'How would I create a custom list class in python?
I would like to write a custom list class in Python (let's call it MyCollection) where I can eventually call:
for x in myCollectionInstance:
#do something here
How would I go about doing that? Is there some class I have to extend, or are there any functions I must override in order to do so?
Solution 1:[1]
Your can subclass list if your collection basically behaves like a list:
class MyCollection(list):
def __init__(self, *args, **kwargs):
super(MyCollection, self).__init__(args[0])
However, if your main wish is that your collection supports the iterator protocol, you just have to provide an __iter__ method:
class MyCollection(object):
def __init__(self):
self._data = [4, 8, 15, 16, 23, 42]
def __iter__(self):
for elem in self._data:
yield elem
This allows you to iterate over any instance of MyCollection.
Solution 2:[2]
I like to subclass MutableSequence, as recommended by Alex Martelli. This works well... I frequently need to add custom methods on top of the list I'm building.
#####################################################################
# For more complete methods, refer to UserList() in the CPython source...
# https://github.com/python/cpython/blob/208a7e957b812ad3b3733791845447677a704f3e/Lib/collections/__init__.py#L1215
#####################################################################
try:
# Python 3
from collections.abc import MutableSequence
except ImportError:
# Python 2.7
from collections import MutableSequence
class MyList(MutableSequence):
"""A container for manipulating lists of hosts"""
def __init__(self, data=None):
"""Initialize the class"""
super(MyList, self).__init__()
if (data is not None):
self._list = list(data)
else:
self._list = list()
def __repr__(self):
return "<{0} {1}>".format(self.__class__.__name__, self._list)
def __len__(self):
"""List length"""
return len(self._list)
def __getitem__(self, ii):
"""Get a list item"""
if isinstance(ii, slice):
return self.__class__(self._list[ii])
else:
return self._list[ii]
def __delitem__(self, ii):
"""Delete an item"""
del self._list[ii]
def __setitem__(self, ii, val):
# optional: self._acl_check(val)
self._list[ii] = val
def __str__(self):
return str(self._list)
def insert(self, ii, val):
# optional: self._acl_check(val)
self._list.insert(ii, val)
def append(self, val):
self.insert(len(self._list), val)
if __name__=='__main__':
foo = MyList([1,2,3,4,5])
foo.append(6)
print(foo) # <MyList [1, 2, 3, 4, 5, 6]>
for idx, ii in enumerate(foo):
print("MyList[%s] = %s" % (idx, ii))
Solution 3:[3]
In Python 3 we have beautiful collections.UserList([list]):
Class that simulates a list. The instance’s contents are kept in a regular list, which is accessible via the data attribute of UserList instances. The instance’s contents are initially set to a copy of list, defaulting to the empty list
[]. list can be any iterable, for example a real Python list or a UserList object.In addition to supporting the methods and operations of mutable sequences, UserList instances provide the following attribute:
dataA real list object used to store the contents of the UserList class.
https://docs.python.org/3/library/collections.html#userlist-objects
Solution 4:[4]
You could extend the list class:
class MyList(list):
def __init__(self, *args):
super(MyList, self).__init__(args[0])
# Do something with the other args (and potentially kwars)
Example usage:
a = MyList((1,2,3), 35, 22)
print(a)
for x in a:
print(x)
Expected output:
[1, 2, 3]
1
2
3
Solution 5:[5]
Implementing a list from scratch requires you to implement the full container protocol:
__len__()
__iter__() __reversed__()
_getitem__() __contains__()
__setitem__() __delitem__()
__eq__() __ne__() __gt__()
__lt__() __ge__() __le__()
__add__() __radd__() __iadd__()
__mul__() __rmul__() __imul__()
__str__() __repr__() __hash__
But the crux of the list is its read-only protocol,
as captured by collections.abc.Sequence's 3 methods:
__len__()__getitem__()__iter__()
To see that in action, here it is a lazy read-only list backed
by a range instance
(super handy because it knows how to do slicing gymnastics),
where any materialized values are stored in a cache (e.g. a dictionary):
import copy
from collections.abc import Sequence
from typing import Dict, Union
class LazyListView(Sequence):
def __init__(self, length):
self._range = range(length)
self._cache: Dict[int, Value] = {}
def __len__(self) -> int:
return len(self._range)
def __getitem__(self, ix: Union[int, slice]) -> Value:
length = len(self)
if isinstance(ix, slice):
clone = copy.copy(self)
clone._range = self._range[slice(*ix.indices(length))] # slicing
return clone
else:
if ix < 0:
ix += len(self) # negative indices count from the end
if not (0 <= ix < length):
raise IndexError(f"list index {ix} out of range [0, {length})")
if ix not in self._cache:
... # update cache
return self._cache[ix]
def __iter__(self) -> dict:
for i, _row_ix in enumerate(self._range):
yield self[i]
Although the above class is still missing the write-protocol and all the rest methods
like __eq__(), __add__(), it is already quite functional.
>>> alist = LazyListView(12)
>>> type(alist[3:])
LazyListView
A nice thing is that slices retain the class, so they refrain breaking laziness
and materialize elements (e.g. by coding an appropriate repr() method).
Yet the class still fails miserably in simple tests:
>>> alist == alist[:]
False
You have to implement __eq__() to fix this, and use facilities like
functools.total_ordering()
to implement __gt__() etc:
from functools import total_ordering
@total_ordering
class LaxyListView
def __eq__(self, other):
if self is other:
return True
if len(self) != len(other):
return False
return all(a == b for a, b in zip(self, other)
def __lt__(self, other):
if self is other:
return 0
res = all(self < other for a, b in zip(self, other)
if res:
return len(self) < len(other)
But that is indeed considerable effort.
NOTICE: if you try to bypass the effort and inherit list (instead of Sequence),
more modifications are needed because, e.g. copy.copy() would now try to copy also
the underlying list and end up calling __iter__(), destroying laziness;
furthermore, __add__() method fills-in internally list, breaking adding of slices.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | K Mehta |
| Solution 2 | |
| Solution 3 | ramusus |
| Solution 4 | |
| Solution 5 |
