'Random stealing calls to child initializer
There's a situation involving sub-classing I can't figure out.
I'm sub-classing Random (the reason is besides the point). Here's a basic example of what I have:
import random
class MyRandom(random.Random):
def __init__(self, x): # x isn't used here, but it's necessary to show the problem.
print("Before")
super().__init__() # Nothing passed to parent
print("After")
MyRandom([])
The above code, when run, gives the following error (and doesn't print "Before"):
>>> import test
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\_\PycharmProjects\first\test.py", line 11, in <module>
MyRandom([])
TypeError: unhashable type: 'list'
To me, this doesn't make any sense. Somehow, the argument to MyRandom is apparently being passed directly to Random.__init__ even though I'm not passing it along, and the list is being treated as a seed. "Before" never prints, so apparently my initializer is never even being called.
I thought maybe this was somehow due to the parent of Random being implemented in C and this was causing weirdness, but a similar case with list sub-classing doesn't yield an error saying that ints aren't iterable:
class MyList(list):
def __init__(self, y):
print("Before")
super().__init__()
print("After")
r = MyList(2) # Prints "Before", "After"
I have no clue how to even approach this. I rarely ever sub-class, and even rarer is it that I sub-class a built-in, so I must have developed a hole in my knowledge. This is not how I expect sub-classing to work. If anyone can explain what's going on here, I'd appreciate it.
Python 3.9
Solution 1:[1]
I found a way to pass a list into the Random's inheritor and use it in __init__.
import random
from typing import List
class MyRandom(random.Random):
internal_list: List
def __init__(self, x=None):
if type(x) is list:
print(f"Access to the list from `__init__`: {MyRandom.internal_list}")
super().__init__(MyRandom.internal_list[0])
else:
super().__init__(x)
def __new__(cls, x):
cls.internal_list = x
return super().__new__(cls)
def new_method(self):
print(f"Access to the list from `new_method`: {MyRandom.internal_list}")
r1 = MyRandom([1, 2])
r1.new_method()
print(r1.random())
r2 = MyRandom([3, 4])
r2.new_method()
print(r2.random())
Output:
Access to the list from `__init__`: [1, 2]
Access to the list from `new_method`: [1, 2]
0.13436424411240122
Access to the list from `__init__`: [3, 4]
Access to the list from `new_method`: [3, 4]
0.23796462709189137
For example purpose, I used MyRandom.internal_list[0] to initialize the PRNG. Of course, it's needed to check if the first element exists.
I'm not sure why __new__ is used when you init MyRandom. It's definitely not documented, because in PyCharm implementation I found this:
@staticmethod # known case of __new__
def __new__(*args, **kwargs): # real signature unknown
""" Create and return a new object. See help(type) for accurate signature. """
pass
Solution 2:[2]
So there is already a good answer showing how to work-around this issue, but it got me curious as to why this happens. I couldn't get to a definitive answer, but posting my findings here for anyone who wants to follow through.
So we know that when a new instance is created, first __new__ is called - creating the actual instance (assigning the memory on C-level). Then the newly created instance is passed to the class' __init__ method.
Now, as the print of "Before" didn't even happen, it is safe to assume that the problem is in the __new__ method. Indeed, when I override it like:
def __new__(cls, *args, **kwargs):
print("in new")
return super().__new__(cls)
No errors were raised and an expected print-out of:
in new
Before
After
Once I added *args to the super call:
return super().__new__(cls, *args)
The same error was back. So this must be an issue in Random's __new__.
Inspecting the code with Pycharm, Random doesn't override its __new__ method, but the class signature is:
class Random(_random.Random):
Trying to inspect this parent class shows a bunch of methods only containing pass in them. This seemed weird but after a quick search, I found out (for some it is probably not a surprise) that modules starting with _ are C implementations. And _random's C implementation is _randommodule.c.
Now, I don't have much knowledge or experience with inspecting C implementations of Python, but I found what seems to be the basic slots of the Random class:
static PyType_Slot Random_Type_slots[] = {
{Py_tp_doc, (void *)random_doc},
{Py_tp_methods, random_methods},
{Py_tp_new, PyType_GenericNew},
{Py_tp_init, random_init},
{Py_tp_free, PyObject_Free},
{0, 0},
};
My own understanding from this is that the class' __init__ is mapped to random_init, and the class' __new__ is mapped to PyType_GenericNew. But as its name suggests, PyType_GenericNew is just a generic object creator assigning the necessary amount of memory for the object. Its body is the single line:
PyType_GenericNew(PyTypeObject *type, PyObject *args, PyObject *kwds)
{
return type->tp_alloc(type, 0);
}
The args are not even used.
On the other hand, the random_init function calls random_seed which does have some hashing in it:
Py_hash_t hash = PyObject_Hash(arg);
But then again, we established that the __init__ is not even called yet, at which point I'm stumped...
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Yevgeniy Kosmak |
| Solution 2 | Tomerikoo |
