'sum() to remove bracket in a double list

I just saw someone wrote this below and got confused why sum() could be used to remove the bracket from another list:

pwd = [['x'], ['y'], ['z']]

a = sum(pwd, [])
print(a)          // ['x', 'y', 'z']

By looking up sum() definition

sum(iterable, /, start=0)`

iterable can be anything, list, tuples or dictionaries, but most importantly it should be numeric.

start is added to the sum of numbers in the iterable. If start is not given in the syntax, it is assumed to be 0.

How does an empty list as start argument of sum() remove the list from another list? This puzzles me…could anyone demystify this?



Solution 1:[1]

Think about what sum does. This:

x = sum([1,2,3,4],0)

Is the same as

x = 0 + 1 + 2 + 3 + 4

Similarly,

x = sum([['x'],['y'],['z']], [])

Is the same as

x = [] + ['x'] + ['y'] + ['z']

And that results in x = ['x','y','z']. It's a side effect of the fact that the list type overrides the + operator.

Solution 2:[2]

Python doesn't know what addition means. It relies on object methods to do the work. + is really a call to an object's __add__ method. Integers add, but lists extend - at least when adding another list.

sum adds iterated values to the start object. When you make start a list, it sums using the list addition rules. In your case, you start with an empty list, and then each iterated value, also a list, is added - extending the list. Its the same as

>>> a = []
>>> pwd = [['x'], ['y'], ['z']]
>>> for val in pwd:
...     print(val)
...     a = a + val
... 
['x']
['y']
['z']
>>> a
['x', 'y', 'z']

This is part of the dynamic nature of python and is leveraged in many ways in various packages. numpy and pandas broadcast operations across entire matricies, for example. pathlib overrides division to join paths.

One could argue that any class you implement should prefer overriding the existing "magic methods" that implement python operators over their own methods. Why would a queue have a put when it can implement +=? Okay, there are reasons why that would be a bad choice, too! That's design work.

Solution 3:[3]

We start with the empty list.

After processing the first element, we have [] + ['x'] == ['x'].

After processing the second element, we have ['x'] + ['y'] == ['x', 'y'].

After processing the third element, we have ['x', 'y'] + ['z'] == ['x', 'y', 'z'], as observed.

Solution 4:[4]

Adding lists just concatenates them so:

sum(pwd,[]) = [] + ['x'] + ['y'] + ['z']
            = ['x', 'y', 'z']

We need the empty list because sum(x) is the same as sum(x,0)

and sum(pwd,0) ? 0 + ['x'] + ['y'] + ['z']

which gives an error as an int cannot be added to a list.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Tim Roberts
Solution 2 tdelaney
Solution 3 BrokenBenchmark
Solution 4 martineau