'Optimized dot product in Python
The dot product of two n-dimensional vectors u=[u1,u2,...un] and v=[v1,v2,...,vn] is is given by u1*v1 + u2*v2 + ... + un*vn.
A question posted yesterday encouraged me to find the fastest way to compute dot products in Python using only the standard library, no third-party modules or C/Fortran/C++ calls.
I timed four different approaches; so far the fastest seems to be sum(starmap(mul,izip(v1,v2))) (where starmap and izip come from the itertools module).
For the code presented below, these are the elapsed times (in seconds, for one million runs):
d0: 12.01215
d1: 11.76151
d2: 12.54092
d3: 09.58523
Can you think of a faster way to do this?
import timeit # module with timing subroutines
import random # module to generate random numnbers
from itertools import imap,starmap,izip
from operator import mul
def v(N=50,min=-10,max=10):
"""Generates a random vector (in an array) of dimension N; the
values are integers in the range [min,max]."""
out = []
for k in range(N):
out.append(random.randint(min,max))
return out
def check(v1,v2):
if len(v1)!=len(v2):
raise ValueError,"the lenght of both arrays must be the same"
pass
def d0(v1,v2):
"""
d0 is Nominal approach:
multiply/add in a loop
"""
check(v1,v2)
out = 0
for k in range(len(v1)):
out += v1[k] * v2[k]
return out
def d1(v1,v2):
"""
d1 uses an imap (from itertools)
"""
check(v1,v2)
return sum(imap(mul,v1,v2))
def d2(v1,v2):
"""
d2 uses a conventional map
"""
check(v1,v2)
return sum(map(mul,v1,v2))
def d3(v1,v2):
"""
d3 uses a starmap (itertools) to apply the mul operator on an izipped (v1,v2)
"""
check(v1,v2)
return sum(starmap(mul,izip(v1,v2)))
# generate the test vectors
v1 = v()
v2 = v()
if __name__ == '__main__':
# Generate two test vectors of dimension N
t0 = timeit.Timer("d0(v1,v2)","from dot_product import d0,v1,v2")
t1 = timeit.Timer("d1(v1,v2)","from dot_product import d1,v1,v2")
t2 = timeit.Timer("d2(v1,v2)","from dot_product import d2,v1,v2")
t3 = timeit.Timer("d3(v1,v2)","from dot_product import d3,v1,v2")
print "d0 elapsed: ", t0.timeit()
print "d1 elapsed: ", t1.timeit()
print "d2 elapsed: ", t2.timeit()
print "d3 elapsed: ", t3.timeit()
Notice that the name of the file must be dot_product.py for the script to run; I used Python 2.5.1 on a Mac OS X Version 10.5.8.
EDIT:
I ran the script for N=1000 and these are the results (in seconds, for one million runs):
d0: 205.35457
d1: 208.13006
d2: 230.07463
d3: 155.29670
I guess it is safe to assume that, indeed, option three is the fastest and option two the slowest (of the four presented).
Solution 1:[1]
Here is a comparison with numpy. We compare the fast starmap approach with numpy.dot
First, iteration over normal Python lists:
$ python -mtimeit "import numpy as np; r = range(100)" "np.dot(r,r)"
10 loops, best of 3: 316 usec per loop
$ python -mtimeit "import operator; r = range(100); from itertools import izip, starmap" "sum(starmap(operator.mul, izip(r,r)))"
10000 loops, best of 3: 81.5 usec per loop
Then numpy ndarray:
$ python -mtimeit "import numpy as np; r = np.arange(100)" "np.dot(r,r)"
10 loops, best of 3: 20.2 usec per loop
$ python -mtimeit "import operator; import numpy as np; r = np.arange(100); from itertools import izip, starmap;" "sum(starmap(operator.mul, izip(r,r)))"
10 loops, best of 3: 405 usec per loop
Seeing this, it seems numpy on numpy arrays is fastest, followed by python functional constructs working with lists.
Solution 2:[2]
I don't know about faster, but I'd suggest:
sum(i*j for i, j in zip(v1, v2))
it's much easier to read and doesn't require even standard-library modules.
Solution 3:[3]
Please benchmark this "d2a" function, and compare it to your "d3" function.
from itertools import imap, starmap
from operator import mul
def d2a(v1,v2):
"""
d2a uses itertools.imap
"""
check(v1,v2)
return sum(imap(mul, v1, v2))
map (on Python 2.x, which is what I assume you use) unnecessarily creates a dummy list prior to the computation.
Solution 4:[4]
In Mathematica (10^12 additions and multiplications):
a = RandomReal[1,{10^4,10^4}];
b = RandomReal[1,{10^4,10^4}];
AbsoluteTiming[c=a.b;]//First
9.65295 seconds
(Windows 10, Dell XPS17 9700, Mathematica 12.3)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | SilentGhost |
| Solution 3 | tzot |
| Solution 4 | Bart Romeny |
