'strange hashing issue using pip hash

I want to install some python packages using pip but cannot as every file downloaded produces the same hash, which then fails comparison in pips security check.

After playing around, I see that every file I download using curl from files.pythonhosted will hash to the same value. I've tested this with a python script like so:

curl http://files.pythonhosted.org/packages/1a/80/b06ce333aabba7ab1b6a41ea3c4e46970ceb396e705733480a2d47a7f74b/Django-4.0.3-py3-none-any.whl -o django.whl
import hashlib

hasher = hashlib.sha256()
BLOCKSIZE = 65536

def hash_stuff(file):
    with open(file, 'rb') as afile:
        buf = afile.read(BLOCKSIZE)
        while len(buf) > 0:
            hasher.update(buf)
            buf = afile.read(BLOCKSIZE)
    print(hasher.hexdigest())

hash_stuff("pynvim.tar.gz")
hash_stuff("opencv.tar.gz")
hash_stuff("django.whl")

which outputs:

➜  ~ python pythonhash.py
c77ab57a36e39ce205ca2327a3edd10399f4d78a3be91e80d845a1b97c29b7d6
ea75572349ed10da0f3224398737fd08352ae10e6f3c571345feb971e080a276
9e31adaf584633587df90d7be36e2fb287c7344eaa4bb23d619f4bdaa19a67d0

if I modify the order of the hash_stuff function like so (note the ordering is different):

hash_stuff("django.whl")
hash_stuff("opencv.tar.gz")
hash_stuff("pynvim.tar.gz")

the output does not change!

➜  ~ python pythonhash.py
c77ab57a36e39ce205ca2327a3edd10399f4d78a3be91e80d845a1b97c29b7d6
ea75572349ed10da0f3224398737fd08352ae10e6f3c571345feb971e080a276
9e31adaf584633587df90d7be36e2fb287c7344eaa4bb23d619f4bdaa19a67d0

If I reset the hasher object I get the first hash c77ab57 three times like so

def hash_stuff(file):
    hasher = hashlib.sha256()
    BLOCKSIZE = 65536
    with open(file, 'rb') as afile:

-----
➜  ~ python pythonhash.py
c77ab57a36e39ce205ca2327a3edd10399f4d78a3be91e80d845a1b97c29b7d6
c77ab57a36e39ce205ca2327a3edd10399f4d78a3be91e80d845a1b97c29b7d6
c77ab57a36e39ce205ca2327a3edd10399f4d78a3be91e80d845a1b97c29b7d6

I've written the same test in ruby and getting the same results..

require 'digest'

puts Digest::SHA256.hexdigest File.read "django.whl"
puts Digest::SHA256.hexdigest File.read "opencv.tar.gz"
puts Digest::SHA256.hexdigest File.read "pynvim.tar.gz"

As a sanity check, I've tested hashing some local files and they produce the same hash consistently, regardless of ordering.

  • How can the ordering of execution effect the hash?
  • erm, files.pythonhosted doesn't even have a proper ssl certificate.. - can I even trust this host?
  • What could I possibly be doing wrong?


Solution 1:[1]

Turns out my internet provider was blocking the content due to files.pythonhosted having a self signed ssl certificate.

the reason the hash was the same for all files was because I was getting an error html page (doh..) thanks @jasonharper for the pointer!

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 RandomEngineer