'TypeError: __init__() got an unexpected keyword argument 'encoding'

Attempting a scrape of table data using pandas in Python 3.6 using Spyder3 on a MacBook Pro OS v10.13.2 (17C88). The code is:

import pandas as pd
...

url = "https://coinmarketcap.com/currencies/bitcoin/historical-data/?start=20130428&end="+time.strftime("%Y%m%d")

# CODE FAILS HERE
bitcoin_market_info = pd.read_html(url)[0]

The results shown in the console:

bitcoin_market_info = pd.read_html(url)[0]
Traceback (most recent call last):

  File "<ipython-input-2-0b0d269a2c9d>", line 15, in <module>
    bitcoin_market_info = pd.read_html(url)[0]

  File "/Users/EL-C/anaconda3/lib/python3.6/site-packages/pandas/io/html.py", line 915, in read_html
    keep_default_na=keep_default_na)

  File "/Users/EL-C/anaconda3/lib/python3.6/site-packages/pandas/io/html.py", line 749, in _parse
    raise_with_traceback(retained)

  File "/Users/EL-C/anaconda3/lib/python3.6/site-packages/pandas/compat/__init__.py", line 385, in raise_with_traceback
    raise exc.with_traceback(traceback)

TypeError: __init__() got an unexpected keyword argument 'encoding'

pd.version is '0.21.1'

Results of pd.show_versions():

INSTALLED VERSIONS
------------------
commit: None
python: 3.6.4.final.0
python-bits: 64
OS: Darwin
OS-release: 17.3.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.21.1
pytest: 3.3.0
pip: 9.0.1
setuptools: 36.5.0.post20170921
Cython: 0.27.3
numpy: 1.13.3
scipy: 1.0.0
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: 1.6.3
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: 1.2.1
tables: 3.4.2
numexpr: 2.6.4
feather: None
matplotlib: 2.1.1
openpyxl: 2.4.9
xlrd: 1.1.0
xlwt: 1.2.0
xlsxwriter: 1.0.2
lxml: 4.1.1
bs4: 4.6.0
html5lib: 1.0.1
sqlalchemy: 1.1.13
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

I have attempted this fix, but it seems to be for an older version and not exactly this situation, given that I haven't imported html5lib.

In case it's needed:

the html5lib.version is 1.0.1

bs4.version is 4.6.0

Running 'pip3 install -U html5lib=="0.9999999"' (as suggested) in the terminal doesn't change the version in Spyder3.

What I see in the terminal when running the command is:

Requirement already up-to-date: html5lib==0.9999999 in /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages
Requirement already up-to-date: six in /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages (from html5lib==0.9999999)

Maybe this is the root? If so, need help figuring this out.



Solution 1:[1]

Try to install Anaconda Python distributive (available for Linux, Windows and MacOS

It works perfectly fine for me:

In [133]: bitcoin_market_info = pd.read_html(url)[0]

In [134]: bitcoin_market_info
Out[134]:
              Date      Open      High       Low     Close       Volume    Market Cap
0     Dec 26, 2017  14036.60  16461.20  14028.90  16099.80  13454300000  235294000000
1     Dec 25, 2017  13995.90  14593.00  13448.90  14026.60  10664700000  234590000000
2     Dec 24, 2017  14608.20  14626.00  12747.70  13925.80  11572300000  244824000000
3     Dec 23, 2017  13948.70  15603.20  13828.80  14699.20  13086000000  233748000000
4     Dec 22, 2017  15898.00  15943.40  11833.00  13831.80  22198000000  266381000000
5     Dec 21, 2017  16642.40  17567.70  15342.70  15802.90  16516600000  278827000000
6     Dec 20, 2017  17760.30  17934.70  16077.70  16624.60  22149700000  297526000000
7     Dec 19, 2017  19118.30  19177.80  17275.40  17776.70  16894500000  320242000000
8     Dec 18, 2017  19106.40  19371.00  18355.90  19114.20  14839500000  320000000000
9     Dec 17, 2017  19475.80  20089.00  18974.10  19140.80  13314600000  326141000000
...            ...       ...       ...       ...       ...          ...           ...
1694  May 07, 2013    112.25    113.44     97.70    111.50            -    1248470000
1695  May 06, 2013    115.98    124.66    106.64    112.30            -    1289470000
1696  May 05, 2013    112.90    118.80    107.14    115.91            -    1254760000
1697  May 04, 2013     98.10    115.00     92.50    112.50            -    1089890000
1698  May 03, 2013    106.25    108.13     79.10     97.75            -    1180070000
1699  May 02, 2013    116.38    125.60     92.28    105.21            -    1292190000
1700  May 01, 2013    139.00    139.89    107.72    116.99            -    1542820000
1701  Apr 30, 2013    144.00    146.93    134.05    139.00            -    1597780000
1702  Apr 29, 2013    134.44    147.49    134.00    144.54            -    1491160000
1703  Apr 28, 2013    135.30    135.98    132.10    134.21            -    1500520000

[1704 rows x 7 columns]

Module's Versions:

In [135]: pd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 3.6.3.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 58 Stepping 9, GenuineIntel
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US
LOCALE: None.None

pandas: 0.21.1
pytest: 3.3.0
pip: 9.0.1
setuptools: 36.5.0.post20170921
Cython: 0.27.3
numpy: 1.13.3
scipy: 1.0.0
pyarrow: 0.7.0
xarray: 0.10.0
IPython: 6.2.1
sphinx: 1.6.3
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.3
blosc: 1.5.1
bottleneck: 1.2.1
tables: 3.4.2
numexpr: 2.6.4
feather: 0.4.0
matplotlib: 2.1.1
openpyxl: 2.4.9
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 1.0.2
lxml: 4.1.1
bs4: 4.6.0
html5lib: 1.0.1
sqlalchemy: 1.1.13
pymysql: 0.7.11.None
psycopg2: 2.7.3.1 (dt dec pq3 ext lo64)
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: 0.5.0

read_html() uses lxml per default (flavor=None):

flavor : str or None, container of strings

The parsing engine to use. bs4 and html5lib are synonymous with each other, they are

both there for backwards compatibility. The default of None tries to

use lxml to parse and if that fails it falls back on bs4 + html5lib.

Solution 2:[2]

pip3 install -U html5lib=="0.9999999"

here's the html5lib bug on github

from: https://stackoverflow.com/a/39087283

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Community
Solution 2 tgrrr