'Hugginface transformers module not recognized by anaconda

I am using Anaconda, python 3.7, windows 10.

I tried to install transformers by https://huggingface.co/transformers/ on my env. I am aware that I must have either pytorch or TF installed, I have pytorch installed - as seen in anaconda navigator environments.

I would get many kinds of errors, depending on where (anaconda / prompt) I uninstalled and reinstalled pytorch and transformers. Last attempt using conda install pytorch torchvision cpuonly -c pytorch and conda install -c conda-forge transformers I get an error:

from transformers import BertTokenizer
bert_tokenizer = BertTokenizer.from_pretrained('bert-base-uncased', do_lower_case=True)

def tok(dataset):
    input_ids = []
    attention_masks = []
    sentences = dataset.Answer2EN.values
    labels = dataset.Class.values
    for sent in sentences:
        encoded_sent = bert_tokenizer.encode(sent, 
                                             add_special_tokens=True,
                                             max_length = 64,
                                             pad_to_max_length =True)

TypeError: _tokenize() got an unexpected keyword argument 'pad_to_max_length'

Does anyone know a secure installation of transformers using Anaconda? Thank you



Solution 1:[1]

The problem is that conda only offers the transformers library in version 2.1.1 (repository information) and this version didn't have a pad_to_max_length argument. I'm don't want to look it up if there was a different parameter, but you can simply pad the result (which is just a list of integers):

from transformers import BertTokenizer
bert_tokenizer = BertTokenizer.from_pretrained('bert-base-uncased', do_lower_case=True)

sentences = ['this is just a test', 'this is another test']

max_length = 64

for sent in sentences:
    encoded_sent = bert_tokenizer.encode(sent, 
                                         add_special_tokens=True,
                                         max_length = max_length)
    encoded_sent.extend([0]* (max_length - len(encoded_sent)))

    ###your other stuff

The better option in my opinion is to create a new conda environment and install everything via pip and not via conda. This will allow you to work with the most recent transformers version (2.11).

Solution 2:[2]

As mentioned by cronoik already, somehow conda only installs the transformers version 2.1.1, although later versions seem to be available (see: https://anaconda.org/conda-forge/transformers/files)

What solved it for me in regards to conda is, that it is also possible to install through a link.

So i installed the latest version using this command:

 conda install https://anaconda.org/conda-forge/transformers/4.16.2/download/noarch/transformers-4.16.2-pyhd8ed1ab_0.tar.bz2

just browse through their repository and right click>copy link target

EDIT: i noticed, that cronoik's answer was at a time, where condas repository in fact did not provide any other versions yet. However at the time of my answer it does provide other versions, however still only installs the version 2.1.1, if not specified otherwise.

Solution 3:[3]

Answer by @Hung worked for me, but I also needed to update the packaging version after receiving the Error: "huggingface-hub 0.5.1 requires packaging>=20.9, but you'll have packing 20.4 which is incompatible".

This other post already solved that as well by running the below:

pip install --upgrade huggingface-hub

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 cronoik
Solution 2 Hung
Solution 3 John T.