'Unable to read text from csv into dataframe when text is "null" and "nan"

I am trying to upload Google n-gram word frequency data into a dataframe.

Dataset can be found here: https://www.kaggle.com/wheelercode/dictionary-word-frequency

A couple of words are not loading unfortunately. The word "null" appears on row 9156 of the csv file and the word "nan" appears on row 17230 of the csv file.

enter image description here

enter image description here

This is how I am uploading the data

my_freq_df = pd.read_csv('ngram_freq_dict.csv',  dtype = {"word": str, "count": np.int32} ) 
my_freq_df['word'] = my_freq_df['word'].astype("string")

Unfortunately, when I try to check if those words were loaded as strings, I get that they weren't

count = 0

for index, row in my_freq_df.iterrows():
    count += 1
    try:
        len(row['word'])
    except:
        print(row['word'])
        print(count)
        print("****____*****")

We can see the image of the output of the try, except and we can see that I cant calculate the length of the words "nan" and "null". Both words are being read as NA.

enter image description here

How do I fix this?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source