'Ghost NaN values in Pandas DataFrame, strange behaviour with Numpy

This is a very strange problem, I tried a lot of things but I can't find a way to solve it.

I have a DataFrame with data collected from API : no problem with that, then I'm using a library which is pandas-ta (https://github.com/twopirllc/pandas-ta), so this add new columns to the DataFrame.

Of course, sometimes there is NaN values in the new columns added (there is a lot of reasons but the main one is that some indicators are length-based).

Basic problem, so basic solution, just need to type df.fillna(0, inplace=True) and it works !

But when when I check the df.values (or the conversion to_numpy()) there is still nan values.

Properties of the problem :

_NaN not found with np.where() in the array both with np.nan & pandas-ta.npNaN

_df.isna().any().any() returns False

_NaN are float values, not string

_array has a dtype equal to object

_I tried various methods to replace the NaNs, not only fillna, but with the fact that they are not recognized it does not work at all

_I also thought it was because of large numbers, but using to_numpy(dtype='float64') gives the same problem

So these values are here only when converted to numpy array and not recognized.

These values are also here when I use PCA to my dataset, where I get a message error because of the NaNs.

Thanks a lot for your time, sorry for the mistakes I'm not a native speaker.

Have a good day y'all.

Edit :

There is a screen of the operations I'm doing and the result printed, you can see one NaN value. Picture



Solution 1:[1]

You will want to save the input as an int to compare:

students = {11111: "A+", 22222: "B+", 33333: "D+"}
ID = int(input("please enter the student ID:"))
for key in students:
    if ID == key:
        print(students[int(ID)])
        break
    else:
        print("ID not found")
if len(str(ID)) < 5:
    print("invalid Id")
elif len(str(ID)) > 5:
    print("invalid Id")

That will let you compare correctly, but I think a better version would be the following:

students = {11111: "A+", 22222: "B+", 33333: "D+"}
ID = int(input("please enter the student ID:"))
found = False
if ID in students:
    print(students[ID])
    found = True
if not found:
    print("ID not found")
if len(str(ID)) != 5:
    print("invalid Id")

Solution 2:[2]

I prefer this code to you:

students = {11111: "A+", 22222: "B+", 33333: "D+"}

ID = int(input("please enter the student ID: "))
if len(str(ID)) == 5:
    if ID in students.keys():
        print(students[ID])
    else:
        print('ID no found')
else:
    print("invalid ID")

It first check the length of the input, then if your input be in the dictionary it prints the response, else prints "ID not found".

Solution 3:[3]

First of all, you should follow Asking the user for input until they give a valid response regarding your input-loop. It should be something like this:

while True:
    ID = input("please enter the student ID:")
    if len(ID) != 5:
        print("invalid Id")
    else:
        # check the id
        break

Now, regarding how you check the keys - it is not necessary to loop over a dict to check if a key exists. The whole advantage of dicts is that they are hash tables and give an O(1) look-up time. So you can simply do:

if ID in students:
    print(students[ID])
else:
    print("ID not found")

But since you're just printing, this can all be simplified using the get method which has a default argument that is returned if the key is not found. So an if/else is not even necessary:

print(students.get(ID, "ID not found"))

Lastly, remember that input always returns a string. Your keys are ints. So you will have to convert the ID to an int before using it as a key.

All together your code could be:

while True:
    ID = input("please enter the student ID:")
    if len(ID) != 5:
        print("invalid Id")
    else:
        print(students.get(int(ID), "ID not found"))
        break

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Ali SHOKOUH ABDI
Solution 3