'Persisting "nana values" from SQLLIte database
I am trying my best to drop all values from Columns and rows with nan values, or my code will break with these.
Before you ask, yes, I ask Google, and I have the correct code block to drop all nan values.
# ____________________________________________________________________________________ SQLite3 Integration
# Read sqlite query results into a pandas DataFrame
con = sqlite3.connect("Sensors Database.db") # Name of database
df = pd.read_sql_query("SELECT * FROM Hollow_Data_1", con)
# Verify that result of SQL query is stored in the dataframe
con.close()
# print the database table in Console
# drop all nan value
df = df.dropna(subset=['latitude', 'longitude', 'alt'])
print(df)
# df = pd.read_csv('DutchHollow_07222018_non_nan.csv')
The last line is a testing CSV string to ensure my code is working without these nan values.
I only see the same thing when I run the print(df.head()). It's like it never even drops them.
Database snippet:

This database is designed to gather data from sensors from drones, and the nanas values are crap data (the drone is powering on and programming the flight, not in the air yet). My dashboard plots these points to a Mapbox.
This can be handled by NOT NULL in the database schema in real-time, so it doesn't have these NAN values in the table.
Essentially, I want it to drop all rows and columns with NAN values.
I try to copy the dataframe
nan_df = dropna() and dropna(how=all) dropna(subset=['latitude', 'longitude', 'alt']) or (inplace=True)
Every time I print(df.head), it persists. It's like the dropna is not there.
This code will be open source: https://github.com/coxchris859?tab=repositories
Solution 1:[1]
Messing around with your data, it appears that your csv is just kinda scuffed.
Instead of true nans, your nans are coming in as nan <-- yes, with a space before it.
Given your df:
gps_date latitude longitude alt ... Temperature Humidity Pressure Voc
0 7/8/2018 20:41 37.7122685 -120.961832 30.4 ... 39.55 1011.68 27.130 1277076.0
1 7/8/2018 20:17 0 0 nan ... 39.66 1014.00 28.967 10943.0
2 7/8/2018 20:17 nan nan nan ... 41.19 1014.02 28.633 15895.0
3 7/8/2018 20:17 nan nan nan ... 42.05 1014.04 27.901 21403.0
4 7/8/2018 20:17 nan nan nan ... 42.49 1014.05 27.169 27909.0
... ... ... ... ... ... ... ... ... ...
4060 7/22/2018 21:50 37.7085305 -121.072975 38.1 ... 42.54 1014.45 22.296 995778.0
4061 7/22/2018 21:50 37.70852517 -121.0729798 38.1 ... 42.53 1014.45 22.305 998589.0
4062 7/22/2018 21:50 37.7085225 -121.0729787 38.2 ... 42.54 1014.44 22.307 999294.0
4063 7/22/2018 21:50 37.70852533 -121.072976 38.4 ... 42.54 1014.45 22.323 1000000.0
4064 7/22/2018 21:50 37.70853217 -121.0729735 38.6 ... 42.54 1014.46 22.323 999294.0
[4065 rows x 21 columns]
Doing:
df = df.replace(' nan', np.nan)
df = df.dropna()
Output:
gps_date latitude longitude alt ... Temperature Humidity Pressure Voc
0 7/8/2018 20:41 37.7122685 -120.961832 30.4 ... 39.55 1011.68 27.130 1277076.0
101 7/8/2018 20:19 37.72486737 -120.9415272 -179.979 ... 42.33 999.77 22.664 511798.0
103 7/8/2018 20:19 37.7193156 -120.9505354 10.642 ... 42.22 999.79 22.596 521619.0
104 7/8/2018 20:19 37.71908237 -120.9503735 1.043 ... 42.12 999.88 22.523 524717.0
105 7/8/2018 20:19 37.71871426 -120.9502485 -11.66 ... 42.03 999.80 22.539 528246.0
... ... ... ... ... ... ... ... ... ...
4060 7/22/2018 21:50 37.7085305 -121.072975 38.1 ... 42.54 1014.45 22.296 995778.0
4061 7/22/2018 21:50 37.70852517 -121.0729798 38.1 ... 42.53 1014.45 22.305 998589.0
4062 7/22/2018 21:50 37.7085225 -121.0729787 38.2 ... 42.54 1014.44 22.307 999294.0
4063 7/22/2018 21:50 37.70852533 -121.072976 38.4 ... 42.54 1014.45 22.323 1000000.0
4064 7/22/2018 21:50 37.70853217 -121.0729735 38.6 ... 42.54 1014.46 22.323 999294.0
[3939 rows x 21 columns]
Alternatively, you can run something like df.latitude = df.latitude.astype(float) on each column that's scuffed, and this appears to auto-fix the scuffed nans and leave you with the correct dtype for those columns.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | BeRT2me |
