'Number of characters in the column name of a DataFrame
I'm consuming an API and some column names are too big for mysql database.
How to ignore field in dataframe?
I was trying this:
import pandas as pd
import numpy as np
lst =['Java', 'Python', 'C', 'C++','JavaScript', 'Swift', 'Go']
df = pd.DataFrame(lst)
limit = 7
for column in df.columns:
if (pd.to_numeric(df[column].str.len())) > limit:
df -= df[column]
print (df)
result:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
My preference is to delete the column that is longer than my database supports.
But I tried slice to change the name and it didn't work either.
I appreciate any help
Solution 1:[1]
As I mentioned in my comment, when you do df = pd.DataFrame(lst) you are saying to create a dataframe with a single column where the rows are populated by your single-dimension list. So iterating through columns of the dataframe isn't doing anything as there is only a single column
That being said, this is an advantage as you can use a set based approach to answer your question:
import pandas as pd
import numpy as np
lst =['Java', 'Python', 'C', 'C++','JavaScript', 'Swift', 'Go']
df = pd.DataFrame(lst)
limit = 7
print(df[df[0].str.len() > limit])
That will spit out a dataframe with a single column and a single row containing "Javascript" the only value that is over your character length limit. If you wanted to keep the values that are under the limit just change that > to <=.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Corralien |
