'Delimiter of read csv is in text field

I received extracted data from a server, the problem is the extract has the delimiter ";" in the csv file.

I read the folder with the following command:

files = glob.glob(r"path/*.csv")
dfs = [pd.read_csv(f, sep=";", engine='c') for f in files]
df2 = pd.concat(dfs,ignore_index=True)

and the output is:


columnA    columnB .... columnT columnU
2000        A      ....  I wish  NaN
1000        B     ....   that    NaN
this ends   NaN   ....    NaN    NaN
3000        A     .....    I      DUU
...

the text in row 3 belongs to the columnT in the second row. So far i am only possible to delete all weirds rows like row 4 but i am not able to keep that information.

df2.dropna(subset=['columnB'], how='all', inplace=True)

How can i read the files correctly? The Problem is, that in the text field columnT in the text it also use ";" as normal character.

the original text is (in csv):

columnA;    columnB; .... columnT;          columnU:
2000;        A;      ....  I wish;            NaN;
1000;        B;     ....   that; this ends;    NaN;
3000;        A;     .....    I;               DUU;

Solution 1:^[1]

I wasn't aware of a programmatic approach to solve this (see my comment), but out of interest, a quick search led me to Escaping quotes and delimiters in CSV files with Excel. Perhaps you could try the same. I.e., either manually or programmatically, replace all single quotes for double quotes, and try your code again.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	Dharman

'Delimiter of read csv is in text field

Solution 1:[1]

Sources

Related Questions

Solution 1:^[1]