'Replace numbers from fileinput maintaining old file formatting and ignoring comma
Hi I have a config file from another program that I want to manipulate with python. The only thing that needs to be replaced are the numbers from the file. The file itself has some tabs and withespaces and commas. I was able to read the file with pandas only keeping the relevant information. Now I want to replace the numbers inside the file with the numbers from my pandas dataframe. This is my code so far:
import pandas as pd
import fileinput
params = pd.read_csv("params.txt", skip_blank_lines=True, delim_whitespace=True, skiprows=1, header=None, names=["paramname", "equal", "param1", "param2"])
#params.columns = params.columns.str.strip()
params.drop(params.columns[1], axis=1, inplace=True)
params.replace(',','', regex=True, inplace=True)
params.replace('\t','', regex=True, inplace=True)
params.replace(' ','', regex=True, inplace=True)
it = 0
rowidx = 0
colidx = "param1"
with fileinput.FileInput("params.txt", inplace=True, backup='.bak') as file:
for line in file:
if (it % 2 == 0):
colidx = "param1"
else:
colidx = "param2"
print(' '.join([params.loc[rowidx, colidx] if x.lstrip('-,').isnumeric() else x for x in line.split()]))
if (it % 2):
rowidx += 1
it += 1
Most things are working as I intended but I still have some issues that I cant find a solution for. I was able to also replace negative numbers with the lstrip function but there are also some numbers that have a comma after the digit and lstrip('-,') doesnt seem to work for that.
Another problem is that it seems that although the numbers are replaced correctly the rewrite of the file replaces the formatting of the original file (tabs are beeing removed).
So my questions are:
- How can I replace numbers that are followed by a comma without replacing the comma itself.
- How to I maintain the original formatting of the inputfile (tabs etc.)?
- How to take float numbers like 0.5 also into account?
This is how the file looks like:
Parameter File
QQ = 7, 6
RR = 5, 5
SS = 0, 0
ay_on = 0, 0
by_on = 0, 1
cvc_on = 1
mvc_on = 1
rc = 0
adus = -200, -200
bdus = 200, 200
au = -5, -2
bu = 5, 2
ay = 7, 0
by = 10, 0.5
con_Ayl =
con_byl =
Hp = 100
Hu = 20
Hw = 1
Tsamp = 100
model = 1
soft = 3
Qfac = 5
sgm_r = 0.01, 0.01
Tobs = -1
mpc_on = 1
And this is how the dataframe looks like:
paramname param1 param2
0 QQ 7 6.00
1 RR 5 5.00
2 SS 0 0.00
3 ay_on 0 0.00
4 by_on 0 1.00
5 cvc_on 1 NaN
6 mvc_on 1 NaN
7 rc 0 NaN
8 adus -200 -200.00
9 bdus 200 200.00
10 au -5 -2.00
11 bu 5 2.00
12 ay 7 0.00
13 by 10 0.50
14 con_Ayl NaN NaN
15 con_byl NaN NaN
16 Hp 100 NaN
17 Hu 20 NaN
18 Hw 1 NaN
19 Tsamp 100 NaN
20 model 1 NaN
21 soft 3 NaN
22 Qfac 5 NaN
23 sgm_r 0.01 0.01
24 Tobs -1 NaN
25 mpc_on 1 NaN
EDIT: I solved the problem with the comma and floats with:
print(' '.join([params.loc[rowidx, colidx] if (x.lstrip('-').isnumeric() or isfloat(x.lstrip('-'))) else params.loc[rowidx, colidx] + ',' if x.endswith(',') else x for x in line.split()]))
But the problem that all the tabs are removed from the file still is there. How can I prevent this?
Solution 1:[1]
I prefer to parse the file and then generate the records for pd.Dataframe.
import pandas as pd
import numpy as np
def parse_file_to_records(file_name):
records = []
with open(file_name, 'r') as f:
lines = f.readlines()
# skip the first line
for line in lines[1:]:
line = line.strip().replace(' ', '')
if not line:
continue
paramname, paramvalue = line.split('=')
param2 = np.nan
try:
param1, param2 = paramvalue.split(',')
except ValueError:
param1 = paramvalue or np.nan
records.append({
'paramname': paramname,
'param1': param1,
'param2': param2,
})
return records
df = pd.DataFrame.from_records(parse_file_to_records('params.txt'))
print(df
)
and the result looks like this
paramname param1 param2
0 QQ 7 6
1 RR 5 5
2 SS 0 0
3 ay_on 0 0
4 by_on 0 1
5 cvc_on 1 NaN
6 mvc_on 1 NaN
7 rc 0 NaN
8 adus -200 -200
9 bdus 200 200
10 au -5 -2
11 bu 5 2
12 ay 7 0
13 by 10 0.5
14 con_Ayl NaN NaN
15 con_byl NaN NaN
16 Hp 100 NaN
17 Hu 20 NaN
18 Hw 1 NaN
19 Tsamp 100 NaN
20 model 1 NaN
21 soft 3 NaN
22 Qfac 5 NaN
23 sgm_r 0.01 0.01
24 Tobs -1 NaN
25 mpc_on 1 NaN
If you want to judge if a string is valid float, you can write a simple function like this:
def isfloat(num):
try:
float(num)
return True
except ValueError:
return False
print(isfloat('1.23'))
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
