'Is there a Python solution to reshape DataFrame defining order by csv_reader, after transpose the data

that how is my csv: 2 tables, one with a summary organize by row, and other organize by column. csv_file

I must read_csv more than 500 csv's files. I must covert the first table to the same range than the second, so I can do aggregation functions to analyse.

The second table was solved with:

df_value = (pd.read_csv(f, sep=';', encoding='latin1', skiprows=8, header=0, usecols=[0,1,2], index_col=False) for f in all_files)

c_df_value = pd.concat(df_value, ignore_index=True, axis=0, join='outer')

The issue is first table, I did the repeat for the same range that I need, but after transpose() and really don't know how to reshape the table respecting the order of csv_read. That what works so far:

df = pd.DataFrame()

df = (pd.read_csv(f, sep=';', encoding='latin1', skiprows=2, nrows=5, header=None, index_col=False, usecols=[1]) for f in all_files)

df2 = pd.concat(df, ignore_index=True, axis=0, join='outer')

df2 = df2.transpose()

df2 = pd.DataFrame(np.repeat(df2.values[0:], repeats=8760, axis=0), index=None)

When I try to use np.reshape, I can't put in the order that I need it:

df = df.rename(columns={0:'Region', 1: 'Cod', 2: 'Lat', 3: 'Lon', 4: 'Alt'})

unique_cols = df.columns.unique().tolist()

new_df = pd.DataFrame(df.values.reshape((-1, len(unique_cols))),columns=unique_cols)

When I try to use reshape(order='C'), mix the columns by row, (order='A' or order='F') mix the columns order.

I tried: pd.pivot, pd.melt, pd.wide_to_long

How is now: pic How is with reshape, order='C': df2 = pd.DataFrame(df2.values.reshape(-1, 5)) return without order

Could someone help me? Thanks.



Solution 1:[1]

I think that I solved, after 2 days, yeap, I think so, code below.


df_list = []
for file in os.listdir('met'):
  if file.endswith('.CSV'):
    temp_df = pd.read_csv(os.path.join('met', file), sep=';', encoding='latin1', skiprows=2, nrows=5, header=None, usecols=[1])
    temp_df = pd.concat([temp_df.transpose()]*8760, ignore_index=True, axis=0, join='outer')
    
    df_list.append(temp_df)
new_df = pd.DataFrame(pd.concat(df_list, ignore_index=True, axis=0, join='outer'))

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 FelipeAllStack