'using pandas to replace header of dataframe

I have an XYZ file in the following format

X[m] Y[m]  DensD_1200c[m] 
   625268.27    234978.67    7.24   
   625268.34    234978.52    7.24   
   625268.38    234978.45    7.24   
   625268.43    234978.37    7.24   
   625268.47    234978.30    7.24   

I'm using this code to read it into python via pandas and turning it into a shapefile.

input_file = "C:/input_file.xyz" # hard-coded for the minute - to be changed
file_extension = os.path.splitext(input_file)[-1].lower()
output_file = input_file[:-4]


if file_extension == ".xyz" or ".asc":
    df  = pd.read_table(input_file, skiprows=2, sep=r'\,|\t', engine='python', names=['x', 'y', 'z'])
    df.columns = ["x", "y", "z"]

elif file_extension == ".txt" or ".csv":
    df = pd.read_csv(input_file, sep='\,|\t')
    df.columns = ["x", "y", "z"]
        

gdf = gpd.GeoDataFrame(df, geometry=df.apply(lambda row: Point(row.x,row.y,row.z), axis=1))

gdf.to_file(f"{output_file}.shp") # hard-coded for the minute - to be changed
shapefile = f"{output_file}.shp"

print("Shapefile Created!")

However, I'm wondering is there a way to remove the header column containing text and replace it with X, Y ,Z?

Note: not all files will have headers so I need a way of recognising if theres a header, and replace such with XYZ



Solution 1:[1]

You can specify the column names in pandas.read_csv using names=[...]. You then need to explicitly mention that the first line is a header:

pd.read_csv(..., header=0, names=['X', 'Y', 'Z'])

or with skiprows:

pd.read_csv(..., skiprows=1, names=['X', 'Y', 'Z'])

output:

           X          Y     Z
0  625268.27  234978.67  7.24
1  625268.34  234978.52  7.24
2  625268.38  234978.45  7.24
3  625268.43  234978.37  7.24
4  625268.47  234978.30  7.24

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1