'using pandas to replace header of dataframe
I have an XYZ file in the following format
X[m] Y[m] DensD_1200c[m]
625268.27 234978.67 7.24
625268.34 234978.52 7.24
625268.38 234978.45 7.24
625268.43 234978.37 7.24
625268.47 234978.30 7.24
I'm using this code to read it into python via pandas
and turning it into a shapefile.
input_file = "C:/input_file.xyz" # hard-coded for the minute - to be changed
file_extension = os.path.splitext(input_file)[-1].lower()
output_file = input_file[:-4]
if file_extension == ".xyz" or ".asc":
df = pd.read_table(input_file, skiprows=2, sep=r'\,|\t', engine='python', names=['x', 'y', 'z'])
df.columns = ["x", "y", "z"]
elif file_extension == ".txt" or ".csv":
df = pd.read_csv(input_file, sep='\,|\t')
df.columns = ["x", "y", "z"]
gdf = gpd.GeoDataFrame(df, geometry=df.apply(lambda row: Point(row.x,row.y,row.z), axis=1))
gdf.to_file(f"{output_file}.shp") # hard-coded for the minute - to be changed
shapefile = f"{output_file}.shp"
print("Shapefile Created!")
However, I'm wondering is there a way to remove the header column containing text and replace it with X, Y ,Z?
Note: not all files will have headers so I need a way of recognising if theres a header, and replace such with XYZ
Solution 1:[1]
You can specify the column names in pandas.read_csv
using names=[...]
. You then need to explicitly mention that the first line is a header:
pd.read_csv(..., header=0, names=['X', 'Y', 'Z'])
or with skiprows:
pd.read_csv(..., skiprows=1, names=['X', 'Y', 'Z'])
output:
X Y Z
0 625268.27 234978.67 7.24
1 625268.34 234978.52 7.24
2 625268.38 234978.45 7.24
3 625268.43 234978.37 7.24
4 625268.47 234978.30 7.24
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |