'Python Dataframe index as a tuple of XY Coordinates

My function collectHUC takes XY coordinates and builds a dictionary of HUC codes named stationsXY for 1000 stations. The keys are tuples like ('41.462605 ', '-74.387089') with values like 0202000600. This process takes time, so I convert stationsXY to a DataFrame and save it as follows for later use:

xy = pd.DataFrame.from_dict(stationsXY, orient='index')
xy.to_csv('xy_huc.csv', index=False)

Here is the issue, when reading back the file xy_huc.csv with

xy_coor = pd.read_csv('xy_huc.csv', index_col=0, header=None)

The DataFrame xy_coor has 2 columns, column 0 for the XY coordinates and column 1 for HUC codes, by setting column 0 as the index, the index format is string now and not tuple. I have tried these so far:

  1. Read the csv file, setting the index as a multi-index from the column

    xy_coor = pd.read_csv('xy_huc.csv', index_col=0, header=None)
    xy = pd.DataFrame(xy_coor, index=pd.MultiIndex.from_tuples(xy_coor.index))
    

    but I have got this error

    TypeError: Expected tuple, got str
    
  2. Read the csv with index_col=False, set the index to column zero

    xy_coor = pd.read_csv('xy_huc.csv', index_col=False, header=None)
    xy = pd.DataFrame(stations_xy, index=pd.MultiIndex.from_tuples(xy_coor[0]))
    

    but xy indexes are in this form

xy DataFrame for point 2

I am willing to retrieve the file in the same format it was saved on. Any suggestions/recommendations are appreciated.

Thanks



Solution 1:[1]

There are multiple ways to do this but I think the easiest way would be to save x and y as separate columns.

Another way would be to parse the tuple string for example:

def back_to_tuple(input_string):
    # remove punctuation but not comma 
    punctuation = "(/')"
    for punc in punctuation:
        input_string = input_string.replace(punc, '') 

    # split on comma -> list 
    input_string = input_string.split(',')

    # split on comma -> list 
    input_string = input_string.split(',')

    return input_string

You could as well save the xy tuple as a string separated with ';':

def tuple_to_string(input_tuple)
    return ';'.join(input_tuple)

And than retrieve it like so:

def back_to_tuple(input_string):
    return tuple(input_string.split(';'))

Solution 2:[2]

why not convert the lat and long into a concatenated string then index it. When you use the index later split the string.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Golden Lion