'How to plot multiple values in CSV file in one plot?

first of all please do not judge my English i am not native.

i have a large CSV file, i have a round 84000 column and and 5 row as follow :

filename    max     probability  size_start size_end
   file1     33     0.001         30          10
   file2     4      0.001         30          10
   file3     50     0.001         40          10
   file4     0      0.001         50          10

i would like to plot probability against the max for each pair of size_start and start end , i have 10 different pair of start and end.
i think i have to do a for loop or while loop to loop over the start and the end and plot the corresponded values for the probability and max but i do not know how to loop over csv file . please enlighten me if you have any idea about the answer.



Solution 1:[1]

You could use Panda's DataFrame.plot() method, after subdividing your data into groups:

import pandas as pd

df = pd.read_csv('data.csv')
for (size_start, size_end), group in df.groupby(['size_start', 'size_end']):
    ax = group.plot(x='probability', y='max', kind='scatter')
    ax.get_figure().savefig(f'plot_{size_start}_{size_end}.png')

This will create a separate image for each size_start/size_end combination. There are many options to customize the design and layout of the plot, have a look at the documentation.

If you want to combine the plots all into the same diagram, you can simply call the plot() function multiple times:

from matplotlib import pyplot as plt

fig = plt.figure()
for (size_start, size_end), group in df.groupby(['size_start', 'size_end']):
    plt.plot('probability', 'max', data=group, label=f'{size_start} / {size_end}')

plt.xlabel('probability')
plt.ylabel('max')
plt.legend()
plt.show() # or plt.savefig('figure.png')

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1