'How do i use numpy to return an array that sums up data from a csv file where the first column contains strings & the others contain float using

I have a CSV file that contains the power supply per plant for a month, I want to sum up the total supply per hour for each plant using NumPy and still maintain the dimensions. below is an example of the data in the CSV file.

PLANT  1.00hrs  2.00hrs  ...  22.00hrs  23.00hrs  24.00hrs
AFAM IV - V     30.0     30.0  ...      50.0      50.0      50.0
AFAM IV - V     30.0     20.0  ...      50.0      30.0      30.0
AFAM IV - V     30.0     30.0  ...      50.0      50.0      50.0
AFAM IV - V    116.0    117.2  ...     166.1     170.6     164.6
AFAM IV - V     50.0     50.0  ...      48.0      48.0      50.0

here is what i have tried doing:

import pandas as pd
import numpy as np

path = 'C:\\Users\\user\\PycharmProjects\\pycharmProject\\NESI_REPORT_JAN.csv'

pdf = pd.read_csv(path)
print(np.sum(np.sum(pdf)))

which gives me the following outcome:

PLANT       AFAM IV - VAFAM IV - VAFAM IV - VAFAM IV - VAF...
1.00hrs                                              111962.9
2.00hrs                                              106835.2
3.00hrs                                             101608.21
4.00hrs                                               99191.9
5.00hrs                                             102670.56
6.00hrs                                             112298.41

i have also tried this:

import numpy as np

path = 'C:\\Users\\user\\PycharmProjects\\pycharmProject\\NESI_REPORT_JAN.csv'

data = np.genfromtxt(path, dtype=None, delimiter=',', names=True)
newdata = np.array(data)

print(np.sum(data, axis=0, keepdims=True))

please how do i sum the hours for each plant using numpy arrays with the original dimentions.



Solution 1:[1]

import numpy as np

data = np.genfromtxt(fname = 'data.csv', delimiter ='  ', skip_header = 1)
#skip_header ->     The number of lines to skip at the beginning of the file.
data = np.nan_to_num(data, nan = 0)

#now you have a normal np.array
row_sum = np.sum(data, axis = 0)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Alessandro Bossi