'Need to read csv files (when csv file is multiple input files) in Python
I have a school assignment that is asking me to write a program that first reads in the name of an input file and then reads the file using the csv.reader() method. The file contains a list of words separated by commas. The program should output the words and their frequencies (the number of times each word appears in the file) without any duplicates. I have been able to figure out how to do this somewhat for one specific input file, but the program needs to be able to read multiple input files. This is what I have so far:
with open('input1.csv', 'r') as input1file:
csv_reader = csv.reader(input1file, delimiter = ',')
for row in csv_reader:
new_row = set(row)
for m in new_row:
count = row.count(m)
print(m, count)
This is what I get:
woman 1
man 2
Cat 1
Hello 1
boy 2
cat 2
dog 2
hey 2
hello 1
This works (almost) for the input1 file, except it changes the order each time I run it. And I need it to work for two other input files?
sample CSV
hello,cat,man,hey,dog,boy,Hello,man,cat,woman,dog,Cat,hey,boy
Solution 1:[1]
See the code below for an example, I've commented it so you understand what it does and why.
As for the fact that for your implementation the order is different is due to the usage of set
. A set
by definition is unordered.
Also note that with your implementation you are passing over the rows twice, once to turn it into a set, and once more to count. Besides this, if the file contains more than one row, your logic would fail, as the counting part only gets reached when the last line of the file is read.
import csv
def count_things(filename):
with open(filename) as infile:
csv_reader = csv.reader(infile, delimiter = ',')
result = {}
for row in csv_reader:
# go over the row by element
for element in row:
# does it exist already?
if element in result:
# if yes, increase count
result[element] += 1
else:
# if no, add and set count to 1
result[element] = 1
# sorting, explained in detail here:
# https://stackoverflow.com/a/613218/9267296
return {k: v for k, v in sorted(result.items(), key=lambda item: item[1], reverse=True)}
# you could just return unsorted result by using:
# return result
for key, value in count_things("input1.csv").items():
# iterate over items() by key/value pairs
# see this link:
# https://www.w3schools.com/python/python_dictionaries_access.asp
print(key, value)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Edo Akse |