'Split large size csv file into multiple csv files with VS code
Is it possible to split over 8GB CSV file into multiple CSV files with VS code?
I tried to split the file based on https://mungingdata.com/python/split-csv-write-chunk-pandas/
chunk_size = 2000000
def write_chunk(part, lines):
with open('split_'+ str(part) +'.csv', 'w') as f_out:
f_out.write(header)
f_out.writelines(lines)
with open(filepath, "r", encoding="utf-8") as f:
count = 0
header = f.readline()
lines = []
for line in f:
count += 1
lines.append(line)
if count % chunk_size == 0:
write_chunk(count//chunk_size, lines)
lines = []
# write remainder
if len(lines) > 0:
write_chunk((count//chunk_size) + 1, lines)
however, if the filepath set the large CSV file, vscode cannot read the file due to the memory issue.
Although I consider data cleaning in SQL before extracting data as CSV, I wonder how you split the large SCV file.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
