'Deleting characters in csv cell
I am trying to delete characters in a .csv file that is before and follows a string of numbers.
| Description |
| ----------------------------------------------- |
| Case Number and Notes: 21-12311 check building |
| Case Number and Notes: 10-50 w/c , 21-10411 |
| Case Number and Notes: 21-10421 , Vehicle Maint |
| Case Number and Notes: Test |
| Case Number and Notes: 21-9693 , traffic Maint |
I need to delete all characters in this column that come before and after the 21-#####. Is this possible to do?
Eventually I'd also like to have a couple leading zeros after the dash to make the number the same length, but I'm still working on that.
Results should be
| 21-12311 |
| 21-10411 |
| 21-10421 |
| 21-9693 |
Solution 1:[1]
You can use a regular expression to extract possible case numbers. In the case of your 3rd row, you can take the last possible match per row.
For example:
import csv
import re
re_casenum = re.compile(r'\b(\d+-\d+)\b')
with open('input.csv') as f_input, open('output.csv', 'w', newline='') as f_output:
csv_input = csv.DictReader(f_input)
csv_output = csv.DictWriter(f_output, fieldnames=csv_input.fieldnames + ['Case Number'])
csv_output.writeheader()
for row in csv_input:
description = row['Description']
case_numbers = re_casenum.findall(description)
if case_numbers:
row['Case Number'] = case_numbers[-1]
csv_output.writerow(row)
This would give you a output.csv containing:
Description,Case Number
Case Number and Notes: 21-12311 check building,21-12311
"Case Number and Notes: 10-50 w/c , 21-10411",21-10411
"Case Number and Notes: 21-10421 , Vehicle Maint",21-10421
"Case Number and Notes: 21-9693 , traffic Maint",21-9693
This would add a new Case Number column to your CSV. Note, from your example it is not clear what delimiter is used for your CSV file, you might need to add suitable delimiter=';' type parameters. The quotes are added as there is a comma in the field.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Martin Evans |
