'rearrange txt to csv file using python

i have data in txt file in the format of

Santosh kumar

+92 123 1234567

Voted For Voted 2 8 months ago

Doc...sapna

+92 123 1234567

Voted For Voted 2 8 months ago

Ramesh Dinani

+92 123 1234567

604PMO S BH: all & GD

Poll e)

Details Options Voters Settings Message

Mk we

+92 242342

Voted For Voted 4 8 months ago

+92 123 1234567

Voted For Voted 2 8 months ago

Nenoram Kolhi

+123 1234567 there more rough line of data between numbers like

r SKL

+92 12323232

Voted For Voted

i need data NAme and phone NUmber LIKE

Name,Number

Santosh kumar,+92 123 1234567

Nenoram Kolhi,+123 1234567

and remove all rough data my code not working properly

import csv

with open('File001.txt', 'r') as in_file:
    stripped = (line.strip() for line in in_file)
    lines = (line.split("+") for line in stripped if line)
    with open('log1.csv', 'w') as out_file:
        writer = csv.writer(out_file)
        writer.writerow(('title', 'intro'))
        writer.writerows(lines)

#########

import pandas as pd
read_file = pd.read_csv('log1.csv',header = None,delimiter = ',')
read_file.columns = ['Name','number']
read_file.to_csv('Final1.csv', index=None)


Solution 1:[1]

The use of regular expressions or "regex" should be of great help in your case.

For example, this piece of code look for every phone number (a "+" followed by numbers or spaces) and the two previous lines to get the names :

import re

re_contact=re.compile(r"\n(.*?)\n\n(\+[\d\s]+?)\n")

for contact in re_contact.finditer(text):
    print("name=",contact.group(1))
    print("number=",contact.group(2))
    print()

This gives :

name= Santosh kumar
number= +92 123 1234567

name= Doc...sapna
number= +92 123 1234567

name= Ramesh Dinani
number= +92 123 1234567

name= Mk we
number= +92 242342

name= Voted For Voted 4 8 months ago
number= +92 123 1234567

name= r SKL
number= +92 12323232

As you can see, there is a phone number whithout name clearly associated in your data.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 manu190466