'Python - Edit lines in a list

I have insert my text file with about 10 lines in the form of a list. Now I want to cut off the firstpart in each line.

To be precise, the first 5 words should be cut off.

How exactly do I have to do this?

Edit:

I have insert my text file:

with open("test.txt", "r") as file: 
  list = [] 
  for line in file: 
    list += [line.strip()] 
  print(list) 

If i only have one line, this works for me:

newlist = " ".join(list.split(" ")[5:]) 
print(newlist) 

But how can I do this with a list (10 lines)



Solution 1:[1]

Python has a method split() that splits a string by a delimiter. By default, it splits at every white space. Then, to cut the first 5 words you can either copy all the other items in the list starting from index 5, or, delete the indexes 0-4 from the list.

Solution 2:[2]

Perhaps something along the lines of:

text = []
with open('input.txt') as f:
    for l in f.readlines():
        words = l.split()
        text.append(words[5:])

Obviously you should do all sorts of error checking here but the gist should be there.

Solution 3:[3]

If you want to remove all first 5 words from all lines in the file, you don't have to read it line by line and then split it.

You can read the whole file, then then use re.sub to remove the first 5 words surrounded by spaces.

import re

pattern = r"(?m)^[^\S\n]*(?:\S+[^\S\n]+){4}\S+[^\S\n]*"
with open('test.txt') as f:
    print(re.sub(pattern, "", f.read()))

The pattern matches:

  • (?m) Inline modifier for multiline
  • ^ Start of string
  • [^\S\n]* Match optional leading spaces without newlines
  • (?:\S+[^\S\n]+){4} Repeat 4 times matching 1+ non whitespace chars followed by 1+ spaces
  • \S+ Match 1+ non whitespace chars
  • [^\S\n]* Match optional trailing spaces without a newline

See a regex 101 demo for the matches.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 tobydank
Solution 2 Andrej Prsa
Solution 3 The fourth bird