'Print 1st word of each sentence in text using python

How to ignore text inside (). In below example I have to ignore printing directions) & Over right).

Example:

Text = "A paragraph is a self-contained unit of discourse in writing dealing with a particular point or idea. A paragraph consists of one or more sentences. Though not required by the syntax of any language, paragraphs are usually an expected part of formal writing, used to organize longer prose.The oldest classical British and Latin writing had little or no space between words and could be written in boustrophedon (alternating. directions). Over time, text direction (left to. right) became standardized, and word dividers and terminal punctuation became common."

Code I used:

for x in text.split('. '):
    y=x.split(" ")
    print(y[0])

Output for this code:

A   A  Though directions) Over right)


Solution 1:[1]

You should use the re module that comes with Python. You want to substitute all instances of '(<any character(s)>)' with the empty string ''.

Try the following:

text1 = re.sub('\(.*\)', '', Text)

will generate a text that will not contain anything within parenthesis. The output of the above will be:

'A paragraph is a self-contained unit of discourse in writing dealing with a particular point or idea. A paragraph consists of one or more sentences. Though not required by the syntax of any language, paragraphs are usually an expected part of formal writing, used to organize longer prose.The oldest classical British and Latin writing had little or no space between words and could be written in boustrophedon  became standardized, and word dividers and terminal punctuation became common.'

Solution 2:[2]

Try this out:

import re
Text = "Your Text here"
Text = re.sub("\\(.*\\)","",Text)  # remove all items in the parenthesis
for x in Text.split('.'):   # split by the period
    x = x.strip()    # strip the spaces away 
    y=x.split(" ")
    print(y[0])

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 ssm
Solution 2 Jake Korman