'how to extract numbers between square brackets?
I have a text file containing text like
index of cluster is 18585 points index are [18585, 14290, 18503, 7220, 6835, 10009,6615, 1269, 14161, 26545, 18140, 9292, 20355, 16401, 7713, 582, 1865, 17247, 26256, 19034, 7282, 1847, 19293, 16944, 27748, 29312,.... ]
index of the cluster is 3014 points index are [ ....] and so on ..
I need to extract numbers between "[" until "]" in every cluster in a single file. i tried to check if line has "[" then get the numbers but didn't work right
import os
f = open("cluster.txt","r")
for line in f.readlines():
if "[" in line:
print("true")
Solution 1:[1]
For each line in the file you can use a regular expression to identify the data within the brackets. Then you can split the resulting string and use a list comprehension (or a map as shown here) to give you a list of all the numbers.
For example:
import re
line = '''index of cluster is 18585 points index are [18585, 14290, 18503, 7220, 6835, 10009,6615, 1269, 14161, 26545, 18140, 9292, 20355, 16401, 7713, 582, 1865, 17247, 26256, 19034, 7282, 1847, 19293, 16944, 27748, 29312]'''
a = re.findall('\[(.*?)\]', line)
if a:
nums = list(map(int, a[0].split(',')))
print(nums)
Output:
[18585, 14290, 18503, 7220, 6835, 10009, 6615, 1269, 14161, 26545, 18140, 9292, 20355, 16401, 7713, 582, 1865, 17247, 26256, 19034, 7282, 1847, 19293, 16944, 27748, 29312]
Solution 2:[2]
You can do something like this:
f = open("cluster.txt","r")
for line in f.readlines():
numbers_only = line.split('[')[1].split(']')[0]
list_of_number_strings = numbers_only.split(',')
list_of_numbers = [int(number) for number in list_of_number_strings]
With this, you will have the numbers converted to integers in the list_of_numbers list in the end. First, this splits the line to only get the part between [ and ] and then it just splits the remainder and converts them to integers. This assumes that each line will contain a list. If some lines would have a different format, you would need to add some additional logic for such cases.
Solution 3:[3]
You could alternatively do the following:
f = open("cluster.txt","r")
lst=[]
for line in f.readlines():
lst += list(map(int, line.split("[").[1].split("]")[0].split(",")))
print(lst)
The list will get all lines of your file. The map just serves as transforming the recovered values into integers. You just have to convert the map to a list and append it to the main one.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | Xenty |
| Solution 3 | DharmanBot |
