'Python: Split string into substrings of a certain length BUT don't split a word [duplicate]
I have some long strings that I would like to split up into substrings of a certain length.
The catch is that the long strings are paragraphs, and I would not like to break up words.
I have this code so far, which does split up words:
my_string = 'this is a test sentence'
n=10
chunks = [my_string[i:i+n] for i in range(0, len(my_string), n)]
print(chunks)
result:
['this is a ', 'test sente', 'nce']
I would like the result to be ['this is a ', 'test ', 'sentence'] (aka, if it has to cut into a word then cut before the word instead).
(In the real script the chunks would be 200 characters long so there would be no possible issue of a word being longer than the chunk size).
Any ideas? This is a really tricky one!!
Solution 1:[1]
Try the following:
my_string = 'this is a test sentence'
n = 10
def chunk(s, n):
i = 0
result = []
while True:
start = i
i += n
# At end of string
if i >= len(s):
result.append(s[start:i])
return result
# Back up until space
while s[i - 1] != ' ':
i -= 1
result.append(s[start:i])
print(chunk(my_string, n))
# ['this is a ', 'test ', 'sentence']
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | iz_ |
