'splitting a dot delimited string into words but with a special case

Not sure if there is an easy way to split the following string:

'school.department.classes[cost=15.00].name'

Into this:

['school', 'department', 'classes[cost=15.00]', 'name']

Note: I want to keep 'classes[cost=15.00]' intact.



Solution 1:[1]

Skip dots within brackets:

import re
s='school.department.classes[cost=15.00].name'
print re.split(r'[.](?![^][]*\])', s)

Output:

['school', 'department', 'classes[cost=15.00]', 'name']

Solution 2:[2]

This could get messy in a hurry, you may need to actually parse this string instead of just splitting it up:

from pyparsing import (Forward,Suppress,Word,alphas,quotedString,
                        alphanums,Regex,oneOf,Group,delimitedList)


# define some basic punctuation, numerics, operators
LBRACK,RBRACK = map(Suppress, '[]')
ident = Word(alphas+'_',alphanums+'_')
real = Regex(r'[+-]?\d+\.\d*').setParseAction(lambda t:float(t[0]))
integer = Regex(r'[+-]?\d+').setParseAction(lambda t:int(t[0]))
compOper = oneOf('= != < > <= >=')

# a full reference may be composed of full references, i.e., a recursive
# grammar - forward declare a full reference
fullRef = Forward()

# a value in a filtering expression could be a full ref or numeric literal
value = fullRef | real | integer | quotedString
filterExpr = Group(value + compOper + value)

# a single dotted ref could be one with a bracketed filter expression
# (which we would want to keep together in a group) or just a plain identifier
ref = Group(ident + LBRACK + filterExpr + RBRACK) | ident

# now insert the definition of a fullRef, using '<<' instead of '='
fullRef << delimitedList(ref, '.')

# try it out
s = 'school.department.classes[cost=15.00].name'
print fullRef.parseString(s)
s = 'school[size > 10000].department[school.type="TECHNICAL"].classes[cost=15.00].name'
print fullRef.parseString(s)

Prints:

['school', 'department', ['classes', ['cost', '=', 15.0]], 'name']
[['school', ['size', '>', 10000]], ['department', ['school', 'type', '=', '"TECHNICAL"']], ['classes', ['cost', '=', 15.0]], 'name']

(It isn't difficult to put "classes[cost=15.00]" back together if you need to.)

Solution 3:[3]

#The simplest method to split the sentence is by using .split('.') as shown below:

s = 'school.department.classes[cost=15.00].name'

s.split('.')

This is your expected output:

['school', 'department', 'classes[cost=15', '00]', 'name']

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2
Solution 3