'How to separate tokens(parantheses,colons,etc) when a scanner is scanning a file?

I have created a Python scanner that is a lexical analyzer with created dictionaries. An argument(file) will be passed through command line that will then scan and print out each token. The problem is that if there is no space between tokens, it counts as a single string. Obviously, I do not want that. Is there a way to make a single rule for encountering no white space and create a single whitespace and then continue or will that be a bunch of conditionals for each one? An example of the command line print out that has the problem.

    Line # 2 - Program: hello_world

 Token: Program:, token, 3004
 Token: hello_world, token, 3004
----------------------------------------
 Line # 3 - Author: example

 Token: Author:, token, 3004
 Token: example, token, 3004

# Implement split() for separation
    for word in line.split():

        # Implementation of block comment encounter
        # If encountered, skip

        if tokenize(word).value == 3006:
            self.insideComment = True
            return ""
        if tokenize(word).value == 3007:
            self.insideComment = False
            return ""
        if tokenize(word).value == 3008:
            return (tokenizedLine)
        if tokenize(word).value == 3014:
            self.insideComment == False
            return ""
        
        # If program encounters ':' then white space
        if tokenize(word).value == 3013:
            return...


        #Special characters, other tokens
   "otherTokens": {
    '\"': 3001,
    # cannot start with number
    '\[([a-zA-Z]|([\-]?[0-9]?.[0-9]))+\]': 3002,
    '[\-]?[0-9]': 3003,
    '[a-zA-Z]+': 3004,
    '\(': 3005,
    '\)': 3006,
    '\/\*': 3007,
    '\*\/': 3008,
    '\/\/': 3009,
    '\[\]': 3010,
    '\,': 3011,
    '\s+': 3012,
    ':': 3013, # colon
    

}

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'How to separate tokens(parantheses,colons,etc) when a scanner is scanning a file?

Sources

Related Questions