'Regex to get Java import statements

I am writing a Java program to read other Java source files and pull out there import statements:

package com.me.myapp

import blah.example.dog.client.Fizz;
import blah.example.cat.whiskers.client.Buzz;
import blah.example.shared.Foo;
import blah.example.server.Bar;
...etc.

I want the regex to return anything starting with import blah.example. and that has client in the package name after that. Hence the regex would pick up Fizz and Buzz in the example above, but not Foo or Bar.

My best attempt is:

String regex = "import blah.example*client*";
if(someString.matches(regex))
    // Do something

This regex isn't throwing an exception, but itsn't working. Where am I going wrong with it? Thanks in advance!



Solution 1:[1]

You can try with ^import blah[.]example[.](\\w+[.])*client[.]\\w+;$ with MULTILINE flag to make ^ and $ match also start and end of new lines.

Here is some demo:

String data = "package com.me.myapp\n\nimport blah.example.dog.client.Fizz;\nimport blah.example.cat.whiskers.client.Buzz;\nimport blah.example.shared.Foo;\nimport blah.example.server.Bar;";

Pattern p = Pattern.compile(
        "^import blah[.]example[.](\\w+[.])*client[.]\\w+;$",
        Pattern.MULTILINE);
Matcher m = p.matcher(data);
while (m.find())
    System.out.println(m.group());

Output

import blah.example.dog.client.Fizz;
import blah.example.cat.whiskers.client.Buzz;

You can also use the similar regex to check if it matches your strings/lines

String data = "package com.me.myapp\n\nimport blah.example.dog.client.Fizz;\nimport blah.example.cat.whiskers.client.Buzz;\nimport blah.example.shared.Foo;\nimport blah.example.server.Bar;";

Scanner scanner = new Scanner(data);
while (scanner.hasNextLine()){
    String line=scanner.nextLine();
    if (line.matches("import blah[.]example[.](\\w+[.])*client[.]\\w+;")){
        System.out.println(line);
    }
}

Solution 2:[2]

Assuming that someString is one of the lines from the Java source code

Java String

"import\\s+blah\\.example(?:\\.\\w+)*\\.client(?:\\.\\*|(?:\\.\\w+)*);"

Regex

import\s+blah\.example(?:\.\w+)*\.client(?:\.\*|(?:\.\w+)*);

Solution 3:[3]

threating sources as text files can be problematic....

i would try the following approaches instead: * using javac processor framework to integrate your matcher into the compiler * using the ASM library

Solution 4:[4]

Regex may parse src incorrectly, eg commented out imports

/*
import blah.example.dog.client.Fizz;
import blah.example.cat.whiskers.client.Buzz;
*/

or not formatted code

import blah.example.dog.client.Fizz; import blah.example.cat.whiskers.client.Buzz;

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2
Solution 3 Zoltán Haindrich
Solution 4 Evgeniy Dorofeev