'Which parsers are available for parsing C# code? [closed]

Which parsers are available for parsing C# code?

I'm looking for a C# parser that can be used in C# and give me access to line and file informations about each artefact of the analysed code.



Solution 1:[1]

Works on source code:

Works on assembly:

The problem with assembly "parsing" is that we have less informations about line and file (the informations is based on .pdb file, and Pdb contains lines informations only for methods)

I personnaly recommend Mono.Cecil and NRefactory.

Solution 2:[2]

Mono (open source) includes C# compiler (and of course parser)

Solution 3:[3]

If you are going to compile C# v3.5 to .net assemblies:

var cp = new Microsoft.CSharp.CSharpCodeProvider(new Dictionary<string, string>() { { "CompilerVersion", "v3.5" } });

http://msdn.microsoft.com/en-us/library/microsoft.csharp.csharpcodeprovider.aspx

Solution 4:[4]

If you're familiar with ANTLR, you can use Antlr C# grammar.

Solution 5:[5]

I've implemented just what you are asking (AST Parsing of C# code) at the OWASP O2 Platform project using SharpDevelop AST APIs.

In order to make it easier to consume I wrote a quick API that exposes a number of key source code elements (using statements, types, methods, properties, fields, comments) and is able to rewrite the original C# code into C# and into VBNET.

You can see this API in action on this O2 XRule script file: ascx_View_SourceCode_AST.cs.o2 .

For example this is how you process a C# source code text and populate a number of TreeViews & TextBoxes:

    public void updateView(string sourceCode)
    {   
        var ast = new Ast_CSharp(sourceCode);
        ast_TreeView.show_Ast(ast);
        types_TreeView.show_List(ast.astDetails.Types, "Text");
        usingDeclarations_TreeView.show_List(ast.astDetails.UsingDeclarations,"Text");
        methods_TreeView.show_List(ast.astDetails.Methods,"Text");
        fields_TreeView.show_List(ast.astDetails.Fields,"Text");
        properties_TreeView.show_List(ast.astDetails.Properties,"Text");
        comments_TreeView.show_List(ast.astDetails.Comments,"Text");

        rewritenCSharpCode_SourceCodeEditor.setDocumentContents(ast.astDetails.CSharpCode, ".cs");
        rewritenVBNet_SourceCodeEditor.setDocumentContents(ast.astDetails.VBNetCode, ".vb");                                
    }

The example on ascx_View_SourceCode_AST.cs.o2 also shows how you can then use the information gathered from the AST to select on the source code a type, method, comment, etc..

For reference here is the API code that wrote (note that this is my first pass at using SharpDevelop's C# AST parser, and I am still getting my head around how it works):

Solution 6:[6]

We have recently released a C# parser that handles all C# 4.0 features plus the new async feature: C# Parser and CodeDOM

This library generates a semantic object model which retains comments and formatting information and can be modified and saved. It also supports the use of LINQ queries to analyze source code.

Solution 7:[7]

You should definitely check out Roslyn since MS just opened (or will soon open) the code with an Apache 2 license here. You can also check out a way to parse this info with this code from GitHub.

Solution 8:[8]

Solution 9:[9]

SharpDevelop, an open source IDE, comes with a visitor-based code parser which works really well. It can be used independently of the IDE.

Solution 10:[10]

Consider to use reflection on a built binary instead of parsing the C# code directly. The reflection API is really easy to use and perhaps you can get all the information you need?

Solution 11:[11]

Have a look at Gold Parser. It has a very intuitive IU that lets you interactively test your grammar and generate C# code. There are plenty of examples available with it and it is completely free.

Solution 12:[12]

Maybe you could try with Irony on irony.codeplex.com.

It's very fast and a c# grammar already exists.

The grammar itself is written directly in c# in a BNF like way (acheived with some operators overloads)

The best thing with it is that the "grammar" produces the AST directly.

Solution 13:[13]

Something that is gaining momentum and very appropriate for the job is Nemerle

you can see how it could solve it in these videos from NDC :

Solution 14:[14]

Not in C#, but a full C# 2/3/4 parser that builds full ASTs is available with our DMS Software Reengineering Toolkit.

DMS provides a vast infrastructure for parsing, tree building, construction of symbol tables and flow analyses, source-to-source transformation, and regeneration of source code from the (modified) ASTs. (It also handles many other languages than just C#.)

EDIT (September) 2013: This answer hasn't been updated recently. DMS has long handled C# 5.0

Solution 15:[15]

GPPG might be of use, if you are willing to write your own parser (which is fun).