Roslyn beyond 'Hello world' 02 - Visual Studio extension for refactoring

Roslyn beyond 'Hello world' 02 Visual Studio extension for refactoring [2017 August 01] .NET, C#, Roslyn, Visual Studio, Refactoring

Part 1 - Important concepts and development setup
Project used in this article is available on GitHub

From the point of view of developer using it (we will call them 'user-dev'), Refactorings in Roslyn are additional commands that pop-up in Visual Studio when they click certain pieces of code. From our point of view, Refactorings are classes inheriting from CodeRefactoringProvider, which get a chance to examine current syntax graph every time user-dev clicks something in it and determine, if they should offer any transformations of that graph based on its state and what was clicked.

We will be building a Refactoring which allows our user-dev to regenerate a given classes public constructor by adding to it any missing assignment of members that match a certain pattern and are not yet assigned during construction. This is the refactoring I use at work to regenerate dependency injected constructors.

Before we begin, a few things to remember.

Roslyn uses Tasks to conserve resources. Since there can be dozens and even hundreds of refactorings and analyzers running at a time, it is important to have efficiency in mind.

Syntax graph is an immutable data structure. It mirrors text of code, every element of the graph has a span, containing information about the start and end of text it corresponds to. Elements of graph are not yet bound to actual symbols available in current solution (types, their members, etc.). You have as much information about them, as can be determined based on a single file analysis.

Because it is immutable, any transformations you propose to user-dev are created by substituting parts of the graph to produce a new graph. If he accepts one of them – text restored from it will replace current document text. Since each node has a span with information about its position, they have to be copied during ecah graph modification. I.e. if you construct a new piece of graph and insert it into the current graph – constructed piece will be a distinct instance != to the one now in graph.

using SF = Microsoft.CodeAnalysis.CSharp.SyntaxFactory;
// we will construct the following
//class Foo
//{
//    void Bar() { }
//}

MethodDeclarationSyntax methodDecl =
    SF.MethodDeclaration(
            returnType: SF.PredefinedType(SF.Token(SyntaxKind.VoidKeyword)),
            identifier: SF.Identifier("Bar"))
        .WithBody(SF.Block());

ClassDeclarationSyntax @classDecl = SF.ClassDeclaration("Foo")
                    .WithMembers(SF.SingletonList<MemberDeclarationSyntax>
                                                                (methodDecl));

MethodDeclarationSyntax insertedMetodDecl = 
                                @classDecl
                                        .Members
                                        .OfType<MethodDeclarationSyntax>()
                                        .Single();

//true
var textIsEqual = methodDecl.GetText().ToString()
                    == insertedMetodDecl.GetText().ToString();

//false
var referneceIsEqual = methodDecl == insertedMetodDecl;

Where do we start?

When you have a refactoring that you want to automate with Roslyn, the first question is “What exactly should I do?”. Syntax graph can be quite overwhelming at first sight, Remember from part 1, a one-line constructor consisted of over 30 elements. The key here are nodes. Trivia represents things that don’t matter to compiler, like comments and whitespace. Tokens are small pieces of code like brackets. Nodes are at the top – the represent core C# concepts like type and member declarations. Unless you are making a simple formatting tool, you will be working with nodes for the most part.

Knowing this, it is best to start by writing out the ‘before’ and ‘after’ of your change in plain C# and examining the difference in syntax graph nodes between the two. For this I prefer to use LINQPad, since it has a readily available option to hide everything but nodes.

Seeing this, we just need figure the graph transformation, that will turn ‘before’ into ‘after’.

Step 1 - inspect context of user-dev clicks

We start by analyzing click site and current syntax graph, to see if we should even offer our transformation for the current click context.

public sealed override async Task 
                ComputeRefactoringsAsync(CodeRefactoringContext context)
{
    // Get current syntax graph root
    var root = await context.Document
        .GetSyntaxRootAsync(context.CancellationToken)
        .ConfigureAwait(false);

    // Find the node which was clicked.
    var node = root.FindNode(context.Span);

    ClassDeclarationSyntax classDecl = null;
    ConstructorDeclarationSyntax constructorDecl = null;
    // we can offer modification if a class name 
    // or one of its constructors were clicked
    switch (node)
    {
        case ClassDeclarationSyntax @class:
            classDecl = @class;
            break;
        case ConstructorDeclarationSyntax @constructor 
                when @constructor.Parent is ClassDeclarationSyntax @class:
            constructorDecl = @constructor;
            classDecl = @class;
            break;
        default:
            // not something we can work with
            return;
            break;
    } 
                  
    var action = CodeAction.Create("(Re)Generate dependency " + 
                                                    "injected constructor",
        cancelToken => RegenerateDependencyInjectedConstructor(
                                                        context.Document, 
                                                        classDecl, 
                                                        constructorDecl, 
                                                        cancelToken));
    // offer poosible transformation
    context.RegisterRefactoring(action);
}

Step 2 - propose transformation

private async Task RegenerateDependencyInjectedConstructor(
                            Document document, 
                            ClassDeclarationSyntax @class,
                            ConstructorDeclarationSyntax constructor,
                            CancellationToken cancellationToken)
{
    // get root
    var root = await document.GetSyntaxRootAsync(cancellationToken);

                                                                          
    // in order to regenerate a constructor, class must have:
    // - exactly 1 public constructor or a public constructor marked with attr
    // - at least 1 injectable member that is not yet assigned
    // - no 2 injectable member sharing the exact same type;
    // following method checks presence of all those pieces 
    // and returns them via ref parameters; 
    // If it is successfull in finding all required pieces, it returns true;
    // if not – it returns false and document will contain 
    // a transformed document with included comment 
    // explaining the reason of failure 
    var hasAllNeededParts = TryGetOrAddRequiredParts(
                                ref document,
                                ref root,
                                ref @class,
                                ref constructor,
                                out MemberDeclarationSyntax[] injectables);

    if(hasAllNeededParts == false)
    {
        return document;
    }

    // produce a syntax graph representing a modified constructor
    var newConstructor = RegenereateConstructorSyntax(injectables, constructor);

    // replace old constructor with new one
    var newDocumentRoot = root.ReplaceNode(constructor, newConstructor);
    document = document.WithSyntaxRoot(newDocumentRoot);

    // return fully formed new syntax graph
    return document;
}

At this point, we will skip some pieces of code, concentrating on most relevant parts, so make sure to check-out this project on GitHub.

We need to get ‘injectable’ members of a type - read-only fields and properties marked with special [InjectedDependencyAttribute]. We are only interested in the ones that don’t yet have an assignment.

private async Task<Document>private static MemberDeclarationSyntax[] 
                            GetInjectableMembers(TypeDeclarationSyntax type)
{
    SyntaxKind[] propOrFieldDeclaration = new[]
    {
        SyntaxKind.FieldDeclaration,
        SyntaxKind.PropertyDeclaration
    };

    // every node has methods to sort through its children:
    // Child|Descendant + Nodes|Tokens|etc.
    // as you can imagine, 'Child' only gives you first level children
    // of specified type, while 'Descendant' walks entire sub-graph
    var injectableMembers = 
        @type
            .ChildNodes()
            // 'Fits; is an extension method checking .Kind property
            .Where(x => x.Fits(propOrFieldDeclaration))
            .OfType<MemberDeclarationSyntax>()
            .Where(x =>
            {
                // to check assignment, we look for a '=' Token
                // inside a given node
                bool alreadyHasAssignments = x.DescendantNodes()
                               .Any(n => n.Fits(SyntaxKind
                                                    .EqualsValueClause));

                if (alreadyHasAssignments)
                {
                    return false;
                }

                bool isReadonlyField = x.DescendantTokens()
                                .Any(n => n.Fits(SyntaxKind
                                                    .ReadOnlyKeyword));

                // GetAttributeIdentifiers is a util extension method 
                SyntaxToken[] attributes = x.GetAttributeIdentifiers();

                bool hasDependencyAttr = attributes
                        // this method compares attributes by name
                        .Any(y => y.FitsAttrIdentifier(
                                        typeof(InjectedDependencyAttribute)));

                bool hasExcludedForDepAttr = attributes
                        .Any(y => y.FitsAttrIdentifier(
                            typeof(ExcludeFromInjectedDependenciesAttribute)));

                if (hasExcludedForDepAttr)
                {
                    return false;
                }

                return isReadonlyField || hasDependencyAttr;

            })
            .ToArray();

    return injectableMembers;
}

If we didn’t find the right kind of injectable- we need to notify user-dev, by adding a comment. Adding a comment is our first syntax-graph transformation.

private static bool TryGetOrAddRequiredParts(
                                ref Document document,
                                ref SyntaxNode root,
                                ref ClassDeclarationSyntax @class,
                                ref ConstructorDeclarationSyntax constructor,
                                out MemberDeclarationSyntax[] injectables)
{
           
    injectables = null;

    injectables = GetInjectableMembers(@class);
    if (injectables.Any() == false)
    {
        var errorMessage = "Can't regenerate constructor, no unassgined " + 
                            "candidate members found " +
                            "(readonly fields, properties markead " + 
                            $"with {nameof(InjectedDependencyAttribute)}).";
        document = NotifyErrorViaCommentToClassOrConstructor(
                                                            document, 
                                                            root, 
                                                            @class, 
                                                            constructor, 
                                                            errorMessage);
        return false;
    }

    //...
}

We will look at the case, when the class name itself was clicked, and we are inserting the comment near class declaration opening brace.

private static Document ClassDeclWithCommentAtOpeningBrace(
                                                Document document, 
                                                SyntaxNode root, 
                                                ClassDeclarationSyntax type, 
                                                string errorMessage)
{
    // we prepare some Tokens
    var explanatoryCommentTrivia = SF.Comment("//" + errorMessage);
    var endOfLineTrivia = SF.EndOfLine("\r\n");
    // and get preexisting leading trivia, since we want to preserve it
    var leadingTrivia = @type.OpenBraceToken.LeadingTrivia;

    // Nodes have many 'WithX' methods, depending on their type;
    // they produce a new graph, replacing or adding a specific piece
    // to the old one
    var typeUpdatedWithExplanatoryComment = @type.WithOpenBraceToken(
            SF.Token(
                leadingTrivia,
                SyntaxKind.OpenBraceToken,
                SF.TriviaList(
                    explanatoryCommentTrivia,
                    endOfLineTrivia)));

    // Notice, that we end up with a totally new version 
    // of our @class syntax graph,
    // that is currently outside of our document.
    // There is no way to insert it (immutability), 
    // rather, we repeat the process
    // of replacement, to get a new Root, and than a whole new Document
    var newDocumentRoot = root.ReplaceNode(
                                @type, 
                                typeUpdatedWithExplanatoryComment);
    var newDocument = document.WithSyntaxRoot(newDocumentRoot);

    // this is the new document state, that will be offered to the user-dev
    return newDocument;
}

Figuring out, how to construct a piece of syntax graph corresponding to a given text is a very tedious task. Luckily, there is a great tool "Roslyn Quoter" by Kirill Osenkov to help us. It is available on GitHub and as an online version.

From a given C# code, it will produce the Roslyn calls necessary to reproduce it. It assumes, that SyntaxFactory type was statically imported, i.e.
using static Microsoft.CodeAnalysis.CSharp.SyntaxFactory;
I prefer to import it as
using SF = Microsoft.CodeAnalysis.CSharp.SyntaxFactory;
and prefix all calls with ”SF” to clearly mark them.

Let’s look at the method that finds all the necessary parts again.

private static bool TryGetOrAddRequiredParts(
                        ref Document document,
                        ref SyntaxNode root,
                        ref ClassDeclarationSyntax @class,
                        ref ConstructorDeclarationSyntax constructor,
                        out MemberDeclarationSyntax[] injectables)
{         
    injectables = null;

    injectables = GetInjectableMembers(@class);
    if (injectables.Any() == false)
    {
        // make document with explanation and return fasle
    }

    var injectablesWithSameType =
        injectables.GroupBy(x => x.GetMemberType().GetTypeName())
                    .FirstOrDefault(x => x.Count() > 1);

    if (injectablesWithSameType != null)
    {
        // make document with explanation and return false
    }

    if (constructor != null)
    {
        // a constructor was already chosen (by user-devs click)
        return true;
    }

    // A class name was clicked. Find public constructors. 
    // If any of them is marked with special attribute - 
    // only consider marked ones
    ConstructorDeclarationSyntax[] publicConstructors 
                             = GetPublicEligableConstructors(@class);

    // remember, we need exactly one match!
    var constructorsCount = publicConstructors.Count();

    if (constructorsCount > 1)
    {
        // more the one eligible  constructor,
        // make document with explanation and return fasle
    }
    else if (constructorsCount == 0)
    {
        // No public constructors. 
        // We will make an empty public constructor.
        // Notice, we set all of the relevant ref parameters,
        // that were passed to this method.
        // Immutability! We will have a new document,
        // new class in it with new constructor.
        (document, root, @class, constructor) =
            GetTypeDeclarationWithEmptyConstructor(
                                                document, 
                                                root, 
                                                @class, 
                                                injectables);
        return true;
    }
    else
    {
        constructor = publicConstructors.Single();
        return true;
    }
}

Let’s examine the method, which adds an empty constructor to the class.

// we return a C#7 tuple
private static (
        Document doc,
        SyntaxNode root,
        ClassDeclarationSyntax @class,
        ConstructorDeclarationSyntax constructor)
    GetTypeDeclarationWithEmptyConstructor(Document document,
                                        SyntaxNode root,
                                        ClassDeclarationSyntax @class,
                                        MemberDeclarationSyntax[] injectables)
{
    // get the class name
    var classsName = @class.Identifier.Text;

    var newConstructor = SF.ConstructorDeclaration(
            SF.Identifier(classsName))
                .WithModifiers(SF.TokenList(
                    SF.Token(
                        SF.TriviaList(),
                        SyntaxKind.PublicKeyword, // public
                        SF.TriviaList(SF.Space))))
                .WithBody(SF.Block());	// body is empty

    // If we just add the constructor - it will be at the end of the class.
    // Instead, we would like to insert right after last injectable member.
    // (According to our team style, they go first in the class)
    var members = @class.Members;
    var lastInjectableIndex = @class.Members.IndexOf(injectables.Last());

    var lastInjectableIsLastMember = 
                        (lastInjectableIndex == @class.Members.Count());
    ClassDeclarationSyntax newClass;
    if (lastInjectableIsLastMember)
    {
        // just add it last as is
        newClass = @class.AddMembers(newConstructor);
    }
    else
    {
        // insrt it after last injectable
        var newMmebers = members.Insert(lastInjectableIndex + 1, newConstructor);
        newClass = @class.WithMembers(newMmebers);
    }

    // construct new document
    var newDocumentRoot = root.ReplaceNode(@class, newClass);
    var newDocument = document.WithSyntaxRoot(newDocumentRoot);

    // remember, on replacement \ insertion nodes are copied;
    // to continue working, we need to get references to the copies
    // after last insertion
    newDocumentRoot = newDocument.GetSyntaxRootAsync().Result;

    // get a class node with the same Identifier text as argument
    newClass = newDocumentRoot.GetMatchingClassDeclaration(newClass);
    newConstructor = GetPublicEligableConstructors(newClass).Single();

    return (newDocument, newDocumentRoot, newClass, newConstructor);
}

By now, we know that we have all the right pieces – references to the class declaration, constructor declaration and all missing injectable member declarations. We need to update constructor parameter list and add assignments of new parameters to its code.

Remember, we need to exclude injectables that are already assigned in constructor. In order to do that, we will analyze its code to find this statements.

// we want the names of all assigned members
private static string[] GetExistingAssignmentsInConstructor(
                                    ConstructorDeclarationSyntax constructor)
{
    return constructor
         .Body
         .ChildNodes()
         .OfType<ExpressionStatementSyntax>()
         .Select(x => x.DescendantNodes()
                        .Where(y => y
                                    .Fits(SyntaxKind.SimpleAssignmentExpression))
                                    .SingleOrDefault())
         .Where(x => x != null)
         .OfType<AssignmentExpressionSyntax>()
         .Select(x => x.Left)
         .OfType<IdentifierNameSyntax>()
         .Select(x => x.Identifier.Text)
         .ToArray();
}

We will skip the method adding parameters for missing injectables to constructor and look at adding corresponding assignment statements. By now, you should be familiar with what is going on.

IEnumerable<StatementSyntax>newBodyStatements =
GetBodyStatementsWithMissinggAssignmentsPrepended(
                                            constructor,
                                            updatedParamters,
                                            injectablesMissingAnAssignment);

var newBodySyntaxList = SF.List(newBodyStatements);

var newBody = constructor.Body.WithStatements(newBodySyntaxList);

constructor = constructor.WithBody(newBody);

private static IEnumerable<StatementSyntax> 
    GetBodyStatementsWithMissinggAssignmentsPrepended(
                        ConstructorDeclarationSyntax constructor, 
                        ParameterSyntax[] updatedParamters, 
                        IEnumerable<MemberDeclarationSyntax>
                                            injectablesMissingAnAssignment)
{
    // for each pair of missing injecatble and added parameter
    // we create an assignment 
    ExpressionStatementSyntax[] assignmentStatementsToAdd = 
        GetMissingAssignmentExpressions(
                            updatedParamters, 
                            injectablesMissingAnAssignment);
    
    // and prepand them to the existing statements in body
    var newBodyStatements = Enumerable.Concat(
                                assignmentStatementsToAdd,
                                constructor.Body.Statements);
    return newBodyStatements;
}

private static ExpressionStatementSyntax[] GetMissingAssignmentExpressions(
        ParameterSyntax[] updatedParamters, 
        IEnumerable<MemberDeclarationSyntax> injectablesMissingAnAssignment)
{
    return injectablesMissingAnAssignment.Select(injectable =>
    {
        var injectableTypeIdentifier = injectable.GetMemberType();
        var injectableType = injectableTypeIdentifier.GetTypeName();
        var injectableName = injectable.GetMemberIdentifier().Text;

        var correspondingParameter = updatedParamters
            .SingleOrDefault(parameter =>
            {
                var paramType = parameter.Type.GetTypeName();

                return paramType == injectableType;
            });

        if (correspondingParameter == null)
        {
            return null;
        }

        var paramName = correspondingParameter.Identifier.Text;

        return SF.ExpressionStatement(
                            SF.AssignmentExpression(
                                SyntaxKind.SimpleAssignmentExpression,
                                SF.IdentifierName(injectableName),
                                SF.IdentifierName(paramName)));

    })
    .Where(x => x != null)
    .ToArray();
}

That’s it. Roslyn refactoring yield very nicely to Test Driven Development, so I urge you to use it. Testing setup was covered in Part 1 , and most of our tests look like this.

[TestMethod]
public void CanAddSingleParameterInjectionToConstructor()
{
    var testClassFileContents = @"
using System;
public class FooBar
{
    public readonly FooBar _p1;
    public FooBar()
    {
        
    }
}";

    var testClassExpectedNewContents = @"
using System;
public class FooBar
{
    public readonly FooBar _p1;
    public FooBar(FooBar p1)
    {
        _p1 = p1;
    }
}";

    TestUtil.TestAssertingEndText(
                    testClassFileContents,
                    "FooBar",
                    testClassExpectedNewContents);
}

One more thing about working with the syntax graph. Remember, every element in it corresponds to span of program code, text. You start with text, it is parsed into graph, you create a transformed graph, and then it is transformed back into text. When you initially start working with it, especially if you’ve worked with expression trees before, you are tempted to analyze node types, covering every possible subtype and making decisions based on that. You would start with

public static string GetTypeName(this TypeSyntax member)
{
    switch (member)
    {
        case PredefinedTypeSyntax predefined:
            return predefined.Keyword.Text;
            break;
        case IdentifierNameSyntax name:
            return name.Identifier.Text;
            break;
        default:
            throw new ArgumentException(
            $"Unknown TypeSyntax node type : {member.Kind().ToString()}");
    }
}

Then remember about generics and add them

public static string GetTypeName(this TypeSyntax member)
{
    switch (member)
    {
        case PredefinedTypeSyntax predefined:
            return predefined.Keyword.Text;
            break;
        case IdentifierNameSyntax name:
            return name.Identifier.Text;
            break;
       case GenericNameSyntax generic:
            return generic.GetText().ToString().Trim();
            break;
        default:
            throw new ArgumentException(                                      
            $"Unknown TypeSyntax node type : {member.Kind().ToString()}");   
    }
}

Then encounter arrays, and more possibilities... But in the end, oftentimes it will be easier to just work with the text of a given Node.

public static string GetTypeName(this TypeSyntax member)
{
    return member.GetText().ToString().Trim();
}

It takes some practice to figure out, when just dealing with text will be more beneficial than trying to analyze node type. Hints can be picked up from node types structure and inheritance. For example, in the case above, we chose to stop our analyses at the level of TypeSyntax, which is an abstract base type – we should pay attention to common base types. Also, even when you get rid of the complex switch and function code becomes trivial, as in our case, it is a good idea to keep the function. This will clearly mark our intention and allow us to easily get back to deeper analysis, should we need it in the future.

In part 3, we will look at building a Diagnostics Analyzer and working with the next level of compilation – bound symbol graph.

Compiling enterprise

Before we begin, a few things to remember.

Where do we start?

Step 1 - inspect context of user-dev clicks

Step 2 - propose transformation

Archives

Elsewhere