Expression trees and advanced queries in C# 01 - IQueryable and Expression Tree basics

Expression trees and advanced queries in C# 01 IQueryable and Expression Tree basics [2017 May 07] .NET, C#, IQueryable, Expression Trees

Sample project containing a demonstration of the theory explained in this article is available on GitHub

Many .NET developers don’t realize or don’t pay attention to the differences between IEnumerable and IQueryable. Most tutorials on the topic don’t go beyond trivial examples, thus missing the huge potential hidden inside.

IQueryable is IEnumerable and much more. From practical perspective, IQueryable represents a logical query defined by an Expression Tree with a Provider that can convert and execute this logical query against a specific data source.

Notice, how extension methods look very similar, but IEnumerable expects Func<T,Boolean> whereas IQueryable expects Expression<Func<T,Boolean>>. That Expression means an Expression Tree containing a serialized Func<T,Boolean>>. IEnumearble can be thought of as a pipeline producing (enumerating) results step by step where each step is a single C# function (usually supplied by a lambda we pass as argument). IQueryable, on the other hand, can be thought of as a query that you slowly build from pieces of logic meant to produce an entire result every time it is executed. Instead of C# functions, IQueryable accepts their Expression Tree equivalents. The Provider will examine them to extract abstract, language-agnostic logic of the query and, usually, produce an equivalent query in a different concrete language.

Expression Trees in C# are pieces of serialized logic represented by a tree data structure. They are similar to the tree data structures produced by parsing stage of the compilation process (If you ever used Syntax Visualizer tool with Roslyn compiler, think of the syntax graph visualization, but simpler).

Built in support for Expression Trees in traditional C# IDEs is so good, that most developers are using them without even realizing it, since C# compiler does all the work for you, hiding their differences from good old lambdas.

This looks like a lambda (fiddle).

Expression<Func<Bar, bool>> foo = (x) => x.Baz > 55;

But because we have specified the type to be Expression<Func<T,Boolean>>, the compiler will stop one step short of producing actual bytecode and store the result of parsing that code. Here is code equivalent to what we actually get (fiddle).

var parameter = Expression.Parameter(typeof(Bar), "x");

var bazProp = typeof(Bar).GetProperty(nameof(Bar.Baz));
var body = Expression.GreaterThan(
                        Expression.MakeMemberAccess(parameter, bazProp),
                        Expression.Constant(55));

Expression<Func<Bar, bool>> foo2 
    = Expression.Lambda<Func<Bar, bool>>(body, new[] { parameter });

Here is a diagram representation of what we get. Every node is an expression itself.

So, we have a piece of parsed logic / code, what can we use it for?

We can instruct the runtime to finish compilation and produce actual executable code. (fiddle).

 var result1 = foo2.Compile().Invoke(new Bar() { Baz = 50 }); //false
 var result2 = foo2.Compile().Invoke(new Bar() { Baz = 60 }); //true

We can examine the expression that we have both visually in debug...

...and at runtime to get type information, logics and pieces of data used in construction. Here is probably the simplest possible example - getting name of a property at runtime in a strong-type fashion (fiddle).

public static string NameOf<T>(Expression<Func<T, object>> expression)
{
    var memberExpression = expression.Body as MemberExpression
                            ?? ((UnaryExpression)expression.Body).Operand 
                                                            as MemberExpression;
    return memberExpression.Member.Name;
}
     
Util.NameOf<Bar>(x => x.Baz); //Baz

No nameof operator needed, works starting with C# 3.5. This is much better than using strings with reflection, since your expression is checked by the compiler and maintained by the IDE. Rename the prop, and expression will be part of that renaming. Delete the prop - and the compiler will give you an error about missing member.

The most frequent use would probably be to use the whole Expression Tree or its subtrees to construct a new piece of logic that incorporates them.

Expression<Func<Bar, bool>> isBarGood = (x) => x.Baz > 5000;
Expression<Func<Bar, int>> getTheRock = (x) => x.Rock;
Expression<Func<int, string>> describeTheRock = (x) => 
                                            x < 7000 ? "Good Rock of " + x :
                                            x < 9000 ? "Great Rock of " + x :
                                            Over 9000!!!";

IQueryable<string> goodRockDescriptions = DataContext.Bars
                                            .Where(isBarGood)
                                            .Select(getTheRock)
                                            .Select(describeTheRock);

When the Provider, in this case T-SQL driver, converts the logics of the query into SQL, we get something like the following query. You can see, that logics has been extracted from each Expression Tree and recompiled into parts of new query.

-- Region Parameters
DECLARE @p0 Int = 5000
DECLARE @p1 Int = 7000
DECLARE @p2 NVarChar(1000) = 'Good Rock of '
DECLARE @p3 Int = 9000
DECLARE @p4 NVarChar(1000) = 'Great Rock of '
DECLARE @p5 NVarChar(1000) = 'Over 9000!'
-- EndRegion
SELECT
    (CASE
       WHEN [t0].[Rock] < @p1 THEN @p2 + (CONVERT(NVarChar,[t0].[Rock]))
       WHEN [t0].[Rock] < @p3 THEN @p4 + (CONVERT(NVarChar,[t0].[Rock]))
       ELSE CONVERT(NVarChar(MAX), @p5)
    END) AS [value]
FROM [Dbo].[Bar] AS [t0]
WHERE [t0].[Baz] > @p0

Expression Trees in .NET can represent almost any C# code, but, seeing how their main usage is to produce "serialized" logic that is to be translated into something different than C#, often of a functional nature, compiler support for converting C# code into Expression Trees is limited to expression lambdas, i.e.

Expression<Func<TParam1, TResult>> = (x) => {...}; //stetement lambda, error
Expression<Func<TParam1, TResult>> = (x) => ...; //expression labmda, will work*
//*provided the expression inside does not contain something else not supported

Here are more examples of Expression Tree construction and usage, and if you ever wonder, how to manually create expression trees representing statement ("body") lambdas, checkout BlockExpression, and list of all possible expressions.

What is important about Expression Tress when it comes to queries, is that you can combine them. The new expression, , of course has to represent a valid piece of code, but that mostly comes down to its "return" type and parameter access. I.E. let’s take our previous expression

(x) => x.Baz > 55;

Value 55 will be represented by ConstantExpression with Type set to Int32. You can swap it for almost any other expression with compatible return Type...

int someCapturedVar = 55; 
(x) => x.Baz > (someCapturedVar);//will work
(x) => x.Baz > (x.Bac);//will work, if Bac is int
(x) => x.Baz > (DataContext.Counter.FirstOrDefault(x => x.Value) ?? 0);//should still work...
(x) => x.Baz > (someQueryableUsingSameProvider ?? 0);//..and will probably look like this, read below
(x) => x.Baz > (?);

... and under most circumstances no part of the tree outside of ? needs to change, and nothing in the ? tree needs to know, how it is being used. Their only point of cohesion is the expected / returned Type. The rules here are almost exactly what you would expect when swapping one piece of C# code for another, except now you swap expressions.

Knowing that

IQueryable consists of a Provider and an Expression Tree
Expression Trees can be combined almost as easily as pieces of C# code

the logical conclusion is, IQueryables should be combinable as easily as IEnumerables. And for the most part they are, even better - advanced providers will, actively eliminate dead code for you, while optimizing execution plans to best suit underlying technologies and speeding things up, much like a compiler would.

We will talk about this in part 2, IQueryable composition.

If you want more info on what useful things we can do with expression trees and how we do them, checkout part 3 Expression Tree Modification

Compiling enterprise

Archives

Elsewhere