Pattern matching in C# – part 3 – tuples

Hello again, you potential RSS subscriber (probably still reading this in Google Reader, trying to determine which reader you’ll choose next) ! Last time I left you with a working pattern-matching mechanism in C#, but very limited… as it could only match a single value at a time.

What makes the pattern-matching mechanism so powerful in functional languages is its ability to match many sorts of data structures, amongst which are tuples.

Defining a match expression on a tuple or 2 values in F# can be done this way :

let result =
    match input with
    | 1, 1 -> 1
    | 2, _ -> 2
    | _, 3 -> 3
    | a, b when a > 4 -> 4
    | _ -> 5

This sample shows some interesting cases. A tuple of 2 values can be matched with several expressions, each value of the tuple being compared with individual patterns.

In part 1 of this series, we stated these individual patterns:

  • a constant pattern,
  • a conditional pattern,
  • a default pattern, known as the wildcard pattern (the underscore in F#).

Now, we have to handle every possible combination of these patterns!

The following sample translated in C# in my experimental syntax is the following:

var tupleSample =
    PatternMatcher.CreateSwitch<int, int, int>()
    .Case(1         , 1, (a, b) => 1)
    .Case(2         , _, (a, b) => 2)
    .Case(_         , 3, (a, b) => 3)
    .Case(a => a > 4, _, (a, b) => 4)
    .Case(_         , _, (a, b) => 5);

Welcome to the parenthesis hell. Soon it will like like Lisp if we go on… But how do we enable that syntax ?

In order to handle gracefully the wildcard (“_”) pattern, I have used a non-generic MatchCase class with a static field called Any:

public class MatchCase
{
    public static readonly MatchCase Any;

    static MatchCase()
    {
        Any = new MatchCase();
    }

    private MatchCase()
    {
    }
}

This trick allows me to define methods with a MatchCase-typed parameter to handle the wildcard pattern case. Moreover, if I add the following line before my patterns definitions :

var _ = MatchCase.Any;

I’m then able to use a simple “_” to represent the “any” case.

The method used previously to build the pattern matching Func and its invocation in the Eval method doesn’t change: only the “recording” process is impacted. The pain point here is that we need to define each necessary overload of the Case extension method, in order to handle all the patterns. These extensions methods are all based on the following model:

public static PatternMatcher<Tuple<T1, T2>, TResult>
    Case<T1, T2, TResult>(
    this PatternMatcher<Tuple<T1, T2>, TResult> pattern,
    {arg1},
    {arg2},
    Expression<Func<T1, T2, TResult>> selector)

where the {arg1} and {arg2} placeholders are replaced with :

  • T1 testValue1, T2 testValue2
  • T1 testValue1, Expression<Func<T2, bool>> testExpression2
  • T1 testValue1, MatchCase any2
  • Expression<Func<T1, bool>> testExpression1, T2 testValue2
  • Expression<Func<T1, bool>> testExpression1, Expression<Func<T2, bool>> testExpression2
  • Expression<Func<T1, bool>> testExpression1, MatchCase any2
  • MatchCase any1, T2 testValue2
  • MatchCase any1, Expression<Func<T2, bool>> testExpression2
  • MatchCase any1, MatchCase any2

The implementation of each method follows the same pattern as in the previous post, and builds an expression that determines whether a given tuple matches the pattern, returning, and evaluates the selector expression when there is a match.

But if we want to handle tuples with more values… do we have to write all the overloads ?!

Posted in Functional Inspiration, Syntax Puzzles | Tagged , , , | Comments Off

Pattern matching in C# – part 2

Last time, I just stopped before generating any real expression tree… Now is the perfect time to continue the process !

In order to handle match expressions where none of the expressions would match a given candidate, I decided to use an option type. This allows the generation of this type of expressions:

Func<double, Option<double>> matchSomeOrNone =
    (double d) =>
        (d > 0.0)
        ? Option<double>.Some(d)
        : Option<double>.None();

Given this these options, the translation of last time’s sample code:

var stringPattern = PatternMatcher
    .CreateSwitch<string, double>()
    .Case("TEST"                        , s => Math.PI)
    .Case("TEST_1"                      , s => Math.E)
    .Case((string)null                  , s => 0.0)
    .Case(s => s.StartsWith("TEST__")   , s => Math.E * 2)
    .CaseElse(                            s => s.Length);

would become:

Func<string, Option<double>> generatedFuncOption =
    (string s) =>
        (s == "TEST")
            ? Option<double>.Some(Math.PI)
            : (s == "TEST_1")
                ? Option<double>.Some(Math.E)
                : (s == null)
                    ? Option<double>.Some(0.0)
                    : s.StartsWith("TEST__")
                        ? Option<double>.Some(Math.E * 2)
                        : Option<double>.Some(s.Length);

In order to instantiate option types in LINQ expressions, I have to use the corresponding MethodInfo instances, obtained from reflexion :

MethodInfo createSomeMethodInfo =
    typeof(Option<TResult>)
        .GetMethod(
            "Some",
            BindingFlags.Public | BindingFlags.Static);

MethodInfo createNoneMethodInfo =
    typeof(Option<TResult>)
        .GetMethod(
            "None",
            BindingFlags.Public | BindingFlags.Static);

And then, in order to finally build a matching expression, I loop over the match conditions and result selector expressions the following way:

private Func<TSource, Option<TResult>> BuildPatternMatchingFunc()
{
    ParameterExpression parameter =
        Expression.Parameter(typeof(TSource));

    Expression noneExpression = Expression.Call(
        createNoneMethodInfo);

    Expression caseExpression = noneExpression;

    foreach (var matchCase in this.cases.Reverse())
    {
        caseExpression =
            Expression.Condition(
                Expression.Invoke(
                    matchCase.TestExpression,
                    parameter),
                Expression.Call(
                    createSomeMethodInfo,
                    Expression.Invoke(
                        matchCase.SelectorExpression,
                        parameter)),
                caseExpression);
    }

    return Expression
        .Lambda<Func<TSource, Option<TResult>>>(
            caseExpression,
            parameter)
        .Compile();
}

The createNoneMethodInfo and createSomeMethodInfo are the MethodInfo corresponding to the generic Option<T>.None() and Option<T>.Some() methods obtained by reflection.

Debugging this kind of expression building can be complicated, particularly without the ability to visualize clearly the expression tree… Hopefully, I have built an expression tree visualizer in the past (I could also have used the easy path and install the classical one) ! From the C# sample again, I can produce the following expression tree:

This tree may seem wide, but is is just a succession of conditional expressions… Finally, the the function that evaluates the result of the pattern-matching expression is the following one:

public TResult Eval(TSource candidate)
{
    if (this.patternMatchingFunc == null)
    {
        this.patternMatchingFunc = BuildPatternMatchingFunc();
    }

    Option<TResult> matchResult = patternMatchingFunc
                                    .Invoke(candidate);

    if (!matchResult.IsSome)
        throw new NoMatchFoundException();

    return matchResult.Value;
}

As you can see, it’s a wrapper around the expression evaluation invocation. It first generates the expression  instantiates upon the first call (in a non-threadsafe way, but who cares !) and handles “nothing matches” case by throwing an exception. And this works !

Now, you might say that matching a single value in of very limited interest. I can’t deny it. Why don’t we try to match several values, then ?

Posted in Functional Inspiration, Syntax Puzzles | Tagged , , , | Comments Off

Pattern matching in C# – part 1

In my previous post, I tried to explain what pattern matching is, and in particular the F# match expression, that I’m trying to mimic in C#. For my first attempt, I’ll try to analyse the matching cases that can apply on a single value. Let’s start right now with an F# sample[1]:

let test str =
    match str with
    | "TEST" -> Math.PI
    | "TEST_1" -> Math.E
    | null -> 0.0
    | s when s.StartsWith("TEST__") -> Math.E * 2.0
    | s -> float s.Length

If we analyse these patterns, we see 3 different cases:

  • a constant pattern (including the null value),
  • a conditional pattern (the lambda one)
  • a default pattern (the else case).

So in C#, what I’ve managed to build is the following:

var stringPattern = PatternMatcher
.CreateSwitch<string, double>()
.Case("TEST"                        , s => Math.PI)
.Case("TEST_1"                      , s => Math.E)
.Case((string)null                  , s => 0.0)
.Case(s => s.StartsWith("TEST__")   , s => Math.E * 2)
.CaseElse(                            s => s.Length);

What is difficult here is to write something that is valid C#, because:

  1. I can’t change the compiler,
  2. To keep things simple, I don’t want to use Roslyn in the first attempt (although that might come later).

Recording expressions for individual tests

This solution is mainly based upon a generic class called PatternMatcher<TSource, TResult>, which represents the match expression being built. Each Case overload builds an instance of a MatchCase class. As with many “fluent” syntax nowadays, in order to chain the calls in a single statement, each Case overload returns a new instance of the PatternMatcher, which keeps track of all the cases registered so far.

The MatchCase’s sole purpose is to encapsulate two expressions:

  • a test expression that will be used to determine if a particular source value matches the case: this will be an expression of Func<TSource, bool>,
  • a selector expression that will be used to produce the result of the match expression when the input value matches this particular case: this will be an expression of Func<TSource, TResult>.

Getting an Expression of a given Func type is nothing more that changing the type of a variable. The compiler will change its work accordingly, and produce an Expression from a given lambda-expression, instead of a typed delegate.

The most general case is the conditional pattern. Let’s take a look at the following extension method:

public static PatternMatcher<TSource, TResult>
    Case<TSource, TResult>(
        this PatternMatcher<TSource, TResult> pattern,
        Expression<Func<TSource, bool>> testExpression,
        Expression<Func<TSource, TResult>> selector)

You can see that the testExpression and selector parameters are already the types that my MatchCase class needs. From this values, I just have to instantiate the class and store it for later use.

The two other  cases (constant and default) will also provide directly the selector expression, only the test expression will have to be provided:

  • as an expression that tests equality with a given constant, for the constant pattern,
  • as an expression that is always true, for the default pattern.

These cases are just special cases of the previous one, called with an appropriate lambda, which will be converted to the test expression.

Building an expression tree

Of course, what we want to do is more that just record the expressions ! The next step is to combine them in a global match expression.

The kind of code I would like to produce, for the previous sample, would be the following:

Func<string, double> generatedFunc =
(string s) =>
    {
        if (s == "TEST") return Math.PI;
        if (s == "TEST_1") return Math.E;
        if (s == null) return 0.0;
        if (s.StartsWith("TEST__")) return Math.E * 2;
        return s.Length;
    };

Or even “better”, the same, but with conditional expressions instead of if and return statements:

Func<string, double> generatedFuncNoIf =
    (string s) =>
        (s == "TEST")
            ? Math.PI
            : (s == "TEST_1")
                ? Math.E
                : (s == null)
                    ? 0.0
                    : s.StartsWith("TEST__")
                        ? Math.E * 2
                        : s.Length;

An expression corresponding to the previous code could be generated, but I also want to handle an additional case: when there is no match. This means that my expression has to be able to return some result or… none. Some or none ? Isn’t there a type for that ? We’ll see how to use it next time.

[1] I know this sample is useless and theoretical, but isn’t this whole series ? If you really want to use match expressions, embrace F#, not C# !

Posted in Functional Inspiration, Syntax Puzzles | Tagged , , | Comments Off

Pattern matching in C# – introduction

Although bringing real pattern matching to C# in not my objective, I’ve tackled yet another syntax puzzle, trying to build a “sort-of” F#-ish pattern-matching-like syntax.
Disclaimer : as you can see from the quotes and italic, don’t expect me to build a fully-functional F# pattern matching in C#.

What is pattern matching ?

From the MSDN entry on Pattern matching :

Patterns are rules for transforming input data. They are used throughout the F# language to compare data with a logical structure or structures, decompose data into constituent parts, or extract information from data in various ways.

The point I want to focus on is in fact the F# match expression. The match expression is a powerful way to express conditions over data. It is often described as a “switch statement on steroids”. Its characteristics are the following :

  • Just like a switch, it takes a value as its input,
  • Unlike a switch, the value is not limited only to enums or built-in value types.
  • It has a return type. If there is nothing to return, F# uses a special unit type, but there is no void as a return type neither a null value.
  • If the values to return can be of different types depending on the input, Discriminated Unions (DUs) can be used as return types.

I’ll start with a simple F# sample, to show what a discriminated union is :

type Shape =
    | Point
    | Circle of float
    | Rectangle of float * float

In this sample, a type Shape is defined, that can be either a Point, a Circle or a Rectangle. Although the Point could be considered a simple enum value, the Circle and Rectangle cases, unlike enums, carry data. For instance :

let p = Point
let c = Circle(5.0)
let r = Rectangle(10.0, 20.0)

If we want to extract the data components of a particular union type, we use pattern-matching in the following way :

let Rectangle(w, h) = r

This expression binds the value of w and h to the data elements of the r Rectangle defined earlier. But my purpose in the blog series is to introduce in C# a syntax that would look like a match expression, which in F# is the following :

let getArea s =
    match s with
    | Point -> 0.0
    | Circle(r) -> r * r * System.Math.PI
    | Rectangle(w, h) -> w * h

In this attempt, I first consider matching simple values (which has no practical interest of any sort), then matching tuples, and I’ll finish with matching against some sort of Discriminated Union, although DUs are not a native C# construct.

First, let’s consider what we want to build. From the sample above, we can see that what we get from the match expression is a “getArea” that takes an instance of the union type as input, and gives a float as output. As this is just the definition of a function, it looks like we’re trying to build a construct that generates instances of the generic Func<TInput, TOutput> delegate.

In F#, match expressions can also be used to perform actions, and in this case will return the special unit type. To perform actions in C# we have the Action<T> delegate type. In this series I’ll consider only the functions, but actions could be built exactly the same way.

In the next part, I’ll show my first naive attempt to deal with simple values, before targeting any tuples or discriminated union types.

Posted in Functional Inspiration, Syntax Puzzles | Tagged , , | Comments Off

Functional Inspiration – A simple Option type

In F#, the option type is a built-in Discriminated Union type which allows to handle nicely the fact that an expression can represent some value or none. In C#, we have almost this construct when we use the Nullable<T>, but as I’ve already stated before, this type is available only for value types. Here is an option type that deals also with reference types.

I’ve already played with an Irreleventable<T> type in the past, and the implementation of it is quite the same as what I’ll do today, but the semantics are different, and I don’t want to re-use this type in a misleading way.

So today, I bring you “yet another option type” in C# :

public struct Option<T>
{
    private readonly bool isSome;
    public bool IsSome
    {
        get
        {
            return isSome;
        }
    }

    private readonly T value;
    public T Value
    {
        get
        {
            if (!isSome) throw new NotSupportedException();
            return value;
        }
    }

    private Option(bool isSome, T value)
    {
        this.isSome = isSome;
        this.value = value;
    }

    public static Option<T> Some(T value)
    {
        return new Option<T>(true, value);
    }

    public static Option<T> None()
    {
        return new Option<T>(false, default(T));
    }

    public override int GetHashCode()
    {
        return isSome
            ? value.GetHashCode()
            : -1;
    }

    public override bool Equals(object other)
    {
        if (!(other is Option<T>))
            return false;

        Option<T> otherOption = (Option<T>)other;

        return IsSome
            ? (value == null && otherOption.Value == null)
                || value.Equals(otherOption.Value)
            : !otherOption.IsSome;
    }

    public override string ToString()
    {
        return this.IsSome
            ? this.value.ToString()
            : String.Empty;
    }
}

The only noticeable differences with the Irreleventable type are the (chosen) lack of conversion operators, and the equality comparison, as an Option<T> will never be considered equal to an instance of T (unlike the Irreleventable).

Why would I write a blog post simply to introduce this type, without any context ? More to come soon.

Posted in Functional Inspiration | Tagged , | Comments Off