แก้ไข

แชร์ผ่าน


Pattern match Span<char> on a constant string

Note

This article is a feature specification. The specification serves as the design document for the feature. It includes proposed specification changes, along with information needed during the design and development of the feature. These articles are published until the proposed spec changes are finalized and incorporated in the current ECMA specification.

There may be some discrepancies between the feature specification and the completed implementation. Those differences are captured in the pertinent language design meeting (LDM) notes.

You can learn more about the process for adopting feature speclets into the C# language standard in the article on the specifications.

Summary

Permit pattern matching a Span<char> and a ReadOnlySpan<char> on a constant string.

Motivation

For perfomance, usage of Span<char> and ReadOnlySpan<char> is preferred over string in many scenarios. The framework has added many new APIs to allow you to use ReadOnlySpan<char> in place of a string.

A common operation on strings is to use a switch to test if it is a particular value, and the compiler optimizes such a switch. However there is currently no way to do the same on a ReadOnlySpan<char> efficiently, other than implementing the switch and the optimization manually.

In order to encourage adoption of ReadOnlySpan<char> we allow pattern matching a ReadOnlySpan<char>, on a constant string, thus also allowing it to be used in a switch.

static bool Is123(ReadOnlySpan<char> s)
{
    return s is "123";
}

static bool IsABC(Span<char> s)
{
    return s switch { "ABC" => true, _ => false };
}

Detailed design

We alter the spec for constant patterns as follows (the proposed addition is shown in bold):

Given a pattern input value e and a constant pattern P with converted value v,

  • if e has integral type or enum type, or a nullable form of one of those, and v has integral type, the pattern P matches the value e if result of the expression e == v is true; otherwise
  • If e is of type System.Span<char> or System.ReadOnlySpan<char>, and c is a constant string, and c does not have a constant value of null, then the pattern is considered matching if System.MemoryExtensions.SequenceEqual<char>(e, System.MemoryExtensions.AsSpan(c)) returns true.
  • the pattern P matches the value e if object.Equals(e, v) returns true.

Well-known members

System.Span<T> and System.ReadOnlySpan<T> are matched by name, must be ref structs, and can be defined outside corlib.

System.MemoryExtensions is matched by name and can be defined outside corlib.

The signature of System.MemoryExtensions.SequenceEqual overloads must match:

  • public static bool SequenceEqual<T>(System.Span<T>, System.ReadOnlySpan<T>)
  • public static bool SequenceEqual<T>(System.ReadOnlySpan<T>, System.ReadOnlySpan<T>)

The signature of System.MemoryExtensions.AsSpan must match:

  • public static System.ReadOnlySpan<char> AsSpan(string)

Methods with optional parameters are excluded from consideration.

Drawbacks

None

Alternatives

None

Unresolved questions

  1. Should matching be defined independently from MemoryExtensions.SequenceEqual() etc.?

    ... the pattern is considered matching if e.Length == c.Length and e[i] == c[i] for all characters in e.

    Recommendation: Define in terms of MemoryExtensions.SequenceEqual() for performance. If MemoryExtensions is missing, report compile error.

  2. Should matching against (string)null be allowed?

    If so, should (string)null subsume "" since MemoryExtensions.AsSpan(null) == MemoryExtensions.AsSpan("")?

    static bool IsEmpty(ReadOnlySpan<char> span)
    {
        return span switch
        {
            (string)null => true, // ok?
            "" => true,           // error: unreachable?
            _ => false,
        };
    }
    

    Recommendation: Constant pattern (string)null should be reported as an error.

  3. Should the constant pattern match include a runtime type test of the expression value for Span<char> or ReadOnlySpan<char>?

    static bool Is123<T>(Span<T> s)
    {
        return s is "123"; // test for Span<char>?
    }
    
    static bool IsABC<T>(Span<T> s)
    {
        return s is Span<char> and "ABC"; // ok?
    }
    
    static bool IsEmptyString<T>(T t) where T : ref struct
    {
        return t is ""; // test for ReadOnlySpan<char>, Span<char>, string?
    }
    

    Recommendation: No implicit runtime type test for constant pattern. (IsABC<T>() example is allowed because the type test is explicit.)

    This recommendation was not implemented. All of the preceding samples produce a compiler error.

  4. Should subsumption consider constant string patterns, list patterns, and Length property pattern?

    static int ToNum(ReadOnlySpan<char> s)
    {
        return s switch
        {
            { Length: 0 } => 0,
            "" => 1,        // error: unreachable?
            ['A',..] => 2,
            "ABC" => 3,     // error: unreachable?
            _ => 4,
        };
    }
    

    Recommendation: Same subsumption behavior as used when the expression value is string. (Does that mean no subsumption between constant strings, list patterns, and Length, other than treating [..] as matching any?)

Design meetings

https://github.com/dotnet/csharplang/blob/master/meetings/2020/LDM-2020-10-07.md#readonlyspanchar-patterns