Επεξεργασία

Κοινή χρήση μέσω


Collection expressions

Note

This article is a feature specification. The specification serves as the design document for the feature. It includes proposed specification changes, along with information needed during the design and development of the feature. These articles are published until the proposed spec changes are finalized and incorporated in the current ECMA specification.

There may be some discrepancies between the feature specification and the completed implementation. Those differences are captured in the pertinent language design meeting (LDM) notes.

You can learn more about the process for adopting feature speclets into the C# language standard in the article on the specifications.

Summary

Collection expressions introduce a new terse syntax, [e1, e2, e3, etc], to create common collection values. Inlining other collections into these values is possible using a spread element ..e like so: [e1, ..c2, e2, ..c2].

Several collection-like types can be created without requiring external BCL support. These types are:

Further support is present for collection-like types not covered under the above through a new attribute and API pattern that can be adopted directly on the type itself.

Motivation

  • Collection-like values are hugely present in programming, algorithms, and especially in the C#/.NET ecosystem. Nearly all programs will utilize these values to store data and send or receive data from other components. Currently, almost all C# programs must use many different and unfortunately verbose approaches to create instances of such values. Some approaches also have performance drawbacks. Here are some common examples:

    • Arrays, which require either new Type[] or new[] before the { ... } values.
    • Spans, which may use stackalloc and other cumbersome constructs.
    • Collection initializers, which require syntax like new List<T> (lacking inference of a possibly verbose T) prior to their values, and which can cause multiple reallocations of memory because they use N .Add invocations without supplying an initial capacity.
    • Immutable collections, which require syntax like ImmutableArray.Create(...) to initialize the values, and which can cause intermediary allocations and data copying. More efficient construction forms (like ImmutableArray.CreateBuilder) are unwieldy and still produce unavoidable garbage.
  • Looking at the surrounding ecosystem, we also find examples everywhere of list creation being more convenient and pleasant to use. TypeScript, Dart, Swift, Elm, Python, and more opt for a succinct syntax for this purpose, with widespread usage, and to great effect. Cursory investigations have revealed no substantive problems arising in those ecosystems with having these literals built in.

  • C# has also added list patterns in C# 11. This pattern allows matching and deconstruction of list-like values using a clean and intuitive syntax. However, unlike almost all other pattern constructs, this matching/deconstruction syntax lacks the corresponding construction syntax.

  • Getting the best performance for constructing each collection type can be tricky. Simple solutions often waste both CPU and memory. Having a literal form allows for maximum flexibility from the compiler implementation to optimize the literal to produce at least as good a result as a user could provide, but with simple code. Very often the compiler will be able to do better, and the specification aims to allow the implementation large amounts of leeway in terms of implementation strategy to ensure this.

An inclusive solution is needed for C#. It should meet the vast majority of casse for customers in terms of the collection-like types and values they already have. It should also feel natural in the language and mirror the work done in pattern matching.

This leads to a natural conclusion that the syntax should be like [e1, e2, e3, e-etc] or [e1, ..c2, e2], which correspond to the pattern equivalents of [p1, p2, p3, p-etc] and [p1, ..p2, p3].

Detailed design

The following grammar productions are added:

primary_no_array_creation_expression
  ...
+ | collection_expression
  ;

+ collection_expression
  : '[' ']'
  | '[' collection_element ( ',' collection_element )* ']'
  ;

+ collection_element
  : expression_element
  | spread_element
  ;

+ expression_element
  : expression
  ;

+ spread_element
  : '..' expression
  ;

Collection literals are target-typed.

Spec clarifications

  • For brevity, collection_expression will be referred to as "literal" in the following sections.

  • expression_element instances will commonly be referred to as e1, e_n, etc.

  • spread_element instances will commonly be referred to as ..s1, ..s_n, etc.

  • span type means either Span<T> or ReadOnlySpan<T>.

  • Literals will commonly be shown as [e1, ..s1, e2, ..s2, etc] to convey any number of elements in any order. Importantly, this form will be used to represent all cases such as:

    • Empty literals []
    • Literals with no expression_element in them.
    • Literals with no spread_element in them.
    • Literals with arbitrary ordering of any element type.
  • The iteration type of ..s_n is the type of the iteration variable determined as if s_n were used as the expression being iterated over in a foreach_statement.

  • Variables starting with __name are used to represent the results of the evaluation of name, stored in a location so that it is only evaluated once. For example __e1 is the evaluation of e1.

  • List<T>, IEnumerable<T>, etc. refer to the respective types in the System.Collections.Generic namespace.

  • The specification defines a translation of the literal to existing C# constructs. Similar to the query expression translation, the literal is itself only legal if the translation would result in legal code. The purpose of this rule is to avoid having to repeat other rules of the language that are implied (for example, about convertibility of expressions when assigned to storage locations).

  • An implementation is not required to translate literals exactly as specified below. Any translation is legal if the same result is produced and there are no observable differences in the production of the result.

    • For example, an implementation could translate literals like [1, 2, 3] directly to a new int[] { 1, 2, 3 } expression that itself bakes the raw data into the assembly, eliding the need for __index or a sequence of instructions to assign each value. Importantly, this does mean if any step of the translation might cause an exception at runtime that the program state is still left in the state indicated by the translation.
  • References to 'stack allocation' refer to any strategy to allocate on the stack and not the heap. Importantly, it does not imply or require that that strategy be through the actual stackalloc mechanism. For example, the use of inline arrays is also an allowed and desirable approach to accomplish stack allocation where available. Note that in C# 12, inline arrays can't be initialized with a collection expression. That remains an open proposal.

  • Collections are assumed to be well-behaved. For example:

    • It is assumed that the value of Count on a collection will produce that same value as the number of elements when enumerated.
    • The types used in this spec defined in the System.Collections.Generic namespace are presumed to be side-effect free. As such, the compiler can optimize scenarios where such types might be used as intermediary values, but otherwise not be exposed.
    • It is assumed that a call to some applicable .AddRange(x) member on a collection will result in the same final value as iterating over x and adding all of its enumerated values individually to the collection with .Add.
    • The behavior of collection literals with collections that are not well-behaved is undefined.

Conversions

A collection expression conversion allows a collection expression to be converted to a type.

An implicit collection expression conversion exists from a collection expression to the following types:

  • A single dimensional array type T[], in which case the element type is T
  • A span type:
    • System.Span<T>
    • System.ReadOnlySpan<T>
      in which cases the element type is T
  • A type with an appropriate create method, in which case the element type is the iteration type determined from a GetEnumerator instance method or enumerable interface, not from an extension method
  • A struct or class type that implements System.Collections.IEnumerable where:
    • The type has an applicable constructor that can be invoked with no arguments, and the constructor is accessible at the location of the collection expression.

    • If the collection expression has any elements, the type has an instance or extension method Add where:

      • The method can be invoked with a single value argument.
      • If the method is generic, the type arguments can be inferred from the collection and argument.
      • The method is accessible at the location of the collection expression.

      In which case the element type is the iteration type of the type.

  • An interface type:
    • System.Collections.Generic.IEnumerable<T>
    • System.Collections.Generic.IReadOnlyCollection<T>
    • System.Collections.Generic.IReadOnlyList<T>
    • System.Collections.Generic.ICollection<T>
    • System.Collections.Generic.IList<T>
      in which cases the element type is T

The implicit conversion exists if the type has an element type U where for each element Eᵢ in the collection expression:

  • If Eᵢ is an expression element, there is an implicit conversion from Eᵢ to U.
  • If Eᵢ is a spread element ..Sᵢ, there is an implicit conversion from the iteration type of Sᵢ to U.

There is no collection expression conversion from a collection expression to a multi dimensional array type.

Types for which there is an implicit collection expression conversion from a collection expression are the valid target types for that collection expression.

The following additional implicit conversions exist from a collection expression:

  • To a nullable value type T? where there is a collection expression conversion from the collection expression to a value type T. The conversion is a collection expression conversion to T followed by an implicit nullable conversion from T to T?.

  • To a reference type T where there is a create method associated with T that returns a type U and an implicit reference conversion from U to T. The conversion is a collection expression conversion to U followed by an implicit reference conversion from U to T.

  • To an interface type I where there is a create method associated with I that returns a type V and an implicit boxing conversion from V to I. The conversion is a collection expression conversion to V followed by an implicit boxing conversion from V to I.

Create methods

A create method is indicated with a [CollectionBuilder(...)] attribute on the collection type. The attribute specifies the builder type and method name of a method to be invoked to construct an instance of the collection type.

namespace System.Runtime.CompilerServices
{
    [AttributeUsage(
        AttributeTargets.Class | AttributeTargets.Struct | AttributeTargets.Interface,
        Inherited = false,
        AllowMultiple = false)]
    public sealed class CollectionBuilderAttribute : System.Attribute
    {
        public CollectionBuilderAttribute(Type builderType, string methodName);
        public Type BuilderType { get; }
        public string MethodName { get; }
    }
}

The attribute can be applied to a class, struct, ref struct, or interface. The attribute is not inherited although the attribute can be applied to a base class or an abstract class.

The builder type must be a non-generic class or struct.

First, the set of applicable create methods CM is determined.
It consists of methods that meet the following requirements:

  • The method must have the name specified in the [CollectionBuilder(...)] attribute.
  • The method must be defined on the builder type directly.
  • The method must be static.
  • The method must be accessible where the collection expression is used.
  • The arity of the method must match the arity of the collection type.
  • The method must have a single parameter of type System.ReadOnlySpan<E>, passed by value.
  • There is an identity conversion, implicit reference conversion, or boxing conversion from the method return type to the collection type.

Methods declared on base types or interfaces are ignored and not part of the CM set.

If the CM set is empty, then the collection type doesn't have an element type and doesn't have a create method. None of the following steps apply.

If only one method among those in the CM set has an identity conversion from E to the element type of the collection type, that is the create method for the collection type. Otherwise, the collection type doesn't have a create method.

An error is reported if the [CollectionBuilder] attribute does not refer to an invokable method with the expected signature.

For a collection expression with a target type C<S0, S1, …> where the type declaration C<T0, T1, …> has an associated builder method B.M<U0, U1, …>(), the generic type arguments from the target type are applied in order — and from outermost containing type to innermost — to the builder method.

The span parameter for the create method can be explicitly marked scoped or [UnscopedRef]. If the parameter is implicitly or explicitly scoped, the compiler may allocate the storage for the span on the stack rather than the heap.

For example, a possible create method for ImmutableArray<T>:

[CollectionBuilder(typeof(ImmutableArray), "Create")]
public struct ImmutableArray<T> { ... }

public static class ImmutableArray
{
    public static ImmutableArray<T> Create<T>(ReadOnlySpan<T> items) { ... }
}

With the create method above, ImmutableArray<int> ia = [1, 2, 3]; could be emitted as:

[InlineArray(3)] struct __InlineArray3<T> { private T _element0; }

Span<int> __tmp = new __InlineArray3<int>();
__tmp[0] = 1;
__tmp[1] = 2;
__tmp[2] = 3;
ImmutableArray<int> ia =
    ImmutableArray.Create((ReadOnlySpan<int>)__tmp);

Construction

The elements of a collection expression are evaluated in order, left to right. Each element is evaluated exactly once, and any further references to the elements refer to the results of this initial evaluation.

A spread element may be iterated before or after the subsequent elements in the collection expression are evaluated.

An unhandled exception thrown from any of the methods used during construction will be uncaught and will prevent further steps in the construction.

Length, Count, and GetEnumerator are assumed to have no side effects.


If the target type is a struct or class type that implements System.Collections.IEnumerable, and the target type does not have a create method, the construction of the collection instance is as follows:

  • The elements are evaluated in order. Some or all elements may be evaluated during the steps below rather than before.

  • The compiler may determine the known length of the collection expression by invoking countable properties — or equivalent properties from well-known interfaces or types — on each spread element expression.

  • The constructor that is applicable with no arguments is invoked.

  • For each element in order:

    • If the element is an expression element, the applicable Add instance or extension method is invoked with the element expression as the argument. (Unlike classic collection initializer behavior, element evaluation and Add calls are not necessarily interleaved.)
    • If the element is a spread element then one of the following is used:
      • An applicable GetEnumerator instance or extension method is invoked on the spread element expression and for each item from the enumerator the applicable Add instance or extension method is invoked on the collection instance with the item as the argument. If the enumerator implements IDisposable, then Dispose will be called after enumeration, regardless of exceptions.
      • An applicable AddRange instance or extension method is invoked on the collection instance with the spread element expression as the argument.
      • An applicable CopyTo instance or extension method is invoked on the spread element expression with the collection instance and int index as arguments.
  • During the construction steps above, an applicable EnsureCapacity instance or extension method may be invoked one or more times on the collection instance with an int capacity argument.


If the target type is an array, a span, a type with a create method, or an interface, the construction of the collection instance is as follows:

  • The elements are evaluated in order. Some or all elements may be evaluated during the steps below rather than before.

  • The compiler may determine the known length of the collection expression by invoking countable properties — or equivalent properties from well-known interfaces or types — on each spread element expression.

  • An initialization instance is created as follows:

    • If the target type is an array and the collection expression has a known length, an array is allocated with the expected length.
    • If the target type is a span or a type with a create method, and the collection has a known length, a span with the expected length is created referring to contiguous storage.
    • Otherwise intermediate storage is allocated.
  • For each element in order:

    • If the element is an expression element, the initialization instance indexer is invoked to add the evaluated expression at the current index.
    • If the element is a spread element then one of the following is used:
      • A member of a well-known interface or type is invoked to copy items from the spread element expression to the initialization instance.
      • An applicable GetEnumerator instance or extension method is invoked on the spread element expression and for each item from the enumerator, the initialization instance indexer is invoked to add the item at the current index. If the enumerator implements IDisposable, then Dispose will be called after enumeration, regardless of exceptions.
      • An applicable CopyTo instance or extension method is invoked on the spread element expression with the initialization instance and int index as arguments.
  • If intermediate storage was allocated for the collection, a collection instance is allocated with the actual collection length and the values from the initialization instance are copied to the collection instance, or if a span is required the compiler may use a span of the actual collection length from the intermediate storage. Otherwise the initialization instance is the collection instance.

  • If the target type has a create method, the create method is invoked with the span instance.


Note: The compiler may delay adding elements to the collection — or delay iterating through spread elements — until after evaluating subsequent elements. (When subsequent spread elements have countable properties that would allow calculating the expected length of the collection before allocating the collection.) Conversely, the compiler may eagerly add elements to the collection — and eagerly iterate through spread elements — when there is no advantage to delaying.

Consider the following collection expression:

int[] x = [a, ..b, ..c, d];

If spread elements b and c are countable, the compiler could delay adding items from a and b until after c is evaluated, to allow allocating the resulting array at the expected length. After that, the compiler could eagerly add items from c, before evaluating d.

var __tmp1 = a;
var __tmp2 = b;
var __tmp3 = c;
var __result = new int[2 + __tmp2.Length + __tmp3.Length];
int __index = 0;
__result[__index++] = __tmp1;
foreach (var __i in __tmp2) __result[__index++] = __i;
foreach (var __i in __tmp3) __result[__index++] = __i;
__result[__index++] = d;
x = __result;

Empty collection literal

  • The empty literal [] has no type. However, similar to the null-literal, this literal can be implicitly converted to any constructible collection type.

    For example, the following is not legal as there is no target type and there are no other conversions involved:

    var v = []; // illegal
    
  • Spreading an empty literal is permitted to be elided. For example:

    bool b = ...
    List<int> l = [x, y, .. b ? [1, 2, 3] : []];
    

    Here, if b is false, it is not required that any value actually be constructed for the empty collection expression since it would immediately be spread into zero values in the final literal.

  • The empty collection expression is permitted to be a singleton if used to construct a final collection value that is known to not be mutable. For example:

    // Can be a singleton, like Array.Empty<int>()
    int[] x = []; 
    
    // Can be a singleton. Allowed to use Array.Empty<int>(), Enumerable.Empty<int>(),
    // or any other implementation that can not be mutated.
    IEnumerable<int> y = [];
    
    // Must not be a singleton.  Value must be allowed to mutate, and should not mutate
    // other references elsewhere.
    List<int> z = [];
    

Ref safety

See safe context constraint for definitions of the safe-context values: declaration-block, function-member, and caller-context.

The safe-context of a collection expression is:

  • The safe-context of an empty collection expression [] is the caller-context.

  • If the target type is a span type System.ReadOnlySpan<T>, and T is one of the primitive types bool, sbyte, byte, short, ushort, char, int, uint, long, ulong, float, or double, and the collection expression contains constant values only, the safe-context of the collection expression is the caller-context.

  • If the target type is a span type System.Span<T> or System.ReadOnlySpan<T>, the safe-context of the collection expression is the declaration-block.

  • If the target type is a ref struct type with a create method, the safe-context of the collection expression is the safe-context of an invocation of the create method where the collection expression is the span argument to the method.

  • Otherwise the safe-context of the collection expression is the caller-context.

A collection expression with a safe-context of declaration-block cannot escape the enclosing scope, and the compiler may store the collection on the stack rather than the heap.

To allow a collection expression for a ref struct type to escape the declaration-block, it may be necessary to cast the expression to another type.

static ReadOnlySpan<int> AsSpanConstants()
{
    return [1, 2, 3]; // ok: span refers to assembly data section
}

static ReadOnlySpan<T> AsSpan2<T>(T x, T y)
{
    return [x, y];    // error: span may refer to stack data
}

static ReadOnlySpan<T> AsSpan3<T>(T x, T y, T z)
{
    return (T[])[x, y, z]; // ok: span refers to T[] on heap
}

Type inference

var a = AsArray([1, 2, 3]);          // AsArray<int>(int[])
var b = AsListOfArray([[4, 5], []]); // AsListOfArray<int>(List<int[]>)

static T[] AsArray<T>(T[] arg) => arg;
static List<T[]> AsListOfArray<T>(List<T[]> arg) => arg;

The type inference rules are updated as follows.

The existing rules for the first phase are extracted to a new input type inference section, and a rule is added to input type inference and output type inference for collection expression expressions.

11.6.3.2 The first phase

For each of the method arguments Eᵢ:

  • An input type inference is made from Eᵢ to the corresponding parameter type Tᵢ.

An input type inference is made from an expression E to a type T in the following way:

  • If E is a collection expression with elements Eᵢ, and T is a type with an element type Tₑ or T is a nullable value type T0? and T0 has an element type Tₑ, then for each Eᵢ:
    • If Eᵢ is an expression element, then an input type inference is made from Eᵢ to Tₑ.
    • If Eᵢ is a spread element with an iteration type Sᵢ, then a lower-bound inference is made from Sᵢ to Tₑ.
  • [existing rules from first phase] ...

11.6.3.7 Output type inferences

An output type inference is made from an expression E to a type T in the following way:

  • If E is a collection expression with elements Eᵢ, and T is a type with an element type Tₑ or T is a nullable value type T0? and T0 has an element type Tₑ, then for each Eᵢ:
    • If Eᵢ is an expression element, then an output type inference is made from Eᵢ to Tₑ.
    • If Eᵢ is a spread element, no inference is made from Eᵢ.
  • [existing rules from output type inferences] ...

Extension methods

No changes to extension method invocation rules.

12.8.10.3 Extension method invocations

An extension method Cᵢ.Mₑ is eligible if:

  • ...
  • An implicit identity, reference, or boxing conversion exists from expr to the type of the first parameter of Mₑ.

A collection expression does not have a natural type so the existing conversions from type are not applicable. As a result, a collection expression cannot be used directly as the first parameter for an extension method invocation.

static class Extensions
{
    public static ImmutableArray<T> AsImmutableArray<T>(this ImmutableArray<T> arg) => arg;
}

var x = [1].AsImmutableArray();           // error: collection expression has no target type
var y = [2].AsImmutableArray<int>();      // error: ...
var z = Extensions.AsImmutableArray([3]); // ok

Overload resolution

Better conversion from expression is updated to prefer certain target types in collection expression conversions.

In the updated rules:

  • A span_type is one of:
    • System.Span<T>
    • System.ReadOnlySpan<T>.
  • An array_or_array_interface is one of:
    • an array type
    • one of the following interface types implemented by an array type:
      • System.Collections.Generic.IEnumerable<T>
      • System.Collections.Generic.IReadOnlyCollection<T>
      • System.Collections.Generic.IReadOnlyList<T>
      • System.Collections.Generic.ICollection<T>
      • System.Collections.Generic.IList<T>

Given an implicit conversion C₁ that converts from an expression E to a type T₁, and an implicit conversion C₂ that converts from an expression E to a type T₂, C₁ is a better conversion than C₂ if one of the following holds:

  • E is a collection expression and one of the following holds:
    • T₁ is System.ReadOnlySpan<E₁>, and T₂ is System.Span<E₂>, and an implicit conversion exists from E₁ to E₂
    • T₁ is System.ReadOnlySpan<E₁> or System.Span<E₁>, and T₂ is an array_or_array_interface with element type E₂, and an implicit conversion exists from E₁ to E₂
    • T₁ is not a span_type, and T₂ is not a span_type, and an implicit conversion exists from T₁ to T₂
  • E is not a collection expression and one of the following holds:
    • E exactly matches T₁ and E does not exactly match T₂
    • E exactly matches both or neither of T₁ and T₂, and T₁ is a better conversion target than T₂
  • E is a method group, ...

Examples of differences with overload resolution between array initializers and collection expressions:

static void Generic<T>(Span<T> value) { }
static void Generic<T>(T[] value) { }

static void SpanDerived(Span<string> value) { }
static void SpanDerived(object[] value) { }

static void ArrayDerived(Span<object> value) { }
static void ArrayDerived(string[] value) { }

// Array initializers
Generic(new[] { "" });      // string[]
SpanDerived(new[] { "" });  // ambiguous
ArrayDerived(new[] { "" }); // string[]

// Collection expressions
Generic([""]);              // Span<string>
SpanDerived([""]);          // Span<string>
ArrayDerived([""]);         // ambiguous

Span types

The span types ReadOnlySpan<T> and Span<T> are both constructible collection types. Support for them follows the design for params Span<T>. Specifically, constructing either of those spans will result in an array T[] created on the stack if the params array is within limits (if any) set by the compiler. Otherwise the array will be allocated on the heap.

If the compiler chooses to allocate on the stack, it is not required to translate a literal directly to a stackalloc at that specific point. For example, given:

foreach (var x in y)
{
    Span<int> span = [a, b, c];
    // do things with span
}

The compiler is allowed to translate that using stackalloc as long as the Span meaning stays the same and span-safety is maintained. For example, it can translate the above to:

Span<int> __buffer = stackalloc int[3];
foreach (var x in y)
{
    __buffer[0] = a
    __buffer[1] = b
    __buffer[2] = c;
    Span<int> span = __buffer;
    // do things with span
}

The compiler can also use inline arrays, if available, when choosing to allocate on the stack. Note that in C# 12, inline arrays can't be initialized with a collection expression. That feature is an open proposal.

If the compiler decides to allocate on the heap, the translation for Span<T> is simply:

T[] __array = [...]; // using existing rules
Span<T> __result = __array;

Collection literal translation

A collection expression has a known length if the compile-time type of each spread element in the collection expression is countable.

Interface translation

Non-mutable interface translation

Given a target type which does not contain mutating members, namely IEnumerable<T>, IReadOnlyCollection<T>, and IReadOnlyList<T>, a compliant implementation is required to produce a value that implements that interface. If a type is synthesized, it is recommended the synthesized type implements all these interfaces, as well as ICollection<T> and IList<T>, regardless of which interface type was targeted. This ensures maximal compatibility with existing libraries, including those that introspect the interfaces implemented by a value in order to light up performance optimizations.

In addition, the value must implement the nongeneric ICollection and IList interfaces. This enables collection expressions to support dynamic introspection in scenarios such as data binding.

A compliant implementation is free to:

  1. Use an existing type that implements the required interfaces.
  2. Synthesize a type that implements the required interfaces.

In either case, the type used is allowed to implement a larger set of interfaces than those strictly required.

Synthesized types are free to employ any strategy they want to implement the required interfaces properly. For example, a synthesized type might inline the elements directly within itself, avoiding the need for additional internal collection allocations. A synthesized type could also not use any storage whatsoever, opting to compute the values directly. For example, returning index + 1 for [1, 2, 3, 4, 5, 6, 7, 8, 9, 10].

  1. The value must return true when queried for ICollection<T>.IsReadOnly (if implemented) and nongeneric IList.IsReadOnly and IList.IsFixedSize. This ensures consumers can appropriately tell that the collection is non-mutable, despite implementing the mutable views.
  2. The value must throw on any call to a mutation method (like IList<T>.Add). This ensures safety, preventing a non-mutable collection from being accidentally mutated.

Mutable interface translation

Given target type that contains mutating members, namely ICollection<T> or IList<T>:

  1. The value must be an instance of List<T>.

Known length translation

Having a known length allows for efficient construction of a result with the potential for no copying of data and no unnecessary slack space in a result.

Not having a known length does not prevent any result from being created. However, it may result in extra CPU and memory costs producing the data, then moving to the final destination.

  • For a known length literal [e1, ..s1, etc], the translation first starts with the following:

    int __len = count_of_expression_elements +
                __s1.Count;
                ...
                __s_n.Count;
    
  • Given a target type T for that literal:

    • If T is some T1[], then the literal is translated as:

      T1[] __result = new T1[__len];
      int __index = 0;
      
      __result[__index++] = __e1;
      foreach (T1 __t in __s1)
          __result[__index++] = __t;
      
      // further assignments of the remaining elements
      

      The implementation is allowed to utilize other means to populate the array. For example, utilizing efficient bulk-copy methods like .CopyTo().

    • If T is some Span<T1>, then the literal is translated as the same as above, except that the __result initialization is translated as:

      Span<T1> __result = new T1[__len];
      
      // same assignments as the array translation
      

      The translation may use stackalloc T1[] or an inline array rather than new T1[] if span-safety is maintained.

    • If T is some ReadOnlySpan<T1>, then the literal is translated the same as for the Span<T1> case except that the final result will be that Span<T1> implicitly converted to a ReadOnlySpan<T1>.

      A ReadOnlySpan<T1> where T1 is some primitive type, and all collection elements are constant does not need its data to be on the heap, or on the stack. For example, an implementation could construct this span directly as a reference to portion of the data segment of the program.

      The above forms (for arrays and spans) are the base representations of the collection expression and are used for the following translation rules:

      • If T is some C<S0, S1, …> which has a corresponding create-method B.M<U0, U1, …>(), then the literal is translated as:

        // Collection literal is passed as is as the single B.M<...>(...) argument
        C<S0, S1, …> __result = B.M<S0, S1, …>([...])
        

        As the create method must have an argument type of some instantiated ReadOnlySpan<T>, the translation rule for spans applies when passing the collection expression to the create method.

      • If T supports collection initializers, then:

        • if the type T contains an accessible constructor with a single parameter int capacity, then the literal is translated as:

          T __result = new T(capacity: __len);
          __result.Add(__e1);
          foreach (var __t in __s1)
              __result.Add(__t);
          
          // further additions of the remaining elements
          

          Note: the name of the parameter is required to be capacity.

          This form allows for a literal to inform the newly constructed type of the count of elements to allow for efficient allocation of internal storage. This avoids wasteful reallocations as the elements are added.

        • otherwise, the literal is translated as:

          T __result = new T();
          
          __result.Add(__e1);
          foreach (var __t in __s1)
              __result.Add(__t);
          
          // further additions of the remaining elements
          

          This allows creating the target type, albeit with no capacity optimization to prevent internal reallocation of storage.

Unknown length translation

  • Given a target type T for an unknown length literal:

    • If T supports collection initializers, then the literal is translated as:

      T __result = new T();
      
      __result.Add(__e1);
      foreach (var __t in __s1)
          __result.Add(__t);
      
      // further additions of the remaining elements
      

      This allows spreading of any iterable type, albeit with the least amount of optimization possible.

    • If T is some T1[], then the literal has the same semantics as:

      List<T1> __list = [...]; /* initialized using predefined rules */
      T1[] __result = __list.ToArray();
      

      The above is inefficient though; it creates the intermediary list, and then creates a copy of the final array from it. Implementations are free to optimize this away, for example producing code like so:

      T1[] __result = <private_details>.CreateArray<T1>(
          count_of_expression_elements);
      int __index = 0;
      
      <private_details>.Add(ref __result, __index++, __e1);
      foreach (var __t in __s1)
          <private_details>.Add(ref __result, __index++, __t);
      
      // further additions of the remaining elements
      
      <private_details>.Resize(ref __result, __index);
      

      This allows for minimal waste and copying, without additional overhead that library collections might incur.

      The counts passed to CreateArray are used to provide a starting size hint to prevent wasteful resizes.

    • If T is some span type, an implementation may follow the above T[] strategy, or any other strategy with the same semantics, but better performance. For example, instead of allocating the array as a copy of the list elements, CollectionsMarshal.AsSpan(__list) could be used to obtain a span value directly.

Unsupported scenarios

While collection literals can be used for many scenarios, there are a few that they are not capable of replacing. These include:

  • Multi-dimensional arrays (e.g. new int[5, 10] { ... }). There is no facility to include the dimensions, and all collection literals are either linear or map structures only.
  • Collections which pass special values to their constructors. There is no facility to access the constructor being used.
  • Nested collection initializers, e.g. new Widget { Children = { w1, w2, w3 } }. This form needs to stay since it has very different semantics from Children = [w1, w2, w3]. The former calls .Add repeatedly on .Children while the latter would assign a new collection over .Children. We could consider having the latter form fall back to adding to an existing collection if .Children can't be assigned, but that seems like it could be extremely confusing.

Syntax ambiguities

  • There are two "true" syntactic ambiguities where there are multiple legal syntactic interpretations of code that uses a collection_literal_expression.

    • The spread_element is ambiguous with a range_expression. One could technically have:

      Range[] ranges = [range1, ..e, range2];
      

      To resolve this, we can either:

      • Require users to parenthesize (..e) or include a start index 0..e if they want a range.
      • Choose a different syntax (like ...) for spread. This would be unfortunate for the lack of consistency with slice patterns.
  • There are two cases where there isn't a true ambiguity but where the syntax greatly increases parsing complexity. While not a problem given engineering time, this does still increase cognitive overhead for users when looking at code.

    • Ambiguity between collection_literal_expression and attributes on statements or local functions. Consider:

      [X(), Y, Z()]
      

      This could be one of:

      // A list literal inside some expression statement
      [X(), Y, Z()].ForEach(() => ...);
      
      // The attributes for a statement or local function
      [X(), Y, Z()] void LocalFunc() { }
      

      Without complex lookahead, it would be impossible to tell without consuming the entirety of the literal.

      Options to address this include:

      • Allow this, doing the parsing work to determine which of these cases this is.
      • Disallow this, and require the user wrap the literal in parentheses like ([X(), Y, Z()]).ForEach(...).
      • Ambiguity between a collection_literal_expression in a conditional_expression and a null_conditional_operations. Consider:
      M(x ? [a, b, c]
      

      This could be one of:

      // A ternary conditional picking between two collections
      M(x ? [a, b, c] : [d, e, f]);
      
      // A null conditional safely indexing into 'x':
      M(x ? [a, b, c]);
      

      Without complex lookahead, it would be impossible to tell without consuming the entirety of the literal.

      Note: this is a problem even without a natural type because target typing applies through conditional_expressions.

      As with the others, we could require parentheses to disambiguate. In other words, presume the null_conditional_operation interpretation unless written like so: x ? ([1, 2, 3]) :. However, that seems rather unfortunate. This sort of code does not seem unreasonable to write and will likely trip people up.

Drawbacks

  • This introduces yet another form for collection expressions on top of the myriad ways we already have. This is extra complexity for the language. That said, this also makes it possible to unify on one ring syntax to rule them all, which means existing codebases can be simplified and moved to a uniform look everywhere.
  • Using [...] instead of {...} moves away from the syntax we've generally used for arrays and collection initializers already. Specifically that it uses [...] instead of {...}. However, this was already settled on by the language team when we did list patterns. We attempted to make {...} work with list patterns and ran into insurmountable issues. Because of this, we moved to [...] which, while new for C#, feels natural in many programming languages and allowed us to start fresh with no ambiguity. Using [...] as the corresponding literal form is complementary with our latest decisions, and gives us a clean place to work without problem.

This does introduce warts into the language. For example, the following are both legal and (fortunately) mean the exact same thing:

int[] x = { 1, 2, 3 };
int[] x = [ 1, 2, 3 ];

However, given the breadth and consistency brought by the new literal syntax, we should consider recommending that people move to the new form. IDE suggestions and fixes could help in that regard.

Alternatives

  • What other designs have been considered? What is the impact of not doing this?

Resolved questions

  • Should the compiler use stackalloc for stack allocation when inline arrays are not available and the iteration type is a primitive type?

    Resolution: No. Managing a stackalloc buffer requires additional effort over an inline array to ensure the buffer is not allocated repeatedly when the collection expression is within a loop. The additional complexity in the compiler and in the generated code outweighs the benefit of stack allocation on older platforms.

  • In what order should we evaluate literal elements compared with Length/Count property evaluation? Should we evaluate all elements first, then all lengths? Or should we evaluate an element, then its length, then the next element, and so on?

    Resolution: We evaluate all elements first, then everything else follows that.

  • Can an unknown length literal create a collection type that needs a known length, like an array, span, or Construct(array/span) collection? This would be harder to do efficiently, but it might be possible through clever use of pooled arrays and/or builders.

    Resolution: Yes, we allow creating a fixes-length collection from an unknown length literal. The compiler is permitted to implement this in as efficient a manner as possible.

    The following text exists to record the original discussion of this topic.

    Users could always make an unknown length literal into a known length one with code like this:

    ImmutableArray<int> x = [a, ..unknownLength.ToArray(), b];
    

    However, this is unfortunate due to the need to force allocations of temporary storage. We could potentially be more efficient if we controlled how this was emitted.

  • Can a collection_expression be target-typed to an IEnumerable<T> or other collection interfaces?

    For example:

    void DoWork(IEnumerable<long> values) { ... }
    // Needs to produce `longs` not `ints` for this to work.
    DoWork([1, 2, 3]);
    

    Resolution: Yes, a literal can be target-typed to any interface type I<T> that List<T> implements. For example, IEnumerable<long>. This is the same as target-typing to List<long> and then assigning that result to the specified interface type. The following text exists to record the original discussion of this topic.

    The open question here is determining what underlying type to actually create. One option is to look at the proposal for params IEnumerable<T>. There, we would generate an array to pass the values along, similar to what happens with params T[].

  • Can/should the compiler emit Array.Empty<T>() for []? Should we mandate that it does this, to avoid allocations whenever possible?

    Yes. The compiler should emit Array.Empty<T>() for any case where this is legal and the final result is non-mutable. For example, targeting T[], IEnumerable<T>, IReadOnlyCollection<T> or IReadOnlyList<T>. It should not use Array.Empty<T> when the target is mutable (ICollection<T> or IList<T>).

  • Should we expand on collection initializers to look for the very common AddRange method? It could be used by the underlying constructed type to perform adding of spread elements potentially more efficiently. We might also want to look for things like .CopyTo as well. There may be drawbacks here as those methods might end up causing excess allocations/dispatches versus directly enumerating in the translated code.

    Yes. An implementation is allowed to utilize other methods to initialize a collection value, under the presumption that these methods have well-defined semantics, and that collection types should be "well behaved". In practice though, an implementation should be cautious as benefits in one way (bulk copying) may come with negative consequences as well (for example, boxing a struct collection).

    An implementation should take advantage in the cases where there are no downsides. For example, with an .AddRange(ReadOnlySpan<T>) method.

Unresolved questions

  • Should we allow inferring the element type when the iteration type is "ambiguous" (by some definition)? For example:
Collection x = [1L, 2L];

// error CS1640: foreach statement cannot operate on variables of type 'Collection' because it implements multiple instantiations of 'IEnumerable<T>'; try casting to a specific interface instantiation
foreach (var x in new Collection) { }

static class Builder
{
    public Collection Create(ReadOnlySpan<long> items) => throw null;
}

[CollectionBuilder(...)]
class Collection : IEnumerable<int>, IEnumerable<string>
{
    IEnumerator<int> IEnumerable<int>.GetEnumerator() => throw null;
    IEnumerator<string> IEnumerable<string>.GetEnumerator() => throw null;
    IEnumerator IEnumerable.GetEnumerator() => throw null;
}
  • Should it be legal to create and immediately index into a collection literal? Note: this requires an answer to the unresolved question below of whether collection literals have a natural type.

  • Stack allocations for huge collections might blow the stack. Should the compiler have a heuristic for placing this data on the heap? Should the language be unspecified to allow for this flexibility? We should follow the spec for params Span<T>.

  • Do we need to target-type spread_element? Consider, for example:

    Span<int> span = [a, ..b ? [c] : [d, e], f];
    

    Note: this may commonly come up in the following form to allow conditional inclusion of some set of elements, or nothing if the condition is false:

    Span<int> span = [a, ..b ? [c, d, e] : [], f];
    

    In order to evaluate this full literal, we need to evaluate the element expressions within. That means being able to evaluate b ? [c] : [d, e]. However, absent a target type to evaluate this expression in the context of, and absent any sort of natural type, this would we would be unable to determine what to do with either [c] or [d, e] here.

    To resolve this, we could say that when evaluating a literal's spread_element expression, there was an implicit target type equivalent to the target type of the literal itself. So, in the above, that would be rewritten as:

    int __e1 = a;
    Span<int> __s1 = b ? [c] : [d, e];
    int __e2 = f;
    
    Span<int> __result = stackalloc int[2 + __s1.Length];
    int __index = 0;
    
    __result[__index++] = a;
    foreach (int __t in __s1)
      __result[index++] = __t;
    __result[__index++] = f;
    
    Span<int> span = __result;
    

Specification of a constructible collection type utilizing a create method is sensitive to the context at which conversion is classified

An existence of the conversion in this case depends on the notion of an iteration type of the collection type. If there is a create method that takes a ReadOnlySpan<T> where T is the iteration type, the conversion exists. Otherwise, it doesn't.

However, an iteration type is sensitive to the context at which foreach is performed. For the same collection type it can be different based on what extension methods are in scope, and it can also be undefined.

That feels fine for the purpose of foreach when the type isn't designed to be foreach-able on itself. If it is, extension methods cannot change how the type is foreach-ed over, no matter what the context is.

However, that feels somewhat strange for a conversion to be context sensitive like that. Effectively the conversion is "unstable". A collection type explicitly designed to be constructible is allowed to leave out a definition of a very important detail - its iteration type. Leaving the type "unconvertible" on itself.

Here is an example:

using System;
using System.Collections.Generic;
using System.Runtime.CompilerServices;

[CollectionBuilder(typeof(MyCollectionBuilder), nameof(MyCollectionBuilder.Create))]
class MyCollection
{
}
class MyCollectionBuilder
{
    public static MyCollection Create(ReadOnlySpan<long> items) => throw null;
    public static MyCollection Create(ReadOnlySpan<string> items) => throw null;
}

namespace Ns1
{
    static class Ext
    {
        public static IEnumerator<long> GetEnumerator(this MyCollection x) => throw null;
    }
    
    class Program
    {
        static void Main()
        {
            foreach (var l in new MyCollection())
            {
                long s = l;
            }
        
            MyCollection x1 = ["a", // error CS0029: Cannot implicitly convert type 'string' to 'long'
                               2];
        }
    }
}

namespace Ns2
{
    static class Ext
    {
        public static IEnumerator<string> GetEnumerator(this MyCollection x) => throw null;
    }
    
    class Program
    {
        static void Main()
        {
            foreach (var l in new MyCollection())
            {
                string s = l;
            }
        
            MyCollection x1 = ["a",
                               2]; // error CS0029: Cannot implicitly convert type 'int' to 'string'
        }
    }
}

namespace Ns3
{
    class Program
    {
        static void Main()
        {
            // error CS1579: foreach statement cannot operate on variables of type 'MyCollection' because 'MyCollection' does not contain a public instance or extension definition for 'GetEnumerator'
            foreach (var l in new MyCollection())
            {
            }
        
            MyCollection x1 = ["a", 2]; // error CS9188: 'MyCollection' has a CollectionBuilderAttribute but no element type.
        }
    }
}

Given the current design, if the type doesn't define iteration type itself, compiler is unable to reliably validate an application of a CollectionBuilder attribute. If we don't know the iteration type, we don't know what the signature of the create method should be. If the iteration type comes from context, there is no guarantee that the type is always going to be used in a similar context.

Params Collections feature is also affected by this. It feels strange to be unable to reliably predict element type of a params parameter at the declaration point. The current proposal also requires to ensure that the create method is at least as accessible as the params collection type. It is impossible to perform this check in a reliable fashion, unless the collection type defines its iteration type itself.

Note, that we also have https://github.com/dotnet/roslyn/issues/69676 opened for compiler, which basically observes the same issue, but talks about it from the perspective of optimization.

Proposal

Require a type utilizing CollectionBuilder attribute to define its iteration type on itself. In other words this means, that the type should either implement IEnumarable/IEnumerable<T>, or it should have public GetEnumerator method with the right signature (this excludes any extension methods).

Also, right now create method is required to "be accessible where the collection expression is used". This is another point of context dependency based on accessibility. The purpose of this method is very similar to the purpose of a user-defined conversion method, and that one must be public. Therefore, we should consider requiring the create method to be public as well.

Conclusion

Approved with modifications LDM-2024-01-08

The notion of iteration type is not applied consistently throughout conversions

  • To a struct or class type that implements System.Collections.Generic.IEnumerable<T> where:
    • For each element Ei there is an implicit conversion to T.

It looks like an assumption is made that T is necessary the iteration type of the struct or class type in this case. However, that assumption is incorrect. Which can lead to a very strange behavior. For example:

using System.Collections;
using System.Collections.Generic;

class MyCollection : IEnumerable<long>
{
    IEnumerator<long> IEnumerable<long>.GetEnumerator() => throw null;
    IEnumerator IEnumerable.GetEnumerator() => throw null;

    public void Add(string l) => throw null;
    
    public IEnumerator<string> GetEnumerator() => throw null; 
}

class Program
{
    static void Main()
    {
        foreach (var l in new MyCollection())
        {
            string s = l; // Iteration type is string
        }
        
        MyCollection x1 = ["a", // error CS0029: Cannot implicitly convert type 'string' to 'long'
                           2];
        MyCollection x2 = new MyCollection() { "b" };
    }
}
  • To a struct or class type that implements System.Collections.IEnumerable and does not implement System.Collections.Generic.IEnumerable<T>.

It looks like implementation assumes that the iteration type is object, but the specification leaves this fact unspecified, and simply doesn't require each element to convert to anything. In general, however, the iteration type is not necessary the object type. Which can be observed in the following example:

using System.Collections;
using System.Collections.Generic;

class MyCollection : IEnumerable
{
    public IEnumerator<string> GetEnumerator() => throw null; 
    IEnumerator IEnumerable.GetEnumerator() => throw null;
}

class Program
{
    static void Main()
    {
        foreach (var l in new MyCollection())
        {
            string s = l; // Iteration type is string
        }
    }
}

The notion of iteration type is fundamental to Params Collections feature. And this issue leads to a strange discrepancy between the two features. For Example:

using System.Collections;
using System.Collections.Generic;

class MyCollection : IEnumerable<long>
{
    IEnumerator<long> IEnumerable<long>.GetEnumerator() => throw null;
    IEnumerator IEnumerable.GetEnumerator() => throw null;

    public IEnumerator<string> GetEnumerator() => throw null; 

    public void Add(long l) => throw null; 
    public void Add(string l) => throw null; 
}

class Program
{
    static void Main()
    {
        Test("2"); // error CS0029: Cannot implicitly convert type 'string' to 'long'
        Test(["2"]); // error CS1503: Argument 1: cannot convert from 'collection expressions' to 'string'
        Test(3); // error CS1503: Argument 1: cannot convert from 'int' to 'string'
        Test([3]); // Ok

        MyCollection x1 = ["2"]; // error CS0029: Cannot implicitly convert type 'string' to 'long'
        MyCollection x2 = [3];
    }

    static void Test(params MyCollection a)
    {
    }
}
using System.Collections;
using System.Collections.Generic;

class MyCollection : IEnumerable
{
    IEnumerator IEnumerable.GetEnumerator() => throw null;

    public IEnumerator<string> GetEnumerator() => throw null; 
    public void Add(object l) => throw null;
}

class Program
{
    static void Main()
    {
        Test("2", 3); // error CS1503: Argument 2: cannot convert from 'int' to 'string'
        Test(["2", 3]); // Ok
    }

    static void Test(params MyCollection a)
    {
    }
}

It will probably be good to align one way or the other.

Proposal

Specify convertibility of struct or class type that implements System.Collections.Generic.IEnumerable<T> or System.Collections.IEnumerable in terms of iteration type and require an implicit conversion for each element Ei to the iteration type.

Conclusion

Approved LDM-2024-01-08

Should collection expression conversion require availability of a minimal set of APIs for construction?

A constructible collection type according to conversions can actually be not constructible, which is likely to lead to some unexpected overload resolution behavior. For example:

class C1
{
    public static void M1(string x)
    {
    }
    public static void M1(char[] x)
    {
    }
    
    void Test()
    {
        M1(['a', 'b']); // error CS0121: The call is ambiguous between the following methods or properties: 'C1.M1(string)' and 'C1.M1(char[])'
    }
}

However, the 'C1.M1(string)' is not a candidate that can be used because:

error CS1729: 'string' does not contain a constructor that takes 0 arguments
error CS1061: 'string' does not contain a definition for 'Add' and no accessible extension method 'Add' accepting a first argument of type 'string' could be found (are you missing a using directive or an assembly reference?)

Here is another example with a user-defined type and a stronger error that doesn't even mention a valid candidate:

using System.Collections;
using System.Collections.Generic;

class C1 : IEnumerable<char>
{
    public static void M1(C1 x)
    {
    }
    public static void M1(char[] x)
    {
    }

    void Test()
    {
        M1(['a', 'b']); // error CS1061: 'C1' does not contain a definition for 'Add' and no accessible extension method 'Add' accepting a first argument of type 'C1' could be found (are you missing a using directive or an assembly reference?)
    }

    public static implicit operator char[](C1 x) => throw null;
    IEnumerator<char> IEnumerable<char>.GetEnumerator() => throw null;
    IEnumerator IEnumerable.GetEnumerator() => throw null;
}

It looks like the situation is very similar to what we used to have with method group to delegate conversions. I.e. there were scenarios where the conversion existed, but was erroneous. We decided to improve that by ensuring that, if conversion is erroneous, then it doesn't exist.

Note, that with "Params Collections" feature we will be running into a similar issue. It might be good to disallow usage of params modifier for not constructible collections. However in the current proposal that check is based on conversions section. Here is an example:

using System.Collections;
using System.Collections.Generic;

class C1 : IEnumerable<char>
{
    public static void M1(params C1 x) // It is probably better to report an error about an invalid `params` modifier
    {
    }
    public static void M1(params ushort[] x)
    {
    }

    void Test()
    {
        M1('a', 'b'); // error CS1061: 'C1' does not contain a definition for 'Add' and no accessible extension method 'Add' accepting a first argument of type 'C1' could be found (are you missing a using directive or an assembly reference?)
        M2('a', 'b'); // Ok
    }

    public static void M2(params ushort[] x)
    {
    }

    IEnumerator<char> IEnumerable<char>.GetEnumerator() => throw null;
    IEnumerator IEnumerable.GetEnumerator() => throw null;
}

It looks like the issue was somewhat discussed previously, see https://github.com/dotnet/csharplang/blob/main/meetings/2023/LDM-2023-10-02.md#collection-expressions. At that time an argument was made that the rules, as specified right now, are consistent with how interpolated string handlers are specified. Here is a quote:

In particular, interpolated string handlers were originally specified this way, but we revised the specification after considering this issue.

While there is some similarity, there is also an important distinction worth considering. Here is a quote from https://github.com/dotnet/csharplang/blob/main/proposals/csharp-10.0/improved-interpolated-strings.md#interpolated-string-handler-conversion:

Type T is said to be an applicable_interpolated_string_handler_type if it is attributed with System.Runtime.CompilerServices.InterpolatedStringHandlerAttribute. There exists an implicit interpolated_string_handler_conversion to T from an interpolated_string_expression, or an additive_expression composed entirely of _interpolated_string_expression_s and using only + operators.

The target type must have a special attribute which is a strong indicator of author's intent for the type to be an interpolated string handler. It is fair to assume that presence of the attribute is not a coincidence. In contrast, the fact that a type is "enumerable", doesn't necessary mean that there was author's intent for the type to be constructible. A presence of a create method, however, which is indicated with a [CollectionBuilder(...)] attribute on the collection type, feels like a strong indicator of author's intent for the type to be constructible.

Proposal

For a struct or class type that implements System.Collections.IEnumerable and that does not have a create method conversions section should require presence of at least the following APIs:

  • An accessible constructor that is applicable with no arguments.
  • An accessible Add instance or extension method that can be invoked with value of iteration type as the argument.

For the purpose of Params Collectons feature, such types are valid params types when these APIs are declared public and are instance (vs. extension) methods.

Conclusion

Approved with modifications LDM-2024-01-10

Design meetings

https://github.com/dotnet/csharplang/blob/main/meetings/2021/LDM-2021-11-01.md#collection-literals https://github.com/dotnet/csharplang/blob/main/meetings/2022/LDM-2022-03-09.md#ambiguity-of--in-collection-expressions https://github.com/dotnet/csharplang/blob/main/meetings/2022/LDM-2022-09-28.md#collection-literals https://github.com/dotnet/csharplang/blob/main/meetings/2024/LDM-2024-01-08.md https://github.com/dotnet/csharplang/blob/main/meetings/2024/LDM-2024-01-10.md

Working group meetings

https://github.com/dotnet/csharplang/blob/main/meetings/working-groups/collection-literals/CL-2022-10-06.md https://github.com/dotnet/csharplang/blob/main/meetings/working-groups/collection-literals/CL-2022-10-14.md https://github.com/dotnet/csharplang/blob/main/meetings/working-groups/collection-literals/CL-2022-10-21.md https://github.com/dotnet/csharplang/blob/main/meetings/working-groups/collection-literals/CL-2023-04-05.md https://github.com/dotnet/csharplang/blob/main/meetings/working-groups/collection-literals/CL-2023-04-28.md https://github.com/dotnet/csharplang/blob/main/meetings/working-groups/collection-literals/CL-2023-05-26.md https://github.com/dotnet/csharplang/blob/main/meetings/working-groups/collection-literals/CL-2023-06-12.md https://github.com/dotnet/csharplang/blob/main/meetings/working-groups/collection-literals/CL-2023-06-26.md https://github.com/dotnet/csharplang/blob/main/meetings/working-groups/collection-literals/CL-2023-08-03.md https://github.com/dotnet/csharplang/blob/main/meetings/working-groups/collection-literals/CL-2023-08-10.md

Upcoming agenda items

  • Stack allocations for huge collections might blow the stack. Should the compiler have a heuristic for placing this data on the heap? Should the language be unspecified to allow for this flexibility? We should follow what the spec/impl does for params Span<T>. Options are:

    • Always stackalloc. Teach people to be careful with Span. This allows things like Span<T> span = [1, 2, ..s] to work, and be fine as long as s is small. If this could blow the stack, users could always create an array instead, and then get a span around this. This seems like the most in line with what people might want, but with extreme danger.
    • Only stackalloc when the literal has a fixed number of elements (i.e. no spread elements). This then likely makes things always safe, with fixed stack usage, and the compiler (hopefully) able to reuse that fixed buffer. However, it means things like [1, 2, ..s] would never be possible, even if the user knows it is completely safe at runtime.
  • How does overload resolution work? If an API has:

    public void M(T[] values);
    public void M(List<T> values);
    

    What happens with M([1, 2, 3])? We likely need to define 'betterness' for these conversions.

  • Should we expand on collection initializers to look for the very common AddRange method? It could be used by the underlying constructed type to perform adding of spread elements potentially more efficiently. We might also want to look for things like .CopyTo as well. There may be drawbacks here as those methods might end up causing excess allocations/dispatches versus directly enumerating in the translated code.

  • Generic type inference should be updated to flow type information to/from collection literals. For example:

    void M<T>(T[] values);
    M([1, 2, 3]);
    

    It seems natural that this should be something the inference algorithm can be made aware of. Once this is supported for the 'base' constructible collection type cases (T[], I<T>, Span<T> new T()), then it should also fall out of the Collect(constructible_type) case. For example:

    void M<T>(ImmutableArray<T> values);
    M([1, 2, 3]);
    

    Here, Immutable<T> is constructible through an init void Construct(T[] values) method. So the T[] values type would be used with inference against [1, 2, 3] leading to an inference of int for T.

  • Cast/Index ambiguity.

    Today the following is an expression that is indexed into

    var v = (Expr)[1, 2, 3];
    

    But it would be nice to be able to do things like:

    var v = (ImmutableArray<int>)[1, 2, 3];
    

    Can/should we take a break here?

  • Syntactic ambiguities with ?[.

    It might be worthwhile to change the rules for nullable index access to state that no space can occur between ? and [. That would be a breaking change (but likely minor as VS already forces those together if you type them with a space). If we do this, then we can have x?[y] be parsed differently than x ? [y].

    A similar thing occurs if we want to go with https://github.com/dotnet/csharplang/issues/2926. In that world x?.y is ambiguous with x ? .y. If we require the ?. to abut, we can syntactically distinguish the two cases trivially.