Muokkaa

Jaa


Unsafe code, pointer types, and function pointers

Most of the C# code you write is "verifiably safe code." Verifiably safe code means .NET tools can verify that the code is safe. In general, safe code doesn't directly access memory using pointers. It also doesn't allocate raw memory. It creates managed objects instead.

C# supports an unsafe context, in which you may write unverifiable code. In an unsafe context, code may use pointers, allocate and free blocks of memory, and call methods using function pointers. Unsafe code in C# isn't necessarily dangerous; it's just code whose safety cannot be verified.

Unsafe code has the following properties:

  • Methods, types, and code blocks can be defined as unsafe.
  • In some cases, unsafe code may increase an application's performance by removing array bounds checks.
  • Unsafe code is required when you call native functions that require pointers.
  • Using unsafe code introduces security and stability risks.
  • The code that contains unsafe blocks must be compiled with the AllowUnsafeBlocks compiler option.

Pointer types

In an unsafe context, a type may be a pointer type, in addition to a value type, or a reference type. A pointer type declaration takes one of the following forms:

type* identifier;
void* identifier; //allowed but not recommended

The type specified before the * in a pointer type is called the referent type. Only an unmanaged type can be a referent type.

Pointer types don't inherit from object and no conversions exist between pointer types and object. Also, boxing and unboxing don't support pointers. However, you can convert between different pointer types and between pointer types and integral types.

When you declare multiple pointers in the same declaration, you write the asterisk (*) together with the underlying type only. It isn't used as a prefix to each pointer name. For example:

int* p1, p2, p3;   // Ok
int *p1, *p2, *p3;   // Invalid in C#

A pointer can't point to a reference or to a struct that contains references, because an object reference can be garbage collected even if a pointer is pointing to it. The garbage collector doesn't keep track of whether an object is being pointed to by any pointer types.

The value of the pointer variable of type MyType* is the address of a variable of type MyType. The following are examples of pointer type declarations:

  • int* p: p is a pointer to an integer.
  • int** p: p is a pointer to a pointer to an integer.
  • int*[] p: p is a single-dimensional array of pointers to integers.
  • char* p: p is a pointer to a char.
  • void* p: p is a pointer to an unknown type.

The pointer indirection operator * can be used to access the contents at the location pointed to by the pointer variable. For example, consider the following declaration:

int* myVariable;

The expression *myVariable denotes the int variable found at the address contained in myVariable.

There are several examples of pointers in the articles on the fixed statement. The following example uses the unsafe keyword and the fixed statement, and shows how to increment an interior pointer. You can paste this code into the Main function of a console application to run it. These examples must be compiled with the AllowUnsafeBlocks compiler option set.

// Normal pointer to an object.
int[] a = [10, 20, 30, 40, 50];
// Must be in unsafe code to use interior pointers.
unsafe
{
    // Must pin object on heap so that it doesn't move while using interior pointers.
    fixed (int* p = &a[0])
    {
        // p is pinned as well as object, so create another pointer to show incrementing it.
        int* p2 = p;
        Console.WriteLine(*p2);
        // Incrementing p2 bumps the pointer by four bytes due to its type ...
        p2 += 1;
        Console.WriteLine(*p2);
        p2 += 1;
        Console.WriteLine(*p2);
        Console.WriteLine("--------");
        Console.WriteLine(*p);
        // Dereferencing p and incrementing changes the value of a[0] ...
        *p += 1;
        Console.WriteLine(*p);
        *p += 1;
        Console.WriteLine(*p);
    }
}

Console.WriteLine("--------");
Console.WriteLine(a[0]);

/*
Output:
10
20
30
--------
10
11
12
--------
12
*/

You can't apply the indirection operator to a pointer of type void*. However, you can use a cast to convert a void pointer to any other pointer type, and vice versa.

A pointer can be null. Applying the indirection operator to a null pointer causes an implementation-defined behavior.

Passing pointers between methods can cause undefined behavior. Consider a method that returns a pointer to a local variable through an in, out, or ref parameter or as the function result. If the pointer was set in a fixed block, the variable to which it points may no longer be fixed.

The following table lists the operators and statements that can operate on pointers in an unsafe context:

Operator/Statement Use
* Performs pointer indirection.
-> Accesses a member of a struct through a pointer.
[] Indexes a pointer.
& Obtains the address of a variable.
++ and -- Increments and decrements pointers.
+ and - Performs pointer arithmetic.
==, !=, <, >, <=, and >= Compares pointers.
stackalloc Allocates memory on the stack.
fixed statement Temporarily fixes a variable so that its address may be found.

For more information about pointer-related operators, see Pointer-related operators.

Any pointer type can be implicitly converted to a void* type. Any pointer type can be assigned the value null. Any pointer type can be explicitly converted to any other pointer type using a cast expression. You can also convert any integral type to a pointer type, or any pointer type to an integral type. These conversions require an explicit cast.

The following example converts an int* to a byte*. Notice that the pointer points to the lowest addressed byte of the variable. When you successively increment the result, up to the size of int (4 bytes), you can display the remaining bytes of the variable.

int number = 1024;

unsafe
{
    // Convert to byte:
    byte* p = (byte*)&number;

    System.Console.Write("The 4 bytes of the integer:");

    // Display the 4 bytes of the int variable:
    for (int i = 0 ; i < sizeof(int) ; ++i)
    {
        System.Console.Write(" {0:X2}", *p);
        // Increment the pointer:
        p++;
    }
    System.Console.WriteLine();
    System.Console.WriteLine("The value of the integer: {0}", number);

    /* Output:
        The 4 bytes of the integer: 00 04 00 00
        The value of the integer: 1024
    */
}

Fixed-size buffers

You can use the fixed keyword to create a buffer with a fixed-size array in a data structure. Fixed-size buffers are useful when you write methods that interoperate with data sources from other languages or platforms. The fixed-size buffer can take any attributes or modifiers that are allowed for regular struct members. The only restriction is that the array type must be bool, byte, char, short, int, long, sbyte, ushort, uint, ulong, float, or double.

private fixed char name[30];

In safe code, a C# struct that contains an array doesn't contain the array elements. The struct contains a reference to the elements instead. You can embed an array of fixed size in a struct when it's used in an unsafe code block.

The size of the following struct doesn't depend on the number of elements in the array, since pathName is a reference:

public struct PathArray
{
    public char[] pathName;
    private int reserved;
}

A struct can contain an embedded array in unsafe code. In the following example, the fixedBuffer array has a fixed size. You use a fixed statement to get a pointer to the first element. You access the elements of the array through this pointer. The fixed statement pins the fixedBuffer instance field to a specific location in memory.

internal unsafe struct Buffer
{
    public fixed char fixedBuffer[128];
}

internal unsafe class Example
{
    public Buffer buffer = default;
}

private static void AccessEmbeddedArray()
{
    var example = new Example();

    unsafe
    {
        // Pin the buffer to a fixed location in memory.
        fixed (char* charPtr = example.buffer.fixedBuffer)
        {
            *charPtr = 'A';
        }
        // Access safely through the index:
        char c = example.buffer.fixedBuffer[0];
        Console.WriteLine(c);

        // Modify through the index:
        example.buffer.fixedBuffer[0] = 'B';
        Console.WriteLine(example.buffer.fixedBuffer[0]);
    }
}

The size of the 128 element char array is 256 bytes. Fixed-size char buffers always take 2 bytes per character, regardless of the encoding. This array size is the same even when char buffers are marshalled to API methods or structs with CharSet = CharSet.Auto or CharSet = CharSet.Ansi. For more information, see CharSet.

The preceding example demonstrates accessing fixed fields without pinning. Another common fixed-size array is the bool array. The elements in a bool array are always 1 byte in size. bool arrays aren't appropriate for creating bit arrays or buffers.

Fixed-size buffers are compiled with the System.Runtime.CompilerServices.UnsafeValueTypeAttribute, which instructs the common language runtime (CLR) that a type contains an unmanaged array that can potentially overflow. Memory allocated using stackalloc also automatically enables buffer overrun detection features in the CLR. The previous example shows how a fixed-size buffer could exist in an unsafe struct.

internal unsafe struct Buffer
{
    public fixed char fixedBuffer[128];
}

The compiler-generated C# for Buffer is attributed as follows:

internal struct Buffer
{
    [StructLayout(LayoutKind.Sequential, Size = 256)]
    [CompilerGenerated]
    [UnsafeValueType]
    public struct <fixedBuffer>e__FixedBuffer
    {
        public char FixedElementField;
    }

    [FixedBuffer(typeof(char), 128)]
    public <fixedBuffer>e__FixedBuffer fixedBuffer;
}

Fixed-size buffers differ from regular arrays in the following ways:

  • May only be used in an unsafe context.
  • May only be instance fields of structs.
  • They're always vectors, or one-dimensional arrays.
  • The declaration should include the length, such as fixed char id[8]. You can't use fixed char id[].

How to use pointers to copy an array of bytes

The following example uses pointers to copy bytes from one array to another.

This example uses the unsafe keyword, which enables you to use pointers in the Copy method. The fixed statement is used to declare pointers to the source and destination arrays. The fixed statement pins the location of the source and destination arrays in memory so that they will not be moved by garbage collection. The memory blocks for the arrays are unpinned when the fixed block is completed. Because the Copy method in this example uses the unsafe keyword, it must be compiled with the AllowUnsafeBlocks compiler option.

This example accesses the elements of both arrays using indices rather than a second unmanaged pointer. The declaration of the pSource and pTarget pointers pins the arrays.

static unsafe void Copy(byte[] source, int sourceOffset, byte[] target,
    int targetOffset, int count)
{
    // If either array is not instantiated, you cannot complete the copy.
    if ((source == null) || (target == null))
    {
        throw new System.ArgumentException("source or target is null");
    }

    // If either offset, or the number of bytes to copy, is negative, you
    // cannot complete the copy.
    if ((sourceOffset < 0) || (targetOffset < 0) || (count < 0))
    {
        throw new System.ArgumentException("offset or bytes to copy is negative");
    }

    // If the number of bytes from the offset to the end of the array is
    // less than the number of bytes you want to copy, you cannot complete
    // the copy.
    if ((source.Length - sourceOffset < count) ||
        (target.Length - targetOffset < count))
    {
        throw new System.ArgumentException("offset to end of array is less than bytes to be copied");
    }

    // The following fixed statement pins the location of the source and
    // target objects in memory so that they will not be moved by garbage
    // collection.
    fixed (byte* pSource = source, pTarget = target)
    {
        // Copy the specified number of bytes from source to target.
        for (int i = 0; i < count; i++)
        {
            pTarget[targetOffset + i] = pSource[sourceOffset + i];
        }
    }
}

static void UnsafeCopyArrays()
{
    // Create two arrays of the same length.
    int length = 100;
    byte[] byteArray1 = new byte[length];
    byte[] byteArray2 = new byte[length];

    // Fill byteArray1 with 0 - 99.
    for (int i = 0; i < length; ++i)
    {
        byteArray1[i] = (byte)i;
    }

    // Display the first 10 elements in byteArray1.
    System.Console.WriteLine("The first 10 elements of the original are:");
    for (int i = 0; i < 10; ++i)
    {
        System.Console.Write(byteArray1[i] + " ");
    }
    System.Console.WriteLine("\n");

    // Copy the contents of byteArray1 to byteArray2.
    Copy(byteArray1, 0, byteArray2, 0, length);

    // Display the first 10 elements in the copy, byteArray2.
    System.Console.WriteLine("The first 10 elements of the copy are:");
    for (int i = 0; i < 10; ++i)
    {
        System.Console.Write(byteArray2[i] + " ");
    }
    System.Console.WriteLine("\n");

    // Copy the contents of the last 10 elements of byteArray1 to the
    // beginning of byteArray2.
    // The offset specifies where the copying begins in the source array.
    int offset = length - 10;
    Copy(byteArray1, offset, byteArray2, 0, length - offset);

    // Display the first 10 elements in the copy, byteArray2.
    System.Console.WriteLine("The first 10 elements of the copy are:");
    for (int i = 0; i < 10; ++i)
    {
        System.Console.Write(byteArray2[i] + " ");
    }
    System.Console.WriteLine("\n");
    /* Output:
        The first 10 elements of the original are:
        0 1 2 3 4 5 6 7 8 9

        The first 10 elements of the copy are:
        0 1 2 3 4 5 6 7 8 9

        The first 10 elements of the copy are:
        90 91 92 93 94 95 96 97 98 99
    */
}

Function pointers

C# provides delegate types to define safe function pointer objects. Invoking a delegate involves instantiating a type derived from System.Delegate and making a virtual method call to its Invoke method. This virtual call uses the callvirt IL instruction. In performance critical code paths, using the calli IL instruction is more efficient.

You can define a function pointer using the delegate* syntax. The compiler will call the function using the calli instruction rather than instantiating a delegate object and calling Invoke. The following code declares two methods that use a delegate or a delegate* to combine two objects of the same type. The first method uses a System.Func<T1,T2,TResult> delegate type. The second method uses a delegate* declaration with the same parameters and return type:

public static T Combine<T>(Func<T, T, T> combinator, T left, T right) => 
    combinator(left, right);

public static T UnsafeCombine<T>(delegate*<T, T, T> combinator, T left, T right) => 
    combinator(left, right);

The following code shows how you would declare a static local function and invoke the UnsafeCombine method using a pointer to that local function:

static int localMultiply(int x, int y) => x * y;
int product = UnsafeCombine(&localMultiply, 3, 4);

The preceding code illustrates several of the rules on the function accessed as a function pointer:

  • Function pointers can only be declared in an unsafe context.
  • Methods that take a delegate* (or return a delegate*) can only be called in an unsafe context.
  • The & operator to obtain the address of a function is allowed only on static functions. (This rule applies to both member functions and local functions).

The syntax has parallels with declaring delegate types and using pointers. The * suffix on delegate indicates the declaration is a function pointer. The & when assigning a method group to a function pointer indicates the operation takes the address of the method.

You can specify the calling convention for a delegate* using the keywords managed and unmanaged. In addition, for unmanaged function pointers, you can specify the calling convention. The following declarations show examples of each. The first declaration uses the managed calling convention, which is the default. The next four use an unmanaged calling convention. Each specifies one of the ECMA 335 calling conventions: Cdecl, Stdcall, Fastcall, or Thiscall. The last declaration uses the unmanaged calling convention, instructing the CLR to pick the default calling convention for the platform. The CLR will choose the calling convention at run time.

public static T ManagedCombine<T>(delegate* managed<T, T, T> combinator, T left, T right) =>
    combinator(left, right);
public static T CDeclCombine<T>(delegate* unmanaged[Cdecl]<T, T, T> combinator, T left, T right) =>
    combinator(left, right);
public static T StdcallCombine<T>(delegate* unmanaged[Stdcall]<T, T, T> combinator, T left, T right) =>
    combinator(left, right);
public static T FastcallCombine<T>(delegate* unmanaged[Fastcall]<T, T, T> combinator, T left, T right) =>
    combinator(left, right);
public static T ThiscallCombine<T>(delegate* unmanaged[Thiscall]<T, T, T> combinator, T left, T right) =>
    combinator(left, right);
public static T UnmanagedCombine<T>(delegate* unmanaged<T, T, T> combinator, T left, T right) =>
    combinator(left, right);

You can learn more about function pointers in the Function pointer feature spec.

C# language specification

For more information, see the Unsafe code chapter of the C# language specification.