Tutorial: Write queries in C# using language integrated query (LINQ)

In this tutorial, you create a data source and write several LINQ queries. You can experiment with the query expressions and see the differences in the results. This walkthrough demonstrates the C# language features that are used to write LINQ query expressions. You can follow along and build the app and experiment with the queries yourself. This article assumes you've installed the latest .NET SDK. If not, go to the .NET Downloads page and install the latest version on your machine.

First, create the application. From the console, type the following command:

dotnet new console -o WalkthroughWritingLinqQueries

Or, if you prefer Visual Studio, create a new console application named WalkthroughWritingLinqQueries.

Create an in-memory data source

The first step is to create a data source for your queries. The data source for the queries is a simple list of Student records. Each Student record has a first name, family name, and an array of integers that represents their test scores in the class. Add a new file named students.cs, and copy the following code into that file:

namespace WalkthroughWritingLinqQueries;

public record Student(string First, string Last, int ID, int[] Scores);

Note the following characteristics:

  • The Student record consists of automatically implemented properties.
  • Each student in the list is initialized with the primary constructor.
  • The sequence of scores for each student is initialized with a primary constructor.

Next, create a sequence of Student records that serves as the source of this query. Open Program.cs, and remove the following boilerplate code:

// See https://aka.ms/new-console-template for more information
Console.WriteLine("Hello, World!");

Replace it with the following code that creates a sequence of Student records:

using WalkthroughWritingLinqQueries;

// Create a data source by using a collection initializer.
IEnumerable<Student> students =
[
    new Student(First: "Svetlana", Last: "Omelchenko", ID: 111, Scores: [97, 92, 81, 60]),
    new Student(First: "Claire",   Last: "O'Donnell",  ID: 112, Scores: [75, 84, 91, 39]),
    new Student(First: "Sven",     Last: "Mortensen",  ID: 113, Scores: [88, 94, 65, 91]),
    new Student(First: "Cesar",    Last: "Garcia",     ID: 114, Scores: [97, 89, 85, 82]),
    new Student(First: "Debra",    Last: "Garcia",     ID: 115, Scores: [35, 72, 91, 70]),
    new Student(First: "Fadi",     Last: "Fakhouri",   ID: 116, Scores: [99, 86, 90, 94]),
    new Student(First: "Hanying",  Last: "Feng",       ID: 117, Scores: [93, 92, 80, 87]),
    new Student(First: "Hugo",     Last: "Garcia",     ID: 118, Scores: [92, 90, 83, 78]),

    new Student("Lance",   "Tucker",      119, [68, 79, 88, 92]),
    new Student("Terry",   "Adams",       120, [99, 82, 81, 79]),
    new Student("Eugene",  "Zabokritski", 121, [96, 85, 91, 60]),
    new Student("Michael", "Tucker",      122, [94, 92, 91, 91])
];
  • The sequence of students is initialized with a collection expression.
  • The Student record type holds the static list of all students.
  • Some of the constructor calls use named arguments to clarify which argument matches which constructor parameter.

Try adding a few more students with different test scores to the list of students to get more familiar with the code so far.

Create the query

Next, you create your first query. Your query, when you execute it, produces a list of all students whose score on the first test was greater than 90. Because the whole Student object is selected, the type of the query is IEnumerable<Student>. Although the code could also use implicit typing by using the var keyword, explicit typing is used to clearly illustrate results. (For more information about var, see Implicitly Typed Local Variables.) Add the following code to Program.cs, after the code that creates the sequence of students:

// Create the query.
// The first line could also be written as "var studentQuery ="
IEnumerable<Student> studentQuery =
    from student in students
    where student.Scores[0] > 90
    select student;

The query's range variable, student, serves as a reference to each Student in the source, providing member access for each object.

Run the query

Now write the foreach loop that causes the query to execute. Each element in the returned sequence is accessed through the iteration variable in the foreach loop. The type of this variable is Student, and the type of the query variable is compatible, IEnumerable<Student>. After you added the following code, build and run the application to see the results in the Console window.

// Execute the query.
// var could be used here also.
foreach (Student student in studentQuery)
{
    Console.WriteLine($"{student.Last}, {student.First}");
}

// Output:
// Omelchenko, Svetlana
// Garcia, Cesar
// Fakhouri, Fadi
// Feng, Hanying
// Garcia, Hugo
// Adams, Terry
// Zabokritski, Eugene
// Tucker, Michael

To further refine the query, you can combine multiple Boolean conditions in the where clause. The following code adds a condition so that the query returns those students whose first score was over 90 and whose last score was less than 80. The where clause should resemble the following code.

where student.Scores[0] > 90 && student.Scores[3] < 80  

Try the preceding where clause, or experiment yourself with other filter conditions. For more information, see where clause.

Order the query results

It's easier to scan the results if they are in some kind of order. You can order the returned sequence by any accessible field in the source elements. For example, the following orderby clause orders the results in alphabetical order from A to Z according to the family name of each student. Add the following orderby clause to your query, right after the where statement and before the select statement:

orderby student.Last ascending

Now change the orderby clause so that it orders the results in reverse order according to the score on the first test, from the highest score to the lowest score.

orderby student.Scores[0] descending

Change the WriteLine format string so that you can see the scores:

Console.WriteLine($"{student.Last}, {student.First} {student.Scores[0]}");

For more information, see orderby clause.

Group the results

Grouping is a powerful capability in query expressions. A query with a group clause produces a sequence of groups, and each group itself contains a Key and a sequence that consists of all the members of that group. The following new query groups the students by using the first letter of their family name as the key.

IEnumerable<IGrouping<char, Student>> studentQuery =
    from student in students
    group student by student.Last[0];

The type of the query changed. It now produces a sequence of groups that have a char type as a key, and a sequence of Student objects. The code in the foreach execution loop also must change:

foreach (IGrouping<char, Student> studentGroup in studentQuery)
{
    Console.WriteLine(studentGroup.Key);
    foreach (Student student in studentGroup)
    {
        Console.WriteLine($"   {student.Last}, {student.First}");
    }
}
// Output:
// O
//   Omelchenko, Svetlana
//   O'Donnell, Claire
// M
//   Mortensen, Sven
// G
//   Garcia, Cesar
//   Garcia, Debra
//   Garcia, Hugo
// F
//   Fakhouri, Fadi
//   Feng, Hanying
// T
//   Tucker, Lance
//   Tucker, Michael
// A
//   Adams, Terry
// Z
//   Zabokritski, Eugene

Run the application and view the results in the Console window. For more information, see group clause.

Explicitly coding IEnumerables of IGroupings can quickly become tedious. Write the same query and foreach loop much more conveniently by using var. The var keyword doesn't change the types of your objects; it just instructs the compiler to infer the types. Change the type of studentQuery and the iteration variable group to var and rerun the query. In the inner foreach loop, the iteration variable is still typed as Student, and the query works as before. Change the student iteration variable to var and run the query again. You see that you get exactly the same results.

IEnumerable<IGrouping<char, Student>> studentQuery =
    from student in students
    group student by student.Last[0];

foreach (IGrouping<char, Student> studentGroup in studentQuery)
{
    Console.WriteLine(studentGroup.Key);
    foreach (Student student in studentGroup)
    {
        Console.WriteLine($"   {student.Last}, {student.First}");
    }
}

For more information about var, see Implicitly Typed Local Variables.

Order the groups by their key value

The groups in the previous query aren't in alphabetical order. You can provide an orderby clause after the group clause. But to use an orderby clause, you first need an identifier that serves as a reference to the groups created by the group clause. You provide the identifier by using the into keyword, as follows:

var studentQuery4 =
    from student in students
    group student by student.Last[0] into studentGroup
    orderby studentGroup.Key
    select studentGroup;

foreach (var groupOfStudents in studentQuery4)
{
    Console.WriteLine(groupOfStudents.Key);
    foreach (var student in groupOfStudents)
    {
        Console.WriteLine($"   {student.Last}, {student.First}");
    }
}

// Output:
//A
//   Adams, Terry
//F
//   Fakhouri, Fadi
//   Feng, Hanying
//G
//   Garcia, Cesar
//   Garcia, Debra
//   Garcia, Hugo
//M
//   Mortensen, Sven
//O
//   Omelchenko, Svetlana
//   O'Donnell, Claire
//T
//   Tucker, Lance
//   Tucker, Michael
//Z
//   Zabokritski, Eugene

Run this query, and the groups are now sorted in alphabetical order.

You can use the let keyword to introduce an identifier for any expression result in the query expression. This identifier can be a convenience, as in the following example. It can also enhance performance by storing the results of an expression so that it doesn't have to be calculated multiple times.

// This query returns those students whose
// first test score was higher than their
// average score.
var studentQuery5 =
    from student in students
    let totalScore = student.Scores[0] + student.Scores[1] +
        student.Scores[2] + student.Scores[3]
    where totalScore / 4 < student.Scores[0]
    select $"{student.Last}, {student.First}";

foreach (string s in studentQuery5)
{
    Console.WriteLine(s);
}

// Output:
// Omelchenko, Svetlana
// O'Donnell, Claire
// Mortensen, Sven
// Garcia, Cesar
// Fakhouri, Fadi
// Feng, Hanying
// Garcia, Hugo
// Adams, Terry
// Zabokritski, Eugene
// Tucker, Michael

For more information, see the article on the let clause.

Use method syntax in a query expression

As described in Query Syntax and Method Syntax in LINQ, some query operations can only be expressed by using method syntax. The following code calculates the total score for each Student in the source sequence, and then calls the Average() method on the results of that query to calculate the average score of the class.

var studentQuery =
    from student in students
    let totalScore = student.Scores[0] + student.Scores[1] +
        student.Scores[2] + student.Scores[3]
    select totalScore;

double averageScore = studentQuery.Average();
Console.WriteLine("Class average score = {0}", averageScore);

// Output:
// Class average score = 334.166666666667

To transform or project in the select clause

It's common for a query to produce a sequence whose elements differ from the elements in the source sequences. Delete or comment out your previous query and execution loop, and replace it with the following code. The query returns a sequence of strings (not Students), and this fact is reflected in the foreach loop.

IEnumerable<string> studentQuery =
    from student in students
    where student.Last == "Garcia"
    select student.First;

Console.WriteLine("The Garcias in the class are:");
foreach (string s in studentQuery)
{
    Console.WriteLine(s);
}

// Output:
// The Garcias in the class are:
// Cesar
// Debra
// Hugo

Code earlier in this walkthrough indicated that the average class score is approximately 334. To produce a sequence of Students whose total score is greater than the class average, together with their Student ID, you can use an anonymous type in the select statement:

var aboveAverageQuery =
    from student in students
    let x = student.Scores[0] + student.Scores[1] +
        student.Scores[2] + student.Scores[3]
    where x > averageScore
    select new { id = student.ID, score = x };

foreach (var item in aboveAverageQuery)
{
    Console.WriteLine("Student ID: {0}, Score: {1}", item.id, item.score);
}

// Output:
// Student ID: 113, Score: 338
// Student ID: 114, Score: 353
// Student ID: 116, Score: 369
// Student ID: 117, Score: 352
// Student ID: 118, Score: 343
// Student ID: 120, Score: 341
// Student ID: 122, Score: 368