Share via


Introduction to Grouping with LINQ/Lambda (VB.NET) Part 1

Introduction

Grouping information from a variety of data sources in a Visual Basic project for the purpose of categorizing data by one or more key elements outside of a reporting component will be discussed utilizing Language-Integrated Query LINQ and lambda (technically, this is called an “expression lambda”) using GroupBy clause and Enumerable.GroupBy method. Both anonymous and strong typed result sets examples used along with how to transform an anonymous result set when the results are created in a function needed in another method. Also covered, when to use LINQ verses lambda as there are advantages and disadvantages to both depending on a task to perform.

There are countless ways to group data while examples within are basic to intermediate level form the base for advance level grouping.  While learning how to group information what should not be done is attempting to fit code to a task without understanding both the code and task, otherwise simply copying and pasting code more likely than not will fail.

In the various code samples which follow different data sources and scenarios are used for exposure to real world task along with providing a broader scope of what can be done with grouping data. 

Part 2 LINQ/Lambda Group by multiple properties (VB.NET)

Requirements 

Basic LINQ Group by

Important
For proper learning starting out simple allows understanding how parts of a group by query statement works. 

The first group by task counts the occurrences of each character in a given string in a console application where results are written to the console output.  The code is in a procedure called by the main procedure.

Module Module1
 
    Sub Main()
        BasicExample()
 
        Console.ReadLine()
    End Sub
    Sub BasicExample()
        Dim InputString As String  = "T0*A1?0*23aTA3 4T4\+a4 ?407#?A*6T+"
 
        Dim characterGroup =
                (
                    From character In  InputString.ToCharArray()
                    Group characterElement = character By charElement = character Into Group
                    Select New  With {.Character = charElement, .Occurrences = Group.Count}
                ).
                ToList.
                OrderBy(Function(anonymous) anonymous.Character)
 
        Console.WriteLine($"Char    Occurrences")
        For Each  currentItem In  characterGroup
            Console.WriteLine($"[{currentItem.Character}] {currentItem.Occurrences,5}")
        Next
 
    End Sub
 
End Module

The variable containing results, characterGroup is an anonymous type which is only available within the procedure, to use the results a container such as a concrete class or DataTable with matching properties is needed. When working with anonymous type variables generally developers will use single character variable names and/or meaningless names for variable which when working with a simple set of code which is fine until weeks, months or years later maintenance is required to code were the LINQ anonymous statement is within complex code, with meaningless variable names its difficult to follow code logic and one character variables in a LINQ query is hard to hover over these variables during a debug session.

Here is an example that would be hard to maintain while the code sample above variable names are descriptive.

Dim InputString As String  = "T0*A1?0*23aTA3 4T4\+a4 ?407#?A*6T+"
 
Dim results = (From c In InputString.ToCharArray() Group c By c Into Group
               Select New  With {.Item = c, .Occurrences = Group.Count}).ToList.OrderBy(Function(x) x.Item)

Caveat, the query works as expected other than if the input string does not know how to differentiate between casing of letters which will be addressed in another example using IEqualityComparer Interface.

Continuing with this task, the next step is to make the result set available outside of the current code block, in this case a procedure which will move to creating a public container class. Although the result is anonymous the properties are easy enough to figure out, hover the mouse over character in the From clause reveals a Char or hover over various parts to see the types as shown below.

Although this works for a simple query this may not when dealing with complex queries. An easy way to reveal various types is to run the query from a console project as per below where in the For Each statement instead of displaying currentItem.Character and currentItem.Occurrences just the object currentItem is shown.

Dim InputString As String  = "T0*A1?0*23aTA3 4T4\+a4 ?407#?A*6T+"
 
Dim characterGroup =
        (
            From character In  InputString.ToCharArray()
            Group characterElement = character By charElement = character Into Group
            Select New  With {.Character = charElement, .Occurrences = Group.Count}
        ).
        ToList.
        OrderBy(Function(anonymous) anonymous.Character)
 
Console.WriteLine($"Char    Occurrences")
For Each  currentItem In  characterGroup
    Console.WriteLine($"{currentItem} - {currentItem.Character.GetType()} - {currentItem.Occurrences.GetType()}")
Next

The results show both expected results which in this case are the default for the ToString property followed by showing types of each property.

Knowing the types a class can be used which matches up to the types learned above.

Public Class  OccurrenceContainer
    Property Character() As Char
    Property Occurrences() As Integer
 
    Public Overrides  Function ToString() As String
        Return $"{Character} {Occurrences}"
    End Function
End Class

To use the class, replace 

Select New  With {.Character = charElement, .Occurrences = Group.Count}

With

Select New  OccurrenceContainer With {.Character = charElement, .Occurrences = Group.Count}

Which ends up in a function returning type IOrderedEnumerable(Of OccurrenceContainer)

Function BasicExampleStrongTyped() As IOrderedEnumerable(Of OccurrenceContainer)
    Dim InputString As String  = "T0*A1?0*23aTA3 4T4\+a4 ?407#?A*6T+"
 
    Dim characterGroup As IOrderedEnumerable(Of OccurrenceContainer) =
            (
                From character In  InputString.ToCharArray()
                Group characterElement = character By charElement = character Into Group
                Select New  OccurrenceContainer With {.Character = charElement, .Occurrences = Group.Count}
            ).
            ToList.
            OrderBy(Function(anonymous) anonymous.Character)
 
    Console.WriteLine($"Char    Occurrences")
    For Each  currentItem As  OccurrenceContainer In characterGroup
        Console.WriteLine(
            $"{currentItem} - {currentItem.Character.GetType()} - {currentItem.Occurrences.GetType()}")
    Next
 
    Return characterGroup
 
End Function

Suppose a similar task to obtain words rather than characters, little change is needed. Here a list is used, variable names changed to reflect words rather than characters

Sub BasicWordExample()
    Dim wordList = New List(Of String) From {"Hello", "Goodnight",  "Hello", "Good", "Last"}
 
    Dim wordGroup =
            (
                From word In  wordList
                Group wordElement = word By token = word Into Group
                Select New  With {.Word = token, .Occurrences = Group.Count}
            ).
            ToList.
            OrderBy(Function(anonymous) anonymous.Word)
 
 
    For Each  currentWord In  wordGroup
        Console.WriteLine($"{currentWord.Word}, {currentWord.Occurrences}")
    Next
 
End Sub

Just like working with characters the above result is anonymous, only available within the procedure. To fix this a class scoped properly is needed e.g.

Public Class  WordOccurrenceContainer
    Property Word() As String
    Property Occurrences() As Integer
 
    Public Overrides  Function ToString() As String
        Return $"{Word} {Occurrences}"
    End Function
End Class

Now the results are available using a function.

Function WordExampleStrongTyped() As IOrderedEnumerable(Of WordOccurrenceContainer)
    Dim wordList = New List(Of String) From {"Hello", "Goodnight",  "Hello", "Good", "Last"}
 
    Dim wordGroup As IOrderedEnumerable(Of WordOccurrenceContainer) =
            (
                From word In  wordList
                Group wordElement = word By token = word Into Group
                Select New  WordOccurrenceContainer With {.Word = token, .Occurrences = Group.Count}
            ).
            ToList.
            OrderBy(Function(anonymous) anonymous.Word)
 
    Return wordGroup
 
End Function

Group by single property 

Anonymous types
Allow you to select arbitrary fields into data structures that are strongly typed later on in your code

Collections of a class were statistics are required often are done best using a Group By in a LINQ/Lambda query for example, in a school a report is required for listing students by last name. Walking through obtaining this is done projecting using a group by clause and optionally filtering by a count.

Definition of a student starts with a Person class which the Student class inherits.

Public Class  Person
    Public Property  Id() As  Integer
    Public Property  FirstName() As  String
    Public Property  LastName As  String
    Public Property  City As  String
    Public Property  Country As  String
 
    Public Overrides  Function ToString() As String
        Return $"{FirstName} {LastName}"
    End Function
 
End Class

Student class

Public Class  Student
    Inherits Person
    Public Property  Year() As  Integer
    Public ExamScores As List(Of Integer)
 
End Class

For a data source a mocked list is used rather than a database table were using a database table is fine but requires more attention while a mocked list or a delimited text file works better as there are less moving parts.

Namespace Classes
    Public Class  Students
        Public Shared  Function List() As List(Of Student)
            Dim studentList As New  List(Of Student)
 
            studentList.Add(New Student() With {
                               .FirstName = "Mary",
                               .LastName = "Lee",
                               .Year = 2019,
                               .ExamScores = New  List(Of Integer) From {97, 72, 81, 60}})
            studentList.Add(New Student() With {
                               .FirstName = "Jerry",
                               .LastName = "Foster",
                               .Year = 2019,
                               .ExamScores = New  List(Of Integer) From {75, 84, 91, 39}})
            studentList.Add(New Student() With {
                               .FirstName = "Chris",
                               .LastName = "Scott",
                               .Year = 2020,
                               .ExamScores = New  List(Of Integer) From {88, 94, 65, 85}})
            studentList.Add(New Student() With {
                               .FirstName = "Jason",
                               .LastName = "Wong",
                               .Year = 2019, .ExamScores = New  List(Of Integer) From {97, 89, 85, 82}})
            studentList.Add(New Student() With {
                               .FirstName = "James",
                               .LastName = "Lang",
                               .Year = 2020,
                               .ExamScores = New  List(Of Integer) From {35, 72, 91, 70}})
            studentList.Add(New Student() With {
                               .FirstName = "Ming",
                               .LastName = "Lee",
                               .Year = 2019,
                               .ExamScores = New  List(Of Integer) From {55, 62, 95, 75}})
            studentList.Add(New Student() With {
                               .FirstName = "Karen",
                               .LastName = "Scott",
                               .Year = 2020, .ExamScores = New  List(Of Integer) From {45, 82, 81, 80}})
            studentList.Add(New Student() With {
                               .FirstName = "Edward",
                               .LastName = "Lee",
                               .Year = 2019,
                               .ExamScores = New  List(Of Integer) From {55, 72, 81, 90}})
            studentList.Add(New Student() With {
                               .FirstName = "Mary",
                               .LastName = "Smith",
                               .Year = 2019, .ExamScores = New  List(Of Integer) From {75, 77, 91, 80}})
 
            Return studentList
 
        End Function
    End Class
End Namespace

Starting off using an anonymous LINQ query which does not use the classes above, only the mocked data.

Dim students = PersonLibrary.Classes.Students.List()
 
Dim queryLastNames =
        From student In  students
        Group By LastName = student.LastName Into studentGroup = Group
        Where studentGroup.Count() > 1
        Order By LastName
 
 
For Each  currentItem In  queryLastNames
 
    Console.WriteLine(currentItem.LastName)
 
    For Each  student As  Student In  currentItem.studentGroup
        Console.WriteLine($"   {student}")
    Next
 
Next

Results in a console application

If the intent is to use this data outside of the current procedure the query may be altered a strong typed query as done in the character and string examples, simply hover over the query to see the types.

Rather than modify the query above simply use it as shown below which create a strong type result.

Container

Public Class  StudentContainer1
    Public Property  LastName() As  String
    Public Property  List() As  IEnumerable(Of Student)
End Class

Additional query

Dim students = PersonLibrary.Classes.Students.List()
 
Dim queryLastNames =
        From student In  students
        Group By LastName = student.LastName Into studentGroup = Group
        Where studentGroup.Count() > 1
        Order By LastName
 
 
Dim EnumerationOfStudentContainer As IEnumerable(Of StudentContainer1) =
        queryLastNames.Select(Function(item) New  StudentContainer1 With {.LastName = item.LastName, .List = item.studentGroup})
 
For Each  currentItem As  StudentContainer1 In  EnumerationOfStudentContainer
    Console.WriteLine($"{currentItem.LastName}")
    For Each  student As  Student In  currentItem.List
        Console.WriteLine($"   {student}")
    Next
Next

Caveat: a LINQ query can have conditions appended onto a query similarly to what was done above. For example, the following query reads xml data from a file then allows where conditions to be added to the original query.

What should not be done is to merge the two queries together, for instance this is perfect, two queries that are easy to maintain.

Dim students = PersonLibrary.Classes.Students.List()
 
Dim queryLastNames =
        From student In  students
        Group By LastName = student.LastName Into studentGroup = Group
        Where studentGroup.Count() > 1
        Order By LastName
 
 
Dim EnumerationOfStudentContainer As IEnumerable(Of StudentContainer1) =
        queryLastNames.Select(Function(item) New  StudentContainer1 With {
                                 .LastName = item.LastName,
                                 .List = item.studentGroup})

While combing the two queries into one is going to be harder to alter and maintain as shown below. At first this may seem cool but wait a week or two and revisit the code and see if its still cool.

Dim EnumerationOfStudentContainer As IEnumerable(Of StudentContainer1) = 
        (
            From student In  students
            Group By LastName = student.LastName Into studentGroup = Group
            Where studentGroup.Count() > 1 Order By LastName).
        Select(Function(item) New  StudentContainer1 With {
                  .LastName = item.LastName,
                  .List = item.studentGroup})

Group by with min, max, count, average

A common requirement is to obtain count, totals, min, max and averages from a group by. The following example gets max orders for customer orders using Entity Framework where Orders is considered a navigation property of customers.

Public Sub  MaxOrdersByCountry(context As NorthWindContext)
 
    Dim maximumOrdersByCountry =
            From customer In  context.Customers
            Group By customer.Country
            Into MaxOrders = Max(customer.Orders.Count)
 
    Dim results = maximumOrdersByCountry.ToList()
 
    For Each  item In  results
        Console.WriteLine($"{item.Country.Name} - {item.MaxOrders}")
    Next
 
End Sub

Results shown provide count of orders per country.

There is a problem with the query above, using Visual Studio local window the following shows the entire country object is included.

To only bring across country name and max order a select clause is needed.

Dim maximumOrdersByCountry =
        From customer In  context.Customers
        Group customer By customer.Country Into grouping = Group
        Select Country.Name, MaxOrders = grouping.Max(Function(x) x.Orders.Count)

To allow the results to be used by the caller create a class to represent the above properties, Country name and max orders.

Public Class  MaxOrder
    Public Property  Country() As  String
    Public Property  MaxOrders() As  Integer
End Class

Implement in the query

Dim maximumOrdersByCountry =
        From customer In  context.Customers
        Group customer By customer.Country Into grouping = Group
        Select New  MaxOrder With  {.Country = Country.Name, .MaxOrders = grouping.Max(Function(x) x.Orders.Count)}
 
Dim results As List(Of MaxOrder) = maximumOrdersByCountry.ToList()

Entity Framework 6

Grouping is written no different from prior examples using LINQ, Lambda or a combination of the two. The the following example grouping is on Microsoft NorthWind database grouping products by categories.

The following container class is used to return data to a calling form from a class performing reading/grouping of the data. When first writing a lambda query which follows below to know what properties are required use the same trick discussed at the beginning of this article or set a breakpoint after an anonymous query has executed, inspect the result set using the local window in Visual Studio while in a debug session. After writing several statements it will become apparent what properties and their types are needed.
 

Imports CategoryProductEntitiesLibrary.Models
 
Namespace ContainerClasses
    Public Class  ProductByCategory
        Public Property  Category() As  Category
        Public Property  GroupCategoryProducts() As IGrouping(Of Category, Product)
    End Class
End Namespace

Backend code to perform the group by operation. Note the code is executed asynchrony using lambda syntax. 

  • The Task performing the query does not have a useful name although this is inferred by the method return type.
  • The .Select selector uses 'group' variable, not a useful name although this is know from product.Category in the GroupBy above.
Public Async Function GroupProductByCategoryTask() As Task(Of List(Of ProductByCategory))
 
    Using context As  New ProductContext
 
        Dim results = Await Task.Run(
            Function()
 
                Return context.Products.GroupBy(Function(product) product.Category).
                                        Select(Function(group) New  ProductByCategory With {
                                                      .Category = group.Key,
                                                      .GroupCategoryProducts = group
                                                  }).
                                        ToListAsync()
            End Function)
 
 
        Return results
 
    End Using
 
End Function

The Key keyword used above [.Category = group.Key] enables you to specify behavior for properties of anonymous types.

In the form calling the code above uses the results in productByCategories into a ListView.

Private Async Sub GroupProductByCategoryButton_Click(sender As Object, e As  EventArgs) _
    Handles GroupProductByCategoryButton.Click
 
    ListView1.Items.Clear()
 
    Dim productByCategories As List(Of ProductByCategory) = Await operations.GroupProductByCategoryTask()
 
    For Each  group As  ProductByCategory In  productByCategories
 
        ListView1.Items.Add(New ListViewItem(group.Category.CategoryName))
 
        For Each  product As  Product In  group.GroupCategoryProducts
            ListView1.Items.Add(New ListViewItem(New String() {"", product.ProductName}))
        Next
 
        Await Task.Delay(1)
 
    Next
 
    TidyupListView()
 
End Sub
Private Sub  TidyupListView()
    ListView1.AutoResizeColumns(ColumnHeaderAutoResizeStyle.HeaderSize)
    ListView1.EndUpdate()
 
    ListView1.FocusedItem = ListView1.Items(0)
    ListView1.Items(0).Selected = True
 
    ActiveControl = ListView1
 
End Sub


Suppose rather than A-Z sort on category Z-A is desired, the following uses OrderByDecending which results a different container for returning results. Note the use of Key which is a keyword specifying the property of the group.

Public Function  GroupProductSortCategoryDescending() As List(Of GroupSortProduct)
    Dim results As IOrderedQueryable(Of GroupSortProduct)
 
    Using context As  New ProductContext
        '
        ' Here we are grouping by CategoryName under Products
        ' then creating a strong type for the select and within the select
        ' sorting the products by product name followed
        ' by sorting category, the key for the group in descending order.
        '
        results =
                context.Products.GroupBy(Function(product) product.Category.CategoryName).
                Select(Function(group) New  GroupSortProduct With {
                              .CategoryName = group.Key,
                              .Products = group.OrderBy(Function(prod) prod.ProductName)
                          }).
                OrderByDescending(Function(group) group.CategoryName)
 
        Return results.ToList()
 
    End Using
End Function

Container for the above method.

Imports CategoryProductEntitiesLibrary.Models
 
Namespace ContainerClasses
    Public Class  GroupSortProduct
        Public Property  CategoryName() As  String
        Public Property  Products() As  IOrderedEnumerable(Of Product)
    End Class
End Namespace

Often calculations are required e.g. sum, max, min of values in items within each group. In the following example the task is to group products by category, same as above with the total/sum of units of products.

Container class

Namespace ContainerClasses
 
    Public Class  ProductsGroupedSummed
        Public Property  Category() As  String
        Public Property  TotalUnitsInStock() As Integer?
    End Class
End Namespace

Code to perform the group with sumon the property UnitsInStock. The same code logic can be used in other queries on different tables.

Public Function  ProductsGroupedSummed() As List(Of ProductsGroupedSummed)
    Using context As  New ProductContext
 
        Dim productsGroupedSummedResults As New  List(Of ProductsGroupedSummed)
 
        Dim categories = context.Products.GroupBy(Function(product) product.Category).
                Select(Function(productGroup) New  With {
                          Key .Category = productGroup.Key,
                          Key .TotalUnitsInStock = productGroup.Sum(Function(p) p.UnitsInStock)
                      })
 
        For Each  cat In  categories
 
            productsGroupedSummedResults.Add(New ProductsGroupedSummed With {
                           .Category = cat.Category.CategoryName,
                           .TotalUnitsInStock = cat.TotalUnitsInStock
                       })
        Next
 
 
        Return productsGroupedSummedResults
 
    End Using
 
End Function

Working with DataTable containers

The main difference between working with classes and DataTable containers is syntactical sugar.  When working with classes values are accessed using class properties while with DataTable containers values are access using DataRow.Field(Of T)("Field name") extension method along with .AsEnumerable() method of the DataTable container.

First a group by using classes

Sub AnonymousToStrongTypedVersion()
    Dim operations = New DataOperations
    Dim customers = operations.CustomerList()
 
    Dim baseQuery =
            From customer In  customers Order By customer.City
            Group By CountryName = customer.Country Into regionalCustomers = Group
            Order By CountryName
 
    Dim customersByCountry As IEnumerable(Of CountryCompanyContainer) =
            baseQuery.Select(Function(customer) New  CountryCompanyContainer With {
                                .CountryName = customer.CountryName,
                                .Customers = customer.regionalCustomers,
                                .Count = customer.regionalCustomers.Count()
                                })
 
    For Each  TopGroup In  customersByCountry
        Console.WriteLine($"{TopGroup.CountryName} ({TopGroup.Count})")
        For Each  innerGroup In  TopGroup.Customers
            Console.WriteLine($"{innerGroup.CustomerIdentifier,5} {innerGroup.CompanyName} ({innerGroup.City})")
        Next
    Next
 
End Sub

The same operation performed using a DataTable

Private Sub  CustomersDataTableGrouping()
    Dim operations = New DataOperations
    Dim customers = operations.CustomerDataTable()
 
    Dim baseQuery =
            From customer In  customers.AsEnumerable()
            Order By customer.Field(Of String)("City")
            Group By CountryName = customer.Field(Of String)("Country") Into regionalCustomers = Group, Count()
            Order By CountryName
 
    Dim customersByCountry =
            baseQuery.Select(Function(customer) New  CountryCompanyDataTableContainer With {
                                .CountryName = customer.CountryName,
                                .Customers = customer.regionalCustomers,
                                .Count = customer.regionalCustomers.Count()
                                })
 
    For Each  container As  CountryCompanyDataTableContainer In customersByCountry
        Console.WriteLine($"{container.CountryName} ({container.Count})")
        For Each  dataRow As  DataRow In  container.Customers
            Console.WriteLine($"     {dataRow.Values()}")
        Next
    Next
 
End Sub

To shortcut displaying data row values the following extension method provides output.

Imports System.Runtime.CompilerServices
 
Namespace Classes
    Public Module  DataExtensions
        <Extension>
        Public Function  Values(sender As  DataRow) As  String
            Return String.Join(",", sender.ItemArray)
        End Function
    End Module
End NameSpace

Summary

Information provided to write basic LINQ/Lambda Group by query using classes and other containers to assist with writing queries effectively in Visual Studio solutions is a starting point. In the next part focus will be on grouping by multiple properties both with classes and other data containers.

See also

LINQ/Lambda Group by multiple properties (VB.NET)
VB.NET Writing better code recommendations

Source code

The following GitHub repository contains all code samples presented along with extra code samples all guaranteed to work.