LINQ 101: A Beginner's Guide to Mastering Data Queries in .NET

LINQ 101: A Beginner's Guide to Mastering Data Queries in .NET

Learn the Fundamentals of Querying Data with LINQ in .NET

LINQ (Language-Integrated Query), integrated means that it is a part of the language. It is available in the System.Linq namespace.
Linq is a technology that aims to integrate queries into the C# programming language. Normally you have to learn a different query language for each type of data source, XML, SQL databases etc... With LINQ, a query is just like another first-class language construct, like classes, or events, ... With LINQ, you can write queries on collections of objects using language keywords with operators. Using LINQ, you can perform filtering, scheduling, and grouping operations on collections of objects with minimal code. You can write LINQ queries in C# for SQL databases, XML documents, and any object collection that supports the IEnumerable interface.
Linq provides a compact, expressive, and intelligible syntax for manipulating data.

If you are new to c# What is C# Programming Language? A Beginner's Guide will help you get started! if you want to learn OOP as well, check Learn object oriented programming in C# from scratch!

Why should I learn LINQ?

As a C# developer, you should learn LINQ for multiple reasons:

  • By learning Linq, you don't have to learn other query languages for different data sources, only one query will work for different data sources.

  • Using Linq, you will write less code compared to other traditional approaches.

  • Linq will assure compile-time safety, you will get all the errors at compile-time. Linq has lots of inbuilt methods to support different operations.

How does LINQ work?

LINQ queries are converted as per the data source requirement with the help of a LINQ provider. A LINQ provider is a software that implements the IQueryProvider and IQueryable interfaces for a particular data store. In other words, it allows you to write queries against that data source.

Before continuing reading this article ... If you don't know me, my name is Hamza, I am a C# and ML software engineer. If you want to start developing software and learn ML, as a software engineer, I will guide you in this adventure and give you tips every day, follow me and subscribe to my newsletter to receive my daily articles.

Query expressions

  • Query expressions are easy to learn because they use understandable language and familiar C# language.

  • Queries are used to filter and transform data from any data source that supports IEnumerable interface.

  • A query is not executed until you iterate over the IEnumerable using the foreach statement for example.

Ways to write a query

In LINQ there are three ways to write a query :

  • Query Syntax

  • Method Syntax

  • Mixed Syntax (Query + Method)

LINQ Queries

A query is a set of instructions applied to a data source to perform a (CRUD) queries are used to extract data from data sources, queries are expressed in different languages such as SQL for relational databases. Developers were obliged to learn a new query language for each data source. LINQ simplifies this by allowing to work with any data source that supports the IEnumerable interface. In LINQ you are dealing with collections of objects such as a list of cars, an array of integers etc...

The syntax of the query is as follows :

from Object in Datasource
where Condition
select Object

Now, let's take a simple example where we show how LINQ queries are used in C#:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace LINQ_DEMO
{
    internal class Program
    {
        static void Main(string[] args)
        {
            // Create a data source
            int[] arr = new int[8] 
            { 4, 9, -5, 2, -8, 7, 6, -3 };

            // Create a Query 
            var myQuery =
                from number in arr
                where number <= 0
                select number;

            // Execute the query
            foreach(var item in myQuery)
            {
                Console.WriteLine(item);
            }
            Console.ReadLine();
        }
    }
}

Here, we created our data source, which is an array of integers, then we created a query named myQuery, the type of the query is an IEnumerable<int>, and finally, we iterated over this query to print all the values. In LINQ, the execution of the query is distinct from the query itself, meaning that you didn't get any data by just creating the query : var myQuery = from number in arr where number <= 0 select number;

To work with a LINQ query, we need the following components: the data source, the query, and the execution of the query.

The data source

In the previous example, since our data source is an array of integers, it supports the IEnumerable interface this means that it can be used with LINQ queries. Actually, a query is executed in a foreach statement, and foreach requires the IEnumerable and IEnumerable<T> interface.

The Query

The query specifies the data to retrieve from a given data source, it can also contain information about how the data will be ordered, stored grouped before it is returned, the query is stored in a variable of type IEnumerable. The previous query returns all the positive numbers from the list.

The query execution

As stated before, the query itself stores only the command, it needs to be executed using a foreach statement. Since the query itself does not store the results, you can execute the query whenever you want. You can however force the execution of a given query using the ToList() or ToArray() functions.

// Forcing the execution : ToArray()
var muQuery2 = (from number in arr
                           where number <= 0
                           select number).ToArray();
// Forcing the execution : ToList()
var muQuery3 = (from number in arr
                           where number <= 0
                           select number).ToList();

Avoid using the generic syntax

You can avoid using the generic syntax (IEnumerable<T>, List<T>, ...) by using the keyword var, this way you are letting the compiler handle the generic type declaration. The var keyword is important when the type of the variable is obvious or it is not important to explicitly specify.

LINQ Query operations

Before we start talking in detail about LINQ queries, let's create a class called Person:

/// <summary>
/// Class Person
/// </summary>
internal class Person
{
    // Name of the person
    public string Name { get; set; }
    // Age of the person
    public int Age { get; set; }
    // The city of the person
    public string City { get; set; }
}

Then in the main function let's create a list of people:

var people = new List<Person>()
{ 
    new Person(){Name = "john", Age = 5, City = "Rabat"},
    new Person(){Name = "Joe", Age = 54, City = "Madrid"},
    new Person(){Name = "Aladdin", Age = 40, City = "Madrid"},
    new Person(){Name = "Mohammed", Age = 60, City = "Paris"},
    new Person(){Name = "Alain", Age = 10, City = "Madrid"},
    new Person(){Name = "Josh", Age = 18, City = "Paris"}
};

Obtaining a Data Source

In LINQ Queries the first thing is to get the data source, the first keyword in a LINQ statement is from it is used to specify the data source and the range variable.

//myQuery is an IEnumerable<Customer>
var myQuery = from person in people  
              select person;

The range variable person is like the iteration variable in a foreach loop.

Now, to show the result of the query, we use a foreach statement to execute it:

foreach (var item in myQuery)
{
    Console.WriteLine(item.Name);
}

Filtering

Now that we obtain the data source, for that, we will apply a filter to our query. The filter causes the query to return only those elements for which the expression is true.
The filter can be applied to different properties of the Person, it can be applied to select the elements of the list where the person's name contains a specific letter, the length of the name is greater than a specific number, people whose name starts with a given letter and so on, it can also be applied to the age of the person, for example, in the following example we will select the elements whose age is greater than 20:

//myQuery is an IEnumerable<Customer>
var myQuery = from person in people  
              where person.Age > 20
              select person;

Here, the result of the query will be the set of people where the condition person.Age > 20 was true.

You can use C# logical operators AND and OR to apply as many filters as you want, for example, let's select elements from the list where the name contains the letter 'i' and age is greater than 20:

//myQuery is an IEnumerable<Customer>
var myQuery = from person in people
                where person.Age > 20 && person.Name.Contains('i')
                select person;

In the same way, to return people with an age greater than 20 and the name contains the letter 'i':

//myQuery is an IEnumerable<Customer>
var myQuery = from person in people
                where person.Age > 20 || person.Name.Contains('i')
                select person;

Ordering

Sometimes, we would like to order the data, the orderby keyword will cause the returned elements to be ordered according to the default comparer for the type being used. For an int or a double, the elements will be ordered increasingly:

//myQuery is an IEnumerable<Customer>
var myQuery = from person in people
                where person.Name.Length > 3
                orderby person.Age ascending 
                select person;

We can order the previous elements in reverse order by using the descending keyword:

//myQuery is an IEnumerable<Customer>
var myQuery = from person in people
                where person.Name.Length > 3
                orderby person.Age descending 
                select person;

Grouping

The group keyword allows you to group the results based on a key that you specify.

//myQuery is an IEnumerable<Customer>
var myQuery = from person in people
                group person by person.City;

foreach (var item in myQuery)
{
    Console.WriteLine(item.Key);
}

The result is a list of lists, when you iterate over a query that produces a sequence of groups, you should use a nested foreach loop.

//myQuery is an IEnumerable<Customer>
var myQuery = from person in people
                group person by person.City into grouped
                where grouped.Count() > 2
                select grouped;

foreach (var PersonGroup in myQuery)
{
    foreach (var item in PersonGroup)
    {
        Console.WriteLine(item.Name);
    }
}

Method Syntax

Method syntax uses lambda expressions to define any condition, they are easy to read and write for simple queries, however, for complex queries, they are harder to write as compared to query syntax. In this approach, the query is written by using multiple methods combined with a dot(.).

The method syntax is as follows:

Datasource.ConditionMethod().SelectionMethod()

The method syntax of LINQ is one of the two primary ways to write LINQ queries. It uses extension methods to build up a query, and it reads like a fluent sentence in the language of your choice.

Here is an example of a LINQ query using the method syntax to retrieve all the odd numbers from a list of integers:

List<int> numbers = new List<int> { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
IEnumerable<int> oddNumbers = numbers.Where(n => n % 2 == 1);

In this example, the Where method is an extension method provided by LINQ that filters a sequence based on a predicate (a boolean expression). The predicate is defined as a lambda expression n => n % 2 == 1, which specifies that the elements to be included in the result are those for which the remainder after dividing by 2 is 1.

The method syntax is often preferred for its simplicity and readability, especially for simple queries. It can be more verbose than the other syntax of LINQ, called the query syntax, for more complex queries.

Mixed Syntax

Note that we can mix the query syntax and the method syntax:

(from Object in Datasource
where condition
select Object).Method()

In LINQ (Language Integrated Query), mixed syntax refers to a style of querying data that combines elements of both the method syntax and the query syntax. The mixed syntax allows you to take advantage of the strengths of both the method and query syntaxes in the same query. It is often used when the query becomes too complex to be expressed easily using only one syntax.

Here is an example of a LINQ query using the mixed syntax to retrieve all the odd numbers from a list of integers and sort them in descending order:

List<int> numbers = new List<int> { 1, 2, 3, 4, 5, 6, 7, 8, 9 };

var oddNumbers = (from n in numbers
                where n % 2 == 1
                orderby n descending
                select n).ToList();

In this example, the from and where clauses are written in the query syntax, while the orderby and select clauses are written in the method syntax. This allows the query to take advantage of the concise and expressive nature of the query syntax, while still benefiting from the readability and simplicity of the method syntax.

The ToList method at the end is used to execute the query and convert the result to a list. Without it, the query would not be executed and the result would be a deferred execution query.

Thank you for reading, and let's connect!

Thank you for reading my blog. Feel free to subscribe to my email newsletter and connect on Twitter
If you like this article! Don't miss the upcoming ones, follow me and subscribe to my newsletter to receive more!
See you soon :)

Did you find this article valuable?

Support Hamza EL Yousfi by becoming a sponsor. Any amount is appreciated!