LINQ to SQL Performance Optimization, Re-Revisited

What Came Before

In my seemingly endless series of LINQ to SQL Performance Optimization series (see this initial post and this revisiting post), I’ve demonstrated some methods for improving LINQ to SQL loading times for hierarchies of data. This is the data we want to load;

And we want to do a deep load, we want the entire hierarchy. If you need really fast performance, you might need to do a bit of hand tweaking. This article series demonstrates some of the tweaks you might want to try.

The Fastest Method

My fastest method is fairly cumbersome and requires a lot of maintenance.

Basically, you must;

  • Create a stored procedure that retrieves all the data your require (I’ve somewhat automated the creation of the stored procedure, but you would still need to execute this by hand as a manual development step, which is bad)
  • A method for executing the query and loading all the data it delivers
  • A method that assembles the data from the bullet above to the appropriate hierarchy

If you add a new field/change an existing field for an object, you need to update the dbml file and the stored procedure, as typically with LINQ to SQL, you’d only need to update the dbml file.

There is another option, that isn’t as fast as my fastest method, but the maintenance requirements are lower and it’s easier to keep up to date. This is probably a better method than the Full Load version in the vast majority of cases.

The Last (But One) Method

So, tell me tell me, what is this clever new method? Well, we can load the required data using only LINQ instead of using a stored procedure. This means that we can do away with maintaining the stored procedure. The downside is that it generates more queries and is slightly slower.

Basically, you must;

  • Create a method that loads all the datasets you need to assemble your hierarchy – this is written in LINQ as opposed to the Stored Procedure in the previous example
  • A method that assembles the data from the bullet above to the appropriate hierarchy

The reason I call this the last but one method is because we can either run the LINQ queries as deferred execution or immediate execution, and the difference is significant. Which one is faster will depend on your particular situation, but in my case, deferred execution was considerably faster. I’m guessing it will most often be faster.

Deferred execution looks like this in C# (this only retrieves part of the hierarchy, suppliers=>products=>order_details);

            IEnumerable<Supplier> suppliers = context.Suppliers;
            IEnumerable<Product> products =
                context
                    .Products
                    .Join(
                    suppliers,
                    product => product.SupplierID,
                    supplier => supplier.SupplierID,
                    (product, supplier) => product)
                    .Distinct();

            IEnumerable<Order_Detail> orderDetails =
                context
                    .Order_Details
                    .Join(
                    products,
                    detail => detail.ProductID,
                    product => product.ProductID,
                    (child, parent) => child)
                    .Distinct();

And the generated SQL looks like this;

SELECT [t0].[SupplierID], [t0].[CompanyName], [t0].[ContactName], [t0].[ContactTitle], [t0].[Address], [t0].[City], [t0].[Region], [t0].[PostalCode], [t0].[Country], [t0].[Phone], [t0].[Fax], [t0].[HomePage]
FROM [dbo].[Suppliers] AS [t0]
-- Context: SqlProvider(Sql2005) Model: AttributedMetaModel Build: 3.5.30729.1

SELECT DISTINCT [t0].[ProductID], [t0].[ProductName], [t0].[SupplierID], [t0].[CategoryID], [t0].[QuantityPerUnit], [t0].[UnitPrice], [t0].[UnitsInStock], [t0].[UnitsOnOrder], [t0].[ReorderLevel], [t0].[Discontinued]
FROM [dbo].[Products] AS [t0]
INNER JOIN [dbo].[Suppliers] AS [t1] ON [t0].[SupplierID] = ([t1].[SupplierID])
-- Context: SqlProvider(Sql2005) Model: AttributedMetaModel Build: 3.5.30729.1

SELECT DISTINCT [t0].[OrderID], [t0].[ProductID], [t0].[UnitPrice], [t0].[Quantity], [t0].[Discount]
FROM [dbo].[Order Details] AS [t0]
INNER JOIN (
    SELECT DISTINCT [t1].[ProductID], [t1].[ProductName], [t1].[SupplierID], [t1].[CategoryID], [t1].[QuantityPerUnit], [t1].[UnitPrice], [t1].[UnitsInStock], [t1].[UnitsOnOrder], [t1].[ReorderLevel], [t1].[Discontinued]
    FROM [dbo].[Products] AS [t1]
    INNER JOIN [dbo].[Suppliers] AS [t2] ON [t1].[SupplierID] = ([t2].[SupplierID])
    ) AS [t3] ON [t0].[ProductID] = [t3].[ProductID]
-- Context: SqlProvider(Sql2005) Model: AttributedMetaModel Build: 3.5.30729.1

Note how we’re not using any data that’s been returned from the SQL Server, LINQ is building more and more complex SQL statements. But these are nicely optimized by the SQL Server, so we don’t need to worry too much about them.

The immediately executed version looks like this (we execute the query by calling ".ToList()”, which is the immediate part);

            List<Supplier> suppliers = context.Suppliers.ToList();
            List<Product> products =
                context
                    .Products
                    .Where(product => suppliers.Select(supplier => (int?)supplier.SupplierID).Contains(product.SupplierID))
                    .Distinct()
                    .ToList();

            List<Order_Detail> orderDetails =
                context
                    .Order_Details
                    .Where(detail => products.Select(product => product.ProductID).Contains(detail.ProductID))
                    .Distinct()
                    .ToList();

This generates these SQL statements;

SELECT [t0].[SupplierID], [t0].[CompanyName], [t0].[ContactName], [t0].[ContactTitle], [t0].[Address], [t0].[City], [t0].[Region], [t0].[PostalCode], [t0].[Country], [t0].[Phone], [t0].[Fax], [t0].[HomePage]
FROM [dbo].[Suppliers] AS [t0]
-- Context: SqlProvider(Sql2005) Model: AttributedMetaModel Build: 3.5.30729.1

SELECT DISTINCT [t0].[ProductID], [t0].[ProductName], [t0].[SupplierID], [t0].[CategoryID], [t0].[QuantityPerUnit], [t0].[UnitPrice], [t0].[UnitsInStock], [t0].[UnitsOnOrder], [t0].[ReorderLevel], [t0].[Discontinued]
FROM [dbo].[Products] AS [t0]
WHERE [t0].[SupplierID] IN (@p0, @p1, @p2, @p3, @p4, @p5, @p6, @p7, @p8, @p9, @p10, @p11, @p12, @p13, @p14, @p15, @p16, @p17, @p18, @p19, @p20, @p21, @p22, @p23, @p24, @p25, @p26, @p27, @p28)
-- @p0: Input Int (Size = 0; Prec = 0; Scale = 0) [1]
-- @p1: Input Int (Size = 0; Prec = 0; Scale = 0) [2]
-- @p2: Input Int (Size = 0; Prec = 0; Scale = 0) [3]
-- @p3: Input Int (Size = 0; Prec = 0; Scale = 0) [4]
-- @p4: Input Int (Size = 0; Prec = 0; Scale = 0) [5]
-- @p5: Input Int (Size = 0; Prec = 0; Scale = 0) [6]
-- @p6: Input Int (Size = 0; Prec = 0; Scale = 0) [7]
...snip...
-- @p23: Input Int (Size = 0; Prec = 0; Scale = 0) [24]
-- @p24: Input Int (Size = 0; Prec = 0; Scale = 0) [25]
-- @p25: Input Int (Size = 0; Prec = 0; Scale = 0) [26]
-- @p26: Input Int (Size = 0; Prec = 0; Scale = 0) [27]
-- @p27: Input Int (Size = 0; Prec = 0; Scale = 0) [28]
-- @p28: Input Int (Size = 0; Prec = 0; Scale = 0) [29]
-- Context: SqlProvider(Sql2005) Model: AttributedMetaModel Build: 3.5.30729.1

SELECT DISTINCT [t0].[OrderID], [t0].[ProductID], [t0].[UnitPrice], [t0].[Quantity], [t0].[Discount]
FROM [dbo].[Order Details] AS [t0]
WHERE [t0].[ProductID] IN (@p0, @p1, @p2, @p3, @p4, @p5, @p6, @p7, @p8, @p9, @p10, @p11, @p12, @p13, @p14, @p15, @p16, @p17, @p18, @p19, @p20, @p21, @p22, @p23, @p24, @p25, @p26, @p27, @p28, @p29, @p30, @p31, @p32, @p33, @p34, @p35, @p36, @p37, @p38, @p39, @p40, @p41, @p42, @p43, @p44, @p45, @p46, @p47, @p48, @p49, @p50, @p51, @p52, @p53, @p54, @p55, @p56, @p57, @p58, @p59, @p60, @p61, @p62, @p63, @p64, @p65, @p66, @p67, @p68, @p69, @p70, @p71, @p72, @p73, @p74, @p75, @p76)
-- @p0: Input Int (Size = 0; Prec = 0; Scale = 0) [1]
-- @p1: Input Int (Size = 0; Prec = 0; Scale = 0) [2]
-- @p2: Input Int (Size = 0; Prec = 0; Scale = 0) [3]
-- @p3: Input Int (Size = 0; Prec = 0; Scale = 0) [4]
-- @p4: Input Int (Size = 0; Prec = 0; Scale = 0) [5]
-- @p5: Input Int (Size = 0; Prec = 0; Scale = 0) [6]
-- @p6: Input Int (Size = 0; Prec = 0; Scale = 0) [7]
...snip...
-- @p65: Input Int (Size = 0; Prec = 0; Scale = 0) [71]
-- @p66: Input Int (Size = 0; Prec = 0; Scale = 0) [72]
-- @p67: Input Int (Size = 0; Prec = 0; Scale = 0) [73]
-- @p68: Input Int (Size = 0; Prec = 0; Scale = 0) [74]
-- @p69: Input Int (Size = 0; Prec = 0; Scale = 0) [75]
-- @p70: Input Int (Size = 0; Prec = 0; Scale = 0) [76]
-- @p71: Input Int (Size = 0; Prec = 0; Scale = 0) [77]
-- @p72: Input Int (Size = 0; Prec = 0; Scale = 0) [82]
-- @p73: Input Int (Size = 0; Prec = 0; Scale = 0) [83]
-- @p74: Input Int (Size = 0; Prec = 0; Scale = 0) [84]
-- @p75: Input Int (Size = 0; Prec = 0; Scale = 0) [85]
-- @p76: Input Int (Size = 0; Prec = 0; Scale = 0) [86]
-- Context: SqlProvider(Sql2005) Model: AttributedMetaModel Build: 3.5.30729.1

The Last (But One) Method : Deferred Execution

Deferred execution takes about 0.191 seconds, it produces 7 queries in and the actual query code comes out to 6kb of data. This is only slightly slower than the Full Load version, but I’m certain that there are examples were the Full Load version will be considerably faster. You’ll find my full code for this method below.

The Last Method : Immediate Execution

Immediate execution takes about 0.4 seconds, it also produces 7 queries – though since each query gets a very large list of parameters, the amount of data is much larger, it comes out to 261kb of data. That’s 43 times more data!! You’ll find my full code for this method below.

Leader Board

Here’s the leader board as it currently stands;

Method Execution time Queries Query text
Full Load 0.137s 1 query 1kb (actually 195 bytes)
LINQ Full Load – Deferred Execution 0.191s 7 queries 6kb
LINQ Full Load – Immediate Execution 0.4s 7 queries 261kb
Smarter Load Options 0.45s 78 queries 114kb
Default/Lazy Load 1.4s 1031 queries 448kb
Naive Load Options 2.4s 31 queries 77kb

The code – and there’s a lot of it!

Assemble the hierarchy;

public partial class NorthwindDataContext
    {
        public static List<Supplier> AssembleHierarchy(List<Supplier> suppliers, List<Product> products, List<Category> categories, List<Order_Detail> orderDetails, List<Order> orders, List<Shipper> shippers, List<Customer> customers)
        {
            suppliers.ForEach(
                supplier => supplier.Products.SetSource(products.Where(product => product.SupplierID == supplier.SupplierID)));

            products.ForEach(
                product =>
                {
                    product.Category = categories.Single(category => category.CategoryID == product.CategoryID);
                    product.Order_Details.SetSource(
                        orderDetails.Where(detail => detail.ProductID == product.ProductID));
                });

            orderDetails.ForEach(
                detail =>
                {
                    detail.Order = orders.Single(order => order.OrderID == detail.OrderID);
                });

            orders.ForEach(
                order =>
                {
                    order.Shipper = shippers.Single(shipper => shipper.ShipperID == order.ShipVia);
                    order.Customer = customers.Single(customer => customer.CustomerID == order.CustomerID);
                });

            return suppliers;
        }
}

Deferred data fetch and hierarchy assembly;

        private static List<Supplier> ExecuteSimpleFullFetchDeferredExecution(NorthwindDataContext context)
        {
            IEnumerable<Supplier> suppliers = context.Suppliers;
            IEnumerable<Product> products =
                context
                    .Products
                    .Join(
                    suppliers,
                    product => product.SupplierID,
                    supplier => supplier.SupplierID,
                    (product, supplier) => product)
                    .Distinct();

            IEnumerable<Category> categories =
                context
                    .Categories
                    .Join(
                    products,
                    category => category.CategoryID,
                    product => product.CategoryID,
                    (category, product) => category)
                    .Distinct();

            IEnumerable<Order_Detail> orderDetails =
                context
                    .Order_Details
                    .Join(
                    products,
                    detail => detail.ProductID,
                    product => product.ProductID,
                    (child, parent) => child)
                    .Distinct();

            IEnumerable<Order> orders =
                context
                    .Orders
                    .Join(
                    orderDetails,
                    detail => detail.OrderID,
                    order => order.OrderID,
                    (child, parent) => child)
                    .Distinct();

            IEnumerable<Shipper> shippers =
                context
                    .Shippers
                    .Join(
                    orders,
                    shipper => shipper.ShipperID,
                    order => order.ShipVia,
                    (child, parent) => child)
                    .Distinct();

            IEnumerable<Customer> customers =
                context
                    .Customers
                    .Join(
                    orders,
                    customer => customer.CustomerID,
                    order => order.CustomerID,
                    (child, parent) => child)
                    .Distinct();

            return
                NorthwindDataContext.AssembleHierarchy(
                    suppliers.ToList(),
                    products.ToList(),
                    categories.ToList(),
                    orderDetails.ToList(),
                    orders.ToList(),
                    shippers.ToList(),
                    customers.ToList());
        }

Immediate data fetch and hierarchy assembly;

        private static List<Supplier> ExecuteSimpleFullFetchImmediateExecution(NorthwindDataContext context)
        {
            List<Supplier> suppliers = context.Suppliers.ToList();
            List<Product> products =
                context
                    .Products
                    .Where(product => suppliers.Select(supplier => (int?)supplier.SupplierID).Contains(product.SupplierID))
                    .Distinct()
                    .ToList();

            List<Category> categories =
                context
                    .Categories
                    .Where(category => products.Select(product => product.CategoryID).Contains(category.CategoryID))
                    .Distinct()
                    .ToList();

            List<Order_Detail> orderDetails =
                context
                    .Order_Details
                    .Where(detail => products.Select(product => product.ProductID).Contains(detail.ProductID))
                    .Distinct()
                    .ToList();

            List<Order> orders =
                context
                    .Orders
                    .Where(order => orderDetails.Select(detail => detail.OrderID).Contains(order.OrderID))
                    .Distinct()
                    .ToList();

            List<Shipper> shippers =
                context
                    .Shippers
                    .Where(shipper => orders.Select(order => order.ShipVia).Contains(shipper.ShipperID))
                    .Distinct()
                    .ToList();

            List<Customer> customers =
                context
                    .Customers
                    .Where(customer => orders.Select(order => order.CustomerID).Contains(customer.CustomerID))
                    .Distinct()
                    .ToList();

            return
                NorthwindDataContext.AssembleHierarchy(
                    suppliers,
                    products,
                    categories,
                    orderDetails,
                    orders,
                    shippers,
                    customers);
        }

About mfagerlund
Writes code in my sleep - and sometimes it even compiles!

One Response to LINQ to SQL Performance Optimization, Re-Revisited

  1. Pingback: LINQ to SQL, Lambda, Generics and Readability « Mattias Fagerlund's Coding Blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: