What Came Before
In my seemingly endless series of LINQ to SQL Performance Optimization series (see this initial post and this revisiting post), I’ve demonstrated some methods for improving LINQ to SQL loading times for hierarchies of data. This is the data we want to load;
And we want to do a deep load, we want the entire hierarchy. If you need really fast performance, you might need to do a bit of hand tweaking. This article series demonstrates some of the tweaks you might want to try.
The Fastest Method
My fastest method is fairly cumbersome and requires a lot of maintenance.
Basically, you must;
- Create a stored procedure that retrieves all the data your require (I’ve somewhat automated the creation of the stored procedure, but you would still need to execute this by hand as a manual development step, which is bad)
- A method for executing the query and loading all the data it delivers
- A method that assembles the data from the bullet above to the appropriate hierarchy
If you add a new field/change an existing field for an object, you need to update the dbml file and the stored procedure, as typically with LINQ to SQL, you’d only need to update the dbml file.
There is another option, that isn’t as fast as my fastest method, but the maintenance requirements are lower and it’s easier to keep up to date. This is probably a better method than the Full Load version in the vast majority of cases.
The Last (But One) Method
So, tell me tell me, what is this clever new method? Well, we can load the required data using only LINQ instead of using a stored procedure. This means that we can do away with maintaining the stored procedure. The downside is that it generates more queries and is slightly slower.
Basically, you must;
- Create a method that loads all the datasets you need to assemble your hierarchy – this is written in LINQ as opposed to the Stored Procedure in the previous example
- A method that assembles the data from the bullet above to the appropriate hierarchy
The reason I call this the last but one method is because we can either run the LINQ queries as deferred execution or immediate execution, and the difference is significant. Which one is faster will depend on your particular situation, but in my case, deferred execution was considerably faster. I’m guessing it will most often be faster.
Deferred execution looks like this in C# (this only retrieves part of the hierarchy, suppliers=>products=>order_details);
IEnumerable<Supplier> suppliers = context.Suppliers;
IEnumerable<Product> products =
context
.Products
.Join(
suppliers,
product => product.SupplierID,
supplier => supplier.SupplierID,
(product, supplier) => product)
.Distinct();
IEnumerable<Order_Detail> orderDetails =
context
.Order_Details
.Join(
products,
detail => detail.ProductID,
product => product.ProductID,
(child, parent) => child)
.Distinct();
And the generated SQL looks like this;
SELECT [t0].[SupplierID], [t0].[CompanyName], [t0].[ContactName], [t0].[ContactTitle], [t0].[Address], [t0].[City], [t0].[Region], [t0].[PostalCode], [t0].[Country], [t0].[Phone], [t0].[Fax], [t0].[HomePage]
FROM [dbo].[Suppliers] AS [t0]
-- Context: SqlProvider(Sql2005) Model: AttributedMetaModel Build: 3.5.30729.1
SELECT DISTINCT [t0].[ProductID], [t0].[ProductName], [t0].[SupplierID], [t0].[CategoryID], [t0].[QuantityPerUnit], [t0].[UnitPrice], [t0].[UnitsInStock], [t0].[UnitsOnOrder], [t0].[ReorderLevel], [t0].[Discontinued]
FROM [dbo].[Products] AS [t0]
INNER JOIN [dbo].[Suppliers] AS [t1] ON [t0].[SupplierID] = ([t1].[SupplierID])
-- Context: SqlProvider(Sql2005) Model: AttributedMetaModel Build: 3.5.30729.1
SELECT DISTINCT [t0].[OrderID], [t0].[ProductID], [t0].[UnitPrice], [t0].[Quantity], [t0].[Discount]
FROM [dbo].[Order Details] AS [t0]
INNER JOIN (
SELECT DISTINCT [t1].[ProductID], [t1].[ProductName], [t1].[SupplierID], [t1].[CategoryID], [t1].[QuantityPerUnit], [t1].[UnitPrice], [t1].[UnitsInStock], [t1].[UnitsOnOrder], [t1].[ReorderLevel], [t1].[Discontinued]
FROM [dbo].[Products] AS [t1]
INNER JOIN [dbo].[Suppliers] AS [t2] ON [t1].[SupplierID] = ([t2].[SupplierID])
) AS [t3] ON [t0].[ProductID] = [t3].[ProductID]
-- Context: SqlProvider(Sql2005) Model: AttributedMetaModel Build: 3.5.30729.1
Note how we’re not using any data that’s been returned from the SQL Server, LINQ is building more and more complex SQL statements. But these are nicely optimized by the SQL Server, so we don’t need to worry too much about them.
The immediately executed version looks like this (we execute the query by calling ".ToList()”, which is the immediate part);
List<Supplier> suppliers = context.Suppliers.ToList();
List<Product> products =
context
.Products
.Where(product => suppliers.Select(supplier => (int?)supplier.SupplierID).Contains(product.SupplierID))
.Distinct()
.ToList();
List<Order_Detail> orderDetails =
context
.Order_Details
.Where(detail => products.Select(product => product.ProductID).Contains(detail.ProductID))
.Distinct()
.ToList();
This generates these SQL statements;
SELECT [t0].[SupplierID], [t0].[CompanyName], [t0].[ContactName], [t0].[ContactTitle], [t0].[Address], [t0].[City], [t0].[Region], [t0].[PostalCode], [t0].[Country], [t0].[Phone], [t0].[Fax], [t0].[HomePage]
FROM [dbo].[Suppliers] AS [t0]
-- Context: SqlProvider(Sql2005) Model: AttributedMetaModel Build: 3.5.30729.1
SELECT DISTINCT [t0].[ProductID], [t0].[ProductName], [t0].[SupplierID], [t0].[CategoryID], [t0].[QuantityPerUnit], [t0].[UnitPrice], [t0].[UnitsInStock], [t0].[UnitsOnOrder], [t0].[ReorderLevel], [t0].[Discontinued]
FROM [dbo].[Products] AS [t0]
WHERE [t0].[SupplierID] IN (@p0, @p1, @p2, @p3, @p4, @p5, @p6, @p7, @p8, @p9, @p10, @p11, @p12, @p13, @p14, @p15, @p16, @p17, @p18, @p19, @p20, @p21, @p22, @p23, @p24, @p25, @p26, @p27, @p28)
-- @p0: Input Int (Size = 0; Prec = 0; Scale = 0) [1]
-- @p1: Input Int (Size = 0; Prec = 0; Scale = 0) [2]
-- @p2: Input Int (Size = 0; Prec = 0; Scale = 0) [3]
-- @p3: Input Int (Size = 0; Prec = 0; Scale = 0) [4]
-- @p4: Input Int (Size = 0; Prec = 0; Scale = 0) [5]
-- @p5: Input Int (Size = 0; Prec = 0; Scale = 0) [6]
-- @p6: Input Int (Size = 0; Prec = 0; Scale = 0) [7]
...snip...
-- @p23: Input Int (Size = 0; Prec = 0; Scale = 0) [24]
-- @p24: Input Int (Size = 0; Prec = 0; Scale = 0) [25]
-- @p25: Input Int (Size = 0; Prec = 0; Scale = 0) [26]
-- @p26: Input Int (Size = 0; Prec = 0; Scale = 0) [27]
-- @p27: Input Int (Size = 0; Prec = 0; Scale = 0) [28]
-- @p28: Input Int (Size = 0; Prec = 0; Scale = 0) [29]
-- Context: SqlProvider(Sql2005) Model: AttributedMetaModel Build: 3.5.30729.1
SELECT DISTINCT [t0].[OrderID], [t0].[ProductID], [t0].[UnitPrice], [t0].[Quantity], [t0].[Discount]
FROM [dbo].[Order Details] AS [t0]
WHERE [t0].[ProductID] IN (@p0, @p1, @p2, @p3, @p4, @p5, @p6, @p7, @p8, @p9, @p10, @p11, @p12, @p13, @p14, @p15, @p16, @p17, @p18, @p19, @p20, @p21, @p22, @p23, @p24, @p25, @p26, @p27, @p28, @p29, @p30, @p31, @p32, @p33, @p34, @p35, @p36, @p37, @p38, @p39, @p40, @p41, @p42, @p43, @p44, @p45, @p46, @p47, @p48, @p49, @p50, @p51, @p52, @p53, @p54, @p55, @p56, @p57, @p58, @p59, @p60, @p61, @p62, @p63, @p64, @p65, @p66, @p67, @p68, @p69, @p70, @p71, @p72, @p73, @p74, @p75, @p76)
-- @p0: Input Int (Size = 0; Prec = 0; Scale = 0) [1]
-- @p1: Input Int (Size = 0; Prec = 0; Scale = 0) [2]
-- @p2: Input Int (Size = 0; Prec = 0; Scale = 0) [3]
-- @p3: Input Int (Size = 0; Prec = 0; Scale = 0) [4]
-- @p4: Input Int (Size = 0; Prec = 0; Scale = 0) [5]
-- @p5: Input Int (Size = 0; Prec = 0; Scale = 0) [6]
-- @p6: Input Int (Size = 0; Prec = 0; Scale = 0) [7]
...snip...
-- @p65: Input Int (Size = 0; Prec = 0; Scale = 0) [71]
-- @p66: Input Int (Size = 0; Prec = 0; Scale = 0) [72]
-- @p67: Input Int (Size = 0; Prec = 0; Scale = 0) [73]
-- @p68: Input Int (Size = 0; Prec = 0; Scale = 0) [74]
-- @p69: Input Int (Size = 0; Prec = 0; Scale = 0) [75]
-- @p70: Input Int (Size = 0; Prec = 0; Scale = 0) [76]
-- @p71: Input Int (Size = 0; Prec = 0; Scale = 0) [77]
-- @p72: Input Int (Size = 0; Prec = 0; Scale = 0) [82]
-- @p73: Input Int (Size = 0; Prec = 0; Scale = 0) [83]
-- @p74: Input Int (Size = 0; Prec = 0; Scale = 0) [84]
-- @p75: Input Int (Size = 0; Prec = 0; Scale = 0) [85]
-- @p76: Input Int (Size = 0; Prec = 0; Scale = 0) [86]
-- Context: SqlProvider(Sql2005) Model: AttributedMetaModel Build: 3.5.30729.1
The Last (But One) Method : Deferred Execution
Deferred execution takes about 0.191 seconds, it produces 7 queries in and the actual query code comes out to 6kb of data. This is only slightly slower than the Full Load version, but I’m certain that there are examples were the Full Load version will be considerably faster. You’ll find my full code for this method below.
The Last Method : Immediate Execution
Immediate execution takes about 0.4 seconds, it also produces 7 queries – though since each query gets a very large list of parameters, the amount of data is much larger, it comes out to 261kb of data. That’s 43 times more data!! You’ll find my full code for this method below.
Leader Board
Here’s the leader board as it currently stands;
Method |
Execution time |
Queries |
Query text |
Full Load |
0.137s |
1 query |
1kb (actually 195 bytes) |
LINQ Full Load – Deferred Execution |
0.191s |
7 queries |
6kb |
LINQ Full Load – Immediate Execution |
0.4s |
7 queries |
261kb |
Smarter Load Options |
0.45s |
78 queries |
114kb |
Default/Lazy Load |
1.4s |
1031 queries |
448kb |
Naive Load Options |
2.4s |
31 queries |
77kb |
The code – and there’s a lot of it!
Assemble the hierarchy;
public partial class NorthwindDataContext
{
public static List<Supplier> AssembleHierarchy(List<Supplier> suppliers, List<Product> products, List<Category> categories, List<Order_Detail> orderDetails, List<Order> orders, List<Shipper> shippers, List<Customer> customers)
{
suppliers.ForEach(
supplier => supplier.Products.SetSource(products.Where(product => product.SupplierID == supplier.SupplierID)));
products.ForEach(
product =>
{
product.Category = categories.Single(category => category.CategoryID == product.CategoryID);
product.Order_Details.SetSource(
orderDetails.Where(detail => detail.ProductID == product.ProductID));
});
orderDetails.ForEach(
detail =>
{
detail.Order = orders.Single(order => order.OrderID == detail.OrderID);
});
orders.ForEach(
order =>
{
order.Shipper = shippers.Single(shipper => shipper.ShipperID == order.ShipVia);
order.Customer = customers.Single(customer => customer.CustomerID == order.CustomerID);
});
return suppliers;
}
}
Deferred data fetch and hierarchy assembly;
private static List<Supplier> ExecuteSimpleFullFetchDeferredExecution(NorthwindDataContext context)
{
IEnumerable<Supplier> suppliers = context.Suppliers;
IEnumerable<Product> products =
context
.Products
.Join(
suppliers,
product => product.SupplierID,
supplier => supplier.SupplierID,
(product, supplier) => product)
.Distinct();
IEnumerable<Category> categories =
context
.Categories
.Join(
products,
category => category.CategoryID,
product => product.CategoryID,
(category, product) => category)
.Distinct();
IEnumerable<Order_Detail> orderDetails =
context
.Order_Details
.Join(
products,
detail => detail.ProductID,
product => product.ProductID,
(child, parent) => child)
.Distinct();
IEnumerable<Order> orders =
context
.Orders
.Join(
orderDetails,
detail => detail.OrderID,
order => order.OrderID,
(child, parent) => child)
.Distinct();
IEnumerable<Shipper> shippers =
context
.Shippers
.Join(
orders,
shipper => shipper.ShipperID,
order => order.ShipVia,
(child, parent) => child)
.Distinct();
IEnumerable<Customer> customers =
context
.Customers
.Join(
orders,
customer => customer.CustomerID,
order => order.CustomerID,
(child, parent) => child)
.Distinct();
return
NorthwindDataContext.AssembleHierarchy(
suppliers.ToList(),
products.ToList(),
categories.ToList(),
orderDetails.ToList(),
orders.ToList(),
shippers.ToList(),
customers.ToList());
}
Immediate data fetch and hierarchy assembly;
private static List<Supplier> ExecuteSimpleFullFetchImmediateExecution(NorthwindDataContext context)
{
List<Supplier> suppliers = context.Suppliers.ToList();
List<Product> products =
context
.Products
.Where(product => suppliers.Select(supplier => (int?)supplier.SupplierID).Contains(product.SupplierID))
.Distinct()
.ToList();
List<Category> categories =
context
.Categories
.Where(category => products.Select(product => product.CategoryID).Contains(category.CategoryID))
.Distinct()
.ToList();
List<Order_Detail> orderDetails =
context
.Order_Details
.Where(detail => products.Select(product => product.ProductID).Contains(detail.ProductID))
.Distinct()
.ToList();
List<Order> orders =
context
.Orders
.Where(order => orderDetails.Select(detail => detail.OrderID).Contains(order.OrderID))
.Distinct()
.ToList();
List<Shipper> shippers =
context
.Shippers
.Where(shipper => orders.Select(order => order.ShipVia).Contains(shipper.ShipperID))
.Distinct()
.ToList();
List<Customer> customers =
context
.Customers
.Where(customer => orders.Select(order => order.CustomerID).Contains(customer.CustomerID))
.Distinct()
.ToList();
return
NorthwindDataContext.AssembleHierarchy(
suppliers,
products,
categories,
orderDetails,
orders,
shippers,
customers);
}