SQL Server Profiler and LINQ to SQL

If you’re using LINQ to SQL, you’re eventually likely to be required to trace SQL statements from the SQL Server Profiler. Either for performance purposes or for debugging purposes. When you do, you’ll find that you’re in a world of hurt!

Below, I’ll demonstrate what I’m using and which may not be perfect, but it’s a huge step in the right direction!

A world of hurt

So, you want to trace a SQL statement back to the LINQ code that generated it? You’re really in it now! The reason is that when you execute LINQ to SQL statements, they can generate any number of SQL statements and it’s near impossible to trace them back to the individual statements that generated them. Especially on a busy production server where lots and lots of stuff is happening in the profiler

Even worse, you’ll typically have DAL (Data Access Layer) methods that are similar to how you’d use stored procedures previously, that execute a number of LINQ to SQL statements. So the offending SQL might be a tiny part of a large set of SQL statements originating from a particular DAL method.

What I’d like

I’d love to have a method for tracing every single LINQ to SQL statement from SQL Server Profiler back to my code in Visual Studio. But I haven’t been able to find a method for doing this.

What I’ve got out of the box

Basically, I have a long long long list of SQL statements in a log that I have no chance of matching up with the actual code.

What I’ve come up with

I’ve been able to add a line in the SQL Server Profiler logs as each DAL method starts and when it ends – that way I’m able to determine which DAL method a particular SQL Statements belongs to.

The naive way

Now, this can be done relatively straight forward, we’ll simply add a comment at the start and at the end of the DAL method – but this makes our code messy and hard to read. Especially if we get fancy and do a try/catch to make our code safer.

This is what the code looks to start out with (we’re using a data context factory);

        private static List<Employee> GetEmployees()
        {
            using (MyDataClassesDataContext context = 
new MyDataClassesDataContext())
            {
                return
                    context
                        .Employees
                        .Where(employee => employee.LastName.StartsWith("a"))
                        .ToList();
            }
        }

 

Adding some logging;

        private static List<Employee> GetEmployees()
        {
            using (MyDataClassesDataContext context = 
new MyDataClassesDataContext())
            {
                context.ExecuteCommand("BeginCall: GetEmployees");
                List<Employee> employees =
                    context
                        .Employees
                        .Where(employee => employee.LastName.StartsWith("a"))
                        .ToList();
                context.ExecuteCommand("EndCall: GetEmployees");
                return employees;
            }
        }

 

That

  • looks pretty bad
  • we had to refactor the code to use a temporary variable – which is ok if you need it but fairly pointless in this case
  • we must add the logging code to every DAL method!
  • the risks of missing one or a few are fairly large
  • we want to be safe against refactoring (the method changes name)
  • if the call fails with an exception, the exception will prevent the “EndCall” text from ever making it to the log.

What to do?

A cleverer way

Instead we add BeginCall every time the context is created and EndCall every time dispose is called, which allows us to structure our methods the same way as before and forces us to add the logging code. Also, instead of sending the actual name of the method, we use reflection and send MethodInfo – that makes it refactoring safe.

This is what the method call looks like – note that the logging part is invisible, the only change from the original method is the call to GetCurrentMethod

        private static List<Employee> GetEmployees()
        {
            using (MyDataClassesDataContext context =
new MyDataClassesDataContext(MethodBase.GetCurrentMethod()))
            {
                return
                    context
                        .Employees
                        .Where(employee => employee.LastName.StartsWith("a"))
                        .ToList();
            }
        }

 

Next we need to log stuff, MyDataClassesDataContext is a class that’s autogenerated by the DBML editor (as used with LINQ to SQL), but it’s partial so adding stuff is fairly straight forward.

The tricky part is that DataContext, which MyDataClassesDataContext  inherits from, implements Dispose and if we try to override the Dispose method of MyDataClassesDataContext, we’ll not be able to log anything, because that method is called once the context is allready disposed!

But not to worry, we simply re-implement Dispose and call the base method once we’ve logged what we need;

using System;
using System.Reflection;

namespace TraceLINQToSQL
{
    partial class MyDataClassesDataContext : IDisposable
    {
        private readonly string _methodName;

        public MyDataClassesDataContext(MethodBase methodBase)
            : this()
        {
            _methodName = methodBase.DeclaringType.FullName + "." + methodBase.Name;
            AddLogEntry("BeginCall Method={0}", _methodName);
        }

        public static bool LoggingEnabled { get; set; }

        public new void Dispose()
        {
            AddLogEntry("EndCall Method={0}", _methodName);
            base.Dispose();
        }

        private void AddLogEntry(string paramString, params object[] args)
        {
            if (LoggingEnabled)
            {
                ExecuteCommand("--" + string.Format(paramString, args));
            }
        }
    }
}

 

With this, every DAL method should start logging every call so that traceability is greatly improved. However, the cost is two extra roundtrips to the database per DAL method, so it might be a good idea to turn off logging for production. But in my tests, the overhead was negligible.

Going forward

From this, things could be greatly improved with a tool; a tool that groups all SQL statements from a single DAL call and sum up roundtrips/reads/writes/duration/CPU usage etc.

A method that counts how many times each DAL method is called. I haven’t had the time to create such a tool, but if you know of one, please let us know!

Sleep in T-SQL / SQL Server script

Sometimes you need to make a script sleep for a short period of time, either for debugging purposes, or to make sure that something else has time to finish. To do this in T-SQL, you can use this command, which took me a lot of googling to find;

WAITFOR DELAY '00:00:00:50' -- 50ms delay

Benchmarking SQL Server operations

Sometimes you’re stuck trying to optimize a large stored procedure and the first step in optimizations is to figure out where the time is being spent. It’s not always where you expect, and going at it haphazardly doesn’t cut it.

The answer is of course benchmarking (timing), to figure out where the time is spent, and to figure out if your “fix” improved performance or not.

First of all, declare the variables you’ll need. Do this at the top of the the procedure/script that you’re benchmarking;

declare
  @totalTime datetime=GetDate(),
  @partTime datetime=GetDate()

Then drop in scripts like these throughout the code you’re benchmarking;

print ' Timing: [insert_descroption_here]'
print ' * Total time='+convert(varchar(10), datediff(ms, @totalTime,GetDate()))
print ' * Last Part='+convert(varchar(10), datediff(ms, @partTime,GetDate()))
set @partTime=GetDate()

Once you’ve found a block that takes an inordinate amount of time, either try to figure out how to optimize it, or insert more benchmarking points to pinpoint the actual culprit.

Here’s an example;

declare
  @totalTime datetime=GetDate(),
  @partTime datetime=GetDate()

-- Simulates actual script
WAITFOR DELAY '00:00:00:50' -- 50ms delay

print ' Timing: [Initiated]'
print ' * Total time='+convert(varchar(10), datediff(ms, @totalTime,GetDate()))
print ' * Last Part='+convert(varchar(10), datediff(ms, @partTime,GetDate()))
set @partTime=GetDate()

-- Simulates actual script
WAITFOR DELAY '00:00:10:00' -- 10s delay

print ' Timing: [Collected base data]'
print ' * Total time='+convert(varchar(10), datediff(ms, @totalTime,GetDate()))
print ' * Last Part='+convert(varchar(10), datediff(ms, @partTime,GetDate()))
set @partTime=GetDate()

-- Simulates actual script
WAITFOR DELAY '00:00:00:50' -- 50ms delay

print ' Timing: [Compute sums]'
print ' * Total time='+convert(varchar(10), datediff(ms, @totalTime,GetDate()))
print ' * Last Part='+convert(varchar(10), datediff(ms, @partTime,GetDate()))
set @partTime=GetDate()

print 'Done!'

This script produces this output;

Timing: [Initiated]

* Total time=50

* Last Part=50

Timing: [Collected base data]

* Total time=10043

* Last Part=9993

Timing: [Compute sums]

* Total time=10096

* Last Part=53

Done!

As you’ll notice, it’s the part up to “Collected base data” that’s slow, so that’s where you should start either benchmarking further or actually optimizing!