Tuesday, December 10, 2013

ORM vs SQL Part 2: Ordering

Most applications will need to to control the ordering of the data being presented to users, the exact requirements will differ depending on the application and situation, but virtually all will require it. Sometimes the same data will presented in several places, each requiring it's own order for natural reasons. Other times dynamic tables will be presented and users will control what order the view information in. Some even allow full control and let the users specify multiple orders.

In the later two cases ordering usually goes hand in hand with filtering, but I'm going to leave the filtering until part three. In part one I looked at paging.


Ordering with an ORM


Simple ordering is straight forward with an ORM (and with sql for that matter):

  1. var users = session.QueryOver<User>().OrderBy(x => x.LastName).Asc.List();
In fact, at this level, sql is even simpler. An ORM starts to shine when you have some other logic controlling the ordering because you get a composable query object to work with, for example:
  1. var query = session.QueryOver<User>();
  2. if (orderByLastName == true)
  3.   query = query.OrderBy(x=>x.LastName).Asc;
  4. if (orderByLastLogin == true)
  5.   query = query.OrderBy(x=>x.LastLogin).Asc;
By composing queries like this, any ordering that can be logically described can be performed, your only limit is your own code and how much power and flexibility you want to give the end users.

The ORM will generate the sql needed for that individual scenario and make full use of our indexes.


Ordering with Sql


Simple ordering in sql is extremely straight forward:

  1. select TOP 10 * from users order by lastName

Most importantly, this will use the index on lastName. On my computer this will execute in ~5ms with 1,000,000 rows. This is essentially what our ORM will generate. 

Now, let's say our application is showing a list of users in two places, the first we are trying to find users by last name and the second we want to see who were the latest to login. We can write two stored procedures, which will be identical apart from the order by clause. Or we can add some dynamic ordering to the query:

  1. SELECT TOP 10 id, lastName, firstName
  2. FROM users
  3. order by CASE
  4.   WHEN @OrderBy = 'LastNameAsc' THEN lastName
  5.   WHEN @OrderBy = 'FirstName' THEN firstName
  6. END
The drawbacks for this are that the data types have to be the same in the case statement, so we can't use the same query to order by lastLogin. Likewise, we can't mix ascending and descending. And even if we are ordering on lastName, which has an index in this scenario, the index will not get used. It also performs quite badly, 753ms  which is more than two orders of magnitude slower for the same functionality.

In part one I said the new 2012 offset/next syntax performed quite slowly with dynamic ordering, this is why.


Ordering with Sql - Again


So being unable to mix datatypes is a deal breaker in our scenario. we want to order by either lastName or lastLogin. The easiest way to do this is to invite our old friend from part one, rownumber: 

  1. SELECT TOP 10 id, lastName, firstName, CASE
  2.   WHEN @OrderBy = 'LastNameAsc' THEN ROW_NUMBER() OVER(ORDER BY lastName)
  3.   WHEN @OrderBy = 'LastLoginDesc' THEN ROW_NUMBER() OVER(ORDER BY lastLogin desc)
  4. END  as rownumber 
  5. FROM users
  6. order by rownumber
Finally, we can order by multiple data types and by ascending or descending order. But it comes at a hefty price, it's yet another order of magnitude slower, ~5800ms, way beyond what I would consider acceptable.


Ordering and Paging


So now let's look at the combination of paging and ordering:
  1. SELECT TOP(@take) id, lastName, firstName, email, lastLogin 
  2. FROM (
  3.   SELECT id, lastName, firstName, email, lastLogin, CASE
  4.     WHEN @OrderBy = 'LastNameAsc' THEN ROW_NUMBER() OVER(ORDER BY lastName)
  5.     WHEN @OrderBy = 'LastLoginDesc' THEN ROW_NUMBER() OVER(ORDER BY lastLogin desc)
  6.   END  as rownumber 
  7.   FROM users
  8. ) as query 
  9. WHERE query.rownumber > @skip
  10. ORDER BY rownumber

Our simply little select isn't so simple any more and the performance has suffered terribly (~8500ms), to the point where it is immediately noticeable to end users. All because you thought ORM's were too slow.

We are quickly approaching the limits of what we can do with sql. The ORM can easily handle ordering over multiple columns dynamically, to do the same with sql would require a fundamentally different, and probably much more complex, strategy.


Why are ORM's faster?


In this scenario our ORM is 3 orders of magnitude faster than a stored procedure with the same functionality. This is simply because it generates queries dynamically, it doesn't have to account for every possibility at programming time because it can adjust at run time instead.

Sql, despite frequently being championed as faster, can not even make use of indexes appropriately as soon as you throw a real world situation at it.

The final score, with paging and ordering is ORM: ~10ms, Sql: ~8500ms.

Next up, I'll take a look at dynamic filtering.




















No comments:

Post a Comment