SQL: Order By with NULL Values Last


I ran across a situation today where I needed to sort a result set on a column that could contain NULL values. By default, all rows with a NULL value in the sorted column are returned first but in this case, I wanted to have all the NULL values show up last in the result set.

Click here to jump to the correct solution or keep reading for a detail explanation of failed attempts at solving this problem.

For example, imagine you have an Employees table as defined below and with the indicated data.

Running a simple query ordering by the nullable DepartmentId column produces the following result (NULL values at the top).

SELECT * 
FROM Employees
ORDER BY DepartmentId

Ordering the result set by DepartmentId descending obviously won’t work as even though the NULL values will appear at the bottom all the other rows will be in descending order as well.

SELECT * 
FROM Employees
ORDER BY DepartmentId DESC

I have seen some people attempt to solve this problem using the following query.

SELECT Id, FirstName, LastName,
	   (CASE 
			WHEN DepartmentId IS NULL THEN 99999
			ELSE DepartmentId
		END) AS DepartmentId	   
FROM Employees 
ORDER BY DepartmentId

While this will most likely sort the rows as desired, this presents multiple problems. First, you are specifying a DepartmentId of 99999 for rows that really don’t have a DepartmentId and second, you have to ensure that you never have a DepartmentId over 99999 in your Employee records.

You can eliminate the first problem of the query above by doing the following:

SELECT *	 	   
FROM Employees 
ORDER BY ISNULL(DepartmentId, 99999)

The ISNULL function will replace a NULL DepartmentId with the value 99999, thus giving the desired ordering while still preserving the NULL value in the result set. But, this still leaves us with the second issue mentioned above.

The Solution

Both issues can be resolved by a simple case statement in the ORDER BY clause as shown below.

SELECT *	   
FROM Employees 
ORDER BY (CASE 
			WHEN DepartmentId IS NULL THEN 1 
			ELSE 0 
	      END), 
	     DepartmentId

Here, the result set is first sorted by a temporary column with the value of 1 if the associated row’s DepartmentId value is NULL and 0 if it is not. Doing such will ensure that all records with a NULL DepartmentId appear after records with a non-NULL DepartmentId. The two groups of records (ones with a NULL DepartmentId and ones with a non-NULL DepartmentId) are then sorted by the DepartmentId which ensures that all records with a DepartmentId are sorted correctly.

A simple solution to a random problem.

Posted in Sorting, SQL. Tags: , . 5 Comments »

Arbitrary Sort


Sorting is an issue that traces way back to the beginning of programming. All programming languages provide different ways to sort different items. Many even allow you to implement your own comparison function so you can sort a list of items in any way you want; by length or string, in reverse order, string with the most spaces, etc. But, I ran into a sorting problem that couldn’t easily be solved by any built in functions. My problem is as follows: the application I’m writing allows the user to post transactions to a resident in an apartment and each transaction is associated with a category. Categories include Rent, Pet Fees, Late Fees, Storage Unit Charge, Garage Charge, Carport Charge, etc. When the user posts a payment, the payment needs to be applied to the various oustanding charges for the resident in an order that can be arbitrarily set by an administrator. For instance, if a user has an outstanding rent charge, late fee charge, and garage charge, an administrator could setup the application so that the payment would first be applied to the late fee, then the garage charge, and lastly the rent charge. So, when a payment is made, the outstanding charges need to be retrieved, and then ordered according to what the user has defined. There in lies the problem. I wasn’t trying to sort a list by length, alphabetically, or something simple like that, I needed to sort the list by the ordering in another list defined by the user.

To solve my problem, I built an anonymous typed class called ArbitrarySort that implements the IComparer interface. The constructor takes a List of objects of type T which defines the desired sort order. From the passed in List, a Dictionary with key of type T and value of type int is created. Each item in the List is added in order into the Dictionary with an associated value that is incremented by one after each addition. Thus, if you passed in the List

Dog
Cat
Mouse
Hampster

the corresponding Dictionary in (Key, Value) form would be:

(Dog, 1)
(Cat, 2)
(Mouse, 3)
(Hampster, 4)

The Compare method takes two arguments of type T and looks up the value associated with each item in the Dictionary and compares them. Thus, if the Compare method was called on “Hampster” and “Cat”, the method would compare the values 4 and 2 and 1 would be returned as “Hampster” is considered to be greater than “Cat” and comes later in the list.

With this class, I can pass in a list of all transaction categories ordered according to which category a payment should be applied first and then when a payment is received, a list of outstanding charges is sorted according to the predefined order in the ArbitrarySort class.

    /// <summary>
    /// Creates a sorting class that lets you sort a list of items in an arbitrary order.
    /// </summary>
    /// <typeparam name="T"></typeparam>
    public class ArbitrarySort<T> : IComparer<T>
    {
        /// <summary>
        /// Dictionary that stores the objects used for ordering.  The key represents the
        /// object that is to be stored and the value is the index in the list.  Retrieving
        /// the value of an object will give the index it appears in the list.
        /// </summary>
        Dictionary<T, int> _sortOrder;

        /// <summary>
        /// Constructs a new ArbitrarySort class to sort a list in an abritrary order.
        /// </summary>
        /// <param name="sortOrder">Specifies the order in which items should be ordered.
        /// Example: To sort a list of single digit integers first by even numbers and then odd numbers, pass in { 0, 2, 4, 6, 8, 1, 3, 5, 7, 9 }.</param>
        public ArbitrarySort(List<T> sortOrder)
        {
            // Create the dictionary
            _sortOrder = new Dictionary<T, int>();

            for (int i = 0; i < sortOrder.Count(); i++)
                _sortOrder.Add(sortOrder[i], i);
        }

        /// <summary>
        /// Compares two objects and returns a value indicating whether one is less than, equal to or greater than the other.
        /// </summary>
        /// <param name="x">First object of type T to compare.</param>
        /// <param name="y">Second object of type T to compare</param>
        /// <returns>If x is less than y then -1.  If x is equal to y then 0.  If x is greater than y then 1</returns>
        public int Compare(T x, T y)
        {
            // Perform null testing.  A null reference is considered to be less
            // than the other object
            if (x == null && y == null)
                return 0;
            else if (x == null)
                return -1;
            else if (y == null)
                return 1;
            else
            {
                // Variables to store the indicies in the original sort order of each of the objects
                int xIndex = 0;
                int yIndex = 0;

                // Try and get both objects from the _sortOrder collection
                bool xExists = _sortOrder.TryGetValue(x, out xIndex);
                bool yExists = _sortOrder.TryGetValue(y, out yIndex);

                // If one object is not in the list it is considered to be
                // greater than the other object.  Otherwise, compare the two
                // indicies, if the index for object x is less than object y,
                // then x is smaller than y and visa versa.
                if (!xExists && !yExists)
                    return 0;
                else if (!xExists)
                    return 1;
                else if (!yExists)
                    return -1;
                else
                    return xIndex.CompareTo(yIndex);
            }
        }
    }

Since the AribitrarySort class is implemented using a Dictionary, which provides O(1) lookup time, the sorting of lists is extremely quick. For example, sorting a list of one million elements only took 2.6 seconds. For reference, sorting the same list alphabetically using the generic quick sort implemented in the .Net Framework took 1.9 seconds.

            Random random = new Random(DateTime.Now.Millisecond);
            List<string> sortOrder = new List<string>() { "Dog", "Cat", "Fish", "Bird", "Rat", "Monkey", "Frog", "Snake", "Spider", "Lizard", "Rabbit" };
            List<string> arbitrarySortList = new List<string>();
            List<string> quickSortList = new List<string>();
            for (int i = 0; i < 1000000; i++)
            {
                string item = sortOrder[random.Next(0, 9)];
                arbitrarySortList.Add(item);
                quickSortList.Add(item);
            }

            Stopwatch sw = Stopwatch.StartNew();
            arbitrarySortList.Sort(new ArbitrarySort<string>(sortOrder));
            sw.Stop();

            Console.WriteLine("Arbitrary Sort Time: " + sw.ElapsedMilliseconds + " ms");

            sw.Reset();
            sw.Start();
            quickSortList.Sort();
            sw.Stop();

            Console.WriteLine("Quick Sort Time: " + sw.ElapsedMilliseconds + " ms");

            // Output
            // ------
            // Arbitrary Sort Time: 2628 ms
            // Quick Sort Time: 1938 ms
Posted in Sorting. Tags: , . 1 Comment »

Crystal Reports: Sorting String Fields Numerically


Generally speaking, if you need to sort by a field in a numerical fashion, it is obviously best to have the value of the field be a number. But, I have had a few situations where I needed to store the value of the field as a string but sort it in a numerical fashion (for example, fields where letters are allowed but generally are all numbers). If you do a generic string sort on a set of numbers, it does not come out in numerical order due to the fact that lengths are not taken into account. Anything that starts with a 0 comes first, then with a 1 next, etc. So, the following strings ‘200’, ’10’, ‘3020’, ‘420’, ’11’, ‘8’ get sorted as follows:

10
11
200
3020
420
8

Obviously, not the desired result. On the other hand, if we prepend zeros to each of the strings like so, ‘0200’, ‘0010’, ‘3020’, ‘0420’, ‘0011’, ‘0008’, then the strings will be sorted in numeric order.

0008
0010
0011
0200
0420
3020

So, all you need to do is pad your string field with zeros so that each string is the same length. To do this in Crystal Reports, create a forumla field with the following code.

Right("000000000000000" & {Field},15)

The length of the string of zeros should be as long as the maximum length of the field and the second parameter of the Right function is as well the maximum length of the field. Then sort by this forumal field as opposed to the original field.

A simple solution that for some reason wasn’t completely apparent to me at first.