SQL: Order By with NULL Values Last


I ran across a situation today where I needed to sort a result set on a column that could contain NULL values. By default, all rows with a NULL value in the sorted column are returned first but in this case, I wanted to have all the NULL values show up last in the result set.

Click here to jump to the correct solution or keep reading for a detail explanation of failed attempts at solving this problem.

For example, imagine you have an Employees table as defined below and with the indicated data.

Running a simple query ordering by the nullable DepartmentId column produces the following result (NULL values at the top).

SELECT * 
FROM Employees
ORDER BY DepartmentId

Ordering the result set by DepartmentId descending obviously won’t work as even though the NULL values will appear at the bottom all the other rows will be in descending order as well.

SELECT * 
FROM Employees
ORDER BY DepartmentId DESC

I have seen some people attempt to solve this problem using the following query.

SELECT Id, FirstName, LastName,
	   (CASE 
			WHEN DepartmentId IS NULL THEN 99999
			ELSE DepartmentId
		END) AS DepartmentId	   
FROM Employees 
ORDER BY DepartmentId

While this will most likely sort the rows as desired, this presents multiple problems. First, you are specifying a DepartmentId of 99999 for rows that really don’t have a DepartmentId and second, you have to ensure that you never have a DepartmentId over 99999 in your Employee records.

You can eliminate the first problem of the query above by doing the following:

SELECT *	 	   
FROM Employees 
ORDER BY ISNULL(DepartmentId, 99999)

The ISNULL function will replace a NULL DepartmentId with the value 99999, thus giving the desired ordering while still preserving the NULL value in the result set. But, this still leaves us with the second issue mentioned above.

The Solution

Both issues can be resolved by a simple case statement in the ORDER BY clause as shown below.

SELECT *	   
FROM Employees 
ORDER BY (CASE 
			WHEN DepartmentId IS NULL THEN 1 
			ELSE 0 
	      END), 
	     DepartmentId

Here, the result set is first sorted by a temporary column with the value of 1 if the associated row’s DepartmentId value is NULL and 0 if it is not. Doing such will ensure that all records with a NULL DepartmentId appear after records with a non-NULL DepartmentId. The two groups of records (ones with a NULL DepartmentId and ones with a non-NULL DepartmentId) are then sorted by the DepartmentId which ensures that all records with a DepartmentId are sorted correctly.

A simple solution to a random problem.

Posted in Sorting, SQL. Tags: , . 5 Comments »

C#: Most Recently Used List


I wanted to add a button to one of the applications that I wrote that listed the last ten reports that a user has viewed so they could quickly and easily access recently viewed reports. I found a lot of information about how to create a Most Recently Used (MRU) list of files that have been opened but I couldn’t find anything built into the .NET Framework to create a custom list of objects where the most recently used item was at the beginning of the list.

I’m sure many of you are saying to yourself, “Why didn’t he just use a stack?”. The reason I didn’t use the already built Stack class is because I needed to implement two other features:

  • Restrict the size of the set
  • Ensure no duplicate items exist in the set

The Stack class found in the System.Collections namespace doesn’t enforce these two properties inherently and thus the need for a custom collection.

Note: I could have optimized this collection a bit more by adding each item to the end of the list and keeping a pointer to the location of the true first item but given the fact that I only will be storing a few items in the set, I didn’t take the time to implement it that way. I leave that optimization to you.

/// <summary>
/// Stores a set of items with the most recently added at item the front and the 
/// least recently added item at the end
/// </summary>
public class RecentSet<T> : IEnumerable<T>
{
    private List<T> _list;
    private int _size = -1;

    /// <summary>
    /// Creates a new RecentSet object.
    /// </summary>
    public RecentSet()
    {
        _list = new List<T>();            
    }
     /// <summary>
    /// Creates a new RecentSet object with a fixed size. The return set may be smaller than
    /// the specified size but it will never be larger
    /// </summary>
    /// <param name="size">The maximum size of the set</param>
    public RecentSet(int size)
    {
        _list = new List<T>();
        _size = size;
    }
 
    /// <summary>
    /// Creates a new RecentSet object initializing it with the indicated items. Note: 
    /// the initialized RecentSet will be in the order of parameter items.  If items are {1, 2, 3, 4},
    /// iterating through RecentSet will result in a list of {1, 2, 3, 4} not {4, 3, 2, 1}        
    /// </summary>
    public RecentSet(IEnumerable<T> items)
    {
        _list = items.ToList();
    }

    /// <summary>
    /// Creates a new RecentSet object with a fixed size initializing it with the indicated items. Note: 
    /// the initialized RecentSet will be in the order of parameter items.  If items are {1, 2, 3, 4},
    /// iterating through RecentSet will result in a list of {1, 2, 3, 4} not {4, 3, 2, 1}        
    /// </summary>
    public RecentSet(int size, IEnumerable<T> items)
    {
        _list = items.ToList();
        _size = size;

        TrimList();
    }

    /// <summary>
    /// Adds an item to the RecentSet
    /// </summary>
    public void Add(T item)
    {
        // If the item is already in the set, remove it
        int i = _list.IndexOf(item);
        if (i > -1)
            _list.RemoveAt(i);

        // Add the item to the front of the list.
        _list.Insert(0, item);

        TrimList();
    }

    public int Count
    {
        get { return _list.Count; }
    }

    private void TrimList()
    {
        // If there is a set size, make sure the set only contains that many elements
        if (_size != -1)
            while (_list.Count > _size)
                _list.RemoveAt(_list.Count - 1);
    }

    /// <summary>
    /// Returns the set in the form of a List
    /// </summary>
    public List<T> ToList()
    {
        return _list;
    }

    #region IEnumerable<T> Members

     public IEnumerator<T> GetEnumerator()
    {           
        return _list.GetEnumerator();
    }

    #endregion

    #region IEnumerable Members

    System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
    {
        return _list.GetEnumerator();
    }

    #endregion
}

Arbitrary Sort


Sorting is an issue that traces way back to the beginning of programming. All programming languages provide different ways to sort different items. Many even allow you to implement your own comparison function so you can sort a list of items in any way you want; by length or string, in reverse order, string with the most spaces, etc. But, I ran into a sorting problem that couldn’t easily be solved by any built in functions. My problem is as follows: the application I’m writing allows the user to post transactions to a resident in an apartment and each transaction is associated with a category. Categories include Rent, Pet Fees, Late Fees, Storage Unit Charge, Garage Charge, Carport Charge, etc. When the user posts a payment, the payment needs to be applied to the various oustanding charges for the resident in an order that can be arbitrarily set by an administrator. For instance, if a user has an outstanding rent charge, late fee charge, and garage charge, an administrator could setup the application so that the payment would first be applied to the late fee, then the garage charge, and lastly the rent charge. So, when a payment is made, the outstanding charges need to be retrieved, and then ordered according to what the user has defined. There in lies the problem. I wasn’t trying to sort a list by length, alphabetically, or something simple like that, I needed to sort the list by the ordering in another list defined by the user.

To solve my problem, I built an anonymous typed class called ArbitrarySort that implements the IComparer interface. The constructor takes a List of objects of type T which defines the desired sort order. From the passed in List, a Dictionary with key of type T and value of type int is created. Each item in the List is added in order into the Dictionary with an associated value that is incremented by one after each addition. Thus, if you passed in the List

Dog
Cat
Mouse
Hampster

the corresponding Dictionary in (Key, Value) form would be:

(Dog, 1)
(Cat, 2)
(Mouse, 3)
(Hampster, 4)

The Compare method takes two arguments of type T and looks up the value associated with each item in the Dictionary and compares them. Thus, if the Compare method was called on “Hampster” and “Cat”, the method would compare the values 4 and 2 and 1 would be returned as “Hampster” is considered to be greater than “Cat” and comes later in the list.

With this class, I can pass in a list of all transaction categories ordered according to which category a payment should be applied first and then when a payment is received, a list of outstanding charges is sorted according to the predefined order in the ArbitrarySort class.

    /// <summary>
    /// Creates a sorting class that lets you sort a list of items in an arbitrary order.
    /// </summary>
    /// <typeparam name="T"></typeparam>
    public class ArbitrarySort<T> : IComparer<T>
    {
        /// <summary>
        /// Dictionary that stores the objects used for ordering.  The key represents the
        /// object that is to be stored and the value is the index in the list.  Retrieving
        /// the value of an object will give the index it appears in the list.
        /// </summary>
        Dictionary<T, int> _sortOrder;

        /// <summary>
        /// Constructs a new ArbitrarySort class to sort a list in an abritrary order.
        /// </summary>
        /// <param name="sortOrder">Specifies the order in which items should be ordered.
        /// Example: To sort a list of single digit integers first by even numbers and then odd numbers, pass in { 0, 2, 4, 6, 8, 1, 3, 5, 7, 9 }.</param>
        public ArbitrarySort(List<T> sortOrder)
        {
            // Create the dictionary
            _sortOrder = new Dictionary<T, int>();

            for (int i = 0; i < sortOrder.Count(); i++)
                _sortOrder.Add(sortOrder[i], i);
        }

        /// <summary>
        /// Compares two objects and returns a value indicating whether one is less than, equal to or greater than the other.
        /// </summary>
        /// <param name="x">First object of type T to compare.</param>
        /// <param name="y">Second object of type T to compare</param>
        /// <returns>If x is less than y then -1.  If x is equal to y then 0.  If x is greater than y then 1</returns>
        public int Compare(T x, T y)
        {
            // Perform null testing.  A null reference is considered to be less
            // than the other object
            if (x == null && y == null)
                return 0;
            else if (x == null)
                return -1;
            else if (y == null)
                return 1;
            else
            {
                // Variables to store the indicies in the original sort order of each of the objects
                int xIndex = 0;
                int yIndex = 0;

                // Try and get both objects from the _sortOrder collection
                bool xExists = _sortOrder.TryGetValue(x, out xIndex);
                bool yExists = _sortOrder.TryGetValue(y, out yIndex);

                // If one object is not in the list it is considered to be
                // greater than the other object.  Otherwise, compare the two
                // indicies, if the index for object x is less than object y,
                // then x is smaller than y and visa versa.
                if (!xExists && !yExists)
                    return 0;
                else if (!xExists)
                    return 1;
                else if (!yExists)
                    return -1;
                else
                    return xIndex.CompareTo(yIndex);
            }
        }
    }

Since the AribitrarySort class is implemented using a Dictionary, which provides O(1) lookup time, the sorting of lists is extremely quick. For example, sorting a list of one million elements only took 2.6 seconds. For reference, sorting the same list alphabetically using the generic quick sort implemented in the .Net Framework took 1.9 seconds.

            Random random = new Random(DateTime.Now.Millisecond);
            List<string> sortOrder = new List<string>() { "Dog", "Cat", "Fish", "Bird", "Rat", "Monkey", "Frog", "Snake", "Spider", "Lizard", "Rabbit" };
            List<string> arbitrarySortList = new List<string>();
            List<string> quickSortList = new List<string>();
            for (int i = 0; i < 1000000; i++)
            {
                string item = sortOrder[random.Next(0, 9)];
                arbitrarySortList.Add(item);
                quickSortList.Add(item);
            }

            Stopwatch sw = Stopwatch.StartNew();
            arbitrarySortList.Sort(new ArbitrarySort<string>(sortOrder));
            sw.Stop();

            Console.WriteLine("Arbitrary Sort Time: " + sw.ElapsedMilliseconds + " ms");

            sw.Reset();
            sw.Start();
            quickSortList.Sort();
            sw.Stop();

            Console.WriteLine("Quick Sort Time: " + sw.ElapsedMilliseconds + " ms");

            // Output
            // ------
            // Arbitrary Sort Time: 2628 ms
            // Quick Sort Time: 1938 ms
Posted in Sorting. Tags: , . 1 Comment »