Sorting by locale with comparers in multiple columns

Lists can be easily sorted using comparer functions, but sorting table column(s) with comparer functions is much more difficult. This post introduces a function that allows you to do this. One major reason to use this function is for sorting according to the rules of a specific language or locality (also known as culture or locale).

What is sorting by locale?

Lets say I have a list of movies:
MovieNames = {“Toy story”,”Jurassic Park”, “Aeon Flux”, “Æon Flux”, “jurassic park”}

List.Sort(MovieNames) results in:
{“Aeon Flux”, “Jurassic Park”, “Toy story”, “jurassic park”, “Æon Flux”}
This is sorted according to the Unicode sequence which in this circumstance is not desirable.

Using List.Sort(MovieNames, Comparer.FromCulture(“en”,false)) however, results in:
{“Aeon Flux”, “Æon Flux”, “jurassic park”, “Jurassic Park”, “Toy story”}
This is now correct (for the English language, anyways). “jurassic park” comes just before “Jurassic park” as the ignoreCase argument is set to false. (If it was set to true, “jurassic park” would still be grouped with “Jurassic park” but could be in any order).  Also “Aeon Flux” is combined with “Æon Flux” as Comparer.FromCulture(“en”,false)( “Ae”,”Æ”) returns 0 because “Æ” is equivalent to “AE” in the English language.

Other languages have different rules when it comes it to sorting:

In Danish (culture name: “da”), Æ is regarded as a separate letter and not equivalent to “AE”.

Hawaiian (culture name: “haw”) alphabetical order differs from the normal Latin order. It’s five vowels come first (with the short vowel coming before it’s corresponding long vowel) and after that comes its eight consonants.

Tamil (culture name: “ta”) has a completely different script and is syllabic not alphabetic. Uyirmei characters are formed by a combination of Uyir characters (vowels) and Mei characters (consonants). For example, the combination of க, ் and ஷ is equivalent to: க்ஷ.

Sorting Simplified Chinese can be done in at least 3 different ways: by pronounciation: “zh-CN”, by stroke/radical count: “zh-CN_stroke” and Chinese phone book (surname) order: “zh-CN_phoneb”.

The list goes on and on. Two other articles that might be of interest can find can be found here and here.

Can you give me a full list of locales?

842 locales can be found here and 18 more “special” codes exist here. They all work but these lists are not complete. The “culture name” or “language tag” is not case sensitive so “en-gb” is equivalent to “en-GB”. Also some are functionally equivalent despite being of different text, e.g. “fil-latn” is equivalent to “fil”.

Can I sort tables using Comparer.FromCulture?

As shown above, it is very easy to sort lists with a comparer function. Sorting tables on multiple columns with comparers is harder. I have created a function here that allows you to do this:

(Table as table
,comparisonCriteria as any 
,optional defaultcomparisonCriteria as any) =>
    fMakeSureIsList = each if _ is list then _ else {_}

   ,Default =
        List = fMakeSureIsList(defaultcomparisonCriteria)
       ,Sort = if List{0} is number then List{0} else 
               if (List{0} is null or List{0} is function) then Order.Ascending 
               else error "invalid defaultcomparisonCriteria"     
       ,Comparer = if List{0} is function then List{0} else 
                   if List{1}? is function then List{1} else 
                   if List{1}? is null then Value.Compare else 
                   error "invalid defaultcomparisonCriteria"
       ,Record = [Sort = Sort, Comparer = Comparer]

   ,compCritListRecords = 
        List = fMakeSureIsList(comparisonCriteria) //e.g "Col1" => {"Col1"}
       ,FirstCheck =  if Value.Type(List{0}?) = type text    //So ListList does not transform
                     and Value.Type(List{1}?) = type number //{"Col1", Order.Ascending} 
                     then {List} else List                 //to {{"Col1"}, {Order.Ascending}} 
       ,ListList = List.Transform(List, fMakeSureIsList)
       ,fSort = (Order) => if Order = Order.Ascending then 1 else 
                           if Order = Order.Descending then -1 else 
                           error "Order must be Order.Ascending or Order.Descending"
       ,ListRec =  List.Transform(ListList
                  ,each let ColNam = _{0} as text
                           ,Order = _{1}? as nullable number
                           ,Comp = _{2}? as nullable function
                         in [ColNam = ColNam
                            ,Order = fSort(if Order = null then Default[Sort] else Order)
                            ,Comp = if Comp = null then Default[Comparer] else Comp ]) 
        List.Buffer(ListRec as list)
    ,fComparer = (x as record, y as record) =>
         fFullCompare = (x,y, compCrit) => 
            cC = compCrit{0}
           ,comparison = cC[Comp](Record.Field(x,cC[ColNam]),Record.Field(y,cC[ColNam]))
           ,comparisonAllCols = if comparison = 0 and List.Count(compCrit)>1 then 
                                @fFullCompare(x,y,List.Skip(compCrit)) else 
                                comparison *cC[Order]
        ,CompareRec = fFullCompare(x,y,compCritListRecords)

    ,Execute = Table.Sort(Table, fComparer)        

How do I use this function?

For the purposes of explanation, the function above will be called TableComparerSort, henceforth.

A valid argument (of the first style) for Table.Sort will also work for TableComparerSort and will have the same results (although TableComparerSort will be slower).

So let’s say there is a table “Table” with columns named “A”,”B” and “C”.

The 4 declarations below are equivalent:

Table.Sort(Table, {“A”, {“B”, Order.Descending}})
Table.Sort(Table, {{“A”, Order.Ascending},{“B”, Order.Descending}})
TableComparerSort(Table, {“A”, {“B”, Order.Descending}})
TableComparerSort(Table, {{“A”, Order.Ascending},{“B”, Order.Descending}})

Now lets change one of the comparers. We want column “A” to be sorted according to English locality. We cannot not do it directly with Table.Sort, but we can do it by using the TableComparerSort function.

TableComparerSort(Table, {{“A”, Order.Ascending, Comparer.FromCulture(“en”,false)},{“B”, Order.Descending}})

There is also an optional defaultcomparisonCriteria option which can sometimes simplify the comparisonCriteria. This optional argument allows you to override the default sort direction (Order.Ascending) and the default comparer (Value.Compare) which are used when these values are not specified for a column in the comparisonCriteria. The defaultcomparisonCriteria can either be an order (e.g. Order.Ascending), a comparer (e.g. Comparer.FromCulture(“haw”,true)) or a list that contains the order as the first element and the comparer as the second element.

To explain this a bit better here is another scenario. We want to order primarily on “A”, secondarily on “B” and thirdly on “C”. “A” and “B” are to be sorted with Order.Descending and Comparer.FromCulture(“haw”,true), and column “C” in ascending Unicode sequence. The top declaration uses the comparisonCriteria while bottom declaration uses both the comparisonCriteria and the defaultcomparisonCriteria, yet the 2 declarations below are equivalent:

TableComparerSort(Table, {{“A”,Order.Descending,Comparer.FromCulture(“Haw”,false)},{“B”,Order.Descending,Comparer.FromCulture(“Haw”,false)},{“C”}}
TableComparerSort(Table, {“A”,”B”,{“C”,Order.Ascending,Value.Compare}},{Order.Descending,Comparer.FromCulture(“Haw”,false)}

And for a simpler, more realistic example (unrelated to the above scenario):

TableComparerSort(Table, {“FamilyName”,”FirstName”},Comparer.FromCulture(“Haw”,false))

Using custom comparers instead, some ideas

This “TableComparerSort” function is not limited to comparers generated by the Comparer.FromCulture function. You can also create your own. Some good background information can be found in my previous post.

You might also want change the “TableComparerSort” function and create your own comparer so that the comparer takes the order into regard instead. For example, you might have a column containing values of different types, e.g. text and null. Normally, Order.Ascending will have the nulls come first and Order.Descending will have the nulls come last. Changing the “TableComparerSort” function and creating your own comparer you could have the nulls always coming last no matter what the order is.

Add a Comment

Your email address will not be published. Required fields are marked *

Follow this blog!

Get the latest posts delivered to your mailbox: