'Algorithm to sort various data types
Conceptually, I would like to sort data of a single column of variant type, similar to how this is done in Excel:
Here we can see that Numbers are sorted before Strings before Booleans before Errors. So, I would like to have a function that works like this in SQL:
SELECT * FROM table ORDER BY
CASE WHEN type='number' THEN 0 WHEN type='string' THEN 1 /* ... */ END,
SortFunction(variantData)
Here is what I'd like to accomplish:
- The
SortFunctionneeds to return a value of a single data type, for example, a string, or a number, or a binary type. - I am fine limiting the length of a text field in the function, if that is necessary (for example, if we have a string that is 10,000 characters, just limiting it to the first 100 characters).
Any programming language is fine, I'm more concerned just about a technique to accomplish this Excel-like sorting.
For a numeric field, we can keep it as-is, for a date/time-related field we can do a unix timestamp, but how would we do it for a string or binary data type?
Solution 1:[1]
I would build this atop a simple function which orders pairs of data-type predicates and comparator functions. Then we can simply order them from number, to string, to boolean, to error, to whatever.
A simple, barely-tested version of that idea is as follows:
const multiSort = (cfgs) => (xs) => xs .sort ((a, b) => {
const ia = cfgs .findIndex (([f]) => f (a))
const ib = cfgs .findIndex (([f]) => f (b))
return ia == ib ? cfgs [ia] [1] (a, b) : ia - ib
})
const excelSorter = multiSort ([
[(x) => typeof x == 'number' && Number .isFinite (x), (a, b) => a < b ? -1 : a > b ? 1 : 0],
[(x) => typeof x == 'string', (a, b) => a < b ? -1 : a > b ? 1 : 0], // or case-insensitive
[(x) => typeof x == 'boolean', (a, b) => a ? b ? 0 : 1 : b ? -1 : 0],
[isNaN, () => 0],
[() => true, () => a < b ? -1 : a > b ? 1 : 0] // defaulting to JS's internal sort otherwise
])
console .log (excelSorter (
[1, true, "00A", "Something", -2.14, false, 1.29375e-17, 9, "Z", 2.4, NaN, "Hello", 2.98129e+30]
))
//=> [-2.14, 1.29375e-17, 1, 2.4, 9, 2.98129e+30, "00A", "Hello", "Something", "Z", false, true, NaN]
.as-console-wrapper {max-height: 100% !important; top: 0}
We simply find the index of the first configuration object whose test function returns true for our first value and the index of the one for our second value. If those numbers are different, we return the difference in those indices. If they are the same, we return the result of calling the associated comparator function on the two value.
Many variants of this are possible. We might want the function itself to supply that final pair used here, so there's always a fallback. We might want {test, compare} objects instead ordered pairs in an array. We might want to change that final version to simply always return 0, rather than depending upon JS's odd value comparison rules.
Solution 2:[2]
Consider each element as byte array and apply the Comparator:
import java.util.Arrays;
import java.util.Comparator;
public class SortAnyObjects {
public static void main(String[] args) {
Object[] arr = {1, 'c', '&', "z", "testing", "hello world", '?',
'å'};
byte[][] a = new byte[arr.length][]; // <---- The column is not initialized
for (int i = 0; i < arr.length; i++) {
if (arr[i] instanceof Integer) {
a[i] = String.valueOf((int) arr[i]).getBytes();
}
else if (arr[i] instanceof Character) {
a[i] = String.valueOf((char) arr[i]).getBytes();
}
else { // <---- Here expand your else condition as you expect the datatypes
a[i] = ((String) arr[i]).getBytes();
}
}
Arrays.sort(a, new Comparator<byte[]>() {
@Override
public int compare(
final byte[] o1,
final byte[] o2
) {
if (o1 == null) {
return 1;
}
if (o2 == null) {
return -1;
}
if (o1 == o2) {
return 0;
}
if (o2.length > o1.length) {
return compare(o2, o1);
}
for (int i = 0; i < o1.length; i++) {
if (o1[i] == o2[i]) {
continue;
}
return Byte.compare(o1[i], o2[i]);
}
return 0;
}
});
System.out.println(Arrays.toString(a));
for (int i = 0; i < a.length; i++) {
System.out.println(new String(a[i]));
}
}
}
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Scott Sauyet |
| Solution 2 |

