'What is the fastest way to split Integer into digits?

I do a lot of operations with splitting numbers into separate digits, putting digits in ArrayList and passing this digits one by one to other ArrayList for further operations until tempList is empty - then goes the next number which is bigger than previous.

I'd like to know which approach is faster.

The part that is common to both approaches:

// Split number into digits and put into array
BigInteger number; // the number can be very big => BigInteger
ArrayList<Integer> tempList = new ArrayList<>();
while (number.compareTo(BigInteger.ZERO) == 1) {
   tempList.add((number.mod(BigInteger.TEN)).intValue());
   number = number.divide(BigInteger.TEN);
}

Then I can go 2 ways passing those digits to other ArrayList mainList one by one and deleting them after: Way 1:

// reverse the tempList (as digits are put there backwards) and take the first elements
Collections.reverse(tempList);
while (!tempList.isEmpty()) {
    mainList.add(tempList.get(0);
    tempList.remove(0);
}

Way 2:

// take the last elements
while (!tempList.isEmpty()) {
    mainList.add(tempList.get(tempList.size()-1);
    tempList.remove(tempList.get(tempList.size()-1);
}

Which way is faster? Taking into consideration, that it's billions of operations with billions of numbers being split and added. I supposed that method such as Collections.reverse() takes more time, but I only call it every time tempList is updated with new digits from next number. But in Way 2 I call .size()-1 operation on EVERY operation.

Also the bigger the number - the bigger the gap between updating the tempList and taking numbers from it (obviously), so less .reverse() method calling. The number starts at 1 and goes to infinity.

tempList is there for a reason, so please, do not suggest to bypass it.

Additional question: what is the best practice of measuring such things?



Solution 1:[1]

Caution. There are some gotchas involved here.

In fact, there are two questions mixed:

  • How to transfer data from one list to another, in reverse order?
  • How to create a list of digits from a BigInteger?

I agree with the comment by Roman C: "Moving from one list to another is useless". At least, it seems to be useless in this case. But if there is something happening to the tempList, and the general approach of removing elements from one list and adding them to another list (one by one) is justified in any way, then the question of how to improve the performance for this particular case may still be feasible.


Regarding the core question of how to transfer data from one list to another, in reverse order:

Surprisingly, in the form that it is written now,

...the second snippet is far slower than the first one!

(Explanation follows)

A simple test like this one compares both approaches. (Of course, "micorbenchmarks" like this one should be taken with a grain of salt, but as the performance here is related to asymptotic running times, this is reasonable here)

import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
import java.util.Locale;
import java.util.Random;

public class BigIntegerDigitsPerformanceLists
{
    public static void main(String[] args)
    {
        testListConversion();
    }

    private static void testListConversion()
    {
        long before = 0;
        long after = 0;

        for (int size = 10; size <= 1000000; size *= 10)
        {
            List<Integer> inputA = createRandomList(size);
            List<Integer> inputB = createRandomList(size);

            before = System.nanoTime();
            List<Integer> resultA = convertA(inputA);
            after = System.nanoTime();
            System.out.printf(Locale.ENGLISH,
                "A: size %8d time %8.2fms result %d\n",
                size, (after-before)/1e6, resultA.get(0));

            before = System.nanoTime();
            List<Integer> resultB = convertB(inputB);
            after = System.nanoTime();
            System.out.printf(Locale.ENGLISH,
                "B: size %8d time %8.2fms result %d\n",
                size, (after-before)/1e6, resultB.get(0));
        }
    }

    private static List<Integer> createRandomList(int size)
    {
        List<Integer> result = new ArrayList<Integer>();
        Random random = new Random(0);
        for (int i=0; i<size; i++)
        {
            result.add(random.nextInt(10));
        }
        return result;
    }


    private static List<Integer> convertA(List<Integer> list)
    {
        List<Integer> result = new ArrayList<Integer>();
        Collections.reverse(list);
        while (!list.isEmpty())
        {
            result.add(list.get(0));
            list.remove(0);
        }
        return result;
    }

    private static List<Integer> convertB(List<Integer> list)
    {
        List<Integer> result = new ArrayList<Integer>();
        while (!list.isEmpty())
        {
            result.add(list.get(list.size() - 1));
            list.remove(list.get(list.size() - 1));
        }
        return result;
    }
}

The output on my machine is

A: size       10 time     0.08ms result 4
B: size       10 time     0.05ms result 4
A: size      100 time     0.13ms result 1
B: size      100 time     0.39ms result 1
A: size     1000 time     1.27ms result 6
B: size     1000 time     2.96ms result 6
A: size    10000 time    39.72ms result 1
B: size    10000 time   220.82ms result 1
A: size   100000 time  3766.45ms result 7
B: size   100000 time 21734.66ms result 7
...

But....

This is due to a wrong method call. The second approach contains the line

list.remove(list.get(list.size() - 1));

and this is the culprit in this case: You have a list of Integer objects. And you are calling remove, passing in an Integer object. This method will search through the whole list, and remove the first occurrence of the argument. This is not only slow, it also causes the result to be plainly wrong!

What you actually want to do is to remove the last element, using the index of the last element. So changing this line to

list.remove((int)list.size() - 1);

gives an entirely different timing result:

A: size       10 time     0.08ms result 4
B: size       10 time     0.03ms result 4
A: size      100 time     0.13ms result 1
B: size      100 time     0.10ms result 1
A: size     1000 time     1.28ms result 6
B: size     1000 time     0.46ms result 6
A: size    10000 time    39.09ms result 1
B: size    10000 time     2.63ms result 1
A: size   100000 time  3763.97ms result 7
B: size   100000 time     9.83ms result 7
...

So, when implemented properly, then

...the first snippet is far slower than the second one!

For the reasons that Eran mentioned in his answer.


Regarding the question of how to create a list of digits from a BigInteger: There are several possible performance improvements.

Extracting the digits manually, with a sequence of %= 10 and /= 10 calls is very slow. Avoiding the modulo operation already brings a small speedup. So instead of

digit = number % 10;
number = number / 10;

you could do

nextNumber = number / 10;
digit = number - (nextNumber * 10);
number = nextNumber;

But due to the immutability of BigInteger and the expensive division, this is still orders of magnitude slower than simply converting the BigInteger into a string, and extracting the digits from there, as dasblinkenlight suggested in his answer.

A simple comparison:

import java.math.BigInteger;
import java.util.ArrayList;
import java.util.List;
import java.util.Locale;
import java.util.Random;

public class BigIntegerDigitsPerformance
{
    public static void main(String[] args)
    {
        testListCreation();
    }

    private static void testListCreation()
    {
        long before = 0;
        long after = 0;

        for (int size = 10; size <= 100000; size *= 10)
        {
            BigInteger number = createRandomBigInteger(size);

            before = System.nanoTime();
            List<Integer> resultA = createA(number);
            after = System.nanoTime();
            System.out.printf(Locale.ENGLISH,
                "A: size %8d time %8.2fms result %d\n",
                size, (after-before)/1e6, resultA.get(0));

            before = System.nanoTime();
            List<Integer> resultB = createB(number);
            after = System.nanoTime();
            System.out.printf(Locale.ENGLISH,
                "B: size %8d time %8.2fms result %d\n",
                size, (after-before)/1e6, resultB.get(0));

            before = System.nanoTime();
            List<Integer> resultC = createC(number);
            after = System.nanoTime();
            System.out.printf(Locale.ENGLISH,
                "B: size %8d time %8.2fms result %d\n",
                size, (after-before)/1e6, resultC.get(0));
        }
    }


    private static BigInteger createRandomBigInteger(int size)
    {
        StringBuilder sb = new StringBuilder();
        Random random = new Random(0);
        for (int i=0; i<size; i++)
        {
            sb.append(String.valueOf(random.nextInt(10)));
        }
        return new BigInteger(sb.toString());
    }


    private static List<Integer> createA(BigInteger number)
    {
        ArrayList<Integer> list = new ArrayList<Integer>();
        while (number.compareTo(BigInteger.ZERO) == 1)
        {
            list.add((number.mod(BigInteger.TEN)).intValue());
            number = number.divide(BigInteger.TEN);
        }
        return list;
    }

    private static List<Integer> createB(BigInteger number)
    {
        ArrayList<Integer> list = new ArrayList<Integer>();
        while (number.compareTo(BigInteger.ZERO) == 1)
        {
            BigInteger next = number.divide(BigInteger.TEN);
            BigInteger diff = number.subtract(next.multiply(BigInteger.TEN));
            list.add(diff.intValue());
            number = next;
        }
        return list;
    }

    private static List<Integer> createC(BigInteger number)
    {
        String s = number.toString();
        ArrayList<Integer> list = new ArrayList<Integer>(s.length());
        for (int i=s.length()-1; i>=0; i--)
        {
            list.add(s.charAt(i) - '0');
        }
        return list;
    }
}

The output will be along the lines of this:

...
A: size     1000 time     9.20ms result 6
B: size     1000 time     6.44ms result 6
C: size     1000 time     1.96ms result 6
A: size    10000 time   452.44ms result 1
B: size    10000 time   334.82ms result 1
C: size    10000 time    16.29ms result 1
A: size   100000 time 43876.93ms result 7
B: size   100000 time 32334.84ms result 7
C: size   100000 time   297.92ms result 7

showing that the toString approach is more than a hundred times faster than the manual ones.

Solution 2:[2]

The second snippet should be faster for two reasons :

  1. Not having to reverse the ArrayList, as you recognized yourself.

  2. Removing the last element of an ArrayList is faster than removing the first element, since removing the first element involves decreasing the index of all the remaining elements (which is done via System.arraycopy).

Solution 3:[3]

Both snippets are slow. If you want to do it faster, make a list of the proper size, and fill it in from the back.

This answer shows how to obtain the length of the required ArrayList<Integer>. Now you can do this:

BigInteger number; // the number can be very big => BigInteger
ArrayList<Integer> res = new ArrayList<Integer>(
    // getDigitCount comes from the answer linked above
    Collections.nCopies(getDigitCount(number), 0)
);
int pos = res.size()-1;
while (number.compareTo(BigInteger.ZERO) == 1) {
    res.set(pos--, (number.mod(BigInteger.TEN)).intValue());
    number = number.divide(BigInteger.TEN);
}

This produces ArrayList in the desired order right away, so the reversing routines become completely unnecessary.

Note: Java-8 lets you do the conversion to an array of integer in one line:

BigInteger number = new BigInteger("12345678910111213141516171819");
List<Integer> res = number
    .toString()
    .chars()
    .mapToObj(c -> Character.digit(c, 10))
    .collect(Collectors.toList());

Demo.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Community
Solution 2 Eran
Solution 3 Community