'Why is my Executor Service getting stuck when its close to completion every single time?

To put simply, i have a a List userIds which could have a size of any number from 10k to 100k and those are user_ids, and my core logic is that for each user_id I call two different rest api end point and get those payloads, and then do a verfication operation among the payloads

I have used executor service for faster processing in the following way, i have split the records as batches and predefined 100 threads so that 1 thread could process (int)(userIds.size()/100) users

public void execute(List < String > userIds) {
    int numberOfThreads = 100;
    int WINDOW = (int) Math.floor(userIds.size() / numberOfThreads);
   
    int st = 0;
    int end = WINDOW;
   
    for (; st < userIds.size();) {
        int realStart = st;
        int realEnd = end;

        executor.execute(
            () - > {
                verify(realStart, realEnd, userIds) // this method has logic to verify users for the given start and end indexes from the userIds list

            }
        )
    
    st = end + 1;
    if (end + WINDOW >= userIds.size()) {
        end = userIds.size() - 1;
    } else {
        end = st + WINDOW;
    } // code to update the indexes realStart and realEnd for next 'n' batches
}
executor.shutDown();
try {
    System.out.println("Done");
    executor.awaitTermination(Long.MAX_VALUE, TimeUnit.SECONDS);
    System.out.println("Done on a rope");
} catch (Exception e) {
    e.printStackTrace();
}
}

Lets say for 10k records when it is nearing completion, the code gets stuck, which makes me think if there are a few threads which are stuck inside the exectuorService, if so how do i workAround it, or is there any obvious flaw im making here? Any help is appreciated :)



Solution 1:[1]

Assuming that everything is OK with 100 threads and you system can handle it.

Then the issue can be within verify() method. Let's say, you have 10k users -> each thread processes 100 users at a time.

  1. Logic of verify() method cannot handle it
  2. users Id's are not unique within the list userIds and several threads are trying to process the same userId concurrently and maybe get dead-locked over some resources related to that user Id.
  3. other issue with threads blocking each other, but again - it depends on your application logic

Note: executor.awaitTermination(Long.MAX_VALUE, TimeUnit.SECONDS); only tells whether or not executor finished its processing, it does not force it to terminate after time has elapsed. According to Oracle, the correct way to finish would be :

executor.shutdown();
try {
    if (!executor.awaitTermination(800, TimeUnit.MILLISECONDS)) {
        executor.shutdownNow();
    } 
} catch (InterruptedException e) {
    executor.shutdownNow();
}

Another Note: you can use while (st<userIds.size()) instead of half-empty for (; st < userIds.size();)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Sve Kamenska