'Reactive Redis only one thread is used in a webflux service

In a Webflux server, detect function first looks up 2 redis hash value, and then do a heavy CPU calculation using the redis returned value, and then save result to redis, and returns response. The idea is like this:

@PostMapping("detect")
public Mono<RecognitionResponse> detect(@Valid @RequestBody ImageVo vo) {
    float[] vec = ai.calculateVec(vo.getImg);
    Mono<Matrix> aLib= redisOps.get("aLib", "field").switchIfEmpty(...a default matrix...);
    Mono<Matrix> bLib = redisOps.get("bLib", "field").switchIfEmpty(...a default matrix...);
    return aLib.zipWith(bLib)
              .flatMap(tuple -> {
                     R r1 = ai.calculationSimilar(tuple.getT1(), vec);
                     R r2 = ai.calculationSimilar(tuple.getT2(), vec);
                     if (r1 or r2 newImage) {
                         return redisOps.put("aLib", "field", r1.getVec()).map(b -> new RecognitionResponse(r1,r2))
                     } else {
                         return Mono.just(new RecognitionResponse(r1,r2));
                     }
               });
}

In a performance test, the above works fine given a 4 CPU server, like 380% is used. Given a 32 CPU server, very small part of CPU is used, 1 shows a lettuce-epoll-thread has a high CPU consumption like 90%. Then I checked log, it shows only one lettuce-epoll-thread is used for the whole redisOps.get() to return part, which is a heavy CPU calculation. (I printed for every ai function, and they all have the same lettuce-epoll-thread number)

I think first query redis, and then call a handler on returned data is an ordinary usage for webflux. If only one lettuce-epool thread is used, then performance would be very bad. I am starting to think that the above is a misuse of Webflux reactive redis.

Supplement In a test, thread-level top shows that the thread that eats most CPU is thread 4807:

top - 11:33:00 up 486 days, 9 min,  2 users,  load average: 1.55, 1.62, 1.17
Threads: 326 total,  10 running, 316 sleeping,   0 stopped,   0 zombie
%Cpu(s):  8.2 us,  1.1 sy,  0.0 ni, 90.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 13183149+total, 63810604 free, 35841708 used, 32179184 buff/cache
KiB Swap: 20971516 total, 20971516 free,        0 used. 89526880 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND
 4807 tina  20   0 78.933g 0.013t  44980 R 95.7 11.0   8:13.02 java
14946 tina  20   0 78.933g 0.013t  44980 R  6.0 11.0   0:35.34 java
14733 tina  20   0 78.933g 0.013t  44980 S  5.3 11.0   0:28.25 java

And then 4807, equally 0x12c7, is lettuce-epollEventLoop-5-12::

"lettuce-epollEventLoop-5-12" #197 daemon prio=5 os_prio=0 tid=0x00007f9554002800 nid=0x12c7 runnable [0x00007f911a4ef000]
   java.lang.Thread.State: RUNNABLE
        at java.lang.StringBuffer.append(StringBuffer.java:388)
        - locked <0x000000009b0a79d0> (a java.lang.StringBuffer)
        at java.nio.ByteBuffer.toString(ByteBuffer.java:1088)
        at java.lang.String.valueOf(String.java:2994)
        at java.lang.StringBuilder.append(StringBuilder.java:131)
        at java.util.AbstractMap.toString(AbstractMap.java:557)
        at java.lang.String.valueOf(String.java:2994)
        at java.lang.StringBuilder.append(StringBuilder.java:131)
        at io.lettuce.core.output.CommandOutput.toString(CommandOutput.java:151)
        at java.lang.String.valueOf(String.java:2994)
        at java.lang.StringBuilder.append(StringBuilder.java:131)
        at io.lettuce.core.protocol.CommandWrapper.toString(CommandWrapper.java:219)
        at org.slf4j.helpers.MessageFormatter.safeObjectAppend(MessageFormatter.java:277)
        at org.slf4j.helpers.MessageFormatter.deeplyAppendParameter(MessageFormatter.java:249)
        at org.slf4j.helpers.MessageFormatter.arrayFormat(MessageFormatter.java:211)
        at org.slf4j.helpers.MessageFormatter.arrayFormat(MessageFormatter.java:161)
        at org.slf4j.helpers.MessageFormatter.format(MessageFormatter.java:151)
        at io.netty.util.internal.logging.LocationAwareSlf4JLogger.debug(LocationAwareSlf4JLogger.java:115)
        at io.lettuce.core.protocol.RedisStateMachine.decode(RedisStateMachine.java:145)
        at io.lettuce.core.protocol.CommandHandler.decode(CommandHandler.java:742)
        at io.lettuce.core.protocol.CommandHandler.decode0(CommandHandler.java:706)
        at io.lettuce.core.protocol.CommandHandler.decode(CommandHandler.java:701)

jvisualvm results self time and thread cpu time, io.netty.util.concurrent.SingleThreadEventExecutor.takeTask() is the highest, and then comes io.netty.channel.epoll.Native.epollWait

self time:

self time

thread cpu time:

thread cpu time



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source