'Reactive Redis only one thread is used in a webflux service
In a Webflux server, detect function first looks up 2 redis hash value, and then do a heavy CPU calculation using the redis returned value, and then save result to redis, and returns response. The idea is like this:
@PostMapping("detect")
public Mono<RecognitionResponse> detect(@Valid @RequestBody ImageVo vo) {
float[] vec = ai.calculateVec(vo.getImg);
Mono<Matrix> aLib= redisOps.get("aLib", "field").switchIfEmpty(...a default matrix...);
Mono<Matrix> bLib = redisOps.get("bLib", "field").switchIfEmpty(...a default matrix...);
return aLib.zipWith(bLib)
.flatMap(tuple -> {
R r1 = ai.calculationSimilar(tuple.getT1(), vec);
R r2 = ai.calculationSimilar(tuple.getT2(), vec);
if (r1 or r2 newImage) {
return redisOps.put("aLib", "field", r1.getVec()).map(b -> new RecognitionResponse(r1,r2))
} else {
return Mono.just(new RecognitionResponse(r1,r2));
}
});
}
In a performance test, the above works fine given a 4 CPU server, like 380% is used. Given a 32 CPU server, very small part of CPU is used, 1 shows a lettuce-epoll-thread has a high CPU consumption like 90%. Then I checked log, it shows only one lettuce-epoll-thread is used for the whole redisOps.get() to return part, which is a heavy CPU calculation. (I printed for every ai function, and they all have the same lettuce-epoll-thread number)
I think first query redis, and then call a handler on returned data is an ordinary usage for webflux. If only one lettuce-epool thread is used, then performance would be very bad. I am starting to think that the above is a misuse of Webflux reactive redis.
Supplement
In a test, thread-level top shows that the thread that eats most CPU is thread 4807:
top - 11:33:00 up 486 days, 9 min, 2 users, load average: 1.55, 1.62, 1.17
Threads: 326 total, 10 running, 316 sleeping, 0 stopped, 0 zombie
%Cpu(s): 8.2 us, 1.1 sy, 0.0 ni, 90.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 13183149+total, 63810604 free, 35841708 used, 32179184 buff/cache
KiB Swap: 20971516 total, 20971516 free, 0 used. 89526880 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
4807 tina 20 0 78.933g 0.013t 44980 R 95.7 11.0 8:13.02 java
14946 tina 20 0 78.933g 0.013t 44980 R 6.0 11.0 0:35.34 java
14733 tina 20 0 78.933g 0.013t 44980 S 5.3 11.0 0:28.25 java
And then 4807, equally 0x12c7, is lettuce-epollEventLoop-5-12::
"lettuce-epollEventLoop-5-12" #197 daemon prio=5 os_prio=0 tid=0x00007f9554002800 nid=0x12c7 runnable [0x00007f911a4ef000]
java.lang.Thread.State: RUNNABLE
at java.lang.StringBuffer.append(StringBuffer.java:388)
- locked <0x000000009b0a79d0> (a java.lang.StringBuffer)
at java.nio.ByteBuffer.toString(ByteBuffer.java:1088)
at java.lang.String.valueOf(String.java:2994)
at java.lang.StringBuilder.append(StringBuilder.java:131)
at java.util.AbstractMap.toString(AbstractMap.java:557)
at java.lang.String.valueOf(String.java:2994)
at java.lang.StringBuilder.append(StringBuilder.java:131)
at io.lettuce.core.output.CommandOutput.toString(CommandOutput.java:151)
at java.lang.String.valueOf(String.java:2994)
at java.lang.StringBuilder.append(StringBuilder.java:131)
at io.lettuce.core.protocol.CommandWrapper.toString(CommandWrapper.java:219)
at org.slf4j.helpers.MessageFormatter.safeObjectAppend(MessageFormatter.java:277)
at org.slf4j.helpers.MessageFormatter.deeplyAppendParameter(MessageFormatter.java:249)
at org.slf4j.helpers.MessageFormatter.arrayFormat(MessageFormatter.java:211)
at org.slf4j.helpers.MessageFormatter.arrayFormat(MessageFormatter.java:161)
at org.slf4j.helpers.MessageFormatter.format(MessageFormatter.java:151)
at io.netty.util.internal.logging.LocationAwareSlf4JLogger.debug(LocationAwareSlf4JLogger.java:115)
at io.lettuce.core.protocol.RedisStateMachine.decode(RedisStateMachine.java:145)
at io.lettuce.core.protocol.CommandHandler.decode(CommandHandler.java:742)
at io.lettuce.core.protocol.CommandHandler.decode0(CommandHandler.java:706)
at io.lettuce.core.protocol.CommandHandler.decode(CommandHandler.java:701)
jvisualvm results self time and thread cpu time, io.netty.util.concurrent.SingleThreadEventExecutor.takeTask() is the highest, and then comes io.netty.channel.epoll.Native.epollWait
self time:
thread cpu time:
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|


