'Effect of sched_rr_timeslice_ms on program's performance

Consider the following code which sets the scheduling policy to SCHED_RR and performs a dummy loop.

#include <unistd.h>
#include <sched.h>
#include <stdio.h>
int main()
{
    int pid_num = getpid();
    struct sched_param sp = { .sched_priority = 99 };
    int ret = sched_setscheduler(pid_num, SCHED_RR, &sp);
    int policy = sched_getscheduler(pid_num);
    switch(policy) {
        case SCHED_OTHER: printf("SCHED_OTHER\n"); break;
        case SCHED_RR:   printf("SCHED_RR\n"); break;
        case SCHED_FIFO:  printf("SCHED_FIFO\n"); break;
        default:   printf("Unknown...\n");
    }
    unsigned long long sum = 0;
    for (unsigned long long i = 0; i < 30000000000; i++)
        sum += i;
    printf("%llu\n", sum);
    return 0;
}

I have tested the code with two kernel.sched_rr_timeslice_ms values, 1 and 1000. The perf result shows:

$ sudo sysctl kernel.sched_rr_timeslice_ms=1
kernel.sched_rr_timeslice_ms = 1
$ perf stat -a -e instructions,cycles,context-switches,cpu-migrations -- sudo ./test
SCHED_RR
7278142215970761216

 Performance counter stats for 'system wide':

   120,100,665,611      instructions              #    3.98  insn per cycle
    30,160,659,660      cycles
             1,717      context-switches
                29      cpu-migrations

       7.810369637 seconds time elapsed

$ sudo sysctl kernel.sched_rr_timeslice_ms=1000
kernel.sched_rr_timeslice_ms = 1000
$ perf stat -a -e instructions,cycles,context-switches,cpu-migrations -- sudo ./test
SCHED_RR
7278142215970761216

 Performance counter stats for 'system wide':

   120,094,568,266      instructions              #    3.98  insn per cycle
    30,151,055,338      cycles
             1,724      context-switches
                23      cpu-migrations

       7.726291605 seconds time elapsed
$ uname -r
5.13.0-27-generic
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 20.04.3 LTS
Release:        20.04
Codename:       focal

With respect to the large change in RR time slice, the run time, context switches and other parameters are the same.

I wonder what is the result of modification of that kernel parameter then? I expect that larger RR time slice results in fewer context switches, at least.



Solution 1:[1]

I think it doesn't count as a context-switch when the kernel chooses to return back to the same user-space task that was interrupted (by a timer interrupt or whatever at the end of its timeslice). That's a surprisingly high number of context-switches, I think.

Anyway, realtime scheduling isn't about optimizing for overall throughput, which is what you're measuring. (Time to finish this long-running loop).

The important factor here is what happens if you run more threads or processes (tasks) than you have physical cores: with a 1000ms round-robin timeslice, once all your cores were occupied with those tasks, nothing else in user-space would get a timeslice for a whole second, not even your X server or terminal emulator. Only kernel interrupt handlers.

Real-time systems are about latency guarantees, and how long a task can monopolize a core without being switched away from is an important scheduler consideration. (I think most tasks you'd want to run with SCHED_RR would not be purely doing compute work, but likely doing frequent I/O as well, or at least interacting with other processes via shared memory or signals.)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Peter Cordes