'How to disable the page cache in linux kernel?

how to avoid page cache in kernel,the application can directly write or read data from disk?In kernel,how to set?



Solution 1:[1]

You will need the application to call O_DIRECT. From the man page http://man7.org/linux/man-pages/man2/open.2.html

With this you are telling the kernel to not write/read from page cache while doing I/O.

O_DIRECT (since Linux 2.4.10) Try to minimize cache effects of the I/O to and from this file. In general this will degrade performance, but it is useful in special situations, such as when applications do their own caching. File I/O is done directly to/from user- space buffers. The O_DIRECT flag on its own makes an effort to transfer data synchronously, but does not give the guarantees of the O_SYNC flag that data and necessary metadata are transferred. To guarantee synchronous I/O, O_SYNC must be used in addition to O_DIRECT. See NOTES below for further discussion.

          A semantically similar (but deprecated) interface for block
          devices is described in raw(8).

Solution 2:[2]

UPDATE

The write caches in spec are not related with page cache. The cache here actually refer to RAM/NVRAM integrated into disk controllers, such memory should not be confused with page cache!




AFAIK, these only guarantees write page enable/disable switch for SATA and NVMe device,

SATA

refer to sata 3.0 spece:

SET FEATURES (Write Cache Enable/Disable): The write cache enable/disable setting established by the SET FEATURES command with subcommand code of 02h or 82h.

Under linux kernel, HDIO_SET_WCACHE ioctl can control it:

static DEFINE_MUTEX(ide_disk_ioctl_mutex);
static const struct ide_ioctl_devset ide_disk_ioctl_settings[] = {
{ HDIO_GET_ADDRESS, HDIO_SET_ADDRESS,   &ide_devset_address   },
{ HDIO_GET_MULTCOUNT,   HDIO_SET_MULTCOUNT, &ide_devset_multcount },
{ HDIO_GET_NOWERR,  HDIO_SET_NOWERR,    &ide_devset_nowerr    },
{ HDIO_GET_WCACHE,  HDIO_SET_WCACHE,    &ide_devset_wcache    },
{ HDIO_GET_ACOUSTIC,    HDIO_SET_ACOUSTIC,  &ide_devset_acoustic  },
{ 0 }
};

int ide_disk_ioctl(ide_drive_t *drive, struct block_device *bdev, fmode_t mode,
           unsigned int cmd, unsigned long arg)
{
    int err;

    mutex_lock(&ide_disk_ioctl_mutex);
    err = ide_setting_ioctl(drive, bdev, cmd, arg, ide_disk_ioctl_settings);
    if (err != -EOPNOTSUPP)
        goto out;

    err = generic_ide_ioctl(drive, bdev, cmd, arg);
out:
    mutex_unlock(&ide_disk_ioctl_mutex);
    return err;
}

And you can also use hdparm -W0/1 /dev/sdx to disable/enable write cache conviently, which also invoke HDIO_SET_WCACHE internally:

}
        if (!wcache)
            err = flush_wcache(fd);
        if (ioctl(fd, HDIO_SET_WCACHE, wcache)) {
            __u8 setcache[4] = {ATA_OP_SETFEATURES,0,0,0};
            setcache[2] = wcache ? 0x02 : 0x82;
            if (do_drive_cmd(fd, setcache, 0)) {
                err = errno;
                perror(" HDIO_DRIVE_CMD(setcache) failed");
            }
        }

NVME

kernel source:

static ssize_t queue_wc_show(struct request_queue *q, char *page)
{
    if (test_bit(QUEUE_FLAG_WC, &q->queue_flags))
        return sprintf(page, "write back\n");

    return sprintf(page, "write through\n");
}

static ssize_t queue_wc_store(struct request_queue *q, const char *page,
                  size_t count)
{
    int set = -1;

    if (!strncmp(page, "write back", 10))
        set = 1;
    else if (!strncmp(page, "write through", 13) ||
         !strncmp(page, "none", 4))
        set = 0;

    if (set == -1)
        return -EINVAL;

    if (set)
        blk_queue_flag_set(QUEUE_FLAG_WC, q);
    else
        blk_queue_flag_clear(QUEUE_FLAG_WC, q);

    return count;
}

nvme spec:

enter image description here

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Chaitanya Lala
Solution 2