Yanyg - SAN Software Engineer



1 介绍



Readahead is a technique employed by the kernel in an attempt to improve file reading performance. If the kernel has reason to believe that a particular file is being read sequentially, it will attempt to read blocks from the file into memory before the application requests them. When readahead works, it speeds up the system's throughput, since the reading application does not have to wait for its requests. When readahead fails, instead, it generates useless I/O and occupies memory pages which are needed for some other purpose.

Readahead is the process of speculatively(推测地) reading file data into the page cache in the hope that it will be useful to an application in the near future. When readahead works well, it can significantly improve the performance of I/O bound applications by avoiding the need for those applications to wait for data and by increasing I/O transfer size. On the other hand, readahead risks making performance worse as well: if it guesses wrong, scarce(稀有的、缺少的) memory and I/O bandwidth will be wasted on data which will never be used. So, as is the case with memory management in general, readahead algorithms are both performance-critical and heavily based on heuristics(启发式).

"Readahead" is the act of speculatively(推测地) reading a portion of a file's contents into memory in the expectation that a process working with that file will soon want that data. When readahead works well, a data-consuming process will find that the information it needs is available to it when it asks, and that waiting for disk I/O is not necessary. The Linux kernel has done readahead for a long time, but that does not mean that it cannot be done better. To that end, Fengguang Wu has been working on a set of "adaptive readahead" patches for a couple of years(几年).


3 References

512K readahead size with thrashing safe readahead
Improving readahead
Adaptive file readahead
Huge pages in the ext4 filesystem
zswap: compressed swap caching