In the Linux kernel, the following vulnerability has been resolved:
fs/erofs/fileio: call erofs_onlinefolio_split() after bio_add_folio()
If bio_add_folio() fails (because it is full),
erofs_fileio_scan_folio() needs to submit the I/O request via
erofs_fileio_rq_submit() and allocate a new I/O request with an empty
struct bio. Then it retries the bio_add_folio() call.
However, at this point, erofs_onlinefolio_split() has already been
called which increments folio->private; the retry will call
erofs_onlinefolio_split() again, but there will never be a matching
erofs_onlinefolio_end() call. This leaves the folio locked forever
and all waiters will be stuck in folio_wait_bit_common().
This bug has been added by commit ce63cb62d794 ("erofs: support
unencoded inodes for fileio"), but was practically unreachable because
there was room for 256 folios in the struct bio - until commit
9f74ae8c9ac9 ("erofs: shorten bvecs[] for file-backed mounts") which
reduced the array capacity to 16 folios.
It was now trivial to trigger the bug by manually invoking readahead from userspace, e.g.:
posix_fadvise(fd, 0, st.st_size, POSIX_FADV_WILLNEED);
This should be fixed by invoking erofs_onlinefolio_split() only after bio_add_folio() has succeeded. This is safe: asynchronous completions invoking erofs_onlinefolio_end() will not unlock the folio because erofs_fileio_scan_folio() is still holding a reference to be released by erofs_onlinefolio_end() at the end.
| Software | From | Fixed in |
|---|---|---|
| linux / linux_kernel | 6.12 | 6.12.29 |
| linux / linux_kernel | 6.13 | 6.14.7 |
| linux / linux_kernel | 6.15-rc1 | 6.15-rc1.x |
| linux / linux_kernel | 6.15-rc2 | 6.15-rc2.x |
| linux / linux_kernel | 6.15-rc3 | 6.15-rc3.x |
| linux / linux_kernel | 6.15-rc4 | 6.15-rc4.x |
| linux / linux_kernel | 6.15-rc5 | 6.15-rc5.x |