|
|
About ::
TODO ::
Blog ::
RSS ::
Old blog ::
Projects ::
GIT ::
Gallery ::
Notes
Mon, 03 Apr 2006
AIO sendfile.
Let's briefly describe how generic reading is being done in Linux kernel.
When chunk of file data is not in VFS cache, it is requested using block layer,
which bio->bi_end_io callback, generally mpage_end_io_read(),
is called in hard IRQ context and only marks pages as uptodate or error and unlock the pages.
Generally cold pages from VFS cache, i.e. pages which were not used, but inserted recently,
are used for this technique.
Network processing code must wait on those pages before it can start use them until pages are
marked as uptodate.
AIO sendfile's state machine is invoked in bio->bi_end_io callback, in which
it basicaly does the same as mpage_end_io_read() plus schedules reading
of the next chunk of file data.
Let's compare what happens when we call do_generic_file_read() and how kevent based
AIO sendfile works.
do_generic_file_read() runs through all pages which would contain
data from requested region of the file, and if file data is not in VFS cache, it invokes
block layer through mapping->a_ops->readpage(), which is likely mpage_readpage()
or block_read_full_page(), and then wait on VFS pages to become ready.
Selected VFS page is then provided to specified from higher layer actor
function, which actually copies data or send it over the net.
- Current AIO approach does almost the same. It has kevent which has array of preallocated
clean pages, which are used either for
actor function, which is used for
uptodate pages found in VFS cache; or for block layer mechanism, which is very similar
to mpage_readpage(), but it's bio->bi_end_io not only marks pages
as ready, but also invokes kevent state machine. Then new work is scheduled in callback invoked
from kevent state machine, which will process kevent's pages and then will start reading
of the next chunk of data.
So, what is the difference between the two approaches?
Basically, synchronous buffered reading handles data in page sized chunks,
and if there are no data in VFS cache, it allocates new page, inserts it into VFS cache,
and waits until data is read there. Then VFS page is procesed by higher layer actor
function. Synchronous buffered reading is never interrupted (I mean do_generic_file_read()
does not exit except on errors) between different page processing.
AIO approach works similar, but it does not populate page read into VFS cache, uses bigger
set of pages and is interrupted each time given number of pages has been processed,
i.e. only predefined number of pages in kevent are processed in a time, while buffered reading
process all requested data.
Since AIO processing happens in work queue, we should add here an overhead for process switching
after predefined number of pages in kevent has been processed.
:: Link / Comments ()
New w1 release.
It only includes sync with in-kernel tree, which introduces
new locking primitive called mutexes instead of semaphore,
so it can only be compiled with 2.6.16+ kernels, which is reflected
in README.
This release does not contain new functionality.
:: Link / Comments ()
|