Zbr's days.

About :: TODO :: Blog :: RSS :: Old blog :: Projects :: GIT :: Gallery :: Notes

Mon, 14 Aug 2006

Network tree allocator.


Weekend was quite productive: I've completed per CPU support for NTA, so it is fully per-cpu right now excecpt one case when freeing happens on different CPU than original allocation, in that case I put a chunk into queue to be freed on original CPU.

I've also added support for combined pages, so it is possible to allocate upto 16k on x86 with netwrok tree allocator right now.

While hacking on NTA I've decided to completely drop tree from the allocator, since struct page has enough place to put there a pointer to the node. I'm also working on removing so called container cache for network tree allocator (container is a structure which holds free chunks in a list), so when that tasks are completed I will do first release. I expect it to be done today.

Ok, I've removed container cache entirely, so neither allocation, nor freeing requires any kind allocation anymore (sounds really crazy, but it is).

There is some problem with extensive struct page usage in the network tree allocator - combined pages use page->private member as a pointer to the head of combined pages, while it is a spinlock_t for mapping code, so it is impossible to map combined pages and mappind destroys combining, so I need to create some tricks with page->lru instead of stock combining usage.

Here we go: when chunk of memory is free, it is stored in special LIFO list, since it is free, it is possible to dereference it into list entry itself without any kind of containers around it, since each chunk is at least 32bytes long (it should be L1 cache size actually), it is possbile to store there double linked entry, so removing as long as lookup of that entry takes O(1) (lookup is just a dereferencing of the pointer into list entry).
Since each page->lru has two pointers unused (well, they are used in by kernel, but since allocator is not supposed to return it's pages to the kernel, it is perfectly ok to overwrite them), I placed there a pointer to the node and a cpu number where that page was allocated. So freeing just gets that pointers and checks if CPU it runs on differs from allocation one or not, in case it is the same CPU, node is obtained from page->lru and appropriate neighbour pointers are calculated, which are then dereferenced into struct list_head and removed from appropriate lists. Pointers are combined and thus fragmentation is greatly reduced.

/devel/networking/nta :: Link / Comments ()