|
About ::
TODO ::
Blog ::
RSS ::
Old blog ::
Projects ::
GIT ::
Gallery ::
Notes
Wed, 17 Jan 2007
Initial implementation of the ntl (new threading library) M-on-N threading library. mmap2(NULL, 8396800, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7684000 sigprocmask(SIG_BLOCK, NULL, []) = 0 futex(0xb7fdac20, FUTEX_WAKE, 1) = 0 sigprocmask(SIG_SETMASK, [], []) = 0 futex(0xb7fdac20, FUTEX_WAKE, 1) = 0 futex(0xb7fdaaa0, FUTEX_WAKE, 1) = 0 sigprocmask(SIG_SETMASK, [], NULL) = 0 futex(0xb7fdaaa0, FUTEX_WAKE, 1) = 0 ... munmap(0xb7fd7000, 4096) = 0so we get aditional four futex calls - two locks are processed: one when stack is unlinked and returned to stack cache, and another when thread is added and removed from scheduler's queue. Performance differs noticebly (test case includes creation of the thread, which exits immediately, which is repeated requested number of times): $ ./ntl_test 100000 num: 100000, diff: 388234, speed: 3.882340.Compared to 1.793600 microseconds without futex calls. In this situation there is no concurency at all - it is synthetic test, so actually one _empty_ futex call gets about 0.5 microseconds, where pure syscall overhead is 50% (this is Intel Core Duo 3.40GHz (running 3.7 Ghz) test machine). I can not say if futex performance is slow of fast - but I would like to avoid this, so in practice semaphores should not be used for thread serialization, instead lightweight locks must be introduced. In current code all locks are abstracted and implemented in separate file, so lock changes are trivial, but I do not want to introduce per-arch usage right now. /devel/threading :: Link / Comments () |