Zbr's days.
October
Sun Mon Tue Wed Thu Fri Sat
     
8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31  
2008
Months
OctNov Dec

About :: TODO :: Blog :: RSS :: Old blog :: Projects :: GIT :: Gallery :: Notes

Wed, 01 Oct 2008

Tbench regression. SLAB vs SLUB.

After I found a small fix for tbench regression over loopback, I decided to run some tests with it.

Tbench over loopback regression


As was expected, turning off TSO/GSO does not fix the whole issue, performance was increased from 366 MB/s upto 381 MB/s, which is still less than 398 MB/s for 2.6.26-slub.

Another interesting issue I found, is SLAB vs SLUB difference. The former is always faster (about 5-7 MB/s difference): 366 vs 361 MB/s for 2.6.27-rc7 and 381 vs 374 MB/s when TSO and GSO are turned off. Pekka Enberg suggested to revert 5595cffc8248e4672c5803547445e85e4053c8fc commit, which could result in this performance degradation, but without this commit SLUB behaves a little bit slower: 372 vs 374 MB/s.

I will try to find out why there is a huge drop between 2.6.23 and 2.6.24 (54 MB/s) next.

/devel/other :: Link / Comments (0)


Mon, 29 Sep 2008

First fix for the tbench over loopback regression.

It brought me back about 5% in Xen domain with 256 MB of ram in 4-clients tbench test:

current: 187 MB/s
patched: 194 MB/s
Patch is rather trivial: it disables TSO and GSO in loopback and generically on devices which are capable of scatter-gather (where it was automatically enabled by e5a4a72d4f8 commit, which I biseced to be guilty). Actually TSO disablement part provided more gain than GSO on SG devices.
Idea behind patches is clear: we create bigger packet, so we should have smaller overhead of its processing, but apparently GSO/TSO packet creation overhead dominates in loopback at least.

My all three (big) test machines died in various (apparently unbootable) bisections, so I tested it in small and very slow Xen domain. Because of that I did not run 2.6.22 kernel, since git operations and compilation take ages on this 'machine'. For example I was only able to perform about dozen or so git checkous/resets/bisections and compilations for the whole day.

I've posted patch to the netdev@, let's see the result.

Forgot to mention, that I wanted to sell this patch for the DST, POHMELFS or netchannels patch review next time I will post them :)

/devel/other :: Link / Comments (2)


Sun, 28 Sep 2008

Tbench Linux regressions with time.

It was reported, that starting from 2.6.23 Linux kernel has a continuous network-related regression, which results in more than 20% performance degradation. I checked it, and got interesting results.
It is better one time to see, than 1000 times to hear it.

Tbench regressions
Yes, we suck!

I decided to try to fix this issues, and started to bisect 2.6.22->2.6.23 and 2.6.26->2.6.27 on two identical machines, which have 4 logical CPUs (HT enabled) and 4 GB of RAM.

Result was quite surprising: second bisection in the 22->23 froze machine in the middle of the compilation, and first bisection in the 26->27 did not boot. Since I ran it remotely, no progress on this til tomorrow.

/devel/other :: Link / Comments (0)


Wed, 24 Sep 2008

More Kernel Summit photos.


At kernel summit.


Check new ones!

/devel/other :: Link / Comments (0)


Tue, 23 Sep 2008

2008 Linux Kernel Summit photos.


At kernel summit.


Last couple of photos were made at Linux Plumbers Conference (filesystem bof). Not all of them got into the gallery though, I need to try to find missing bits.
I needed to get a real flash instead and do not use build-in one, which sometimes mangled the images...

Got a look?

/devel/other :: Link / Comments (3)


Mon, 22 Sep 2008

Do you know why mammoths are dead?

I'm sorry, I'm not going to waste time on this if you keep acting this dishonest; welcome to my mail filter...
If we pretend to not know about the problem, problem comes and hits us out of stand.

/devel/other :: Link / Comments (6)


Sun, 21 Sep 2008

Mark IPW2100 driver as broken in linux kernel.

Just sent a patch to zillions of maillist (netdev@, linux-kernel@, linux-wireless@) and to lots of developers because of its Fatal interrupt. Scheduling firmware restart. problem.

Let's see if Intel folks will do anything.

Also added couple of jokes about conspiracy theories (like bug fires because Intel forces us to buy a new adapter by this error) to make it a little bit more flameable and to bring attention. I really hope Intel does not do it intentionally.

/devel/other :: Link / Comments (3)


Wed, 17 Sep 2008

A small gift from Gumstix.

Overo board:

Texas Instruments X-Loader 1.4.2 (Sep 10 2008 - 08:47:04)
Reading boot sector
Loading u-boot.bin from mmc

U-Boot 1.3.4 (Sep 10 2008 - 08:47:30)

OMAP3503-GP rev 2, CPU-OPP2 L3-165MHz
Gumstix Overo board + LPDDR/NAND
DRAM:  128 MB
NAND:  256 MiB
*** Warning - bad CRC or NAND, using default environment

In:    serial
Out:   serial
Err:   serial
Hit any key to stop autoboot:  0
reading uImage

2501840 bytes read
## Booting kernel from Legacy Image at 82000000 ...
   Image Name:   Angstrom/2.6.27-rc6+r27+giteddca
   Image Type:   ARM Linux Kernel Image (uncompressed)
   Data Size:    2501776 Bytes =  2.4 MB
   Load Address: 80008000
   Entry Point:  80008000
   Verifying Checksum ... OK
   Loading Kernel Image ... OK
OK

Starting kernel ...

Uncompressing Linux............................................................................
..................................................
Linux version 2.6.27-rc6-omap1 (sakoman@tera) (gcc version 4.2.1) #1 Wed Sep 10 20:32:01 PDT 2008
CPU: ARMv7 Processor [411fc082] revision 2 (ARMv7), cr=00c5387f
Machine: Gumstix Overo
Memory policy: ECC disabled, Data cache writeback
OMAP3430 ES2.2
SRAM: Mapped pa 0x40200000 to va 0xd7000000 size: 0x100000
CPU0: L1 I VIPT cache. Caches unified at level 2, coherent at level 3
CPU0: Level 1 cache is separate instruction and data
CPU0: I cache: 16384 bytes, associativity 4, 64 byte lines, 64 sets,
 supports RA
CPU0: D cache: 16384 bytes, associativity 4, 64 byte lines, 64 sets,
 supports RA WB WT
CPU0: Level 2 cache is unified
CPU0: unified cache: 262144 bytes, associativity 8, 64 byte lines, 512 sets,
 supports WA RA WB WT
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 32512
Kernel command line: setenv bootargs console=ttyS2,115200n8
	root=/dev/mmcblk0p2 rw rootfstype=ext3 rootdelay=1
Clocking rate (Crystal/DPLL/ARM core): 26.0/331/500 MHz
GPMC revision 5.0
IRQ: Found an INTC at 0xd8200000 (revision 4.0) with 96 interrupts
Total of 96 interrupts on 1 active controller
OMAP34xx GPIO hardware version 2.5
PID hash table entries: 512 (order: 9, 2048 bytes)
OMAP clockevent source: GPTIMER1 at 32768 Hz
Console: colour dummy device 80x30
Dentry cache hash table entries: 16384 (order: 4, 65536 bytes)
Inode-cache hash table entries: 8192 (order: 3, 32768 bytes)
Memory: 128MB = 128MB total
Memory: 124372KB available (4088K code, 368K data, 916K init)
SLUB: Genslabs=12, HWalign=32, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
Calibrating delay loop... 499.92 BogoMIPS (lpj=1949696)
Mount-cache hash table entries: 512
CPU: Testing write buffer coherency: ok
net_namespace: 488 bytes
NET: Registered protocol family 16
Found NAND on CS0
Registering NAND on CS0
OMAP DMA hardware revision 4.0
USB: No board-specific platform config found
i2c_omap i2c_omap.1: bus 1 rev3.12 at 2600 kHz
i2c_omap i2c_omap.3: bus 3 rev3.12 at 400 kHz
TWL4030: TRY attach Slave TWL4030-ID0 on Adapter OMAP I2C adapter [1]
TWL4030: TRY attach Slave TWL4030-ID1 on Adapter OMAP I2C adapter [1]
TWL4030: TRY attach Slave TWL4030-ID2 on Adapter OMAP I2C adapter [1]
TWL4030: TRY attach Slave TWL4030-ID3 on Adapter OMAP I2C adapter [1]
Initialized TWL4030 USB module
SCSI subsystem initialized
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
musb_hdrc: version 6.0, pio, host, debug=0
musb_hdrc: USB Host mode controller at d80ab000 using PIO, IRQ 92
musb_hdrc musb_hdrc: MUSB HDRC host driver
musb_hdrc musb_hdrc: new USB bus registered, assigned bus number 1
usb usb1: configuration #1 chosen from 1 choice
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 1 port detected
usb usb1: New USB device found, idVendor=1d6b, idProduct=0002
usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
usb usb1: Product: MUSB HDRC host driver
usb usb1: Manufacturer: Linux 2.6.27-rc6-omap1 musb-hcd
usb usb1: SerialNumber: musb_hdrc
Bluetooth: Core ver 2.13
NET: Registered protocol family 31
Bluetooth: HCI device and connection manager initialized
Bluetooth: HCI socket layer initialized
NET: Registered protocol family 2
IP route cache hash table entries: 1024 (order: 0, 4096 bytes)
TCP established hash table entries: 4096 (order: 3, 32768 bytes)
TCP bind hash table entries: 4096 (order: 2, 16384 bytes)
TCP: Hash tables configured (established 4096 bind 4096)
TCP reno registered
NET: Registered protocol family 1
VFS: Disk quotas dquot_6.5.1
Dquot-cache hash table entries: 1024 (order 0, 4096 bytes)
JFFS2 version 2.2. (NAND) (SUMMARY)  © 2001-2006 Red Hat, Inc.
msgmni has been set to 243
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered (default)
omapfb: configured for panel overo
omapfb: DISPC version 3.0 initialized
Console: switching to colour frame buffer device 128x48
omapfb: Framebuffer initialized. Total vram 1572864 planes 1
omapfb: Pixclock 54000 kHz hfreq 45.1 kHz vfreq 57.7 Hz
Serial: 8250/16550 driver4 ports, IRQ sharing enabled
serial8250.0: ttyS0 at MMIO 0x4806a000 (irq = 72) is a ST16654
serial8250.0: ttyS1 at MMIO 0x4806c000 (irq = 73) is a ST16654
serial8250.0: ttyS2 at MMIO 0x49020000 (irq = 74) is a ST16654
console [ttyS2] enabled
brd: module loaded
loop: module loaded
usbcore: registered new interface driver asix
usbcore: registered new interface driver cdc_ether
usbcore: registered new interface driver usb8xxx
libertas_sdio: Libertas SDIO driver
libertas_sdio: Copyright Pierre Ossman
i2c /dev entries driver
TWL4030 GPIO Demux: IRQ Range 384 to 402, Initialization Success
input: triton2-pwrbutton as /class/input/input0
triton2 power button driver initialized
Driver 'sd' needs updating - please use bus_type methods
omap2-nand driver initializing
NAND device: Manufacturer ID: 0x2c, Chip ID: 0xba (Micron NAND 256MiB 1,8V 16-bit)
cmdlinepart partition parsing not available
Creating 5 MTD partitions on "omap2-nand":
0x00000000-0x00080000 : "xloader"
0x00080000-0x00240000 : "uboot"
0x00240000-0x00280000 : "uboot environment"
0x00280000-0x00680000 : "linux"
0x00680000-0x10000000 : "rootfs"
ehci-omap ehci-omap.0: new USB bus registered, assigned bus number 2
ehci-omap ehci-omap.0: irq 77, io mem 0x48064800
ehci-omap ehci-omap.0: USB 0.0 started, EHCI 1.00, driver 10 Dec 2004
usb usb2: configuration #1 chosen from 1 choice
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 3 ports detected
usb usb2: New USB device found, idVendor=1d6b, idProduct=0002
usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1
usb usb2: Product: OMAP-EHCI Host Controller
usb usb2: Manufacturer: Linux 2.6.27-rc6-omap1 ehci_hcd
usb usb2: SerialNumber: ehci-omap.0
Initializing USB Mass Storage driver...
usbcore: registered new interface driver usb-storage
USB Mass Storage support registered.
mice: PS/2 mouse device common for all mice
twl4030_rtc twl4030_rtc: rtc core: registered twl4030_rtc as rtc0
twl4030_rtc twl4030_rtc: Power up reset detected.
twl4030_rtc twl4030_rtc: Enabling TWL4030-RTC.
OMAP Watchdog Timer Rev 0x31: initial timeout 60 sec
Bluetooth: HCI UART driver ver 2.2
Bluetooth: HCI H4 protocol initialized
Bluetooth: HCI BCSP protocol initialized
Bluetooth: Broadcom Blutonium firmware driver ver 1.2
usbcore: registered new interface driver bcm203x
Bluetooth: Digianswer Bluetooth USB driver ver 0.10
usbcore: registered new interface driver bpa10x
mmci-omap mmci-omap.2: No Slots
usbcore: registered new interface driver usbhid
usbhid: v2.6:USB HID core driver
Advanced Linux Sound Architecture Driver Version 1.0.17.
usbcore: registered new interface driver snd-usb-audio
ASoC version 0.13.2
overo SoC init
TWL4030 Audio Codec init
asoc: twl4030 <-> omap-mcbsp-dai mapping ok
ALSA device list:
  #0: overo (twl4030)
oprofile: using timer interrupt.
TCP cubic registered
NET: Registered protocol family 17
NET: Registered protocol family 15
Bluetooth: L2CAP ver 2.11
Bluetooth: L2CAP socket layer initialized
Bluetooth: SCO (Voice Link) ver 0.6
Bluetooth: SCO socket layer initialized
Bluetooth: RFCOMM socket layer initialized
Bluetooth: RFCOMM TTY layer initialized
Bluetooth: RFCOMM ver 1.10
Bluetooth: BNEP (Ethernet Emulation) ver 1.3
Bluetooth: BNEP filters: protocol multicast
Bluetooth: HIDP (Human Interface Emulation) ver 1.2
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
ieee80211: 802.11 data/management/control stack, git-1.1.13
ieee80211: Copyright (C) 2004-2005 Intel Corporation 
ThumbEE CPU extension supported.
Power Management for TI OMAP3.
SmartReflex driver initialized
VFP support v0.3: implementor 41 architecture 3 part 30 variant c rev 1
twl4030_rtc twl4030_rtc: setting system clock to 2000-01-01 00:00:00 UTC (946684800)
Waiting 1sec before mounting root device...
mmc0: host does not support reading read-only switch. assuming write-enable.
mmc0: new high speed SD card at address 0007
mmcblk0: mmc0:0007 SD02G 1992704KiB
 mmcblk0: p1 p2
kjournald starting.  Commit interval 5 seconds
EXT3 FS on mmcblk0p2, internal journal
EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with ordered data mode.
VFS: Mounted root (ext3 filesystem).
Freeing init memory: 916K
INIT: version 2.86 booting
Starting udevudevd version 124 started

Remounting root file system...
mount: according to mtab, /proc is already mounted on /proc

NET: Registered protocol family 10
Setting up IP spoofing protection: rp_filter.
Configuring network interfaces... ifconfig: SIOCGIFFLAGS: No such device
eth0 No such device

udhcpc: SIOCGIFINDEX: No such device
Error for wireless request "Set Mode" (8B06) :
    SET failed on device wlan0 ; No such device.
Error for wireless request "Set ESSID" (8B1A) :
    SET failed on device wlan0 ; No such device.
ifconfig: SIOCGIFFLAGS: No such device
wlan0No such device

udhcpc: SIOCGIFINDEX: No such device
done.
Starting portmap daemon: portmap.
Sat Sep 13 22:36:00 UTC 2008
Turning echo off on /dev/ttyS1
INIT: Entering runlevel: 5
ALSA: Restoring mixer settings...
Starting Dropbear SSH server: dropbear.
Starting advanced power management daemon: No APM support in kernel
(failed.)
Starting system message bus: dbus.
Starting Hardware abstraction layer hald
Starting syslogd/klogd: start-stop-daemon: lseek: Invalid argument
 * Starting Avahi mDNS/DNS-SD Daemon: avahi-daemon
[ ok ]
Starting Bluetooth subsystem:

Initialization timed out.
Running ntpdate to synchronize clockError : Temporary failure in name resolution
.
Starting GPE display manager: gpe-dm

.-------.
|       |                  .-.
|   |   |-----.-----.-----.| |   .----..-----.-----.
|       |     | __  |  ---'| '--.|  .-'|     |     |
|   |   |  |  |     |---  ||  --'|  |  |  '  | | | |
'---'---'--'--'--.  |-----''----''--'  '-----'-'-'-'
                -'  |
                '---'

The Angstrom Distribution overo ttyS2

Angstrom 2008.1-test-20080911 overo ttyS2

overo login: root
Welcome to Gumstix Overo
For more information visit:
http://www.gumstix.net/Overo/115.html
login[1458]: root login  on `ttyS2'

root@overo:~#
Gumstix Overo

Neat toy! Computer itself actually has a size of the finger (not that thick though), but it does not have a power supply and interface connectors, so essentially unusable as stand-alon board, but with extension motherboard (as on the picture) it becomes very interesting with several usb connectors, hdmi display and audio connectors.

WiFi/bluetooth module is based on wi2wi W2CBW003 Marvell 88W8686 chip. Pretty much unlikely Marvell will share a documentation (on my experience if you do not get more than 1000 chips in single order you will not be allowed to enter its intranet and get access to the needed datasheets), so I will not be able to work on wireless driver, but I would gladly implement it otherwise.

/devel/other :: Link / Comments (6)


Al Viro uses Appele's Mac.

Alexander Viro and his Mac

And an interesting mix of russian and english from him.

/devel/other :: Link / Comments (2)


Tue, 16 Sep 2008

Intel, BURN IN HELL!

[47477.938968] ipw2100: Fatal interrupt. Scheduling firmware restart.
[47478.808276] ipw2100: Fatal interrupt. Scheduling firmware restart.
[47480.611796] ipw2100: Fatal interrupt. Scheduling firmware restart.
[47483.415218] ipw2100: Fatal interrupt. Scheduling firmware restart.
[47487.154543] ipw2100: Fatal interrupt. Scheduling firmware restart.
There is no wired access at kernel summit, but Pavel Emelyanov setup a NAT for me, so I can write this exceptionally informative note.

/devel/other :: Link / Comments (8)


Mon, 08 Sep 2008

GIT web interface for the projects.

I've put there distributed storage with tools, pohmelfs with its server and netchannels.

Enjoy!

/devel/other :: Link / Comments (0)


Tue, 02 Sep 2008

Linux Kernel Summit.

I got visa after short interview in the USA embassy, so will be Portland September 13-20. Although consul asked me in russian about what I will be doing in USA, what is my experience, education and degrees, and so on, I somewhat suddenly started to answer in english. My 'perfect' pronunciation frequently confused even myself, but consul looks like understood something from that flow of sounds. I hope he knows what are filesystems and Linux now.

See you in Portland in a 1.5 weeks!

/devel/other :: Link / Comments (2)


Wed, 20 Aug 2008

To SSD or not to SSD? High-end SSD vs. SAS disks benchmark.

There is a lot of hype around SSD these days... People frequently belive that it is a panacea for hard drive problems related to random disk access. Let's see, how high-end SSDs behave comapred to SAS disks.

I got access to IOzone benchmark performed by Vladislav Seliverstov from Yandex team.

Tested disks: SSD MTRON MSP-SATA7035 (7000?), maximum read speed: 120 MB/s, write speed: 90 MB/s, access latency: 0.1 ms.
Rotating disk was Fujitsu SAS MBA3147RC.

Tests were performed on ext3 filesystem woth the following options (stride option differs accordingly in RAID tests):

# mke2fs -b 4096 -R stride=16 -j -J size=384 -m 0 -O dir_index /dev/md0
Mount options:
/dev/md0 /mnt/raid ext3 noatime,reservation,data=writeback,commit=300 0
dirty_writeback_centisecs VM sysctl was set to 3000.

First, single disk performance: SAS vs SSD.

SAS single disk
SSD single disk

Sequential access speed (both reading and writing) is almost 20% higher for SAS disk than that of SSD. But let's look at random access speed.
SSD reading jumps to the maximum theoretical 100-120 MB/s plato very quickly (impressive peaks at 64 and 128 KBs, which can tell us a bit of the firmware structure of the data blocks), SAS disk is definitely a looser here, since it reaches its maximum performance numbers only at 8-16 MB records.
But SSD random writing is more than two times slower than that of SAS until the latter reaches its maximum performance.
Also very intersting to note, that sequential access is actually noticebly slower than the maximum random access speed for SSD.

Now let's check two-SSD-disk SW RAID-0 array performance with different stripe size.

SSD RAID-0 64KB stripe
SSD RAID-0 128KB stripe
SSD RAID-0 256KB stripe

Random read peaks move around depending on stripe size.

So clearly if your workload depends on random writing, SSD may not be an appropriate solution, and it is definitely the winner in random reading workload. Please also note, that it was high-end SSD with 0.1 ms seek latency, and dight now most of the popular SSDs do not have that shiny numbers.

Great thanks to Vladislav Seliverstov for his data and analysis.

/devel/other :: Link / Comments (0)


Thu, 07 Aug 2008

To Matthew Garrett.

Just a flood note to talk a bit :)

Summary: Almost all problems caused by bugs in Linux, one problem caused by BIOS vendors interpreting the ACPI specification differently to the Linux implementation and trivially worked around. No sabotage.
Except the fact, that it only works for 'Windows XXXX' OSI label.

A short analogy.

Before.
Several years ago there were no problems with ndiswrapper driver with wireless NICs.
More years ago there were no problems with reverse engineered drivers for ATA/IDE and usual NICs.

Now.
Try to tell that you do not support Linux as a server platform, so you will not provide driver or spec for your SATA/SCSI controller or NIC. Some companies even join continual development of the reverse engineered drivers (wireless, ethernet, (s)ata/ide... But of course there are exceptions, no need to make it a red point.

Sorry, but we do not support 'Linux' ACPI label in OS/OSI because vendors will not test it and other similar ... is just a dubious excuse not to push them hard enough. The most exciting example is atheros wireless driver situation.

/devel/other :: Link / Comments (6)


Sun, 03 Aug 2008

Continuing crazy security ideas.

Implement LSM module, which 'guards' some configured dir, so that every read/write/lookup/readdir for any object from there would require a cryptographically strong authentification, otherwise an empty dir (or some other 'old' content) is shown. Applications from the previous round are those ones which are capable to communicate this system, and thus are capable to read and write directory content.
There is a problem with the case when this filesystem is being read on the system which does not have that magic LSM module to authentificate reading.
We do not want to produce garbage directory content in this case, and instead empty dir should be returned. This can be achieved by hiding actual encrypted directory content links not in the parent dir (so it could be read as garbage without security module), but in some other place (like extended attributes), which can be decrypted by security module only.
In this case reading from directory without security module will result in getting empty context, since every read and write to the directory made with security module, resulted in update of special extended attribute and not actual inode. New inodes still exist in the FS and contain valid data (everything is just encrypted) but they are linked to hidden in extended attributes inode instead of actual directory inode. Security module allows to redirect directory operations to the that hidden object instead of visible one.
This approach should work ok with all underlying filesystems, since extended attributes management has generic helpers with appropriate callbacks to the FS code.

Need to think, although updated security ideas TODO entry...

/devel/other :: Link / Comments (2)


Sat, 02 Aug 2008

Game theory for kernel.

I've just thought out an excellent project on how to control/tune various kernel subsystems behaviour based on game theory approaches.
The simplest one is block layer scheduler, which results in maximization of the performance for all participating users.

Just a though so far of course, but I want to dig out my books from the dust grave for the before-sleep reading...

Meanwhile DST project got some features implemented to date: I put network async processing from own threads into per-node thread pool, crypto processing utilizes others. DST is also able to create and send (encrypted and/or hashed if needed) full block IO transaction. Now its time to implement completion handling (its just a search for given transaction and dst_trans_put() call, which will complete block IO request if there are no users of the transaction anymore), read/write processing for the server part and client accepting machinery for the server.

/devel/other :: Link / Comments (2)


Tue, 29 Jul 2008

Another excellent LISP book.

Common LISP Cookbook has such interesting things like threads, socket and foreign function interface.
I belive "Common LISP Cookbook" and "Practical Common LISP" form a must-have library for every LISP programmer. So far I think that that's all what is needed, since this set covers vast majority of possible usage cases. Even DSL are covered there in details.

/devel/other :: Link / Comments (0)


Wed, 23 Jul 2008

Manager's thoughts: unused extensibility and used de-facto standards.

After some before-sleep-reading (this time DNS RFC specifications) I found, that DNS protocol is so much extensible, that is can perfectly cover not only its area, but also help in really lots of close problems. It already has (though completely unused) many interesting RRs and types, which have nothing to deal with DNS (like NULL RR, which allows to transmit binary data or TXT RR, which also is not related to DNS area). And the most popular RRs are A, PTR, SOA CNAME and MX. That's all from about 20 others. The same applies to (q)type and class (I first time read about Hesiod class for example). And DNS allows to introduce own classes, types and resource records.
It is just not used, but we could create distributed DNS system with new types. It would be really simple (and actually it can be done even without new DNS extensions).
But it is not actually needed, since people are used to have DNS just like it is.

Another example is internet video. There is de-facto Adobe standard, no matter what W3C will put into its new standard, everyone will continue to use existing one. Just because it works ok. Not excellent or perfect or whatever, it just works how we used to know.

And there are lots and lots similar examples.

People are so much intert in this questions (although I think in most areas, just because it is convenient not to do something better, when existing solution just works, even if not perfectly and even if not good), that no one will ever bother to change something dramatically, because it will not only require huge amount of money, but also changes in the way people used to think about given area, which is likely even more complex (and money-hungry) problem.

All this talk is about simple thing, I just opened for myself: when you created something completely new, even if it is not the best solution for given problem, if you will start pushing it to wide audience to be used, then you are able to get all 'the market'. That's why when you have something new on the market, where most of the users already used to work with one or another solution, (and even if your project is potentially very good and definitely much better than existing solutions) then there will not be any major gain, only single links to the completely new users.
This is probably told to the first year MBA students, but I was quite excited and dissapointed by this issue: the first new idea, when properly presented even if not the best solution for given problem, can get all the users, after which they will not switch to the new one just because they used to have it this way.

/devel/other :: Link / Comments (1)


Sun, 20 Jul 2008

Crazy security idea.

I've just thought, that I do not know a way to make some (running) application to encrypt all its data, which hits the disk (either via swap or usual way, like editor writing the file and all its temporary files).
I actually consider this as a very useful feature for the editors, browsers, instant messengers and mail clients, downloading applications and musical players and so on. This is especially valid for temporary files, when one expects editor to be highly secure (or even working on encrypted partition), while its temprary files are stored somewhere in /tmp which is not encrypted.

It could be started via some wrapper, which will tell the kernel encryption algorithm, key, iv and all needed info, it will attach a crypto processing callback to the process, so when disk activity is started by given pid (swap or data writing or reading), it is encrypted/decrypted in flight.
Kernel should check all file descriptors opened by the given process and appropriately process them. There may be some problems with communication with unprotected applications, which should be thought out, but overall I like the idea...

Has put it into todo list.

/devel/other :: Link / Comments (0)


Project presentation.

I've just realized, that lots of my blog posts are valid enough presentation abstracts, at least they contain enough words describing the problem, possible solution and overall interested for given area topics. But I never presented such projects in english before, although quite frankly I'm not that bad speaker in russian, at least I am not afraid to talk and probably like a contact with interesting auditory. After all there is this blog :) and even had number of similar kind of presentations from 15 minutes to couple of hours including question/answer part.
My english used in blog is rather ugly, but I rarely (if at all) fix errors which I detect after subsequent reading of the text in the browser (and I detect lots of them) as long as in mails and other posts.
So probably eventually we will have interesting talks about diferent areas, but expect to 'listen' a world-wide language of the gestures :)

/devel/other :: Link / Comments (0)


Tue, 08 Jul 2008

Anecdots and allegories.

I'm not a major kernel contributor, but I was invited 3 times last 3 years to kernel summit.
And I will try to move to this year one in Portland, Oregon, at least I started some preparation process and contacted needed people. I hope I will also participate in Plumber's conference.
As before I will bring bottle of vodka (number of people who wanted to talk suddenly dropped to ground) and greatly appreciate your contact and discussion topics :)
That's of course if stars will stay in a straight line, but I will push them a bit.

/devel/other :: Link / Comments (0)


Wed, 02 Jul 2008

Holy shit: kernel summit.

We would like to invite you to the 2008 Kernel summit, and we hope that you will be able to join us...
I'm trying to recall previous kernel summit:



That was fun, but no one wanted to play football instead of talking about whatever we talked about.

For that year I only committed a HIFN driver into the tree, and there was no kevent :)

This time in US, thinking...

/devel/other :: Link / Comments (5)


Thu, 26 Jun 2008

Hacking your ISP for fun and profit.

My ISP again blocked my account and can not unblock it although there are money on the deposit. There are serious problems in its billing system which requires manual intervention of the operator. Unfortunately it is a real challenge to call them, it already took more than half of a hour yesterday, and without success.
So, I decided to implement an interesting idea on how to bypass its blocking.

It is based on the security 'hole' in its (and I think vast majority of ISPs do the same) DNS configuration, which allows to request any DNS record even if account is blocked. It will be fetched from remote DNS server if there are no records in the IPSs cache.
Thus attack vector becomes visible: implement IP over DNS tunnel network device and setup local routing to use it by default. One has to control at least one remote machine which hosts DNS records for given domain name, since it is required to parse incoming DNS requests and process them accordingly.

There are at least two known IP over DNS tunnel solutions: NSTX (howto) and OzymanDNS (howto). Both solutions require that you own one or another server to run ip-over-dns tunnel server on it. Unfortunately I have only single machine with static IP address, which is not protected by lots of firewalls and allows incoming connections.

The simplest solution for this problem is to create iptables input target rule for the server, which will parse incoming DNS requests and redirect usual queries up the network stack to the userspace server, and handle 'poisoned' queries as tunnel.
Client can be TUN/TAP based, but can also be a tunnel network device.
I believe the more weird it looks, the more interesting it is, so likely will think more about kernel based tunnels.

DNS queries are limited enough not to allow binary data (IIRC, the most interesting is DNS TXT records), but it can be appropriately encoded and enciphered. So, will put it into todo list.
I even think that it is not that bad idea to have such modules in kernel :)

/devel/other :: Link / Comments (8)


Tue, 24 Jun 2008

VM gotcha: forbidden double kmapping.

I've just known, that it is impossible to map the same page twice: for example first time using kmap()/kunmap() and second one via kmap_atomic()/kunmap_atomic().
Although mechanisms are a bit different in both mappings, it is forbidden to do and system will panic like this:

IP: [] kmap_atomic_prot+0x1b/0xc5
*pdpt = 0000000031c79001 *pde = 0000000000000000 
Oops: 0000 [#1] SMP 

Pid: 6478, comm: pohmelfs-crypto Not tainted (2.6.25 #27)
EIP: 0060:[] EFLAGS: 00010202 CPU: 2
EIP is at kmap_atomic_prot+0x1b/0xc5
EAX: ebc7c000 EBX: 00000003 ECX: 00000000 EDX: 00000003
ESI: 00000fdc EDI: 00000163 EBP: 80000000 ESP: ebc7dee4
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
Process pohmelfs-crypto (pid: 6478, ti=ebc7c000 task=f25040b0 task.ti=ebc7c000)
Stack: 00000000 00000003 00000fdc f7cf4078 00000fdc c0114144 00000163 80000000 
       c01991b1 ebc7df44 f70e3580 00000000 ebc7dfa8 ebc7df40 f70e3580 00000003 
       00000000 f7cf4000 f70e3580 f70ff8b0 f70ff880 f7096c00 c019a771 f70e3580 
Call Trace:
 [] kmap_atomic+0x11/0x14
 [] update2+0x7c/0x13f
 [] hmac_update+0x49/0x50
 [] pohmelfs_crypto_thread_func+0x304/0x3e8 [pohmelfs]
 [] hrtick_set+0x7a/0xd7
 [] autoremove_wake_function+0x0/0x2b
 [] pohmelfs_crypto_thread_func+0x0/0x3e8 [pohmelfs]
 [] kthread+0x38/0x5f
 [] kthread+0x0/0x5f
 [] kernel_thread_helper+0x7/0x10
This happend for exacly above case, when page was first mapped via kmap() in POHMELFS and then via kmap_atomic() in HMAC crypto processing code.
I wonder what will happen if we ever try to send kmapped pages over IPsec tunnel. Likely it will ooops too...
This can happen for example when pages are mapped in tcp_sendpage() when calling sendfile() over the interface, which does not support hardware checksumming and scater-gather: mapped pages are pushed down the network stack where they will be eventually encrypted/hashed in IPsec, which will in turn call kmap_atomic().

So, if you will find obscure oops in kmap_atomic() and friends, first check that calling stack did not map page earlier.

/devel/other :: Link / Comments (0)


Thu, 19 Jun 2008

CLISP socket streams.

Excellent documentation with examples. I expect that it is implementation (i.e. CLISP) specific and will not work with SBCL or Allegro for example, but nevertheless I want to learn and somewhat use it.
If it will be good for my usage cases, what my next userspace server will be written with? :)

/devel/other :: Link / Comments (0)


Wed, 18 Jun 2008

LISP macros rox!

(defmacro with-output-dir ((out pos dir flags) &body form)
  `(let ((,pos 2))
     (dolist (operation (nthcdr 2 *iozone-tests*))
       (let* ((dir (pathname-as-directory dir))
	     (output-file (make-pathname
			 :directory (pathname-directory ,dir)
			 :name operation
			 :type "gnuplot")))
        (with-open-file (,out output-file :direction :output :if-exists ,flags)
	  ,@form))
       (incf pos))))

(defun write-gnuplot-headers (dir)
  (with-output-dir (out pos dir :supersede)
		    (format out "set title \"Iozone performance: ~a, KB/s\"~%" operation)
	            (format out "set terminal png small size 450 350~%")
	            (format out "set logscale x~%")
	            (format out "set xlabel \"Record size in KBytes\"~%")
	            (format out "set ylabel \"Kbytes/sec\"~%")
	            (format out "set output \"~a.png\"~%" (elt *iozone-tests* pos))
		    (format out "plot ")))

(defun update-gnuplot-headers (dir file)
  (with-output-dir (out pos dir :append)
		   (unless *first-file-p*
		     (format out ", "))
		   (let* ((fstype (pathname-name file))
			  (name (make-output-name file)))
		     (format out "\"~a\" using 1:~d title \"~a\" with lines" name (1+ pos) fstype))))
Macros are really the coolest feature of the LISP. Now I believe I started to understand LISP kung-fu.
Iozone parser is essentially ready. I was a bit pessimistic yesterday: it took only half of the day and several hours today, and code itself is rather ugly (and frequently really ugly, likely far from the LISP way), but it works: it runs over given dir, searches there for files with given extensions, parses them (removes unneded iozone information), writes result to specified directory. Also runs over iozone test strings and generate gnuplot scripts for them, which will build a graph based on filesystem info it gathered traversing the tree above, so results looks like this:
$ ./parser.lisp
Processing: /tmp/iozone/tmpfs/nfs.out ... done
Processing: /tmp/iozone/tmpfs/pohmelfs.out ... done
$ cat /tmp/iozone/tmpfs/out/read.gnuplot 
set title "Iozone performance: read, KB/s"
set terminal png small size 450 350
set logscale x
set xlabel "Record size in KBytes"
set ylabel "Kbytes/sec"
set output "read.png"
plot "/tmp/iozone/tmpfs/nfs.out.data" using 1:5 title "nfs" with lines,
	"/tmp/iozone/tmpfs/pohmelfs.out.data" using 1:5 title "pohmelfs" with lines

/devel/other :: Link / Comments (2)


Tue, 17 Jun 2008

LISP development zen.

(defun string_to_list (str)
  (let ((num 0) (ret '()) (string_len (length str)))
    (dotimes (i string_len)
      (let ((sym (elt str i)))
        (cond
	  ((not (char-number-p sym))
            (unless (eql num 0)
	      ;(format t ": ~d~%" num)
	      (push num ret)
              (setf num 0)))
          (t (setf num (+ (* num 10) (to_number sym)))
	     (when  (eql i (- string_len 1))
	       (push num ret))))))
  (nreverse ret)))
Which is a part of my LISP parser for iozone output files. So far it is able to convert its output numbers (performance in KB/sec) into LISP lists (one list per record), so single line of iozone output becomes a single list of numbers (ugh, I was forced to write string-to-number conversion function).
It is not that serious achievement likely, and it took the whole day, but nevertheless I like it, although I would write the same in C much faster :)

Main problem with Lisp for me is its functional-conditioning system. Converted to C it looks like:
if (a) {
  if (b) {
    if (c) {
      do_stuff()
    }
  }
}
While I would write:
if (!a)
  return;
if (!b)
  return;
if (!c)
  return;
do_stuff()
So far I did not use macros at all, and all the time looked into Practical Common Lisp book (and frankly got from there directory processing functions, although modified it a bit), but what would you expect from the first project. Tomorrow I will extend it to write gnuplot-compatible file and finally generate some graphs (I do not know how to call external programms from LISP though).
Frankly, I'm not yet excited about how cool LISP is, but I like it, since it is different. Just like I like my neverending appartment development process.
Ugh, and with proper automatic vim highlightning I am not afraid of parenthesis.

Interested reader can grab my sources and comment on ugliness.

Also found an 'interesting' article at IEEE about LISP: Migration of Common Lisp Programs to the Java Platform -The Linj Approach :)

/devel/other :: Link / Comments (2)


Fri, 06 Jun 2008

Contributors we are losing and kernel summit talk about it.

By 'we' I mean kernel community, although I do not think I personally win or lose if someone decided not to hack on Linux kernel.

I even found myself in a 'contributors we are losing' list :)

And yes, very likely Linux kernel community lost me (and I do believe none cares as long as me). But not Linux kernel, it is definitely the place I like.

People, who want to hack on Linux kernel will do that without all that empty talks and brilliant ideas, all of which are only aimed in a single direction: do what we will ask you to do for us. Be fair and admit that you do not want new ideas implemented, you want old bugs (introduced by someone else) fixed only, so that kernel got more respect without possible additional work for you.

It is not how interested people work, instead they just decide themself how and what to do. That's why kernel janitor project did not succeed: it is not interesting for anyone. The same applies to its refocus to bugfixes.
And I do know what is kernel janitorial: I started with that not long time ago: fixed trivial error checks like request_region()/check_region() code and other minor things like PCI remap errors.
That was hell of crap. Frequently there was a situation, when I fixed lots (like 20 or more) drivers in one go and submitted a patch, instead I was asked to split it to separate patches, to add each driver maintainer into the copy, wait for theirs ACK, resubmit and so on. And frequently happend (especially when new feature was introduced and lot of small code has to be changed a little), that while I did that, some other known kernel hacker did the same, and his patch was immediately applied.

Janitorial and all hypocrisy about 'we want more developers' just suck.

My advice for those who really want to hack on kernel: just do what you like, try yourself in whatever subsystem you want, implement your ideas, be creative and do whatever you like with kernel and not what all those kernel heads tell you to do.
The only way to succeed is to move forward!

Argh, and do not listen for any such kind of advices at all :)

/devel/other :: Link / Comments (3)


Mon, 02 Jun 2008

Pros are talking.

- If you haven't noticed, I don't take "no" for an answer,
- And now please tell us step 2 in your secret plan to win friends and influence.
- WTF are you getting at?
Fun thread :)

There is actually a serious problem in kernel community, when some new idea is being implemented, and it moves against something which sits in mind of one or another big kernel hacker out there. When such person replies, that this is bad idea (sometimes without technical arguments), people just stop looking at replies and do not follow arguments of the author just because they frequently do not know area in question enough to make decision and thus rely on others.
This only works when 'others', i.e. core kernel maintainers, are good and do not base theirs decisions on personal feeling and only get technical side into assumption. Unfortunately it is not always the case, and political methods are used. Sometimes even only political methods are used...

/devel/other :: Link / Comments (0)


Sun, 25 May 2008

Every lisper did that.

#!/usr/bin/clisp
(defun f (m)
  (do ((k 0 (1+ k))
       (c 0 n)
       (n 1 (+ c n)))
    ((eql k m)
     (format t "~r" c))))
(f 317)

Guess the result:seven hundred and ninety-three vigintillion, five hundred and ninety-one novemdecillion, four hundred and seven octodecillion, eight hundred and four septendecillion, one hundred and fifty-one sexdecillion, nine hundred and twenty-six quindecillion, five hundred and ninety-three quattuordecillion, seven hundred and ninety-three tredecillion, forty-two duodecillion, one hundred and twenty-six undecillion, eight hundred and ninety-one decillion, one hundred and twenty-eight nonillion, eight hundred and nineteen octillion, six hundred and ten septillion, seven hundred and ten sextillion, one hundred and forty quintillion, one hundred and forty-five quadrillion, thirty-seven trillion, nine hundred and fifty-eight billion, two hundred and seventy-three million, seven hundred and seventy-seven thousand, three hundred and ninety-seven

/devel/other :: Link / Comments (4)


Wed, 21 May 2008

Things getting worse...

$ clisp 
  i i i i i i i       ooooo    o        ooooooo   ooooo   ooooo
  I I I I I I I      8     8   8           8     8     o  8    8
  I  \ `+' /  I      8         8           8     8        8    8
   \  `-+-'  /       8         8           8      ooooo   8oooo
    `-__|__-'        8         8           8           8  8
        |            8     o   8           8     o     8  8
  ------+------       ooooo    8oooooo  ooo8ooo   ooooo   8

Welcome to GNU CLISP 2.42 (2007-10-16) 

Copyright (c) Bruno Haible, Michael Stoll 1992, 1993
Copyright (c) Bruno Haible, Marcus Daniels 1994-1997
Copyright (c) Bruno Haible, Pierpaolo Bernardi, Sam Steingold 1998
Copyright (c) Bruno Haible, Sam Steingold 1999-2000
Copyright (c) Sam Steingold, Bruno Haible 2001-2007

Type :h and hit Enter for context help.

[1]> (defun test-func () (format t "It's a test func"))
TEST-FUNC
[2]> (test-func) 
It's a test func
NIL
[3] (exit)
Bye.
This one has, imho, the less ugly command line... And I'm against SLIME and Emacs. Also tried SBCL, GNU CL and something else, but likely CLIPS will stay.

Instead of sleeping (it will be time to wake up soon in Moscow slums) or at least catching POHMELFS bugs (last several days were solely devoted to this task and fair number of them were fixed as long as some interesting features introduced (probably new), so likely new release will see the light later this week), I'm drinking some beer and making first steps into this. So far looks quite new and probably interesting, but every entrance article about it I read told, that if you are after 25 years old, it is likely impossible to change something in your perception. I'm after, but think that it will be fun and probably will become a really good tool for me.

The more I think about it, the more interesting tasks (as long as those I'm already thinking about like CAPTCHA) I find...

/devel/other :: Link / Comments (5)


Fri, 25 Apr 2008

Solaris vs 'Have you ever kissed a girl?'

As started by Ted Tso.

We forgot the answer:

No, but I can kiss the sky
He was 22 that days? :)

From my developer's point of view Solaris first sucks because of its contributor agreement. There is no way I can devote my time to organization, which will get my work for free and do whatever they want with it without my opinion as author (Actually the same applies to BSD-style at some degree. Yes, that can be trivial greediness).

It is not _that_ bad OS, but there is no known practice in modern medcine of deadman awakening.
Slolaris has its niche, but that's it, although Linux can be tuned to be faster (or if it has some bugs, they can be fixed) in that areas, but that does not matter, people who make decisions already know that they want.

Pseudo openness of the Solaris is just a marketing noise. Those who want to hear it will hear just that, no matter how things are in real life.

/devel/other :: Link / Comments (1)


Tue, 22 Apr 2008

Debunked copy_to_user() from kernel thread problem.

It happend to be really trivial. Even no VM hacking :(

First, some background on how copy_to_user() works on x86.
Its asm looks pretty simple (and it is very small, check arch/x86/lib/usercopy_32.c:__copy_user()), so I always wondered how it can handle missing-page-exception, when userspace page was swapped out.

Things live in small part of the function: .section __ex_table, this table contains two values: place where exception happend, and fixup address (it is just instruction positions). Linker puts this table into special section, accessible by page fault handler do_page_fault(). In some cases page fault path is never executed, code just searches for page and locks it, even if it is already in the table (that is why get_user_pages() is at best as fast as copy_to_user()). This happens when WP bit is not set and does not work (a speculation only though, derived from __copy_to_user_ll() and Intel F00F bug errata).

When WP bit works, we have usual copy_to_user(), which will fault if there is no destination page, and do_page_fault() eventually will be called. After number of checks system determines that it is exception in kernel mode and if there is above exception table (which is true for copy_to_user()), it tries to fix things up.

Here we come to essentially the same code, what is called in get_user_pages(): we locate VMA for failed address and insert new page into page table, this involves allocation of all those strange 3-letters abbreviations: pgd, pud, pmd and pte ('and' is not VMM abbreviation yet), I know what two or three of them mean, but completely forgot pud, on 4 level page table it is hard to recall which two are the same, since iirc x86 has only 3 levels.
If page was swapped out, it will be brought back and eventually fault handler will try to fix things up via fixup_exception(), which will replace EIP with appropriate value from the section table described above, so that CPU will return back to __copy_user() code and continue (or not, depending on fact that page exists or not) its execution.

So, how to hook into above mechanism and allow completely different process to write data into userspace? Quite trivially: above fixup (VMA searching and 3-letters abbreviation allocations) happens for particular mm_struct, which contains VMA list, page table lock and other (likely very) essential information to handle memory management. This structure is obtained from the curent thread executed on the CPU, so by replacing mm_struct in our kernel thread with userspace thread's one, we can safely copy data to and from userspace. There is a race of course, when userspace thread will want to access its own mm_struct (copied to kernel thread) for example calling mmap() or copy_*_user() from kernel, so we have to be careful and properly guard against that.

Example code which does copy to userspace from kernel thread can be found in archive. Just replace kernel path in Makefile to your own, call make and insert module.
Each reading from /dev/tcopy file will end up with copy of data from kernel to userspace in dedicated kernel thread.

/devel/other :: Link / Comments (2)


Mon, 14 Apr 2008

A hypocrisy.

When user fills the bug, developer is supposed to fix it. That is obvious and of course true.

But interesting things start showing in details.
If user piss developer off, it is ok. If developer throws something back - it is bad.
If user does not answer, it is ok. If developer keeps silence - he is a bastard.
If user fills bug, it is ok. If developer asks user for some help - developer is a fucking monster.

Yes, there are real jerks in development community as long as in users, and getting simple numbers: user community is much bigger than development one, so number of crappy people scales as well. And nevertheless, people like to blame developers and pray to users. This comes down to absurd, when developer asks for help, and then he is blamed for not devoting time to solving a problem.

People like to look at others. I like to look at others too of course. And we frequently like to forget that we behave exactly like those who we blame to be jerks. Exactly like them. We just forgot that, or do not pay attention, or do not want to think about, since when things come to us, this becomes a hypocrisy.

/devel/other :: Link / Comments (1)


Thu, 10 Apr 2008

get_user_pages() sclability.

Just found an article at LWN about get_user_pages(). Main problems happend to be a locking between multiple threads...

Out of curiosity, was this scalability problem fixed (for the busy reader: this is my more than 2-years old testing of the get_user_pages() performance with single thread, ran to find bottlenecks in kevent AIO).

Here is a graph (perfomance vs. number of pages):
get_user_page() scalability

/devel/other :: Link / Comments (0)


Thu, 03 Apr 2008

Codying style stupid talks.

Yet another one...

Blah-blah-blah, I like spaces, blah-blah-blah, I do not like spaces...

Here are just two examples (one from the thread), decide yourself, which is easier to read:

Becauseitmoreeasilyallowsyoureyestoseethedifferentoperators.
B e c a u s e i t m o r e e a s i l y a l l o w s y o u r e y e s t o s e e t h e d i f f e r e n t o p e r a t o r s .
The same applies to more common:
for (i=0; i<10; ++i) vs 
for (i = 0; i < 10; ++i)
The latter just wastes lots of space and forces eyes to move out of orbits.
That is my own opinion, obviously the more people involved, more opinions strike.

So, never kick someone when he is on the edge forcing him to change simple stuff in codying style, he can return and kick you back, when you will be on own edge...

Ugh, and forgot likely the favourite one:
for (i=0; i<10; ++i) vs 
for (i=0; i<10; i++)
Update: Oh holy crap: I recall people compared theirs uptimes to show which dick is longer who is more cool, but comparing number of whitespaces-instead-of-tabs-errors per subsystem is a real winner of the modern cruel reality! Hope you have a sense of humor, lets convert number of errors per 1000 lines of code into length (100*kloc/errors):
kernel/ maintainer has this big: ===========D
arch/alpha maintainer has this big: =D
arch/arm maintainer has this big: ==D
arch/avr32 maintainer has this big: ============D
arch/blackfin maintainer has this big: ===================================D
arch/cris maintainer has this big: =D
arch/frv maintainer has this big: ====D
arch/h8300 maintainer has this big: =D
arch/ia64 maintainer has this big: ==D
arch/m32r maintainer has this big: ====D
arch/m68k maintainer has this big: ==D
arch/m68knommu maintainer has this big: =====D
arch/mips maintainer has this big: ====D
arch/parisc maintainer has this big: D
arch/powerpc maintainer has this big: ==D
arch/ppc maintainer has this big: =D
arch/s390 maintainer has this big: =D
arch/sh maintainer has this big: ====D
arch/sparc maintainer has this big: ==D
arch/sparc64 maintainer has this big: ===D
arch/um maintainer has this big: ==D
arch/v850 maintainer has this big: ===D
arch/x86 maintainer has this big: =D
arch/xtensa maintainer has this big: ==D
And couple of my projects:
fs/pohmelfs maintainer has this big: =======D
drivers/block/dst/ maintainer has this big: ============D
drivers/connector maintainer has this big: ===D
drivers/w1 maintainer has this big: =======D
Not bad, will put it near the mirror...

/devel/other :: Link / Comments (8)


Tue, 01 Apr 2008

I believe Firefox as is can pass Turing test.

It is real artificial life on my desktop:

gettimeofday({1207056215, 592745}, NULL) = 0
gettimeofday({1207056215, 592792}, NULL) = 0
gettimeofday({1207056215, 592858}, NULL) = 0
gettimeofday({1207056215, 592909}, NULL) = 0
gettimeofday({1207056215, 592957}, NULL) = 0
gettimeofday({1207056215, 593005}, NULL) = 0
gettimeofday({1207056215, 593064}, NULL) = 0
gettimeofday({1207056215, 593139}, NULL) = 0
gettimeofday({1207056215, 593237}, NULL) = 0
gettimeofday({1207056215, 593292}, NULL) = 0
gettimeofday({1207056215, 593346}, NULL) = 0
gettimeofday({1207056215, 593382}, NULL) = 0
gettimeofday({1207056215, 593431}, NULL) = 0
gettimeofday({1207056215, 593491}, NULL) = 0
gettimeofday({1207056215, 593541}, NULL) = 0
gettimeofday({1207056215, 593589}, NULL) = 0
gettimeofday({1207056215, 593638}, NULL) = 0
gettimeofday({1207056215, 593696}, NULL) = 0
gettimeofday({1207056215, 593762}, NULL) = 0
gettimeofday({1207056215, 593843}, NULL) = 0
gettimeofday({1207056215, 593897}, NULL) = 0
gettimeofday({1207056215, 593951}, NULL) = 0
gettimeofday({1207056215, 593987}, NULL) = 0
gettimeofday({1207056215, 594034}, NULL) = 0
gettimeofday({1207056215, 594093}, NULL) = 0
Suddenly it started to eat my CPU by getting time every 50ms... I can not say why it is needed, except some sign of AI calibrating its ion cannon. Fortunately it was killed before any damage (except screaming cooler on the processor) was made.

/devel/other :: Link / Comments (4)


Wed, 19 Mar 2008

I have a very bad carma: hardware specification of the testing machines.

3 Intel E7520 systems, each one has two 3Ghz Xeon CPUs with HT enabled and EDAC bits, 4 Gb of RAM, Adaptec AIC7902 Ultra320 SCSI adapter. Disks: FUJITSU MAU3036NC 15k rpm 32 Gb system disk (will also be used in testing), two of them will be installed in mirror later, SEAGATE ST3300007LC 10k rpm 300 Gb testing disk.
The former has about 90 MB/s linear read speed, the latter - 75 MB/s.
About 5 minutes to fully compile and link loadable kernel.
Pretty neat machines, and I managed to lost three system disks already, doesn't it say about my bad carma? Without any load, without kernel changes, without anything... Is it because they are called devfs[123] and thus striking problems like that old virtual filesystem, which eventually died a torture death?

Waiting again... Since one machine is still alive, will start filesystem contest tomorrow, development will be a bit postponed.

/devel/other :: Link / Comments (1)


Sat, 15 Mar 2008

Linux sucks? I believe I already told that.

Sorry, but:

# mount /dev/dvd /mnt
...^C
# dmesg | tail
[  853.189807] sr 1:0:0:0: [sr0] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE,SUGGEST_OK
[  853.189822] sr 1:0:0:0: [sr0] Sense Key : Medium Error [current] 
[  853.189832] sr 1:0:0:0: [sr0] Add. Sense: No seek complete
[  853.189843] end_request: I/O error, dev sr0, sector 9180408
[  853.189852] Buffer I/O error on device sr0, logical block 1147551
...

# dd if=/dev/dvd of=/tmp/data bs=1M
# mount -o loop /tmp/data /mnt
# ls /mnt
 Doctor House
So I can not mount dvd via mount, but can do the same after sequential read of the dvd into the file. This error with seeking looks like problem with hardware _or_ linux driver. I know hardware sucks, but if sequential read works I can not understand why any other does not...

Sigh, it is 21 century on the street iirc...
Main problem is that anything other sucks even more. And everyone is guilty, for example there is a bug in hifn 795x hardware crypto accelerator driver I wrote, which is in the mainline, or two in DST project, or aywhere else. I wish world to be perfect...

/devel/other :: Link / Comments (3)


Fri, 29 Feb 2008

Richard Stallman in Moscow.

Here is a schedule, he will present a lecture called "Free Software in Ethics and Practice" March 4 in MIPT where I studied (already long time ago?).

I will not visit though :)

/devel/other :: Link / Comments (1)


Tue, 26 Feb 2008

Traffic jam simulator and some math analysis.

Trying to make at least something during fscking sick.

I've create a simple traffic simulator, which contains variable number of cars and lights, each of which can be programmed to different acceleration, maximum allowed speed, stop and deceleration distances. Each light can be programmed to switch lights after different interval, there are only two lights in real life: red and green, at least in Moscow very unfortunately) no one ever cares about yellow, lots of drivers specially accelerate when see yellow... So, only two colors.

Since I did not bother to implement a nice config for each car and light, there is only signle set of parameters, but command line parameters allow to vary initial number of cars and distance between them, number of time frames before lights change the state or new car enters the road, number of lights and distance between them.



There are two known problems with the lights on the road: first, bad drivers, who do not maintain a huge enough buffer, so they have to wait until car in from of them moves far enough so they can start, this takes some time from limited timeframe of the green light. If buffer is large enough, drivers can start simultaneously and thus move much faster.
One can simulate this behaviour with variable initial number of cars and with different distances between them, if distance is less than stop distance (i.e. distance where driver has to stop its car, it is 4 in the current setup), then driver will have to wait, until distance becomes more or equal to stop distance, if driver stopped far than stop distance (let say 5 'meters'), then car can start simultaneously with the head. The latter approach allows to move more cars through the light during fixed time frame, but psychologically it looks better to stay as closer as possible to the head car, which introduces a latency, since we have to wait until head car moves far enough, so we could start. This leads to negative exponential speed increase for each car behind instead of linear speed if drivers would maintain the buffer. Appropriate equations are quite simple: difference of the distance moved by the single time frame is proportional to the acceleration of the head car, which in turn is proportional to its coordinate, so we have a simple differential equation, which solution results in a negative exponential. One can read a bit more here.



Second problem is light interval. If interval is too short, then cars can not start, only couple of them moves forward, and if it is big enough, then during red light a large backlog of cars can be accumulated, and it will not be removed during green light because of the above problem: each car has to wait until head one moves to some distance. The latter is actually worse, since backlog can become so huge, that it will not be removed at all, which will lead to complete stall of the traffic flow (at the back side, front one will move, but number of cars at the tail will be bigger than number of cars which leave the traffic jam).



One can play with the programm, called traffic. It requires gtk2 devel package installed. Homepage contains essentially the same text and link to the source code. It also shows usage example.

Enjoy!

/devel/other :: Link / Comments (7)


Tue, 19 Feb 2008

Fedora sucks. It is not even remotely designed for smaller than high-end systems.

At least yum developers do not know, that there are systems with less than 1 Gb of RAM. And it is not even about how slow yum is. Not about the fact, that to install 30 kb application yum will download 3.5 Mb sqlite database file.
It is about yum programming bugs:

error: Couldn't fork %post: Cannot allocate memory
  Updating  : libebml                      ################### [  68/1218] 
error: Couldn't fork %post: Cannot allocate memory
  Updating  : xorg-x11-server-utils        ################### [  69/1218] 
  Updating  : fribidi                      ################### [  70/1218] 
error: Couldn't fork %post: Cannot allocate memory
  Updating  : lame-libs                    ################### [  71/1218] 
error: Couldn't fork %post: Cannot allocate memory
error: Couldn't fork %pre: Cannot allocate memory
error:   install: %pre scriptlet failed (2), skipping tk-8.4.17-2.fc8
  Updating  : libdvdnav                    ################### [  73/1218] 
error: Couldn't fork %post: Cannot allocate memory
*** glibc detected *** /usr/bin/python: corrupted double-linked list: 0x15cdde58 ***
As you might expect, all 70+ packages above also got 'Cannot allocate memory' error. My laptop has 256 Mb of RAM and 512 Mb of swap, more than a half was free.
After trying to start the same process again, after some applications were killed to get free memory, yum refused to install packages because of broken dependencies...

For example for xorg-x11-server-utils I have:
xorg-x11-server-utils-7.3-2.fc8
xorg-x11-server-utils-7.2-1.fc7
xorg-x11-server-utils-7.3-1.fc8
But libebml has one FC6 version. For the protocol, FC6 was never installed on this laptop:
libebml-0.7.7-2.fc6
libebml-0.7.7-3.fc8
Fedora Core also forces FC9 stopper bug into needinfo one without any single patch/version to test (at least I did not receive any such mail), opened with perfect description, with probability of bufer overflow somewhere in image processing/rendering code, with 100% reproducible example and image to test with, even after other person reported the same problem on rawhide (and marked it as fc9 stopper).
How in the hell you expect to get some info after two months of silence from developers? (one month after bug was confirmed in rawhide) Some people still believe in miracles...

I would like to test it right now, but I can not because of yum problems... Old packages, as you might expect, still have that bug somewhere.

World is far from being perfect :)

I will not turn it off or suspend, I do not believe it will work after that. Instead I will wait until capabale to get new DVD with some other distro. And it will not be Debian either.

/devel/other :: Link / Comments (3)


Sat, 16 Feb 2008

How to measure a temperature of the body under very limited conditions.

Let's suppose one does not have a thermometer, but there are lots of instruments and equipment around starting from screwdriver to drills, from simple amper/voltmeter to laptop. As a prompting: there is also electricity, vater and automatic teakettle.
Task is to measure temperature of the own body and decide to get or not to get an aspirin. Or make some fun from the process because of quite boring sickness.

Solution is pretty geeky, but first try to think about it yourself.


So, the solution.
It is based on the fact, that when human body or part of it is placed into environment with essentially the same temperature, but much bigger thermal capacity, it does not feel this. Try get shower with about 36 degress Centigrade, and you will not feel neither cold, nor hot. Things are different when air on the street is more than 30 degrees Centigrade, that is because of too much different thermal capacity of water (it is huge) and air (very small).

So, back to the task. To determine your tempeperature you have to get precise volume of water in the teakettle (let's say 1 liter, I could measure it because I have water counters), connect teakettle to the electricity via ampermeter, measure voltage by voltmeter.
Then you have to put your arm into the teakettle and turn it on (beware of heating element) and checkout first time. When your hand will feel itself very comfortable (here is a main error factor) you have to checkout second time. Then remove your arm and wait until water become boiled and write third time.

Now, its time for school physics: power of the teakettle, which is equal to multiplication of current strength and voltage, multipled by time difference is equal to weight of the water multipled by its thermal capacity and temperature difference, which was changed during above time frame.

So, here are practical results:
current strength I = 3.7 A
voltage V = 231 V
mass of the water m = 1 kg
thermal capacity of the water c = 4200 J/(kg*degree)
time difference for complete boiling (from unknown temperature to 100 degrees Centigrade) dt0 = 420 seconds
temperature difference dT can be found from following equation:
I*V*dt0 = m*c*dT
So, we have dT = 100 (temperature of the boiling) - T0 (initial temperature of the water) = I*V*dt0/(m*c), and is equal to 85 degress Centigrade, so initial temperature of the water was about 15 degress Centigrade.

Time difference between start of the process and comfortable temperature was about 30 seconds, so placing this timeframe into above equation we can find, that temperature was changed by 6 degress.

Since we already found, that initial temperature was about 15 degress Centigrade, calculated temperature of my body is about 21 degrees Centigrade.
Its time to go back to grave...

P.S. Yes, I'm a former looser-physicist, that's why I became a kernel hacker, this can explain alot...

/devel/other :: Link / Comments (6)


Mon, 11 Feb 2008

kernelplanet.org

Someone good placed my rss feed to kernelplanet.org, which is a kernel hackers place of shame glory :)
Well, one who did that probably saw that frequently I write quite a lot of notes for a day, that I have no political/hacker/whatever ethic in the blog, that I made too many english errors (especially when I have no access to the dictionary) and so on, hope it is not that bad.

So, couple of words about what it is. This blog is fully devoted to how I spend the days: hacking, having a rest, sleep and move to the toilet...
Blog has comments (with a bit not user friendly captcha), and number of them one can find at the end of the message. When new comment is added, entry is updated, so stream-based aggregators will see it as a new one. Usually there are 1-3 entries per day, sometimes more, sometimes no entries.

That's it. Stay tuned.

/devel/other :: Link / Comments (2)


Tue, 05 Feb 2008

Selecting computer language for the new project.

assert youKnowWhatYouReallyWant == true;
if (iAmWritingForPersonalUseOnly()) {
    if (iWantAReallyNewParadigm()) { // actually you'll get some irreversible brain damage.
        try {
            return "Huskell"; // dude, I really mean the DAMAGE!
        } catch(ECriticalBrainFailure e) {
            if (preferDotNetWorld()){
                return "F#"; // it's the same as Gb, ain't it?
            } else if (processorCount() >= OH_SO_MANY) {
                return "Erlang"; // start thinking in 1000 threads
            } else if (preferPunctuation() == STRONGLY){
                try {
                    return "J"; // APL needed a transliteration -- and got it
                } catch (EBrainOverolad e) {
                    return "K"; // better have a bank hire you soon!
                }
            } else {
                throw new RethinkParadigmException();
                // you should have better selected Haskeel before
            }
        }
    } else {
        if (isDynamicTypingOk()) { // hey, everyone wanna be a cool geek today.
            if (cannotLiveWithoutCurlyBraces()) { // well, who can ?!
                return "Ruby"; // it's Python done better.
            } else if (enjoyIndentation()){
                return "Python"; // it's Ruby done right.
            } else if (shizophrenia->isOK()){
                return "Perl"; // all the expressivenes and imprecision of a human language.
            } else if (sourceCodeConceptIsObsolete()){
                return "Smalltalk"; // ever modified the value of True -- on a live system?
            } else {
                throw new LameException("PHP5"); // stick with this, los^W poor dude
            }

        } else { // static typing obviously
            if (isManagedOk()) { // let PC do some job for me, they are so smart nowdays.
	    			 //Sick of doing everything myself.
                if (preferJavaWorld()) { // die, MS, die!!!
                    return "Scala"; // huge, really huge. Must be inspired by Noah Arc.
                } else if (preferDotNetWorld()) { // stuck on Windows, ha?
                    return "Nemerle"; // kazalos' by... oh, not again...
                } else {
                    throw new IsThereReallyAnythingElseException();
                }
	    
	    // computers will eliminate the humankind if they get enough control.
            } else if (unmanagedOnly()) {
                return "D"; // get a whole new language with every new release. Great fun.
            } else {
                throw new YouWantSomethingStrangeHereException();
            }
        }
    }
} else {
    return "Do Whatever Your Boss Says To And Keep Your Mouth Shut Programming Language";
}
I only know C a bit and some time ago I tried Java and knew what C++ was... I think I'm living out of this new and shiny world of programming, and that's cool.

/devel/other :: Link / Comments (6)


Mon, 04 Feb 2008

Linux.Conf.Au 2008 presentations available.

Check them out!

And getting CRFS presentation (slides, ogg, SPX (what's it)), we have:

http://oss.oracle.com/projects/crfs/

$ wc -l lk/*.[ch] | tail -1
 7335 total
$ wc -l crfsd/*.[ch] | tail -1
 5971 total
But world is so cruel:
Not Found

The requested URL /projects/crfs/ was not found on this server.
Likely it is still a weekend in USA.

/devel/other :: Link / Comments (1)


Fri, 18 Jan 2008

Fedora upgrade sucks.

Since I got fast internet connection I decided to upgrade Fedora Core 7, installed on my laptop to its next version via yum. Machine has 256 Mb of RAM and 512 Mb of swap, so I expeted there should not be problems, but I was wrong - it ended up with OOM condition and yum got stuck, so I killed it (with SIGKILL signal, since it did not respond to anything else). Subsequent runs end up with trnsaction check error where some packets (likely just installed) conflict with other ones (probably with old versions), and I expect, that machine will be unusable after susped/resume or reboot, and there is no way to rollback installed packets...
Although I started downgrade process, it is about 3 o'clock in Moscow, so I will move to bed and hopefully things will be resolved this morning.

Another serious design problem of the whole yum system is its dependency tracking system. It requires to download 3-5 Mb sqlite database almost every time one wants to install any single packet, which can even do not have any dependencies to be resolved, or when its size is about several kilobytes.

/devel/other :: Link / Comments (0)


Thu, 10 Jan 2008

How to fix Debian upgrade process with "A non-dpkg owned copy of the libc6-i686 package was found." error.

I've checked preinst script in Debian libc6_2.7-5_i386.deb and found, that above error only accurs if either /lib/tls/i686/cmov/libc.so.6 or /lib/i686/cmov/libc.so.6 file exists, my system has the former, which was a symlink to libc-2.3.6.so. I removed that link and upgrade process from etch to testing was successfully completed.

I performed above steps on two different machines, one of which runs own 2.6.23 kernel and another one 2.6.18 Debian's one. The former booted successfully and the latter does not, so take that into account, since it looks some kernel changes (from 2.6.18 to 2.6.22 Debian testing) resulted in unbootable machine.

/devel/other :: Link / Comments (4)


Why don't I like Debian.

Because it breaks my dreams.
Actually only a single dream: a dream about perfect life.
I always wanted to believe that Debian is able to perform an easy upgrade between major versions, since it has so much hated/loved stable/testing/unstable split. I know, Fedora, SuSE and others can not perfrom a major leap between versions using only command line tools. Sometimes they can (especially Fedora on x86), but Debian (in my dreams) has to do that always.

And it has just fucked my sweat dream:

Do you want to upgrade glibc now? [Y/n] 


A non-dpkg owned copy of the libc6-i686 package was found.
It is not safe to upgrade the C library in this situation;
please remove that copy of the C library and try again.
dpkg: error processing /var/cache/apt/archives/libc6_2.7-5_i386.deb (--install):
 subprocess pre-installation script returned error exit status 1
What in the hell does it mean? How in the hell is this possible? I do not know, but since today I hate Debian.
I tried number of things to cure the situation, but failed, I'm pretty sure, there is a probability, that my hands are connected to the ass, and I only think and believe that they are connected to shoulders.
After about 3-4 hours of this crap I eventually removed libc6 package from my installation and immediately everything stopped to work:
# ls
bash: /bin/ls: No such file or directory
The only reason to break seems-to-be-cool Debian Etch installation was its too old glibc (libc6) package, which does not contain openat() and friends syscalls, which are extensively used in pohmelfs userspace server.

And here is a reason. I do not know level of correctness of this change, but it does not allow to upgrade glibc (and more generally perform dist-upgrade action) from etch to testing in my setup.

/devel/other :: Link / Comments (6)


My testing environment.

Just like good old days: several machines with 256 MB of ram and 1-3 MB/sec connection to and between them. Things are not that bad, there are several Xeon (E5345) machines around with infiniband cards and several GB of RAM, but that requires setup, installation and so on, so right now it is enough to have smaller systems, which compile small kernel about 30 minutes and untar it 4 minutes, I do not hurry.

/devel/other :: Link / Comments (0)