|
About
TODO
Blog
RSS
Old blog
Projects
Gallery
Notes
Thu, 31 Jan 2008
BTRFS subvolumes.
If Btrfs were to rely on device mapper or MD for mirroring, it would not be able to resolve checksum failures by checking the mirrored copy. The lower layers don't know the checksum or granularity of the filesystem blocks, and so they are not able to verify the data they return.Well, that's not entirely correct, since checksum has to be checked not against other mirror, but against data itself (i.e. it has to be recalculated after read), since during transfer data can be damaged and it is not that rare condition. Thus checksums from different mirror can be both be wrong, but equal, which without recalculating can sign that everything is ok, while it does not. Recalculating block checksum can be faster for smaller blocks than reading it from other disk. If Btrfs were to rely on device mapper for aggregating all of the physical devices into a single big address space, it would not have sufficient information to allocate mirrored copies on different devices. Keeping this information in sync between Btrfs and the device mapper would be difficult and error prone.Actually it is very simple. DST supports such iteraction for example. Instead I propose and will use following scheme for subvolumes (I like the name) in local filesystem: there is pool of devices, and there are allocation policies for each one in the following form (just an example): files with '*.jpg' pattern are allocated from device 1, '*.log' from device 2, metadata is stored on device 3, small files are allocated on device 4, and so on. Then each device has own policy on mirroring its data to needed number of storages. And, a side note, it looks like Chris Mason uses Mac OSX for development or at least for writing documentation, since a screenshot of high-level design clearly has Mac's shadows and fonts :) /devel/dst :: Link / Comments (0) Please solve this captcha to be allowed to post (need to reload in a minute): 37 * 71 Comments are closed for this story. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||