Zbr's days.
October
Sun Mon Tue Wed Thu Fri Sat
 
27
     
2007
Months
Oct

About TODO Blog RSS Old blog Projects Gallery Notes

Sat, 27 Oct 2007

DST merging plans.

Andrew Morton asked about status of the distributed storage and noted that actually there are no reasons not thinking about merging this.
Although he concerned about quite active development via my blog, but DST itself is essentially completed.
Likely it can require some additional features after I start distributed filesystem development, but right now it does not. Maybe I will add optional strong checksumming of the transferred data though.

/devel/dst :: Link / Comments (7)

mnp wrote at 2007-10-27 16:02:

actually andrew said there are NO reasons not thinking about merging.. ">I'm not seeing any reason why we shouldn't start thinking about merging this"

:)

Zbr wrote at 2007-10-27 16:14:

:) That is what I meant, but made a typo. Fixed, thanks!

mnp wrote at 2007-10-27 22:45:

I know it was a typo, that's way I was like.. :)

otherwise I'd be like.. :<

:)

take care!!

anonymous wrote at 2007-10-28 02:16:

could happend (i *really* dont know if could happend) that a broken ethernet card (or on heavy load with a 'bad' driver, or dont know, just something :) sends crap to the other node, and ower distributed idea is sinking like the titanic without even knowing ? If that could happend (seriusly i dont know:) and adding "optional strong checksumming of the transferred data though" could 'detect' this, i think it could be a relly nice feature

Zbr wrote at 2007-10-28 12:16:

Yes, this is possible, but no one cares about such problems in real life (although I can be wrong) - TCP checksum is usually enough to protect against real crap, and errors, which produce the same checksum with good data, are rather an exception than error rule. Even in that case this will be detected before any damage: for one TCP packet with the same checksum and wrong data there will be thousands of packets with both wrong data and checksum, which will be detected.

One can also use IPsec (only AH checksum for example) to protect against such problems.

murble wrote at 2007-10-28 17:34:

I have found this does actually happen with data surviving tcp checksums. It happened to me.

I noticed backups that went over ssh failed, but i was transfering several gb of /dev/zero over just TCP/http and had bits flipped. The tcp checksum is really not very good. even ping complained with errors like:

wrong data byte #100 should be 0x12 but was 0x3a #16 12 ac 32 be 12 ac 32 be 12 ac 32 be 12 ac 32 be 12 ac 32 be 12 ac 32 be 12 ac 32 be 12 ac 32 be #48 12 ac 32 be 12 ac 32 be 12 ac 32 be 12 ac 32 be 12 ac 32 be 12 ac 32 be 12 ac 32 be 12 ac 32 be #80 12 ac 32 be 12 ac 32 be 12 ac 32 be 12 ac 32 be 12 ac 32 be 3a e8 86 c3 d5 51 10 70 1 eb 54 25 #112 8c 96 2d b3 9f 2a a6 d7 4b 41 ab af c8 8f 5a 4c 3a 51 7b 15 32 3c 67 dc 39 a4 f6 4a 10 85 9e ea #144 9a 18 e3 87 c7 74 ec 9 26 1a 1a d6 ab 21 bd 83 de d4 f0 38 6f 9a c1 33 e 9 7d 43 75

Sadly the hardware that failed belonged to an upstream ISP so it took a long time to convince them that they had a problem!! Our short term fix was to tunnel everything through an openvpn link which then reported stuff like Authenticate/Decrypt packet error: packet HMAC authentication failed all over the place but atleast didn't corrupt out data! So using IPsec with AH checksums would be one solution.

Clark wrote at 2007-10-29 02:41:

I think the checksumming is very important in real life. Network hardware can do stupid things and end corrupting data silently. Please seriously consider and option to checksum the data between nodes! :)

Please solve this captcha to be allowed to post (need to reload in a minute): 29 * 98

Comments are closed for this story.