|
About ::
TODO ::
Blog ::
RSS ::
Old blog ::
Projects ::
GIT ::
Gallery ::
Notes
Thu, 04 Sep 2008
Simple exploit for Bernstein/Torek 33 hash.
$ ./djb_crack 0x12345678 0 0xffffffff 0xabcdef12 07 1a 11 19 01 0c hash: 12345678, calc: 12345678: MATCH 00 hash: 00000000, calc: 00000000: MATCH 03 0a 18 14 1a 09 03 hash: ffffffff, calc: ffffffff: MATCH 02 07 15 11 00 20 03 hash: abcdef12, calc: abcdef12: MATCHExploit takes multiple hash values and searches for data which will produce the same hash value (it prints it to stdout as one can see above). Since this hash is so simple, it is actually possible to find matching data using brute force, but it is not interesting. Exploit can not work, if we limit the smallest byte value to something except 1 or 0. Since we do not know actual value of the hash, but only its modulo for 2^32, there is a possibility, that given value can not be represented as sum with fixed multiplicators of the bytes we can operate on (like we can not represent 13 as sum of whatever positive integers, if the smallest one is bigger than 6). But it is always possible to represent any value in the system where the smallest possible byte is zero. Because of the above limitation for the smallest byte value, every hash can be matched by the array of at most 7 bytes (33^7 is bigger than 2^32). I want to think some more on the cases, when we we know only modulo (by dividing real result by 2^32 for example) of the result, but we have to find input bytes, so that hash on them would match required one, and input bytes are limited by some set, and the smallest byte is not 1 or 0. This can be tricky task... value /devel/math/hash :: Link / Comments () Wed, 03 Sep 2008
On multiplier selection for the Bernstein/Torek 33 hash.
hash = hash * 33 + data[i];where initially hash was set to zero, and data[i] means i'th byte of the input data. This can also be written as following: hash = hash + hash << 5 + data[i];Now, let's take a look at hash analysis. As we can see, final hash is a sum of the multiplication of the power of 33 and data bytes. Let's split sum into neighbour pairs, like following (assuming big enough number of input bytes n):hash = (33^(n-2))*(33*a[0] + a[1]) + ... + (33^(n-k-1))*(33*a[k] + a[k+1]) + ...Now let's check single multiplier using above shift equation for the multiplication: 33*a[k] + a[k+1] = a[k] + a[k] << 5 + a[k+1]Using any other multiplier, which does not result in a[k] + a[k+1],
will lead to worse distribution, since number of used bits decreases. Particular bad (if not the worst)
multplier is 31, which leads to the following sum:31*a[k] + a[k+1] = a[k] << 5 - a[k] + a[k+1]This hash will have too small active bits, particular only differece between neighbour bytes will play a role in the final hash production. Now, getting the history of the hash, namely its part, which tells us that hash was first introduced for strings, we can conclude, that above 5 bits shift is used to shift a value to the amount of bits needed to put there new english ASCII character, i.e. shift value could be bigger to work with higher bytes (so that non-zero bits fit the new space). Now because of the time shift I made for myself because of US embassy interview (awake at 5:30 AM, going to sleep at 1:00 AM), my brain does not allow to work on big projects, so I will try to create an exploit for this hash standing on regular several-cups-of-cofee drug. Stay tuned! /devel/math/hash :: Link / Comments () Mon, 01 Sep 2008
Some recent hash analysis: Bernstein/Torek famous (hash * 33) hash.
unsigned long hash(const char *s)
{
unsigned long h;
for (h = 0; *s; s++) {
h *= 33;
h += *s;
}
return h;
}
This hash appeared in Bernstein's djbdns server quite long ago (although
Dr. Bernstein now favours version with XOR instead of sum), but it looks like
it appeared in comp.lang.c on behalf of
Chris Torek.I've spent some time on it to determine how it works. One can check clickable picture below to get my thoughts. Short details below. ![]() Bernstein/Torek hash analysis. In a nutshell, Bernstein/Torek 33 hash is a linear composition of the input bytes. Each input byte is multiplied by a constant value (namely 33 in a power, which equals to the number of bytes minus position of the input word minus one), and then summed. One can check C source code. It is simple only because all operations are performed in the same field F(2^32) (namely sum and multiplication, which is effectively the same), if one would add XOR there (like Dr. Bernstein did in the recent version), it shifts the whole approach to the mix of F(2^32) and F(2^1) fields, which is a completely different moster. In the former case, particulary, it is possible to first multiple/sum lots of elements, and only then apply modulo operation, while in the latter mixed case it is not easily possible (well, I'm searching for group algebra books/articles about operations in mixed fields, so far without much success). Linear combinations of the input bytes allows very simple way to create an input, which will have the same output hash value as you want. Actually I do not belive in all those attacks, which say, that with our technique we managed to reduce something from X to x. Until there is working realization, which does break appropriate cipher, hash or anything else, it is just words. I do not have a breaking code right now (although belive that it is simple), so nothing was broken and in fact can be completely wrong idea :) But I will develop it to show myself, that my basic algebra skills are still valid... /devel/math/hash :: Link / Comments () Tue, 19 Aug 2008
Modular arithmetic article needed.
/devel/math :: Link / Comments () Sun, 13 Jul 2008
Hermite interpolation.
/devel/math/bezier :: Link / Comments () Tue, 28 Aug 2007
LDPC iteractive decoding.
/devel/math/codes :: Link / Comments () Mon, 27 Aug 2007
LDPC code presentation.
/devel/math/codes :: Link / Comments () Wed, 22 Aug 2007
LDPC codes.
1 1 1 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 1 1 1 1 0 0 0 0 0 1And so on, I will check if it works good. /devel/math/codes :: Link / Comments () Mon, 20 Aug 2007
Gallager codes.
0 1 0 1 1 0 0 1 1 1 1 0 0 1 0 0 0 0 1 0 0 1 1 1 1 0 0 1 1 0 1 0And source codeword is: 1 0 0 1 0 1 0 1Check word, calculated my multiplication of matrix and codeword is: 0 0 0 0Let's during transmission of the codeword and check word over the channel codeword was changed to this (secod bit changed): 1 1 0 1 0 1 0 1Here is a generation graph (originally proposed by Tanner) of the given matrix: ![]() Here starts decoding algorithm, where each codeword node C first sends its bit
to each check node F.F0 node will receive 1 1 0 1 bits from C1, C3, C4 and C7 accordingly. F1 node will receive 1 1 0 1 bits F2 node will receive 0 1 0 1 bits F3 node will receive 1 1 0 0 bits Next step is to calculate the answer for each code node. Received check word is 0 0 0 0 (calculated above), so set of simple equations starts here. Each check node F gets three out of four received bits and XORing (summing modulo 2, since this example works in Galois finite field of power of 1 - GF(1)), and sends to the codeword node a bit it expects to be correct to satisfy received check bit. Here is an example for first check node: X0 ^ 1 ^ 0 ^ 1 = 0. X0 = 0 1 ^ X1 ^ 0 ^ 1 = 0. X1 = 0 1 ^ 1 ^ X2 ^ 1 = 0. X2 = 1 1 ^ 1 ^ 0 ^ X3 = 0. X3 = 0 Then we send Xi to Ci code bits. After all check nodes are processed, codeword nodes has following set of bits: C0: 0 from F1, 1 from F3, 1 from originally received codeword. C1: 0 from F0, 0 from F1, 1 from originally received codeword. C2: 1 from F1, 0 from F2, 0 from originally received codeword. C3: 0 from F0, 1 from F3, 1 from originally received codeword. C4: 1 from F0, 0 from F3, 0 from originally received codeword. C5: 0 from F1, 1 from F2, 1 from originally received codeword. C6: 0 from F2, 0 from F3, 0 from originally received codeword. C7: 1 from F0, 1 from F2, 1 from originally received codeword. Then using the voting for each bit (i.e. which bit has more 'votes' in above table out of three cases), we get a new codeword: 1 0 0 1 0 1 0 1The same steps then are repeated until cdeword stopped to change. In our case we get it after the first run. Spft decision algorithm usses essentially the same logic, but it operates with probabilities of the bit to be 1 or 0, each probabilities are recalculated in each run, and after probability is higher than requested value (or error probability is less than requested value), loop stops. Real world examples use much bigger codewords (up to several thousands of bits), but logic is always the same. So, I've started initial userspace implementation, if stars are in the right order I will be able to complete it until move to climbing zone, otherwise I hope it to be ready tomorrow. Stay tuned. /devel/math/codes :: Link / Comments () Tue, 07 Aug 2007
I as bloody wrong.
/devel/math/hash :: Link / Comments () Mon, 06 Aug 2007
Breaking Enigma code or cracking SHA1 hash for fun.
Input data: 5e ca 9b e6 38 cf cd 33 41 cf 61 b3 fb cd 39 df 65 87 61 b8 2c 1e 56 ac 69 d7 d0 18 7f 9b 0f a3 9c 13 99 4c c0 08 c2 de 2d ed c2 d5 99 f8 94 57 d7 a1 e2 35 93 73 0c 11 5a 80 5e 80 ff a8 54 fe digest: 136be2b1 e949ef99 b85caa61 c97e39cc 7c53ccc5 Cracked data: workspace (substitute W in sha_transform() with this data): a57d8667 a57d8667 a57d8667 a57d8667 a57d8667 a57d8667 a57d8667 a57d8667 a57d8667 a57d8667 a57d8667 a57d8667 a57d8667 a57d8667 a57d8667 96ccb97c a1900adc c7d34989 3c218123 b2380816 digest: 136be2b1 e949ef99 b85caa61 c97e39cc 7c53ccc5So, the last task in breaking reduced SHA1 is to find input 64 bytes, which after processed by this method: W[i] = rol32(W[i-3] ^ W[i-8] ^ W[i-14] ^ W[i-16], 1);results in found above workspace. I do not know, if this can be considered as SHA1 crack, my goal was to complete it upto exactly this point, i.e. break reduced to 20 rounds SHA1 algo. Although I will ask Bruce Schneier tomorrow if it is. Stay tuned, but I will go climbing now. /devel/math/hash :: Link / Comments () Wed, 01 Aug 2007
A hash.
#define f1(x,y,z) (z ^ (x & (y ^ z)))
__u32 W[80];
__u32 digest[5];
__u8 in[64];
for (i = 0; i < 20; i++) {
t = f1(b, c, d) + K1 + rol32(a, 5) + e + W[i];
e = d; d = c; c = rol32(b, 30); b = a; a = t;
}
I've started to think about how to solve this problem.I expect similar results to my one round Jenkins hash analysis or complete fail. You know where to find results - stay tuned. /devel/math/hash :: Link / Comments () Tue, 19 Jun 2007
CRC performance depending on the word size. size: 4096, num: 100000, bits: 64, speed: 459.630646 kop/sec. size: 4096, num: 100000, bits: 32, speed: 348.913483 kop/sec. size: 4096, num: 100000, bits: 16, speed: 176.067184 kop/sec.As expected, CRC speed was improved with increasing size of the word, unfortunately for galois field multiplication speed increase is not enough. Let's discuss Reed-Solomon coding, which requires galois-field multiplication of checksummed (or raw) data words. For 16 bit CRC sum galois filed multiplication can be performed using 2^16 table (2 lookups per multiplication or division operation) lookup, but for 32 bit CRC galois field multiplication in GF(2^32) can not be performed using table, since it will be too big. To perform galois field multiplication essentially one needs to solve a system of equations, number of equations is equal to size of the values being multiplied, i.e. 32 in our case. Even the fastest method of solving system of equations will be much much slower than 16 bit CRC plus its 2 lookups. Even using Cauchy matrixes for Reed-Solomon encoding with galois multiplication instead of Vandermonde one results in much slower code than table lookup. So, for the maximum performance we either need to limit Reed-Solomon system to 16 bit CRC (galois field multiplication is used in RAID coding for example), or to use different coding method. I'm studing WEAVER codes which are faster by design (and use only XOR operations without complex galois field multiplications), but they have own limits, so I'm trying to understand how distributed system with such erasure coding will suffer from using it. /devel/math/codes :: Link / Comments () Sun, 20 May 2007
First Bezier curve. /devel/math/bezier :: Link / Comments () Sat, 19 May 2007
Bezier curves. /devel/math/bezier :: Link / Comments () Sun, 01 Apr 2007
#define f1(x,y,z) (z ^ (x & (y ^ z))) /* x ? y : z */
#define K1 0x5A827999L /* Rounds 0-19: sqrt(2) * 2^30 */]
for (i = 0; i < 20; i++) {
t = f1(b, c, d) + K1 + rol32(a, 5) + e + W[i];
e = d; d = c; c = rol32(b, 30); b = a; a = t;
}
Where $a, $b, $c, $d and $e are parts of the digest, W[i] is i'th word
of the input. Does it look similar to this one:
{ \
a -= b; a -= c; a ^= (c>>13); \
b -= c; b -= a; b ^= (a<<8); \
c -= a; c -= b; c ^= (b>>13); \
a -= b; a -= c; a ^= (c>>12); \
b -= c; b -= a; b ^= (a<<16); \
c -= a; c -= b; c ^= (b>>5); \
a -= b; a -= c; a ^= (c>>3); \
b -= c; b -= a; b ^= (a<<10); \
c -= a; c -= b; c ^= (b>>15); \
}
For me it does - it is roughly the same GF(2) and GF(2^32) transformations,
but with different arguments and operations, but the ground is the same.So, I have quite ambitious goal to write a similar to this code for new type of the hash. But I'm not sure if I will have a time, but at least it is interesting. Quoted above code snipet is part of the well-known hash, other 3 parts are essentially the same with different f?() function.
/devel/math/hash :: Link / Comments () Wed, 28 Mar 2007
Breaking Enigma code.
Day one - cracking first round of Jenkins hash. That was pretty easy from calculation point of view (i.e. how to select i'th bit based on knowledge of (i-1)'th or (i+1)'th bit), but it was not a main goal. I wanted to solve the problem from algorithmistic point of view, i.e. not to find a single solution, but a generation law. I started from simpler task - to solve following equation: (A + X) XOR X = BDay two - theory. There are two possible ways of solving above equation - either to present logical (bitwise) operation in Galua field of 2^1 (GF(2^1)) - XOR - in ariphmetical field (GF(2^32)) or present sum operation in GF(2^32) as a bitwise operation in GF(2^1). There are different ways to do it. Day three - sum as a bitwise operation. It can be presented using quite simple form: Ai + Bi = Ai xor Bi xor K(i-1), where Ki = 0, i = 0, Ki = (Ai and Bi) or (Ai and K(i-1)) or (Bi and K(i-1)) Xi means i'th bit of XI got recursive formula for simple sum, which was not what I wanted, so I started to solve above logical equation using different models. First one - polynomial algebra - sum and bitwise operations are quite simple in polynomial algebra, but there is a serious problem - bitwise operations must be applied only to normalized polynomial form, since it looses information when drops overflow of the order. This limitation is nothing for comupter program, since it is possible to normalize polynomial form before doing bitwise operation, but it does not allow to move on in algorithm solvation. So, next step - transform bitwise operations into GF(2^32) operations. Day four - boolean polynoms. Boolean polynom is a polynom which result and arguments can only be either 0, or 1. For example XOR operation can be presented as following polynomial form: A XOR B = X*(3-X)/2, where X = 2*a + bLooks exactly what I wanted - to present all operations in polynomial form, but there is small problem. After I expressed all logical operations in the simple equation above ( (A+X) xor X = B, where each
operation (sum in GF(2^32) and XOR) is transformed into boolean polynoms)
in polynomial form, I got system of equations of 27 order - it is not what I expected to solve for simple
equation, so this step was dropped too.Day five - give up? Not so fast, if I can not solve that problem in algorithm form, let's find at least one sulution for all given inputs. Here I created a library for polynomial sum, subtraction and normalization (which allows to work with numbers of essentually any order as a bonus, i.e. it is a sum and substraction in Galua field of any order), and started with first round of Jenkins hash cracking. I ended up with code which gets as input two 32 bit values and hash result, and returns third 32 bit value which if being hashed in first round of Jenkins hash produce required hash value. That was simple, but then I started second round. (Jenkins hash has 9 rounds). I made the main mistake here - I started to calculate second round value based on initial inputs (two of them are under attacker's control), but for correct solvation I needed to make a trick. Day six - break second round of Jenkins hash. Main trick in Jenkins hash is that it uses value calculated on previous round as input parameter for current round, so after some mathematical transformation it is possible to change variables in the hash equation to work not with initial values, but with some initial values and result of the previous round. Since each round's result only depends on its input values (some of them can be initial ones, others can be calculated on previous rounds), it can be done. This idea was in my brain from the beginning, but I could not formulate it into something which can be used, and today sitting in the bus, which moved from home to Moscow to my paid work office I finally drew it (the latest sheet of paper). Now I have a program which calculates two inputs for given input and required hash output for two-round Jenkins hash. It is quite simple, but the whole process was very interesting itself. Further cracking is not impossible task, but complexity increasees with each round (although not exponentially). I absolutely do not regret about time I spent solving this problem - that was fun. Interested reader can find a set of photos of brainstorming of this problem in gallery. And a solution: Example: old a: 15e28f3a, b: 1cb4ed1c, c: 7edcd3a0, h1: 70bc78e4 new a: 15e28f3a, b: 434c39d0, c: d28fc0ec, h1: 70bc78e4As you can see, $a parameter is the same, $b and $c are under attacker's control, all three produce the same hash value $h1 of reduced to two rounds Jenkins hash. $b and $c were calculated based on $a and $h1 knowledge (or actually $h1 can be selected randomly and $b and $c can be selected to produce it again and again, salting with random value will not change the picture). /devel/math/hash :: Link / Comments () |