Re: Why byte-by-byte comparison is the safest?

Comments & Suggestions

Why byte-by-byte comparison is the safest?


alan 12-04-2005, 23:41

Compare duplicate file finder using MD5, CRC and SHA1 hash function to compare file contents, they are not safe because any hash function is not unique, there must be collision. So NoClone compares byte by byte to ensure duplicate files are really the same.

There are lots of research paper on this topics:

Collisions for Hash Functions MD4, MD5, HAVAL-128 and RIPEMD - Xiaoyun Wang, Dengguo Feng, Xuejia Lai, Hongbo Yu:

Collisions for Hash Functions
MD4, MD5, HAVAL-128 and RIPEMD

Xiaoyun Wang1, Dengguo Feng2, Xuejia Lai3, Hongbo Yu1

The School of Mathematics and System Science, Shandong University, Jinan250100, China1
Institute of Software, Chinese Academy of Sciences, Beijing100080, China2
Dept. of Computer Science and Engineering, Shanghai Jiaotong University, Shanghai, China3
xywang@sdu.edu.cn1
revised on August 17, 2004

1 Collisions for MD5

2 Collisions for HAVAL-128

HAVAL is proposed in [10]. HAVAL is a hashing algorithm that can compress messages of any length in 3,4
or 5 passes and produce a fingerprint of length 128, 160, 192 or 224 bits.

Attack on a reduced version for HAVAL was given by P. R. Kasselman and W T Penzhorn [7], which
consists of last rounds for HAVAL-128. We break the full HAVAL-128 with only about the 26 HAVAL
computations. Here we give two examples of collisions of HAVAL-128, where

with non-zeros at position 0,11,18, and i 0,1,2,...31, such that HAVAL(M) HAVAL(M') .

Table 2 Two pairs of collision, where i=11 and these two examples differ only at the last word

3 Collisions for MD4

MD4 is designed by R. L. Rivest[8 ] . Attack of H. Dobbertin in Eurocrypto'96[2] can find collision with probability 1/222. Our attack can find collision with hand calculation, such that

Table 3 Two pairs of collisions for MD4

 

4 Collisions for RIPEMD

RIPEMD was developed for the RIPE project (RACE Integrrity Primitives Evalustion, 1988-1992). In 1995, H. Dobbertin proved that the reduced version RIPEMD with two rounds is not collision-free[4]. We show that the full RIPEMD also isn't collision-free. The following are two pairs of collisions for RIPEMD:

Table 4 The collisions for RIPEMD

5 Remark

Besides the above hash functions we break, there are some other hash functions not having ideal security. For
example, collision of SHA-0 [6 ] can be found with about 240 computations of SHA-0 algorithms, and a collision
for HAVAL-160 can be found with probability 1/232.
Note that the messages and all other values in this paper are composed of 32-bit words, in each 32-bit word
the most left byte is the most significant byte.


1 B. den Boer, Antoon Bosselaers, Collisions for the Compression Function of MD5, Eurocrypto,93.
2 H. Dobbertin, Cryptanalysis of MD4, Fast Software Encryption, LNCS 1039, D. , Springer-Verlag, 1996.
3 H. Dobbertin, Cryptanalysis of MD5 compress, presented at the rump session of EurocrZpt'96.
4 Hans Dobbertin, RIPEMD with Two-round Compress Function is Not Collision-Free, J. Cryptology 10(1),
1997.
5 H. Dobbertin, A. Bosselaers, B. Preneel, "RIPMEMD-160: A Strengthened Version of RIPMMD," Fast
Software EncrZption, LNCS 1039, D.Gollmann, Ed., Springer-Verlag, 1996, pp. 71-82.
6 FIPS 180-1, Secure hash standard, NIST, US Department of Commerce, Washington D. C., April 1995.
7 P. R. Kasselman, W T Penzhorn , Cryptananlysis od reduced version of HAVAL, Vol. 36, No. 1, Electronic
Letters, 2000.
8 R. L. Rivest, The MD4 Message Digest Algorithm, Request for Comments (RFC)1320, Internet Activities
Board, Internet Privacy Task Force, April 1992.
9 R. L Rivest, The MD5 Message Digest Algorithm, Request for Comments (RFC)1321, Internet Activities
Board, Internet PrivacZ Task Force, April 1992.3RIPEMD-1281
10 Y. Zheng, J. Pieprzyk, J. Seberry, HAVAL--A One-way Hashing Algorithm with Variable Length of Output,
Auscrypto'92.


NoClone Author
Reasonable Software House

Re: Why byte-by-byte comparison is the safest?


alan 02-01-2007, 15:43
Sample files cause duplicate MD5:
http://www.mscs.dal.ca/~selinger/md5collision/
NoClone Author
Reasonable Software House

Powered by Community Server, by Telligent Systems