NoClone  
Welcome to Support Forum Sign in | Join | Faq

Comments & Suggestions

Started by alan at 12-04-2005 23:41. Topic has 4 replies.

Print Search
Sort Posts:    
   12-04-2005, 23:41
alan is not online. Last active: 2/8/2010 5:24:01 PM alan

Top 10 Posts
Joined on 08-10-2005
Hong Kong
Posts 139
Why byte-by-byte comparison is the safest?
Reply Quote

Compare duplicate file finder using MD5, CRC and SHA1 hash function to compare file contents, they are not safe because any hash function is not unique, there must be collision. So NoClone compares byte by byte to ensure duplicate files are really the same.

There are lots of research paper on this topics:

Collisions for Hash Functions MD4, MD5, HAVAL-128 and RIPEMD - Xiaoyun Wang, Dengguo Feng, Xuejia Lai, Hongbo Yu:

Collisions for Hash Functions
MD4, MD5, HAVAL-128 and RIPEMD

Xiaoyun Wang1, Dengguo Feng2, Xuejia Lai3, Hongbo Yu1

The School of Mathematics and System Science, Shandong University, Jinan250100, China1
Institute of Software, Chinese Academy of Sciences, Beijing100080, China2
Dept. of Computer Science and Engineering, Shanghai Jiaotong University, Shanghai, China3
xywang@sdu.edu.cn1
revised on August 17, 2004

1 Collisions for MD5

2 Collisions for HAVAL-128

HAVAL is proposed in [10]. HAVAL is a hashing algorithm that can compress messages of any length in 3,4
or 5 passes and produce a fingerprint of length 128, 160, 192 or 224 bits.

Attack on a reduced version for HAVAL was given by P. R. Kasselman and W T Penzhorn [7], which
consists of last rounds for HAVAL-128. We break the full HAVAL-128 with only about the 26 HAVAL
computations. Here we give two examples of collisions of HAVAL-128, where

with non-zeros at position 0,11,18, and i 0,1,2,...31, such that HAVAL(M) HAVAL(M') .

Table 2 Two pairs of collision, where i=11 and these two examples differ only at the last word

3 Collisions for MD4

MD4 is designed by R. L. Rivest[8 ] . Attack of H. Dobbertin in Eurocrypto'96[2] can find collision with probability 1/222. Our attack can find collision with hand calculation, such that

Table 3 Two pairs of collisions for MD4

 

4 Collisions for RIPEMD

RIPEMD was developed for the RIPE project (RACE Integrrity Primitives Evalustion, 1988-1992). In 1995, H. Dobbertin proved that the reduced version RIPEMD with two rounds is not collision-free[4]. We show that the full RIPEMD also isn't collision-free. The following are two pairs of collisions for RIPEMD:

Table 4 The collisions for RIPEMD

5 Remark

Besides the above hash functions we break, there are some other hash functions not having ideal security. For
example, collision of SHA-0 [6 ] can be found with about 240 computations of SHA-0 algorithms, and a collision
for HAVAL-160 can be found with probability 1/232.
Note that the messages and all other values in this paper are composed of 32-bit words, in each 32-bit word
the most left byte is the most significant byte.


1 B. den Boer, Antoon Bosselaers, Collisions for the Compression Function of MD5, Eurocrypto,93.
2 H. Dobbertin, Cryptanalysis of MD4, Fast Software Encryption, LNCS 1039, D. , Springer-Verlag, 1996.
3 H. Dobbertin, Cryptanalysis of MD5 compress, presented at the rump session of EurocrZpt'96.
4 Hans Dobbertin, RIPEMD with Two-round Compress Function is Not Collision-Free, J. Cryptology 10(1),
1997.
5 H. Dobbertin, A. Bosselaers, B. Preneel, "RIPMEMD-160: A Strengthened Version of RIPMMD," Fast
Software EncrZption, LNCS 1039, D.Gollmann, Ed., Springer-Verlag, 1996, pp. 71-82.
6 FIPS 180-1, Secure hash standard, NIST, US Department of Commerce, Washington D. C., April 1995.
7 P. R. Kasselman, W T Penzhorn , Cryptananlysis od reduced version of HAVAL, Vol. 36, No. 1, Electronic
Letters, 2000.
8 R. L. Rivest, The MD4 Message Digest Algorithm, Request for Comments (RFC)1320, Internet Activities
Board, Internet Privacy Task Force, April 1992.
9 R. L Rivest, The MD5 Message Digest Algorithm, Request for Comments (RFC)1321, Internet Activities
Board, Internet PrivacZ Task Force, April 1992.3RIPEMD-1281
10 Y. Zheng, J. Pieprzyk, J. Seberry, HAVAL--A One-way Hashing Algorithm with Variable Length of Output,
Auscrypto'92.


NoClone Author
Reasonable Software House
   Report 
   02-01-2007, 15:43
alan is not online. Last active: 2/8/2010 5:24:01 PM alan

Top 10 Posts
Joined on 08-10-2005
Hong Kong
Posts 139
Re: Why byte-by-byte comparison is the safest?
Reply Quote
Sample files cause duplicate MD5:
http://www.mscs.dal.ca/~selinger/md5collision/
NoClone Author
Reasonable Software House
   Report 
   06-08-2009, 4:38
viking3 is not online. Last active: 1/2/2010 2:47:01 AM viking3

Top 25 Posts
Joined on 06-07-2009
Posts 5
Re: Why byte-by-byte comparison is the safest?
Reply Quote
Byte-by-byte comparison is much slower than a checksum calculation.
I wish that there was an option to use checksum instead when comparing large number of files.

I am trying to find a few unique files among several hundred thousand files, where all but the unique files are duplicates.
A checksum option would speed it up tremendously in this case!

   Report 
   08-04-2009, 15:16
alan is not online. Last active: 2/8/2010 5:24:01 PM alan

Top 10 Posts
Joined on 08-10-2005
Hong Kong
Posts 139
Re: Why byte-by-byte comparison is the safest?
Reply Quote
We did enhance the algorithm to speed up comparison by binary contents, so true-by-byte comparison isn't slower than hash function like MD5. However, due to strong demand, comparison by MD5 was on the wish list of NoClone 2009.

NoClone Author
Reasonable Software House
   Report 
   08-08-2009, 6:15
MasterCATZ is not online. Last active: 8/7/2009 8:53:51 PM MasterCATZ

Top 25 Posts
Joined on 08-07-2009
Posts 6
Re: Why byte-by-byte comparison is the safest?
Reply Quote
I have found byte-by-byte way faster then when I run sfv checks , but I am assuming this is because I can easily do 7000 + IO's per sec and the array will open many files at the same time with out drama whist the sfv checks are not multitask'd

   Report 
Support Forum » NoClone Support... » Comments & Sugg... » Re: Why byte-by-byte comparison is the safest?

Other Customer Support Channels:

  1. Registration key enquiry
    For existing customers who have lost their registration key, or who wish to obtain their registration key for a new version if available.
  2. Consultant Documentation
    Information and step-by-step tutorial to get started and procedure for operations.
  3. Live chatLive chat by BoldchatPlus
    Live chat with us concerning sales and technical questions, available from Monday to Friday 1:00am to 9:30am GMT.
  4. Contact us
    For direct communication to our Customer Service representatives regarding bug reports, comments on NoClone etc.
  5. Order Enquiry
    For existing customers who have ordered NoClone Installer CD-ROM, and want to check the delivery status.

TIP Getting Started Guide

 

| | |Français |Deutsch |Czech | |Danish |Greek |Brazil |Portuguese |Polish |Español |Italiano

©2003-2006 Reasonable Software House, All rights reserved.
Phone: +852 35204490 Fax: +852 35204492 Email: Contact us
Address: 332 InnoCentre, 72 Tat Chee Avenue, Kowloon Tong, Hong Kong

Powered by Community Server, by Telligent Systems