About
A while ago I was facing a problem with duplicate binary files
on my machine. In the past ten years or so I have been collecting music
and backing up music on my computer. Thing is
After a while duplicate files appear. Various backups and "cheap disks"
for use in the car caused my collection to grow with a exponential rate.
I looked around for a program that could remove these duplicate files from disk
but found none. This inspired me to right duplicate. It is a command line tool that
takes a directory as parameter and looks through that directory for duplicate files.
I was developed for music but can be used on any binary or text file.
The algorithm is rather simple, except for the VCDIFF part of it. Basically it
compares two files on size. If the sizes are equal, it does a binary diff using
the xdelta api developed by Josh MacDonald. If the files are equal, on is removed.
Duplicate was developed on a Monolithic system and sould compile fine on linux or os x
I haven't tried it on Windows yet.
Update
Duplicate needs to be run more than once to remove all duplicate files.
Fixed.
|