DupeKill_v1.2 ReadMe
Incredibly, I found room on the internet for yet another duplicate file remover. Like others, this tool compares files based on content and lists any files which have exact binary duplicates. What sets this one apart is that you can give it hints about which files you'd like to keep before starting a scan. For example; if you tell it to prefer files with the 'most descriptive name', the app look at all duplicate filenames and default to the shortest, most descriptive name as the one to keep. A file with a name containing "copy of", or ".1.txt" would be considered less descriptive than one without; and a file named "lkePic.jpg" is considered less descriptive than "Lake Pictures.jpg". This way the default keep or remove action is pre-set; and time you spend going though the list and making changes to keep the 'right' file is minimized. The tool has many other options to choose from to pick the default 'kept' file: use the newest or oldest by date, the longest or shortest filename, or specify one or more directories where files should always be kept or removed.
The app also makes use of a speed improvement I haven't seen anywhere else: It makes an extra pass of the file list to create a 'fasthash'. The fasthash uses small samples of the file (16 kilobytes), taken from the beginning, end, and three places in the middle; then does a duplicate check based on the hash of the samples. This is very quick for large files, and it helps eliminate the vast majority of potential duplicates very quickly, as most files will have different samples. Most other duplicate finders omit this step, but it really speeds things up.
As of beta 6 the app is fairly robust; avoiding most of the traps in this excellent reference.
Note: There's no fuzzy searching, this app only finds exact duplicates.
Usage Notes
Version 4.6.2 or better of the .net framework is required. Get it from Microsoft.
No installation; just unpack and run. A settings file and ancillary files may be created in the program folder.
Instructions:
- Pick a directory where you have duplicate files and hit scan.
- After the program identifies the duplicate files, you'll be given a list to choose what to keep and what to delete. The app makes suggestions, but you can make changes by right clicking and selecting delete or keep; Or by clicking the 'Action' column to toggle it. At this point scan results can be copied via the right click menu or exported in bulk via the 'Export Scan...' button.
- Hit the 'Execute Actions' button to run the selected actions. The app will warn you about deletions, and there's an extra warning if you're deleting all copies of a file. You can make a log to see what was done. The app will reset after this.
Each duplicate will have an action automatically assigned, and you can manually set the action by right clicking or using the keyboard. Available actions are: Keep [spacebar], Delete [delete key], Link [F4 key], Recycle, or Move. The context menu option 'Mark as auto' will reset the action to whatever was automatically assigned. The 'Invert Action' option will toggle the action between 'Keep' and whatever the default removal action is (set this in settings, by default it's 'Delete').
For complex searches, the 'Look In' drop-down has an 'Advanced criteria' option. Advanced criterias let you include or exclude multiple folders and have filtering options, as well as options to specify a default 'keep' or 'remove from' folder. These criteria can be named and permanently saved. To create a criteria, select 'Advanced Criteria..." from the 'Look In' drop-down; the Edit Criteria form will the show with further options.
When using Advanced Criteria the '...' button is used to edit an existing criteria. To remove a saved criteria: select it in the drop-down, edit it with the '...' button, and hit the 'Delete this Criteria' button in the lower left.
Notes about symbolic links
In recent versions the application can create symbolic links instead of deleting files. Symbolic links are like shortcuts, they point to the original file; this lets you keep the file's alternate name and location, without wasting the space of having the file's data duplicated.
A few caveats: +The app usually needs to be run as an administrator to create symbolic links. +At least one file must be kept to create links (obviously). +The links created by the app will always target the first file you are keeping.
Links are created in steps, to prevent inadvertent deletions if link creation fails; for each file we want to link we:
1: Rename the file we are removeing/replacing to a dummy name.
2: Try to create our link, using the original name from the file in step 1.
3: If creating the link suceeded, we delete the orginal (renamed) file.
Keyboard commands
After scanning, the following keys can be used on the file list:
- [Enter] Open the file.
- [Delete]: Mark the file for deletion
- [Space]: Mark the file to keep.
- [F4]: Mark the file to be replaced with a symbolic link.
- [Ctrl+A]: Select all
- [Ctrl+C]: Copy filenames to clipboard
- [Ctrl+Alt+C]: Copy all info to the clipboard.
- [Ctrl+X]: Exchange the selected items actions. (Swaps between Keep and the default removal action)
Note: The directory textBox, in additon to specifying plain directories, can have wildcard '*' or any-character '?' filters. eg:
c:\*.txt
c:\documents\just_files_like_201?.txt
Command Line Usage:
- -startHidden
If an instance of the program is not already running, starts the program and immediately hides it. - -startDir=[directory]
Scan the specified directory.
ChangeLog
Version 1.2; 2020-02-19
- Added: 'Export Scan...' button (on the right above the file list) to dump the results of a scan to a .csv before executing any actions.
- Changed: Now storing the version # in the assembly information.
- Changed: Small mod to the 'descriptive name' scoring.
Version 1.1.1; 2019-11-19
- Fixed: Bug in prefer/discard folder logic.
- Fixed: Regression in 'most descriptive' scoring code.
Version 1.1; 2019-11-04
- Added: 'Dupes' column to show the # of duplicates.
- Added: Column sorts: click the 'size' or 'dupes' column to toggle sorting by those numbers.
- Added: New options for default actions in settings (the 'default keep decider' dropdown): now you can select the file to keep based on the longest/shortest name, or oldest/newest file.
- Added: New hash: SHA256Cng for those who have chips with SHA acceleration instructions.
- Added: Using 'advanced criteria', you can now specify folders to always keep files from or always remove files from.
- Added: 'Move' action to move duplicate files to a folder instead of deleting them.
- Changed: Greatly improved performance with large file lists (listview now in virtual mode); hopefully no bugz.
- Changed: Single click on the 'action' column now toggles the action, rather than a double-click. Dunno what I was thinking there...
- Changed: You can now change the 'default remove action' after running a scan: after changing the option in settings, select some files & hit 'mark as auto'; the new default remove action will be applied. (Previously a re-scan would be required.)
- Fixed: Was using the ascii version of CreateSymbolicLink() instead of the Unicode one. Whoops. Now Unicode link creation shouldn't fail.
- Note: Do remember new versions of the tool require the .net framework 4.6.2 or better, the app will crash on startup if it's not installed.
Version 1.0; 2019-08-21
- Added: Blake2B hash function; in most cases it's faster than even SHA1 and md5 while providing a vastly larger address space. This is the new default hash, no more worrying about those damn SHAttered pdf's :)
- Added: Extra warning in case the user has selected all extant copies of a file for deletion.
- Added: You can now drag a folder to the app to start scanning it. (Files too, it'll scan the parent folder.)
- Changed: Better progress bar(s). Also we now use a wait cursor during the 'execute actions' work.
- Changed: New requirement: we now target version 4.6.2 of the .net framework to take advantage of long-path aware API's. The older version supporting .net 4.0 will remain here: https://cresstone.com/apps/DupeKill/DupeKill_v0.8.zip#oldVersion
- Fixed: Some improvements in data entry & tab navigation on the 'advanced criteria' form.
Beta 8; 2017-07-09
- Added: Major feature: advanced criteria scans can be configured via the 'address bar' drop-down.
- Added: "Include subfolders" is now a remembered setting.
- Added: New option in settings to remember recent scan paths.
- Added: New option in settings to sort scan results by size or # of dupes. (On-demand sorts may be next, but are not easy due to the potential size of scans)
- Added: "Copy all info" command file list context menu (keyboard Ctrl+Alt+C). The following information about the selected items is copied to the clipboard (tab separated): full path, file size, creation date, modification date, file signature, action
- Fixed: list no longer flickers on action change.
Beta 7; 2016-12-29
- Added: Symbolic link code finished and activated.
- Added: Ctrl+mousewheel zooming.
- Added: Some keyboard shortcuts for buttons, check the tooltips.
- Changed: Settings for window position now stored in a sane way.
- Fixed: Turned off single-instance limit.
- Changed: Start with windows removed (another setting inherited from the template that didn't make a lot of sense, since we're mostly running on-demand.)
Beta 6; 2016-02-09
- Fixed: Now ignoring reparse points & symbolic links.
- Added: Settings dialog updated (links in about textbox and start menu shortcut option)
Beta 5; 2015-11-07
- Fixed: Item coloring regression.
- Added: File creation/modification date columns
- Added: column sizes remembered
Beta 4; 2015-11-05
- Added: Choices for hash algo added to settings; also added hash benchmark option to see what's fastest.
- Some optimizations
- UI tweaks
Beta 3; 2015-11-03
- Changed: Now using FindFirstFile/FindNextFile via pInvoke instead of Directory.EnumerateFiles/FileInfo; Because on network drives FileInfo is slow
- Added: Abort button to cancel active scan
- Added: Space savings from perspective deletions now shown.
- Added: Shell integration now available in settings.
- Added: Size is now shown under the id; Still not entirely happy with how this looks.
- Fixed: Can now delete read-only files.
Beta 1; 2015-10-28
- Initial Release. From concept to working in a single Sunday. Damn it feels good to be a gangster.
License Information
This software includes code or resources from the following sources:
Blake2B hashing code from the Blake2 project
Licenced under the terms of: The CC0 public domain dedication.Host icon grabber routine by Sergey Stoyan
Licensed under the terms of: The Code Project Open License (CPOL) 1.02Additional icon code by Steve McMahon
Licensed under the terms of: vbAccelerator Software License
This software is distributed as-is, without any representations or warranties of any kind.
The author of this software imposes no additional license terms or limits upon its use or redistribution.
Feedback/Bugs
Send to utils@cressto ne.com
App Website
DocumentId: 3e1269d8fd8b796ff79724ae4d146f089db276aa
EOF