The 5 Best Tools to Find and Remove Duplicate Files in Linux
Source: https://www.makeuseof.com/best-tools-fi ... les-linux/
The 5 Best Tools to Find and Remove Duplicate Files in Linux
If your computer is all packed up with duplicate files and folders, check out these five Linux utilities to free up some space.
File management is a complicated task in and of itself. Add to that large volume of duplicate files that typically hog up the storage space, and the process becomes increasingly difficult.
While the standard way to deal with duplicate files is to locate and delete them manually. However, using a dedicated duplicate file finder program instead can significantly accelerate the process.
So if you're planning to get rid of duplicate files and clean up your computer, here’s a list of some of the best tools for finding and removing duplicate files in Linux.
1. Fslint
Fslint is a GUI and CLI-based utility for cleaning various kinds of clutter from your system. It calls this clutter "lint" and offers multiple tools to help you carry out a multitude of tasks, including finding duplicate files, empty directories, and problematic filenames.
By featuring both graphical and command-line modes of operation, fslint makes it easier for new Linux users to free up their computer storage from all sorts of system lint.
To access fslint via the GUI, all you need to do is open the terminal and run the fslint-gui command.
As far as advanced functionality is concerned, the program offers 10 different functionalities in the CLI mode such as findup, findu8, findnl, findtf, and finded. Using these, you can refine the search results to increase your chances of finding specific kinds of duplicate files on your system.
How to Install fslint
On Debian-based distros like Ubuntu:
Код: Выделить всё
sudo apt install fslint
On RHEL-based distros like CentOS and Fedora:
Код: Выделить всё
sudo yum install fslint
sudo dnf install fslint
On Arch Linux and Manjaro:
Код: Выделить всё
sudo pacman -S fslint
2. Fdupes
Fdupes is one of the easiest programs to identify and delete duplicate files residing within directories. Released under the MIT License on GitHub, it's free and open-source.
The program works by using md5sum signature and byte-by-byte comparison verification to determine duplicate files in a directory. If required, you can also perform recursive searches, filter out search results, and get a summarized view of the discovered duplicate files.
Once you have identified duplicate files in a directory, you can then use fdupes to either delete the files or replace them with links to the original file.
Fdupes Installation
On Debian-based distros:
Код: Выделить всё
sudo apt install fdupes
On RHEL-based distros:
Код: Выделить всё
sudo yum install fdupes
sudo dnf install fdupes
To install on Arch Linux and Manjaro:
Код: Выделить всё
sudo pacman -S fdupes
3. Rdfind
Rdfind is another Linux utility to help you find redundant files on your computer across different directories. It relies on comparing files based on their content—and not their name—to identify duplicates, which makes it more effective at its job.
To achieve this, the program works by ranking equal files in a directory and determining the original and duplicates: the highest-ranked one is selected as the original while the rest are duplicates.
Besides, rdfind can also calculate checksums to compare files when required. And the best part is it saves the scanned results to a results.txt file in the home directory, so you can refer to it when you're about to delete duplicates to ensure you don't remove the wrong ones.
Of course, like with most other duplicate file finders, rdfind also offers some preprocessors to sort files, ignore empty files, or set symlinks. Last but not the least, there's an option to delete duplicate files as well.
How to Install rdfind
On Debian/Ubuntu:
Код: Выделить всё
sudo apt install rdfind
On Fedora/CentOS:
Код: Выделить всё
sudo dnf install rdfind
4. DupeGuru
DupeGuru is a cross-platform tool for finding and deleting duplicate files on your machine. One of its best characteristics is the option to customize the matching engine to suit your preference so as to increase your chances of finding the right kind of duplicate files in a directory. And similar to a few other duplicate finder programs, it also offers a GUI to facilitate easier operations.
Talking about functionality, dupeGuru leverages its fuzzy matchingalgorithm to scan either filenames or file contents and find duplicates quickly and efficiently.
Plus, it's also good at dealing with music and picture-specific information, which gives it an edge over other duplicate file finders. Moreover, if required, you have the option to tweak its matching engine to locate exactly the kind of duplicate files you want to eliminate.
DupeGuru also lets you delete duplicate files. And for this, it has a reference directory system in place, which prevents you from accidentally deleting the wrong files. Besides deletion, there's the option to move or copy them elsewhere, too.
DupeGuru Installation
On Debian-based distros:
Код: Выделить всё
sudo add-apt-repository ppa:dupeguru/ppa
sudo apt-get update
sudo apt-get install dupeguru
On Arch Linux:
Код: Выделить всё
sudo pacman -S dupeguru
5. Rmlint
Rmlint is yet another lint—and not just duplicate files—finder and remover for Linux. It's free to use and extremely fast at identifying duplicate files and directories on your system. You also get support for the Btrfs storage format, which makes it stand out from other tools on this list.
Speaking of, some of the other aspects where rmlint trumps the other competing duplicate file removal tools include the ability to search for files based on a particular timeframe, find files with broken user/group IDs, and find non-stripped binaries that occupy a lot of space. Besides, similar to a few other programs, it also saves the scanned results to rmlint.json and rmlint.sh files, which come in handy during the delete operation.
However, do note that, unlike other tools, rmlint isn't the easiest to use: it generates a script for deleting duplicates, which requires some level of understanding to be used effectively.
How to Install rmlint
On Debian-based distros:
Код: Выделить всё
sudo apt install rmlint
On Fedora and CentOS:
Код: Выделить всё
sudo yum install rmlint
sudo dnf install rmlint
On Arch-based distros like Manjaro:
Код: Выделить всё
sudo pacman -S rmlint
Keeping Duplicate Files at Bay on Linux
Using the duplicate file finder programs listed above, you can easily identify the duplicate files that might be taking up space on your machine and remove them altogether. However, a word of advice when working with such tools is to be extra cautious with your actions to avoid ending up deleting important files and documents on your system.
In case you're a little skeptical about which files to delete and which ones to keep, make sure to make a backup of the entire data on your system to be on a safer side.