Various

Order a list and remove duplicates

You have a given list, and want to remove duplicates, quickly and easily ?

One of the possible solutions, free and effective, is to use Notepad++ [1], a free text editor, but, on top of it, it is light, and offers lot of interesting functionalities :

  • files are opened in tabs,
  • syntax highlight (files in a known format are coloured),
  • ability to cancel a lot of operations,
  • duplicates removing,
  • comparison of several files, …

You will also have to install TextFX plugin. To do so, go on SourceForge [2], and download the latest plugin version (Fig 10). You then just have to extract downloaded archive in your Notepad++ installation folder (Fig 11).

TextFX Plugin download on SourceForge
Fig 10 : TextFX Plugin download on SourceForge
Plugin installation for Notepad++
Fig 11 : Plugin installation for Notepad++

Once Notepad++ is launched, you can have a file containing duplicates (Fig 1).

Notepad++ file with duplicates
Fig 1 : Notepad++ file with duplicates

In order to delete them, in a first time, verify that the option “+Sort outputs only UNIQUE (at column) lines” (Fig 2) is set, and then select your data (Fig 3).

Notepad++ sort unique
Fig 2 : Notepad++ sort unique
Notepad++ select data
Fig 3 : Notepad++ select data

You now got two choices :

Notepad++ sort case insensitive
Fig 4 : Notepad++ sort case insensitive
Notepad++ duplicate lines removed
Fig 5 : Notepad++ duplicate lines removed
Notepad++ sort case sensitive
Fig 6 : Notepad++ sort case sensitive
Notepad++ identical lines removed
Fig 7 : Notepad++ identical lines removed

You can then, with the two newly created list, after having them copied in different files, compare them (Fig 8), with, as result, the display in first file of lines that do not exist in the second one, and, in the second one, the display of lines that do not exist in the first file (Fig 9).

SENotepad++ compare files
Fig 8 : SENotepad++ compare files
Notepad++ compare results
Fig 9 : Notepad++ compare results