Learn and Learn

Learn And Learn - great place for tutorials, references and how-to

Python: Remove Duplicate Items/Elements from list – Examples & Explanation

There are several techniques you can use to remove duplicate elements from list in Python. These are more than ten in numbers. We will only discuss top 5 best ways to remove duplicates elements or items from the list in Python. These methods are tested and mostly used by good developers of Python. You can use following techniques:

Method – 1:

Using set() Function to Remove Duplicate Items from list

As we know that set is the unordered collection of unique elements. Python internally uses the hash technique to process sets. It is quite simple and fast method to remove duplicate items from Python list. The original order of the sequence (list) may not be preserved using this approach. This method will also not work if you have dict or similar object in the list because dict object is not hashable in Python.


Output:
List with duplicate elements: [1, 2, 3, 4, 1, 2, 3, 5, 6, 7]
List with unique elements: [1, 2, 3, 4, 5, 6, 7]

 

Method – 2:

Remove Duplicate Items from list and Keep Order in Python

Following code snippet does very well job. It removes duplicate items from the list and preserves its order. It is not compatible with set objects. It is quite fast and its benchmark is quite impressive.


Output:

Original List: ['b', 'b', 'a', 'a', 1, 2, 3, 4, 1, 2, 3, 5, 6, 7]
Unique List: ['b', 'a', 1, 2, 3, 4, 5, 6, 7]

 

Method – 3:

Using list Comprehension to Remove Duplicate Items from list in Python


Output:

newlist1: ['a', 'b', 1, 2, 3, 4, 5, 6, 7]
newlist2: ['a', 'b', 4, 1, 2, 3, 5, 6, 7]

 

Method – 4

Using OrderedDict of collections Library to Remove Duplicate Items from list in Python

Note: In Python 3.5 and above OrderedDict of collections Library has C implementation. This means it is a quite fast and best technique to remove duplication elements from the list in Python. You should also give a try to


Output:

myListObj: OrderedDict([('a', None), ('b', None), (1, None), (2, None), (3, None), (4, None), (5, None), (6, None), (7, None)])

myListObj: ['a', 'b', 1, 2, 3, 4, 5, 6, 7]

myListObj: odict_keys(['a', 'b', 1, 2, 3, 4, 5, 6, 7])
myListObj: ['a', 'b', 1, 2, 3, 4, 5, 6, 7]

 

Method – 5:

Using iteration_utilities.unique_everseen to Remove Duplicate Items from list in Python

This is an external implementation. You can find full detail of iteration_utilities.unique_everseen package by following the links below. Using external package, this is perhaps the fastest method to remove duplicate items from the list in Python. Most of the implementation of the package in C language. It preserves the order of the list objects and also supports un-hashable values such as dict. This iteration_utilities package requires installation before you use it. If you need to handle large lists then you should give it a try. Useful links of this package iteration_utilities can be found below.


Output:

lst_unique: [1, 2, 3, 4]
lst_sets_unique: [{1}, {2}, {3}]

Final words – How to remove duplicate items from th list in Python?

You have read five best ways to remove duplicate items from the list in Python. Every programmer has its own choice to implement and use a technique for this purpose. Some programmers may have dig further and find their own method to achieve this goal in a fast and elegant way. A major concern comes when data list is quite large and extensive calculation is performed. If usage of the large list is quite high and performance is required then you should consider all these methods. You should pick a method that best suits you and should perform some tests before implementing in the production environment.
Here is a brief glimpse of the five methods.

  1. Method 1 – If the order of the list is not required and you don’t use hashable items such as sets then this method is quite fast and easy to read.
  2. Method 2 is also quite fast and does not support sets but it preserves the order of the list.
  3. Method 3 uses list comprehension technique. It performs well. It can be used on set objects. Some programmers may like it but it has some drawback as everybody knows about it.
  4. Method 4 uses a class OrderedDict of package collections. It very fast and elegant way to perform such operations. As we have already discussion, from Python 3.5 and above, this is implemented in C. It is recommended method when you don’t have sets in your list.
  5. Method 5 uses an external library iteration_utilities. According to its author, this library is implemented in C and performs very well. It requires some installation overhead. Installation procedure and package documentation can be read on the author website.

Don’t forget to write us using contact us form if you have any suggestion and code. We will publish it with your name.

 

Loading...
LearnAndLearn.com © 2018 - All rights reserved