Removing Duplicates From a List in Python
Let’s go over a few idiomatic ways to remove duplicates from lists in Python.
Method #1 - Create a new list (simplest)
This is the easiest algorithm to code, but because it requires creating a new list, also requires more memory and is a bit slower.
def remove_duplicates(original):
deduped = []
for item in original:
if item not in deduped:
deduped.append(item)
return deduped
We take advantage of Python’s in keyword here, only adding each item to the final list if it isn’t already present.
Method #2 - Create a new list with syntactic sugar (less code, harder to understand)
def remove_duplicates(original):
deduped = []
[deduped.append(item) for item in original if item not in deduped]
return deduped
This is the same exact code from a performance standpoint but only uses one line. If you’re into code golf, then this might be your solution.
Method #3 - Use the built-in “set” data structure (fast, loses order)
A set() is a group of values that doesn’t contain any duplicates. By casting a list into a set and back, you remove all duplicates. The main drawback here is that you’ll lose your ordering.
def remove_duplicates(original):
return list(set(original))
This method will be faster in most circumstances than the previous two because each transfer is O(n) in big-o notation terms. A group of two O(n) operations is faster than one O(n^2) operation. As a bonus, it even uses less code.
Method #4 - Use an ordered dictionary (fast, maintains order)
By using the collections libraries’ OrderedDict type, we can maintain the ordering of the list while maintaining the same Big-O that we had with a set().
from collections import OrderedDict
def remove_duplicates(original):
return list(OrderedDict.fromkeys(original))
Related Articles
Complete Guide to Removing Elements From Lists in Python
Dec 09, 2021 by Lane Wagner - Boot.dev co-founder and backend engineer
While lists aren’t the most efficient data structure if you’ll be doing lots of deleting from the middle, there are definitely good ways to accomplish the task. The built-in remove() method should be your first option. Let’s go over some examples.
How to Use the Ternary Operator in Python
Dec 09, 2021 by Lane Wagner - Boot.dev co-founder and backend engineer
Developers love concise code that’s easy to read, and that’s exactly what ternary operators are for. The ternary operator in Python lets you perform a small if/else statement in a single line. Let’s take a look at a few examples.
How to Check if a File Exists in Python
Dec 08, 2021 by Lane Wagner - Boot.dev co-founder and backend engineer
When working with files in Python, you’ll often need to check if a file exists before you do anything else with it, such as reading from or writing to it. Luckily, the Python standard library makes this a piece of cake.
Python vs C++: The Best Language To Learn For You
Nov 17, 2021 by Meghan Reichenbach
It’s either a blessing or a curse when choosing to learn Python or C++ because there couldn’t be two more opposing languages to compare.