Remove All Duplicates from a Given String in Python
The task of removing all duplicates from a given string in Python involves retaining only the first occurrence of each character while preserving the original order. Given an input string, the goal is to eliminate repeated characters and return a new string with unique characters. For example, with s = “geeksforgeeks”, the result would be “geksfor”.
Using orderedDict.fromkeys()
OrderedDict efficiently remove duplicates from a string while preserving the order of characters. It works by converting the string into an ordered dictionary where each character is a key, ensuring only unique characters remain.
from collections import OrderedDict
s = "geeksforgeeks"
res = "".join(OrderedDict.fromkeys(s))
print(res)
Output
geksfor
Explanation: This method creates an ordered dictionary of unique characters from the string and returns them in the original order.
Table of Content
Using dict.fromkeys()
Similar to OrderedDict, this method uses a regular dictionary which maintains insertion order to remove duplicates. It’s highly efficient, maintaining the order of characters and removing any repetitions.
s = "geeksforgeeks"
res = "".join(dict.fromkeys(s))
print(res)
Output
geksfor
Explanation: This method uses a dictionary to remove duplicates while keeping the insertion order, similar to OrderedDict, but with slightly less overhead.
Using for loop
This approach removes duplicates by iterating through the string with a for loop, tracking previously seen characters using a set. The first occurrence of each character is added to the result string, which preserves the order.
s = "geeksforgeeks"
seen = set() # track unique characters
res = "" # result string
for char in s:
if char not in seen:
seen.add(char)
res += char
print(res)
Output
geksfor
Explanation: This method checks if a character is already in the set and if not, it adds it to the result string, preserving the original order of characters.
Using set()
This method uses list comprehension combined with a set to remove duplicates. While the method preserves order by checking each character against the set, it introduces slicing overhead that can reduce performance, especially for larger strings.
s = "geeksforgeeks"
res = "".join([char for i, char in enumerate(s) if char not in s[:i]])
print(res)
Output
geksfor
Explanation: List comprehension iterates through the string s, adding each character to res only if it has not appeared before (i.e., it is not present in the substring s[:i]), effectively removing duplicates while maintaining order.