Python - Remove K length Duplicates from String

Last Updated : 18 Jan, 2025

To remove consecutive K-length duplicates from a string iterate through the string comparing each substring with the next and excluding duplicates. For example we are given a string s = "abcabcabcabc" we need to remove k length duplicate from the string so that the output should become "aaaabc" . We can use methods like Counter from collection , string replacement, list comprehension .

Using `collections.Counter`

Method counts substrings of length K and removes those that appear exactly K times.

Python

from collections import Counter

s = "abcabcabcabc"
k = 3

# Count the occurrences of each substring of length K
sub_c = Counter(s[i:i+k] for i in range(len(s) - k + 1))

# Remove substrings that appear exactly K times
res = ''.join([s[i] for i in range(len(s) - k + 1) if sub_c[s[i:i+k]] != k] + [s[i] for i in range(len(s)-k+1, len(s))])

print(res)

Output

aaaabc

Explanation:

Counter() counts occurrences of all substrings of length K in the string s.
New string is built by excluding substrings that appear exactly K times using a condition.

Using String Replacement

Method uses the replace() function to remove K-length duplicates explicitly.

Python

s = "abcabcabcabc"
k = 3

# Loop to find and remove duplicates of length K
for i in range(len(s) - k + 1):
    if s[i:i+k] == s[i+k:i+2*k]:  # Check for consecutive K-length duplicates
        s = s[:i+k] + s[i+2*k:]

print(s)

Output

abcabc

Explanation:

Loop iterates through the string, checking if two consecutive substrings of length K are identical using s[i:i+K] == s[i+K:i+2*K].
Consecutive duplicates are found string is updated by removing the second duplicate substring using slicing (s[:i+K] + s[i+2*K:])

Using Set and Sliding Window

This method uses a sliding window to keep track of substrings and removes duplicates by checking if the substring appears more than once.

Python

s = "abcabcabcabc"
k = 3

res = []
seen = set()

# Traverse the string with a sliding window of size K
for i in range(len(s) - k + 1):
    sub = s[i:i+k]
    if sub not in seen:
        res.append(sub)
        seen.add(sub)


print(''.join(res))

Output

abcbcacab

Explanation:

Loop iterates through the string, checking if two consecutive substrings of length K are identical using s[i:i+K] == s[i+K:i+2*K].
Consecutive duplicates are found string is updated by removing the second duplicate substring using slicing (s[:i+K] + s[i+2*K:])

Using List Comprehension

This method checks for duplicate substrings of length K and removes them by comparing substrings in a list comprehension.

Python

s = "abcabcabcabc"
k = 3

# Using list comprehension to remove K-length duplicates
res = ''.join([s[i:i+k] for i in range(len(s) - k + 1) if s[i:i+k] not in s[i+k:i+2*k]])

print(res)

Output

bcacababc

Explanation:

List comprehension iterates over the string, extracting substrings of length K and checks if each substring is not repeated in the next consecutive substring of the same length.
Substrings that are not duplicates are joined together using ''.join(), forming the final string without consecutive duplicates of length K.