Python - Remove K length Duplicates from String
To remove consecutive K-length duplicates from a string iterate through the string comparing each substring with the next and excluding duplicates. For example we are given a string s = "abcabcabcabc" we need to remove k length duplicate from the string so that the output should become "aaaabc" . We can use methods like Counter from collection , string replacement, list comprehension .
Using collections.Counter
Method counts substrings of length K
and removes those that appear exactly K
times.
from collections import Counter
s = "abcabcabcabc"
k = 3
# Count the occurrences of each substring of length K
sub_c = Counter(s[i:i+k] for i in range(len(s) - k + 1))
# Remove substrings that appear exactly K times
res = ''.join([s[i] for i in range(len(s) - k + 1) if sub_c[s[i:i+k]] != k] + [s[i] for i in range(len(s)-k+1, len(s))])
print(res)
Output
aaaabc
Explanation:
Counter()
counts occurrences of all substrings of lengthK
in the strings
.- New string is built by excluding substrings that appear exactly
K
times using a condition.
Using String Replacement
Method uses the replace()
function to remove K-length duplicates explicitly.
s = "abcabcabcabc"
k = 3
# Loop to find and remove duplicates of length K
for i in range(len(s) - k + 1):
if s[i:i+k] == s[i+k:i+2*k]: # Check for consecutive K-length duplicates
s = s[:i+k] + s[i+2*k:]
print(s)
Output
abcabc
Explanation:
- Loop iterates through the string, checking if two consecutive substrings of length
K
are identical usings[i:i+K] == s[i+K:i+2*K]
. - Consecutive duplicates are found string is updated by removing the second duplicate substring using slicing (
s[:i+K] + s[i+2*K:]
)
Using Set and Sliding Window
This method uses a sliding window to keep track of substrings and removes duplicates by checking if the substring appears more than once.
s = "abcabcabcabc"
k = 3
res = []
seen = set()
# Traverse the string with a sliding window of size K
for i in range(len(s) - k + 1):
sub = s[i:i+k]
if sub not in seen:
res.append(sub)
seen.add(sub)
print(''.join(res))
Output
abcbcacab
Explanation:
- Loop iterates through the string, checking if two consecutive substrings of length
K
are identical usings[i:i+K] == s[i+K:i+2*K]
. - Consecutive duplicates are found string is updated by removing the second duplicate substring using slicing (
s[:i+K] + s[i+2*K:]
)
Using List Comprehension
This method checks for duplicate substrings of length K
and removes them by comparing substrings in a list comprehension.
s = "abcabcabcabc"
k = 3
# Using list comprehension to remove K-length duplicates
res = ''.join([s[i:i+k] for i in range(len(s) - k + 1) if s[i:i+k] not in s[i+k:i+2*k]])
print(res)
Output
bcacababc
Explanation:
- List comprehension iterates over the string, extracting substrings of length
K
and checks if each substring is not repeated in the next consecutive substring of the same length. - Substrings that are not duplicates are joined together using
''.join()
, forming the final string without consecutive duplicates of lengthK
.