Maximum Repeating Substring

This is an easy problem from LeetCode (Problem #1668), which demonstrates the power of dynamic programming.

Problem

For a string sequence, a string word is $k$-repeating if word concatenated $k$ times is a substring of sequence. The word’s maximum $k$-repeating value is the highest value $k$ where word is $k$-repeating in sequence. If word is not a substring of sequence, word’s maximum $k$-repeating value is 0.

Given strings sequence and word, return the maximum $k$-repeating value of word in sequence.

Example 1:

Input: sequence = "ababc", word = "ab"

Output: 2

Explanation: "abab" is a substring in "ababc".

Example 2:

Input: sequence = "ababc", word = "ba"

Output: 1

Explanation: "ba" is a substring in "ababc". "baba" is not a substring in "ababc".

Example 3:

Input: sequence = "ababc", word = "ac"

Output: 0

Explanation: "ac" is not a substring in "ababc".

Solution

We will use an array named dp of length len(sequence) + 1, where dp[i] denotes the maximum number of consecutive occurrence of word upto index i in sequence (i.e: the maximum number of consecutive occurrences of word that ended at index i-1).

If there are dp[i] number of consecutive words that end at index i-1, if word occurs again at index i, then we have discovered one more consecutive occurrence of word. This means, we will need to update dp[i + len(word)] to dp[i] + 1.

Alternatively, if we have just discovered the occurrence of word ending at index i-1, then dp[i] = dp[i-len(word)] + 1.

We initialize all the elements of dp to 0 at the beginning.

    def maxRepeating(sequence: str, word: str):
        dp=[0]*(len(sequence)+1)
        for i in range(len(word), len(sequence)+1):
            if word==sequence[i-len(word):i]:
                dp[i]=dp[i-len(word)]+1
        return max(dp)

We return the maximum value in the dp array as it denotes the maximum-consecutive occurrances of word in sequence.

Consider the following example: sequence = "ababcababab" and word = "ab".

At the end of the call to maxRepeating(sequence, word), the dp[] array will contain the following entries:

dp_array2

For the above example, the loop index $i$ starts at len(word) = 2 and ends at len(sequence) = 11. Let us follow each iteration in the for loop of the above algorithm:

$i = 2$: Since word = sequence[0:2], we set dp[2] = dp[i-len(word)] + 1 = dp[0] + 1 = 1.

$i = 3$: Since word $\neq$ sequence[1:3], we don’t update dp[3] (it remains at 0).

$i = 4$: Since word = sequence[2:4], we set dp[4] = dp[i-len(word)] + 1 = dp[2] + 1 = 2.

$i = 5$: Since word $\neq$ sequence[3:5], we don’t update dp[5] (it remains at 0).

$i = 6$: Since word $\neq$ sequence[4:6], we don’t update dp[6] (it remains at 0).

$i = 7$: Since word = sequence[5:7], we set dp[7] = dp[i-len(word)] + 1 = dp[5] + 1 = 1.

$i = 8$: Since word $\neq$ sequence[6:8], we don’t update dp[8] (it remains at 0).

$i = 9$: Since word = sequence[7:9], we set dp[9] = dp[i-len(word)] + 1 = dp[7] + 1 = 2.

$i = 10$: Since word $\neq$ sequence[8:10], we don’t update dp[10] (it remains at 0).

$i = 11$: Since word = sequence[9:11], we set dp[11] = dp[i-len(word)] + 1 = dp[9] + 1 = 3.

Now, we exit out of the for loop and return max(dp), which is 3.

Consider another example: sequence = "aaabaaaabaaabaaaabaaaabaaaabaaaaba" and word = "aaaba".

At the end of the call to maxRepeating(sequence, word), the dp[] array will contain the following entries:

dp = [0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 2, 0, 0, 0, 1, 0, 0, 0, 0, 2, 0, 0, 0, 0, 3, 0, 0, 0, 0, 4, 0, 0, 0, 0, 5]

Since max(dp) = 5, there are 5 consecutive occurrance of "aaaba" in "aaabaaaabaaabaaaabaaaabaaaabaaaaba".

Runtime: Since we are iterating through the loop only once, the runtime is $O(n)$. Due to the use of dp array, the space complexity is $O(n)$.

Back to Dynamic Programming Problems

Anusha Murali

Maximum Repeating Substring

anusha-murali.github.io