Find length of longest substring with no more than K distinct characters

0 / 468
substring with no more than K distinct characters

Problem Statement

 

Given a string, find the length of the longest substring in it with no more than K distinct characters.


 

Example 1:


  Input: String=”araaci”, K=2 Output: 4 Explanation: The longest substring with no more than ‘2’ distinct characters is “araa”.      

Example 2:


  Input: String=”araaci”, K=1 Output: 2 Explanation: The longest substring with no more than ‘1’ distinct characters is “aa”.    

Example 3:


  Input: String=”cbbebi”, K=3 Output: 5 Explanation: The longest substrings with no more than ‘3’ distinct characters are “cbbeb” & “bbebi”.        

Brute force approach

 
 

As seen in the FINDING MAX SUM OF SUBARRAY WITH A GIVEN LENGTH IN AN ARRAY problem, a basic brute force solution can be employed by finding all substrings with length <= K, but that isn’t an effective solution and hence we will not discuss it.


 

Efficient Solution


 

This problem follows the Sliding Window pattern and we can use a similar dynamic sliding window strategy as discussed in Smallest Subarray with a given sum. We can use a map to remember the frequency of each character we have processed. Here is how we will solve this problem:


 
  • First, we will insert characters from the beginning of the string until we have ‘K’ distinct characters in the map by looping through the string.
 
  • These characters will make up our sliding window. We are asked to find the longest such window having no more than ‘K’ distinct characters. We will remember the length of this window as the longest window so far and keep looping through and finding the max window size in each pass.
 
  • After this, we will keep adding one character in the sliding window (i.e. slide the window ahead), in a stepwise fashion, one step at a time.
 
  • In each step, we will try to shrink the window from the beginning if the count of distinct characters in the Map is larger than ‘K’. We will shrink the window until we have no more than ‘K’ distinct characters in the Map. This is needed as we intend to find the longest window.
 
  • While shrinking, we’ll decrement the frequency of the character going out of the window and remove it from the Map if its frequency becomes zero as we have already processed it.
 
  • At the end of each step, we’ll check if the current window length is the longest so far, and if so, remember its length. At the end we will thus have the longest window which we will return.
   

Here is the visual representation of this algorithm for the Example-1:


 
substring with no more than K distinct characters

substring with no more than K distinct characters

      Javascript code below:-    
<script>
const longest_substring_with_k_distinct = function(str, k) {
  // TODO: Write your code here
  var windowStart = 0;
  var map = {};
  var maxLength = 0;

  for (var windowEnd = 0; windowEnd < str.length; windowEnd++) {
    if (typeof map[str[windowEnd]] === 'undefined') {
      map[str[windowEnd]] = 0;
    }

    map[str[windowEnd]]++;

    // shrink the sliding window, until we are left with 'k' distinct characters in the char_frequency
    while (Object.keys(map).length > k) {
      const leftChar = str[windowStart];
      // check if character frequency is distinct in map
      if (map[leftChar] === 1) {
        delete map[leftChar];
      }
      windowStart += 1; // shrink the window
    }
    // remember the maximum length so far
    maxLength = Math.max(maxLength, windowEnd - windowStart + 1);

  }

  return maxLength;
};

document.write(`Length of the longest substring: ${longest_substring_with_k_distinct('araaci', 2)}<br/>`);
document.write(`Length of the longest substring: ${longest_substring_with_k_distinct('araaci', 1)}<br/>`);
document.write(`Length of the longest substring: ${longest_substring_with_k_distinct('cbbebi', 3)}<br/>`);

</script>
   

Time Complexity 

   

The time complexity of the above algorithm will be O(N) where ‘N’ is the number of characters in the input string. The outer for loop runs for all characters and the inner while loop processes each character only once, therefore the time complexity of the algorithm will be O(N+N) which is asymptotically equivalent to O(N).


 

Space Complexity

   

The space complexity of the algorithm is O(K), as we will be storing a maximum of ‘K+1’ characters in the Map.


 

Comments

comments


An avid reader, responsible for generating creative content ideas for golibrary.co. His interests include algorithms and programming languages. Blogging is a hobby and passion.

Related Posts