13.13 Better time complexity for computing all string decompositions

#1

Here is my idea for 13.13 for a better time complexity. We can use an implementation similar to 13.8 i.e. we find the smallest substring that contains all the words and check if this string is indeed the concatenation of all the given words. Our result will be all such subtrings.

Initially we find all occurrences of all words in the given string an store it in a unordered_map which contains mapping from the index at which the word is found in the given string to the word itself. After this we just advance 2 pointers similar to the approach in 13.8 to find smallest substring containing all values. The time complexity of the code should be O(Nm + nm) because we find all occurrences of all words in the given string. Time taken to find all occurrences of one word - O(N+n). Since there are m words total complexity is O(Nm + nm)

vector<int> allstringdecompositions(string& s, vector<string>& words)
{
	int i, found, unit_size = words[0].size(), total_words = words.size();

	unordered_map<int, string> sidx_to_word_mapping;
	for(i=0;i<words.size();i++)
	{
		found = -1;
		while(true)
		{
			if(found+1 == s.size())
				break;
			found = s.find(words[i], found+1);
			if(found == string::npos)
				break;
			sidx_to_word_mapping[found] = words[i];
		}
	}

	int lo=0,hi=0;
	vector<int> res;
	unordered_map<string, int> words_covered;
	while(true)
	{
		while(hi < s.size() && words_covered.size() < words.size())
		{
			auto iter = sidx_to_word_mapping.find(hi);
			if(iter != sidx_to_word_mapping.end())
				words_covered[(*iter).second]++;
			hi++;
		}
		if(words_covered.size() < words.size())
			break;
		while(words_covered.size() >= words.size())
		{
			auto iter = sidx_to_word_mapping.find(lo);
			if(iter != sidx_to_word_mapping.end())
			{
				words_covered[(*iter).second]--;
				if(words_covered[(*iter).second]==0)
				{
					int ending_idx = ((hi-1) + (sidx_to_word_mapping[hi-1]).size() - 1);
					int starting_idx = lo;
					if(ending_idx - starting_idx + 1 == unit_size*total_words)
						res.push_back(lo);
					words_covered.erase((*iter).second);
				}
			}
			lo++;
		}
	}
	return res;
}
0 Likes

#2

Actually since complexity of find method in the library is Nn the overall complexity is still O(Nn*m). Also I didn’t take into account that there can be multiple occurrences of a single word in the list of words given as it is implemented in the book.

0 Likes