(EPI1.3.1) Understanding solution 15.6

#1

I’m working on meta algorithms and the optimized O(nlogn) solution for longest nondecreasing subsequence stumped me.

template <typename T>
int longest_nondecreasing_subsequence(vector<T> &A)
{
	vector<T> tail_values;
	for(const T &a: A)
	{
	    auto it = upper_bound(tail_values.begin(), tail_values.end(), a);
	    if( it == tail_values.end())
     	    {
		tail_values.emplace_back(a);
            }
            else
            {
    	      *it = a;
            }
       }
       return tail_values.size();
}

As I understand the above problem, upper_bound(first, last, val) works on only sorted array as it is trying to do binary search for first value greater than val
Suppose we pass following vector as input: vector<int> A{4,2,7,8,1}
The result tail_values would be tail_values = {1,7,8}
I understand the tail_values.size() is the right number for answer but {1,7,8} is not the actual increasing subsequence. Also the upper_bound worked because every time we modified tail_values we made sure it remains sorted.
but Its just hard to get an intuition or the thought process behind this answer for me. Could you explain it better? The theoretical explanation in the book doesn’t make sense to me.

EDIT: just to be clear, theoretical explanation talks about all the steps in the code including modifying the tail_values array. What is unclear to me is how can one ever think that size of {1,7,8} is the right answer to an input array which has elements in the order{4,2, 7,8,1}, even though it is the right answer. I hope you understand what I mean.

0 Likes

#2

Hey Akash,

Thanks for your question, and this is a very hard question for sure. Basically, you cannot treat the values in tail_values as the subsequence you want to find as this method is limited in the sense of reconstructing of the subsequence you are looking for. Instead, each value in tail_values stores the most potential value to let it grow. For example, if we have an array A = <1, 3, 5, 4, 2>, when we process 5, tail_values will be <1, 3, 5> where 1 at index 1 denotes the most potential value for a subsequence length 1 to grow, 3 at index 2 denotes the most potential value for a subsequence length 2 to grow (note that it does not know how to construct the subsequence itself), etc. Let’s continue this example, when we process 4, tail_values will be <1, 3, 4> where 4 replace 5 becomes the most potential value of length 3 subsequence. Then when we process 2, tail_values becomes <1, 2, 4> and you shall understand that why <1, 2, 4> stores the most potential values to grow but itself does not represent the subsequence itself, and we never said it could in the book so don’t confuse about that. However, the size of itself actually tells us how long this subsequence could be.

Please let me know if you have any problem.

1 Like

#3

Thanks man.
Makes sense!

0 Likes