Hi:

I have a question regarding the code in the book camp of heaps chapter, Python version.

In the top_k(k, stream) implementation, the last line uses heapq.nsmallest function to sort the k elements in the current heap. I am just confused why not using a sort() function directly to the current heap array (if result needs to be sorted), or just output the elements in the heap array sequentially (if the result doesn’t need to be sorted)?

The code in the book is actually very similar to an implementation of nlargest(k, L) of a heap (except that the input is a stream). I searched its actual implementation: the last operation is actually sort(reverse = True). However, if you use nsmallest function in the last line, its real execution will be to construct an identical list of the current heap, heapify that list again (because the function doesn’t know it is already a heap. Though no operation is needed, checks will be done), and then pop out the elements one by one, which is wasteful in both time and space.

Am I correct?

Thanks.