Thanks for taking the trouble to share these issues with us.
-
You are correct about the memory overhead. We started with a C++ implementation, where there’s no notion of box-types, that mindset carried over when analyzing the Java program. We do prefer List to in[], it was a conscious decision, makes working with Collections easier (addAll(), retainAll(), some constructor calls, for example) Pedantically, RandomAccess would be the right call, since we implicitly assume O(1) random access times, but we didnt want to muddy things further.) I know there’s a big overhead to using box types (I did a compare recently, it’s about 5-10x slower to use Arrays.sort() on Integer[] compared to int[]) but for teaching and interview purposes, asymptotically they are the same and box-types are easier to work with.
-
Another good point, I think we will change the problem to say something like “in the abstract you have a sequence of integers and need to find a missing one” Since we use logical right shift, signedness issues are not relevant. If these were infact IP addresses, the long solution would work.
-
We’ve reworked this problem recently to take an Iterable as argument, instead of an input stream, which gives us the ability to iterate and restart. This reduces the burden of coding up I/O, which is full of gotcha’s (like the one you pointed out). We’ve actually removed streams just about everywhere, replaced with Iterable or Iterator as appropriate.
In summary, we agree with your comments, and want to clarify that many of our choices are based on what works from a teaching perspective (and also an interview setting). We’d expect candidates to know the tradeoffs, but most interviewers would be happy with an asympytotically optimum solution, with the candidate demonstrating understanding of tweaks needed for the real world (like you pointed out, also the use of better memory management, short vs int, that kind of thing)
To prove that no good deed goes unpunished, I was wondering if you could take a look at some new material we wrote up, introducing the data structures via a min working example, some tips, and library review (about 30 pages). Let me know if you are interested, from your feedback on this problem I’m sure your input is very helpful,
bye
Adnan