Why the usage of itertools.tee() in Problem 11.9 EPI Python

#1

Hello,

In Problem 11.9 Find The Missing IP address (Python version), the solution uses itertools.tee() function to create a copy of the stream. According to the Python documentation: “This itertool may require significant auxiliary storage (depending on how much temporary data needs to be stored). In general, if one iterator uses most or all of the data before another iterator starts, it is faster to use list() instead of tee().”

The usage of this function seem to negate the crux of the problem, which if I understand correctly, is that we cannot fit the entire file in RAM. Since the first stream iterator is used completely, doesn’t it make the function store a copy of the stream in RAM, and thus defeating the purpose of the RAM constraint?

Can someone help clarify this? Thanks.

Best,
Steve

1 Like

#2

Yeah, tee seems poor choice here. Instead an example using file iterator would have been better. Then going back to beginning would have been file.seek(0).

0 Likes