I think we are on the same wavelength, but given that you are an expert on streaming library I'm going to expand on Enum as you might be interested. There are many feature axes for streaming libraries and Enum tries to cover a bit too much of them. It was designed, by the Extlib project, as a go-between library to translate from one data structure to the other, and to be as flexible as possible:
- It supports a wide variety of combinators common to stream libraries, on both finite and infinite stream; there is also a notion of whether it's easy/fast, for a given stream, to compute its size (without consuming the stream), to eg. preallocate arrays when converting from enum to array, and combinators try to preserve that information when they can
- In the style of the venerable Stream module, Enum is a "destructive streaming" library, in that it mutates its internal state when you access the next element. I think the idea is that in general this is more memory-efficient (you cannot keep a reference to the start of the stream that leaks memory), but it's also a bit error-prone in non-linear usage scenarios.
- To still support non-linear usage scenarios, Enum streams have a "clone" operation that duplicates a stream into two copies. The useful, interesting and difficult thing about clone is that getting the same element on both the original and the clone stream should not duplicate any side-effects involved in computing the value: clone should duplicate the stream of values, but keep threading the computation effect.
In my experience most of the complexity (and fragility) of the current Enum implementation comes from the destructive aspects and cloning. The implementation is in a fairly imperative style (enumerations are defined as object-style record of *mutable* methods that close over the enumeration state) and there is a fair amount of bookkeeping involved on each update/next to support this feature set. This is not a big issue for the original use-case, which is to convert between data structures (have 2*N conversion functions, Foo.{to,of}_enum, instead of N^2 conversion functions), where performance bottlenecks are usually on the data-structure-processing side, but this means that using Enum for heavy stream processing is known to be slow. Again, I would expect BatSeq (which is neither destructive nor memoizing) to do much better on these workflows.
It is perfectly reasonable to question whether we need this complex feature set for a central streaming library. I have mixed thoughts on this question:
- as a library developer, my experience is that the current implementation is too fragile and too slow for its own good, and I think that users would be better served by a simpler abstraction that does less in a more robust way. The pragmatic choice is thus to use simpler stream libraries. Another interesting point is that, if you develop those simpler, more robust stream libraries, it is sometimes possible to reuse them to build more complex streams on top of them (for example, a solid purely-functional stream implementation can be turned into a destructive stream implementation by passing references to functional streams around), so this decomposition of concern would also help rebuilding a more robust Enum. Simon Cruanes did good work in that direction in preliminary versions of his Containers/Sequence libraries (I can't find specific references right now, to the various streaming types with different feature support).