ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
* [ruby-core:105750] [Ruby master Feature#18262] Enumerator::Lazy#partition
@ 2021-10-22 12:09 zverok (Victor Shepelev)
  2021-10-22 12:22 ` [ruby-core:105751] " Dan0042 (Daniel DeLorme)
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: zverok (Victor Shepelev) @ 2021-10-22 12:09 UTC (permalink / raw)
  To: ruby-core

Issue #18262 has been reported by zverok (Victor Shepelev).

----------------------------------------
Feature #18262: Enumerator::Lazy#partition
https://bugs.ruby-lang.org/issues/18262

* Author: zverok (Victor Shepelev)
* Status: Open
* Priority: Normal
----------------------------------------
(Part of my set of proposals about making `.lazy` more useful/popular.)

Currently:
```ruby
file = File.open('very-large-file.txt')
lines_with_errors, lines_without_errors = file.lazy.partition { _1.start_with?('E:') }
lines_with_errors.class
# => Array, all file is read by this moment
```
This might be not very practical performance-wise and memory-wise.

I am thinking that maybe returning a pair of lazy enumerators might be a good addition to `Enumerator::Lazy`

Naive prototype:

```ruby
class Enumerator::Lazy
  def partition(&block)
    buffer1 = []
    buffer2 = []
    source = self

    [
      Enumerator.new { |y|
        loop do
          if buffer1.empty?
            begin
              item = source.next
              if block.call(item)
                y.yield(item)
              else
                buffer2.push(item)
              end
            rescue StopIteration
              break
            end
          else
            y.yield buffer1.shift
          end
        end
      }.lazy,
      Enumerator.new { |y|
        loop do
          if buffer2.empty?
            begin
              item = source.next
              if !block.call(item)
                y.yield(item)
              else
                buffer1.push(item)
              end
            rescue StopIteration
              break
            end
          else
            y.yield buffer2.shift
          end
        end
      }.lazy
    ]
  end
end
```
Testing it:
```ruby
Enumerator.produce(1) { |i| puts "processing #{i}"; i + 1 }.lazy
  .take(30)
  .partition(&:odd?)
  .then { |odd, even|
    p odd.first(3), even.first(3)
  }
# Prints:
# processing 1
# processing 2
# processing 3
# processing 4
# processing 5
# [1, 3, 5]
# [2, 4, 6]
```
As you might notice by the "processing" log, it only fetched the amount of entries that was required by produced enumerators.

The **drawback** would be—as my prototype implementation shows—the need of internal "buffering" (I don't think it is possible to implement lazy partition without it), but it still might be worth a shot?



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [ruby-core:105751] [Ruby master Feature#18262] Enumerator::Lazy#partition
  2021-10-22 12:09 [ruby-core:105750] [Ruby master Feature#18262] Enumerator::Lazy#partition zverok (Victor Shepelev)
@ 2021-10-22 12:22 ` Dan0042 (Daniel DeLorme)
  2021-11-18  4:59 ` [ruby-core:106112] " knu (Akinori MUSHA)
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Dan0042 (Daniel DeLorme) @ 2021-10-22 12:22 UTC (permalink / raw)
  To: ruby-core

Issue #18262 has been updated by Dan0042 (Daniel DeLorme).


+1
Since a lazy enumerator is produced for both #select and #reject, it would make sense for #partition as well.

----------------------------------------
Feature #18262: Enumerator::Lazy#partition
https://bugs.ruby-lang.org/issues/18262#change-94257

* Author: zverok (Victor Shepelev)
* Status: Open
* Priority: Normal
----------------------------------------
(Part of my set of proposals about making `.lazy` more useful/popular.)

Currently:
```ruby
file = File.open('very-large-file.txt')
lines_with_errors, lines_without_errors = file.lazy.partition { _1.start_with?('E:') }
lines_with_errors.class
# => Array, all file is read by this moment
```
This might be not very practical performance-wise and memory-wise.

I am thinking that maybe returning a pair of lazy enumerators might be a good addition to `Enumerator::Lazy`

Naive prototype:

```ruby
class Enumerator::Lazy
  def partition(&block)
    buffer1 = []
    buffer2 = []
    source = self

    [
      Enumerator.new { |y|
        loop do
          if buffer1.empty?
            begin
              item = source.next
              if block.call(item)
                y.yield(item)
              else
                buffer2.push(item)
              end
            rescue StopIteration
              break
            end
          else
            y.yield buffer1.shift
          end
        end
      }.lazy,
      Enumerator.new { |y|
        loop do
          if buffer2.empty?
            begin
              item = source.next
              if !block.call(item)
                y.yield(item)
              else
                buffer1.push(item)
              end
            rescue StopIteration
              break
            end
          else
            y.yield buffer2.shift
          end
        end
      }.lazy
    ]
  end
end
```
Testing it:
```ruby
Enumerator.produce(1) { |i| puts "processing #{i}"; i + 1 }.lazy
  .take(30)
  .partition(&:odd?)
  .then { |odd, even|
    p odd.first(3), even.first(3)
  }
# Prints:
# processing 1
# processing 2
# processing 3
# processing 4
# processing 5
# [1, 3, 5]
# [2, 4, 6]
```
As you might notice by the "processing" log, it only fetched the amount of entries that was required by produced enumerators.

The **drawback** would be—as my prototype implementation shows—the need of internal "buffering" (I don't think it is possible to implement lazy partition without it), but it still might be worth a shot?



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [ruby-core:106112] [Ruby master Feature#18262] Enumerator::Lazy#partition
  2021-10-22 12:09 [ruby-core:105750] [Ruby master Feature#18262] Enumerator::Lazy#partition zverok (Victor Shepelev)
  2021-10-22 12:22 ` [ruby-core:105751] " Dan0042 (Daniel DeLorme)
@ 2021-11-18  4:59 ` knu (Akinori MUSHA)
  2021-11-18 15:04 ` [ruby-core:106147] " Dan0042 (Daniel DeLorme)
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: knu (Akinori MUSHA) @ 2021-11-18  4:59 UTC (permalink / raw)
  To: ruby-core

Issue #18262 has been updated by knu (Akinori MUSHA).


I agree this would be a good addition, and I think the existing users of `lazy` would understand the incompatibility this would bring is a necessary step to make `partition` more useful.

However, the buffering could be a pitfall for new users.   In today's developer meeting, Matz and I agreed to suggest that the behavior should be well documented.  If you were dividing a huge (or infinite) list into two where one enumerator would yield a value extremely less likely than the other, the buffer could become huge.  That is not straightforward from what you normally expect from "lazy", so it should be noted in the documentation.

----------------------------------------
Feature #18262: Enumerator::Lazy#partition
https://bugs.ruby-lang.org/issues/18262#change-94707

* Author: zverok (Victor Shepelev)
* Status: Open
* Priority: Normal
----------------------------------------
(Part of my set of proposals about making `.lazy` more useful/popular.)

Currently:
```ruby
file = File.open('very-large-file.txt')
lines_with_errors, lines_without_errors = file.lazy.partition { _1.start_with?('E:') }
lines_with_errors.class
# => Array, all file is read by this moment
```
This might be not very practical performance-wise and memory-wise.

I am thinking that maybe returning a pair of lazy enumerators might be a good addition to `Enumerator::Lazy`

Naive prototype:

```ruby
class Enumerator::Lazy
  def partition(&block)
    buffer1 = []
    buffer2 = []
    source = self

    [
      Enumerator.new { |y|
        loop do
          if buffer1.empty?
            begin
              item = source.next
              if block.call(item)
                y.yield(item)
              else
                buffer2.push(item)
              end
            rescue StopIteration
              break
            end
          else
            y.yield buffer1.shift
          end
        end
      }.lazy,
      Enumerator.new { |y|
        loop do
          if buffer2.empty?
            begin
              item = source.next
              if !block.call(item)
                y.yield(item)
              else
                buffer1.push(item)
              end
            rescue StopIteration
              break
            end
          else
            y.yield buffer2.shift
          end
        end
      }.lazy
    ]
  end
end
```
Testing it:
```ruby
Enumerator.produce(1) { |i| puts "processing #{i}"; i + 1 }.lazy
  .take(30)
  .partition(&:odd?)
  .then { |odd, even|
    p odd.first(3), even.first(3)
  }
# Prints:
# processing 1
# processing 2
# processing 3
# processing 4
# processing 5
# [1, 3, 5]
# [2, 4, 6]
```
As you might notice by the "processing" log, it only fetched the amount of entries that was required by produced enumerators.

The **drawback** would be—as my prototype implementation shows—the need of internal "buffering" (I don't think it is possible to implement lazy partition without it), but it still might be worth a shot?



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [ruby-core:106147] [Ruby master Feature#18262] Enumerator::Lazy#partition
  2021-10-22 12:09 [ruby-core:105750] [Ruby master Feature#18262] Enumerator::Lazy#partition zverok (Victor Shepelev)
  2021-10-22 12:22 ` [ruby-core:105751] " Dan0042 (Daniel DeLorme)
  2021-11-18  4:59 ` [ruby-core:106112] " knu (Akinori MUSHA)
@ 2021-11-18 15:04 ` Dan0042 (Daniel DeLorme)
  2021-11-18 16:04 ` [ruby-core:106150] " knu (Akinori MUSHA)
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Dan0042 (Daniel DeLorme) @ 2021-11-18 15:04 UTC (permalink / raw)
  To: ruby-core

Issue #18262 has been updated by Dan0042 (Daniel DeLorme).


I don't think buffering is a problem. If you consider that the current implementation fully buffers the two sub-lists before returning them, the lazy implementation literally cannot cause more buffering than the current non-lazy implementation.

----------------------------------------
Feature #18262: Enumerator::Lazy#partition
https://bugs.ruby-lang.org/issues/18262#change-94748

* Author: zverok (Victor Shepelev)
* Status: Open
* Priority: Normal
----------------------------------------
(Part of my set of proposals about making `.lazy` more useful/popular.)

Currently:
```ruby
file = File.open('very-large-file.txt')
lines_with_errors, lines_without_errors = file.lazy.partition { _1.start_with?('E:') }
lines_with_errors.class
# => Array, all file is read by this moment
```
This might be not very practical performance-wise and memory-wise.

I am thinking that maybe returning a pair of lazy enumerators might be a good addition to `Enumerator::Lazy`

Naive prototype:

```ruby
class Enumerator::Lazy
  def partition(&block)
    buffer1 = []
    buffer2 = []
    source = self

    [
      Enumerator.new { |y|
        loop do
          if buffer1.empty?
            begin
              item = source.next
              if block.call(item)
                y.yield(item)
              else
                buffer2.push(item)
              end
            rescue StopIteration
              break
            end
          else
            y.yield buffer1.shift
          end
        end
      }.lazy,
      Enumerator.new { |y|
        loop do
          if buffer2.empty?
            begin
              item = source.next
              if !block.call(item)
                y.yield(item)
              else
                buffer1.push(item)
              end
            rescue StopIteration
              break
            end
          else
            y.yield buffer2.shift
          end
        end
      }.lazy
    ]
  end
end
```
Testing it:
```ruby
Enumerator.produce(1) { |i| puts "processing #{i}"; i + 1 }.lazy
  .take(30)
  .partition(&:odd?)
  .then { |odd, even|
    p odd.first(3), even.first(3)
  }
# Prints:
# processing 1
# processing 2
# processing 3
# processing 4
# processing 5
# [1, 3, 5]
# [2, 4, 6]
```
As you might notice by the "processing" log, it only fetched the amount of entries that was required by produced enumerators.

The **drawback** would be—as my prototype implementation shows—the need of internal "buffering" (I don't think it is possible to implement lazy partition without it), but it still might be worth a shot?



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [ruby-core:106150] [Ruby master Feature#18262] Enumerator::Lazy#partition
  2021-10-22 12:09 [ruby-core:105750] [Ruby master Feature#18262] Enumerator::Lazy#partition zverok (Victor Shepelev)
                   ` (2 preceding siblings ...)
  2021-11-18 15:04 ` [ruby-core:106147] " Dan0042 (Daniel DeLorme)
@ 2021-11-18 16:04 ` knu (Akinori MUSHA)
  2021-11-19 20:28 ` [ruby-core:106182] " Eregon (Benoit Daloze)
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: knu (Akinori MUSHA) @ 2021-11-18 16:04 UTC (permalink / raw)
  To: ruby-core

Issue #18262 has been updated by knu (Akinori MUSHA).


Dan0042 (Daniel DeLorme) wrote in #note-3:
> I don't think buffering is a problem. If you consider that the current implementation fully buffers the two sub-lists before returning them, the lazy implementation literally cannot cause more buffering than the current non-lazy implementation.

Buffering may be a whole new idea in the history of Enumerator::Lazy, so it won't hurt to document that the new `Lazy#partition` implementation can consume a lot of memory.

----------------------------------------
Feature #18262: Enumerator::Lazy#partition
https://bugs.ruby-lang.org/issues/18262#change-94751

* Author: zverok (Victor Shepelev)
* Status: Open
* Priority: Normal
----------------------------------------
(Part of my set of proposals about making `.lazy` more useful/popular.)

Currently:
```ruby
file = File.open('very-large-file.txt')
lines_with_errors, lines_without_errors = file.lazy.partition { _1.start_with?('E:') }
lines_with_errors.class
# => Array, all file is read by this moment
```
This might be not very practical performance-wise and memory-wise.

I am thinking that maybe returning a pair of lazy enumerators might be a good addition to `Enumerator::Lazy`

Naive prototype:

```ruby
class Enumerator::Lazy
  def partition(&block)
    buffer1 = []
    buffer2 = []
    source = self

    [
      Enumerator.new { |y|
        loop do
          if buffer1.empty?
            begin
              item = source.next
              if block.call(item)
                y.yield(item)
              else
                buffer2.push(item)
              end
            rescue StopIteration
              break
            end
          else
            y.yield buffer1.shift
          end
        end
      }.lazy,
      Enumerator.new { |y|
        loop do
          if buffer2.empty?
            begin
              item = source.next
              if !block.call(item)
                y.yield(item)
              else
                buffer1.push(item)
              end
            rescue StopIteration
              break
            end
          else
            y.yield buffer2.shift
          end
        end
      }.lazy
    ]
  end
end
```
Testing it:
```ruby
Enumerator.produce(1) { |i| puts "processing #{i}"; i + 1 }.lazy
  .take(30)
  .partition(&:odd?)
  .then { |odd, even|
    p odd.first(3), even.first(3)
  }
# Prints:
# processing 1
# processing 2
# processing 3
# processing 4
# processing 5
# [1, 3, 5]
# [2, 4, 6]
```
As you might notice by the "processing" log, it only fetched the amount of entries that was required by produced enumerators.

The **drawback** would be—as my prototype implementation shows—the need of internal "buffering" (I don't think it is possible to implement lazy partition without it), but it still might be worth a shot?



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [ruby-core:106182] [Ruby master Feature#18262] Enumerator::Lazy#partition
  2021-10-22 12:09 [ruby-core:105750] [Ruby master Feature#18262] Enumerator::Lazy#partition zverok (Victor Shepelev)
                   ` (3 preceding siblings ...)
  2021-11-18 16:04 ` [ruby-core:106150] " knu (Akinori MUSHA)
@ 2021-11-19 20:28 ` Eregon (Benoit Daloze)
  2021-11-20  4:12 ` [ruby-core:106187] " Dan0042 (Daniel DeLorme)
  2021-11-20 10:17 ` [ruby-core:106189] " zverok (Victor Shepelev)
  6 siblings, 0 replies; 8+ messages in thread
From: Eregon (Benoit Daloze) @ 2021-11-19 20:28 UTC (permalink / raw)
  To: ruby-core

Issue #18262 has been updated by Eregon (Benoit Daloze).


Mmh, I think the whole point of Enumerator::Lazy is it uses constant memory and does not buffer arbitrary amount of elements.

Shouldn't it simply behave like:
```ruby
class Enumerator::Lazy
  def partition(&block)
    [select(&block), reject(&block)]
  end
end
```

No buffering, no surprises, consistent with general Lazy methods.
The block will be executed 2 times per element if both enumerators iterate that element.
That seems completely fine since it's a predicate, and expected given you'd do the same with `select`/`reject` manually.

----------------------------------------
Feature #18262: Enumerator::Lazy#partition
https://bugs.ruby-lang.org/issues/18262#change-94790

* Author: zverok (Victor Shepelev)
* Status: Open
* Priority: Normal
----------------------------------------
(Part of my set of proposals about making `.lazy` more useful/popular.)

Currently:
```ruby
file = File.open('very-large-file.txt')
lines_with_errors, lines_without_errors = file.lazy.partition { _1.start_with?('E:') }
lines_with_errors.class
# => Array, all file is read by this moment
```
This might be not very practical performance-wise and memory-wise.

I am thinking that maybe returning a pair of lazy enumerators might be a good addition to `Enumerator::Lazy`

Naive prototype:

```ruby
class Enumerator::Lazy
  def partition(&block)
    buffer1 = []
    buffer2 = []
    source = self

    [
      Enumerator.new { |y|
        loop do
          if buffer1.empty?
            begin
              item = source.next
              if block.call(item)
                y.yield(item)
              else
                buffer2.push(item)
              end
            rescue StopIteration
              break
            end
          else
            y.yield buffer1.shift
          end
        end
      }.lazy,
      Enumerator.new { |y|
        loop do
          if buffer2.empty?
            begin
              item = source.next
              if !block.call(item)
                y.yield(item)
              else
                buffer1.push(item)
              end
            rescue StopIteration
              break
            end
          else
            y.yield buffer2.shift
          end
        end
      }.lazy
    ]
  end
end
```
Testing it:
```ruby
Enumerator.produce(1) { |i| puts "processing #{i}"; i + 1 }.lazy
  .take(30)
  .partition(&:odd?)
  .then { |odd, even|
    p odd.first(3), even.first(3)
  }
# Prints:
# processing 1
# processing 2
# processing 3
# processing 4
# processing 5
# [1, 3, 5]
# [2, 4, 6]
```
As you might notice by the "processing" log, it only fetched the amount of entries that was required by produced enumerators.

The **drawback** would be—as my prototype implementation shows—the need of internal "buffering" (I don't think it is possible to implement lazy partition without it), but it still might be worth a shot?



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [ruby-core:106187] [Ruby master Feature#18262] Enumerator::Lazy#partition
  2021-10-22 12:09 [ruby-core:105750] [Ruby master Feature#18262] Enumerator::Lazy#partition zverok (Victor Shepelev)
                   ` (4 preceding siblings ...)
  2021-11-19 20:28 ` [ruby-core:106182] " Eregon (Benoit Daloze)
@ 2021-11-20  4:12 ` Dan0042 (Daniel DeLorme)
  2021-11-20 10:17 ` [ruby-core:106189] " zverok (Victor Shepelev)
  6 siblings, 0 replies; 8+ messages in thread
From: Dan0042 (Daniel DeLorme) @ 2021-11-20  4:12 UTC (permalink / raw)
  To: ruby-core

Issue #18262 has been updated by Dan0042 (Daniel DeLorme).


I wouldn't say that constant memory is the "whole point" of Enumerator::Lazy. It's more about performing the minimum amount of computation needed, only when needed. Executing the block twice for each element is not minimal, and much more surprising to me than any amount of buffering. What if there are side effects in the block?

----------------------------------------
Feature #18262: Enumerator::Lazy#partition
https://bugs.ruby-lang.org/issues/18262#change-94795

* Author: zverok (Victor Shepelev)
* Status: Open
* Priority: Normal
----------------------------------------
(Part of my set of proposals about making `.lazy` more useful/popular.)

Currently:
```ruby
file = File.open('very-large-file.txt')
lines_with_errors, lines_without_errors = file.lazy.partition { _1.start_with?('E:') }
lines_with_errors.class
# => Array, all file is read by this moment
```
This might be not very practical performance-wise and memory-wise.

I am thinking that maybe returning a pair of lazy enumerators might be a good addition to `Enumerator::Lazy`

Naive prototype:

```ruby
class Enumerator::Lazy
  def partition(&block)
    buffer1 = []
    buffer2 = []
    source = self

    [
      Enumerator.new { |y|
        loop do
          if buffer1.empty?
            begin
              item = source.next
              if block.call(item)
                y.yield(item)
              else
                buffer2.push(item)
              end
            rescue StopIteration
              break
            end
          else
            y.yield buffer1.shift
          end
        end
      }.lazy,
      Enumerator.new { |y|
        loop do
          if buffer2.empty?
            begin
              item = source.next
              if !block.call(item)
                y.yield(item)
              else
                buffer1.push(item)
              end
            rescue StopIteration
              break
            end
          else
            y.yield buffer2.shift
          end
        end
      }.lazy
    ]
  end
end
```
Testing it:
```ruby
Enumerator.produce(1) { |i| puts "processing #{i}"; i + 1 }.lazy
  .take(30)
  .partition(&:odd?)
  .then { |odd, even|
    p odd.first(3), even.first(3)
  }
# Prints:
# processing 1
# processing 2
# processing 3
# processing 4
# processing 5
# [1, 3, 5]
# [2, 4, 6]
```
As you might notice by the "processing" log, it only fetched the amount of entries that was required by produced enumerators.

The **drawback** would be—as my prototype implementation shows—the need of internal "buffering" (I don't think it is possible to implement lazy partition without it), but it still might be worth a shot?



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [ruby-core:106189] [Ruby master Feature#18262] Enumerator::Lazy#partition
  2021-10-22 12:09 [ruby-core:105750] [Ruby master Feature#18262] Enumerator::Lazy#partition zverok (Victor Shepelev)
                   ` (5 preceding siblings ...)
  2021-11-20  4:12 ` [ruby-core:106187] " Dan0042 (Daniel DeLorme)
@ 2021-11-20 10:17 ` zverok (Victor Shepelev)
  6 siblings, 0 replies; 8+ messages in thread
From: zverok (Victor Shepelev) @ 2021-11-20 10:17 UTC (permalink / raw)
  To: ruby-core

Issue #18262 has been updated by zverok (Victor Shepelev).


@Eregon your version is not the same for effectfull enumerators (where `.lazy` is extremely useful):

```
require 'stringio'

str = StringIO.new(<<~ROWS)
1: OK
2: Err
3: OK
4: Err
5: OK
6: Err
ROWS


err, ok = str.each_line(chomp: true).lazy.partition { _1.include?('Err') }

p [err.first(2), ok.first(2)]
# mine: [["2: Err", "4: Err"], ["1: OK", "3: OK"]]
# yours: [["2: Err", "4: Err"], ["5: OK"]]
```
...because yours is consuming both kinds of rows while producing `err`s.

I agree that if not a "whole point", the "it consumes much less memory" is an implicit expectation of a lazy enumerator. But I believe having (well-documented) quirk in `partition` is better than not having lazy `partition` at all.

----------------------------------------
Feature #18262: Enumerator::Lazy#partition
https://bugs.ruby-lang.org/issues/18262#change-94796

* Author: zverok (Victor Shepelev)
* Status: Open
* Priority: Normal
----------------------------------------
(Part of my set of proposals about making `.lazy` more useful/popular.)

Currently:
```ruby
file = File.open('very-large-file.txt')
lines_with_errors, lines_without_errors = file.lazy.partition { _1.start_with?('E:') }
lines_with_errors.class
# => Array, all file is read by this moment
```
This might be not very practical performance-wise and memory-wise.

I am thinking that maybe returning a pair of lazy enumerators might be a good addition to `Enumerator::Lazy`

Naive prototype:

```ruby
class Enumerator::Lazy
  def partition(&block)
    buffer1 = []
    buffer2 = []
    source = self

    [
      Enumerator.new { |y|
        loop do
          if buffer1.empty?
            begin
              item = source.next
              if block.call(item)
                y.yield(item)
              else
                buffer2.push(item)
              end
            rescue StopIteration
              break
            end
          else
            y.yield buffer1.shift
          end
        end
      }.lazy,
      Enumerator.new { |y|
        loop do
          if buffer2.empty?
            begin
              item = source.next
              if !block.call(item)
                y.yield(item)
              else
                buffer1.push(item)
              end
            rescue StopIteration
              break
            end
          else
            y.yield buffer2.shift
          end
        end
      }.lazy
    ]
  end
end
```
Testing it:
```ruby
Enumerator.produce(1) { |i| puts "processing #{i}"; i + 1 }.lazy
  .take(30)
  .partition(&:odd?)
  .then { |odd, even|
    p odd.first(3), even.first(3)
  }
# Prints:
# processing 1
# processing 2
# processing 3
# processing 4
# processing 5
# [1, 3, 5]
# [2, 4, 6]
```
As you might notice by the "processing" log, it only fetched the amount of entries that was required by produced enumerators.

The **drawback** would be—as my prototype implementation shows—the need of internal "buffering" (I don't think it is possible to implement lazy partition without it), but it still might be worth a shot?



-- 
https://bugs.ruby-lang.org/

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-11-20 10:17 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-22 12:09 [ruby-core:105750] [Ruby master Feature#18262] Enumerator::Lazy#partition zverok (Victor Shepelev)
2021-10-22 12:22 ` [ruby-core:105751] " Dan0042 (Daniel DeLorme)
2021-11-18  4:59 ` [ruby-core:106112] " knu (Akinori MUSHA)
2021-11-18 15:04 ` [ruby-core:106147] " Dan0042 (Daniel DeLorme)
2021-11-18 16:04 ` [ruby-core:106150] " knu (Akinori MUSHA)
2021-11-19 20:28 ` [ruby-core:106182] " Eregon (Benoit Daloze)
2021-11-20  4:12 ` [ruby-core:106187] " Dan0042 (Daniel DeLorme)
2021-11-20 10:17 ` [ruby-core:106189] " zverok (Victor Shepelev)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).