Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: improve channels for M:N producer/consumer scenarios #14601

Closed
azdagron opened this issue Mar 2, 2016 · 11 comments
Closed

proposal: improve channels for M:N producer/consumer scenarios #14601

azdagron opened this issue Mar 2, 2016 · 11 comments

Comments

@azdagron
Copy link

azdagron commented Mar 2, 2016

Channels work pretty well when you have 1 sender and N receivers, where the sender is producing some finite amount of work (afterwhich it closes the channel) and the receivers consume until all work is done (i.e. the channel is closed).

Practically speaking though, it isn't uncommon to have M senders. When signalling to receivers that there is nothing more being produced, the senders have to coordinate using some outside synchronization mechanism that all senders are done and the channel is safe to close. Forgetting to do this either leads to hung receivers or panics due to multiple closes on a single channel.

In a similar vein, it also isn't uncommon to want to only have a sender producing if there are still receivers interested in consuming. Facilitating this also requires keeping track of each receiver and using some outside synchronization primitive to signal to the sender that it can stop producing.

To solve the multiple producer problem

  1. Every channel has both a recv and send reference count. Each count is initialized to 1 when make returns a channel.
  2. A channel is considered closed when either reference count drops to zero.
  3. A dup() built-in is used to increment reference counts on a channel. If a directional channel is passed into dup(), then only the reference count for that direction is incremented. dup() returns the passed in channel.
  4. Calling close() decrements the reference counts on a channel. If a directional channel is passed into close(), then only the reference count for that direction is decremented.
  5. Calling close() does not panic as long as there is a reference count > 0 on the direction being closed. This makes it safe to always close what you referenced (via make() or dup()).

To solve the consumers signaling producers to stop problem

  1. A new way to send down a channel that doesn't panic if the channel is closed. This can be done with recover but that does not compose well with for-select loops. Proposed syntax: ok := val -> ch
  2. Recv directional channels must now be closable.

Channels now more closely mirror unix pipes, where each side can be closed independently.

Examples

/////////////////////////////////////////////
// multiple producers (no close coordination required)
/////////////////////////////////////////////

func producer1(duped chan<- int) {
    defer close(duped)

    duped <- 1
    duped <- 2
    duped <- 3
}

func producer2(duped chan<- int) {
    defer close(duped)

    duped <- 4
    duped <- 5
    duped <- 6
}

func main() {
    ch := make(chan int)
    go producer1(dup(ch))
    go producer2(dup(ch))
    close(ch)

    // consume all
    for range ch {
    }
}

/////////////////////////////////////////////
// produce until no consumers
/////////////////////////////////////////////

func produceUntilNobodyConsuming(ch chan<- int) {
    var n int
    for {
        ok := n -> ch
        if !ok {
            return
        }
        n++
    }
}

func consumeN(duped <-chan int, n int) {
    defer close(duped)

    i := 0
    for range duped {
        i++
        if i == n {
            return
        }
    }
}

func main() {
    ch := make(chan int)
    go consumeN(dup(ch), 5)
    go consumeN(dup(ch), 10)
    close(ch)
    produceUntilNobodyConsuming(ch)
}

Open issues

  1. What happens when dup() is called on a closed channel? panic? return the closed channel without incrementing reference counts? reopen the channel? Probably panic. Maybe it supports comma-ok syntax?
  2. Better syntax for non-panic channel sends?
@minux
Copy link
Member

minux commented Mar 2, 2016 via email

@azdagron
Copy link
Author

azdagron commented Mar 2, 2016

Having to turn to the reflect package for normal everyday use cases is not ideal. It also doesn't scale well when you have lots of producers (each channel is an allocation). Also, what is the algorithmic complexity of select if there are lots of channels?

It also complicates adding or removing producers after the consumers have been started, as each consumer needs to be notified of the additional producer channel and keep track of which producers have stopped.

In essence, that solution scales linearly with M and N, where reference counts are constant.

@minux
Copy link
Member

minux commented Mar 2, 2016 via email

@rsc rsc added this to the Proposal milestone Mar 2, 2016
@rsc rsc added the Proposal label Mar 2, 2016
@rsc
Copy link
Contributor

rsc commented Mar 2, 2016

There are two separate parts to this suggestion. The first is using close as a receiver->sender signal, and the second is making closes ref-counted.

Regarding close by receiver as a signal to a sender, I wrote this on #11344:

Closing the channel is a send-like communication from the senders to
the receivers that there will be no more sends. If a sender later
tries to send on the channel, then the goroutine who did the close is
clearly confused. That merits a panic, whether it is part of a select
or an ordinary operation.

The fact that close is (like ordinary sends) a communication from
sender to receiver is fundamental to its operation. It comes up
repeatedly that people want to use close as a reverse signal from
receivers to senders to say "stop sending to me". That is not what
it means. It would break the unidirectionality of channels. And at
least nine times out of ten the people who want to do this have not
completely thought through the implications of this kind of
cancellation mechanism. There almost always need to be two steps in a
cancellation: a request for the cancellation and an acknowledgement
that work has in fact stopped. Close can serve as the latter; it
cannot serve as both, and we make it as hard as possible for people
to do that accidentally.

This insistence that close is a sender -> receiver communication and
not the reverse is also the reason why you cannot close a
receive-only channel.

This is still my response to trying to use close as a signal from receiver to sender. At the very least that only makes sense for unbuffered channels, and even then you still have to worry about every sender knowing that the send might fail.

Ref-counting the closes (on the sender side) is an interesting idea. If it were early in the language design it might be worth doing. But we fundamentally have to raise the bar for language additions as time goes on, to avoid ending up with a language that grows linearly with time since being created. If we allowed such growth, one of Go's key benefits - being small and easy to understand - would be lost. At this point in Go's development, there would need to be a compelling case that this feature is very commonly needed. I don't see evidence for that.

Also, ref-counting sender-side closes is not much code today: http://play.golang.org/p/ZPsg-tDZrR.

@azdagron
Copy link
Author

azdagron commented Mar 2, 2016

I hadn't considered closing on buffered channels, but I believe the proposal is still good. If all readers are closed (implying nobody is going to receive on the channel ever again), then sends on a closed buffered channel can still act like they do today, i.e. panic. All the proposal enables is for senders to know that no receives are going to happen ever again on that channel.

I'm totally for the unidirectionality of data across channels. However, if you take a hard line for unidirectionality of control you have to create outside mechanisms for bidirectional signalling. You don't have to look far to see great examples of the usefulness of bidirectional signalling. Can you imagine a world where you had to signal to a TCP server, or the other end of a unix pipe, that you were no longer going to be receiving data by using an out of band channel? That would certainly increase the complexity of using those transports. That is the story we have with channels today.

As for the evidence that sender side ref counting isn't commonly needed, you only need to hit the blogosphere for a short time to see the mounting complaints against channels when trying to compose them in a way that allows for clean tear down in practical use cases. Maybe I'll compile a list.

The code example that implements ref counting on top of channels is great. My complaint is if you embrace channels in your architecture, you end up with all sorts of channel types. You get to repeat that code for every channel type. queue discussion on generics. :trollface:

I love the conservative approach to raising the bar on language changes. Being able to fit Go in one's head is an attractive part of the language. I believe that this proposal helps make channels more widely useful given the cost of including them in the language in the first place.

@azdagron
Copy link
Author

azdagron commented Mar 3, 2016

Added a number 5 to the proposal above to clarify close() semantics.

@rsc
Copy link
Contributor

rsc commented Mar 3, 2016

I hadn't considered closing on buffered channels, but I believe the proposal is still good.

If you do a receiver-side close, then when a future send fails, the sender can handle that failure by doing something else. But if you do a receiver-side close on a buffered channel, there are past sends into the buffer that succeeded that are now effectively failing after the fact, unless the semantics is that you can close the receiver side of the channel and then still receive from it to drain the buffer. Certainly that's not the semantics of TCP sockets or Unix pipes.

You don't have to look far to see great examples of the usefulness of bidirectional signalling.

True, but you also don't have to work hard to create two channels.

Can you imagine a world where you had to signal to a TCP server, or the other end of a unix pipe, that you were no longer going to be receiving data by using an out of band channel?

Isn't that the world we live in? By default when the receiver closes its end of a TCP connection or Unix pipe, a future send on that connection causes the sender to be killed by a SIGPIPE. In general TCP connections are bidirectional (more like two channels), and most TCP-based protocols do use an explicit message to end the protocol instead of just closing the receive side of one direction.

The analogy to TCP connections and Unix pipes is further confusing since they do not have the dup operation described in the proposal, unless you count giving each reader and writer its own fd (which is semantically a bit different).

@azdagron
Copy link
Author

azdagron commented Mar 3, 2016

If the producer is sending on a buffered channel, it already can't make any assumptions about data that was successfully "sent" on the channel, since there are no guarantees anybody received or ever will receive it. Two way communication is already a requirement for a sender to guarantee that something sent down a channel was "handled" by a receiver, orthogonal to whether a channel is buffered or unbuffered (just because somebody received it, doesn't mean it was "handled"). I don't see how this proposal changes anything except to provide a convenient way for senders to know that nobody will ever receive again so don't bother continuing to produce.

I concede that relating channels to sockets or pipes isn't a perfect analogy. However, I do feel that current channel semantics cause unnecessary friction, and people end up building up abstractions around them over and over again to facilitate producer cleanup. Considering the lack of generics, you end up with a ton of mostly non-reusable boilerplate littered all over the code which comes with all of the joys of maintenance hell.

I'm not married to my proposal. If somebody can come up with a cleaner way to provide two way signalling between senders and receivers, I'd be pleased as punch. I do feel that until there is such a mechanism, channels will be a shadow of their potential glory.

@egonelbre
Copy link
Contributor

Can you provide a practical real world example where you want to use this? Without real-world problem, it's hard to suggest anything concrete... or know whether the proposed solution actually solves it in "the best way" in practice.

@mattn
Copy link
Member

mattn commented Mar 10, 2016

I think using channel for this usecase make useless resoruces. And I guess most of the usecases may be solved with callback based system. ex: https://github.com/mattn/go-pubsub

@robpike
Copy link
Contributor

robpike commented Aug 15, 2016

Thanks for your thoughtful proposal but this seems too big a change to make in the language.

@robpike robpike closed this as completed Aug 15, 2016
@golang golang locked and limited conversation to collaborators Aug 15, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants