New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: improve channels for M:N producer/consumer scenarios #14601
Comments
for the M:N producer consumer problem, why not
use M channels so that each producer is in charge
of its own channel and use select to get tasks in
the N consumers? It's easy to implement arbitrary
number of select cases with reflect.Select.
|
Having to turn to the reflect package for normal everyday use cases is not ideal. It also doesn't scale well when you have lots of producers (each channel is an allocation). Also, what is the algorithmic complexity of select if there are lots of channels? It also complicates adding or removing producers after the consumers have been started, as each consumer needs to be notified of the additional producer channel and keep track of which producers have stopped. In essence, that solution scales linearly with M and N, where reference counts are constant. |
The allocation overhead is one time setup cost.
You seems to imply that using one shared channel
scales better than multiple channels. But that
might not be the case. Operations on the shared
channel creates a lot of contention on the channel
esp. when the number of producers and consumers
are big. (Note the in the current implementation,
each channel send/receive involves grabbing the
channel mutex.)
|
There are two separate parts to this suggestion. The first is using close as a receiver->sender signal, and the second is making closes ref-counted. Regarding close by receiver as a signal to a sender, I wrote this on #11344:
This is still my response to trying to use close as a signal from receiver to sender. At the very least that only makes sense for unbuffered channels, and even then you still have to worry about every sender knowing that the send might fail. Ref-counting the closes (on the sender side) is an interesting idea. If it were early in the language design it might be worth doing. But we fundamentally have to raise the bar for language additions as time goes on, to avoid ending up with a language that grows linearly with time since being created. If we allowed such growth, one of Go's key benefits - being small and easy to understand - would be lost. At this point in Go's development, there would need to be a compelling case that this feature is very commonly needed. I don't see evidence for that. Also, ref-counting sender-side closes is not much code today: http://play.golang.org/p/ZPsg-tDZrR. |
I hadn't considered closing on buffered channels, but I believe the proposal is still good. If all readers are closed (implying nobody is going to receive on the channel ever again), then sends on a closed buffered channel can still act like they do today, i.e. panic. All the proposal enables is for senders to know that no receives are going to happen ever again on that channel. I'm totally for the unidirectionality of data across channels. However, if you take a hard line for unidirectionality of control you have to create outside mechanisms for bidirectional signalling. You don't have to look far to see great examples of the usefulness of bidirectional signalling. Can you imagine a world where you had to signal to a TCP server, or the other end of a unix pipe, that you were no longer going to be receiving data by using an out of band channel? That would certainly increase the complexity of using those transports. That is the story we have with channels today. As for the evidence that sender side ref counting isn't commonly needed, you only need to hit the blogosphere for a short time to see the mounting complaints against channels when trying to compose them in a way that allows for clean tear down in practical use cases. Maybe I'll compile a list. The code example that implements ref counting on top of channels is great. My complaint is if you embrace channels in your architecture, you end up with all sorts of channel types. You get to repeat that code for every channel type. queue discussion on generics. I love the conservative approach to raising the bar on language changes. Being able to fit Go in one's head is an attractive part of the language. I believe that this proposal helps make channels more widely useful given the cost of including them in the language in the first place. |
Added a number 5 to the proposal above to clarify close() semantics. |
If you do a receiver-side close, then when a future send fails, the sender can handle that failure by doing something else. But if you do a receiver-side close on a buffered channel, there are past sends into the buffer that succeeded that are now effectively failing after the fact, unless the semantics is that you can close the receiver side of the channel and then still receive from it to drain the buffer. Certainly that's not the semantics of TCP sockets or Unix pipes.
True, but you also don't have to work hard to create two channels.
Isn't that the world we live in? By default when the receiver closes its end of a TCP connection or Unix pipe, a future send on that connection causes the sender to be killed by a SIGPIPE. In general TCP connections are bidirectional (more like two channels), and most TCP-based protocols do use an explicit message to end the protocol instead of just closing the receive side of one direction. The analogy to TCP connections and Unix pipes is further confusing since they do not have the dup operation described in the proposal, unless you count giving each reader and writer its own fd (which is semantically a bit different). |
If the producer is sending on a buffered channel, it already can't make any assumptions about data that was successfully "sent" on the channel, since there are no guarantees anybody received or ever will receive it. Two way communication is already a requirement for a sender to guarantee that something sent down a channel was "handled" by a receiver, orthogonal to whether a channel is buffered or unbuffered (just because somebody received it, doesn't mean it was "handled"). I don't see how this proposal changes anything except to provide a convenient way for senders to know that nobody will ever receive again so don't bother continuing to produce. I concede that relating channels to sockets or pipes isn't a perfect analogy. However, I do feel that current channel semantics cause unnecessary friction, and people end up building up abstractions around them over and over again to facilitate producer cleanup. Considering the lack of generics, you end up with a ton of mostly non-reusable boilerplate littered all over the code which comes with all of the joys of maintenance hell. I'm not married to my proposal. If somebody can come up with a cleaner way to provide two way signalling between senders and receivers, I'd be pleased as punch. I do feel that until there is such a mechanism, channels will be a shadow of their potential glory. |
Can you provide a practical real world example where you want to use this? Without real-world problem, it's hard to suggest anything concrete... or know whether the proposed solution actually solves it in "the best way" in practice. |
I think using channel for this usecase make useless resoruces. And I guess most of the usecases may be solved with callback based system. ex: https://github.com/mattn/go-pubsub |
Thanks for your thoughtful proposal but this seems too big a change to make in the language. |
Channels work pretty well when you have 1 sender and N receivers, where the sender is producing some finite amount of work (afterwhich it closes the channel) and the receivers consume until all work is done (i.e. the channel is closed).
Practically speaking though, it isn't uncommon to have M senders. When signalling to receivers that there is nothing more being produced, the senders have to coordinate using some outside synchronization mechanism that all senders are done and the channel is safe to close. Forgetting to do this either leads to hung receivers or panics due to multiple closes on a single channel.
In a similar vein, it also isn't uncommon to want to only have a sender producing if there are still receivers interested in consuming. Facilitating this also requires keeping track of each receiver and using some outside synchronization primitive to signal to the sender that it can stop producing.
To solve the multiple producer problem
To solve the consumers signaling producers to stop problem
ok := val -> ch
Channels now more closely mirror unix pipes, where each side can be closed independently.
Examples
Open issues
The text was updated successfully, but these errors were encountered: