I want to find a channel implementation of channel like this, but with inversed count of readers and writers, i.e. only one allowed writer with infinite number of readers. Is it exists or i need to write it manually?
The documentation page you linked to calls this kind of channels "mpsc", which stands for Multiple Producer, Single Consumer. The converse is therefore Single Producer, Multiple Consumer. Googling "rust spmc channel" leads to what you are looking for as the very first result.
Related
I need to implement the following architecture:
I have large Order that must be split into smaller order( parallel) and send to Downstream async rest end point .
Down stream ordering API publish message to a reply queue ( kafka/rabbitmq) after completing order ( failed or success)
with correlation ids .
Need to have aggregate listener to collect all the responses and send the final out put to caller.
I am thinking of using spring integration Scatter gather pattern and other useful Spring features.
Can you help me show an example of how such an architecture can be implemented with the help of Spring-integration
large Order that must be split into smaller order( parallel)
This is not what scatter-gather is designed for. Its purpose to do many requests for the same input, e.g. ask dealerships for car quote and then choose the best one for you.
What you are asking is more like a splitter-aggregator.
So, you just perform a split function on your order object and produce as many items as you need into its output channel. And this one has to be an ExecutorChannel to be able to process those splitted items in parallel.
Since you talk about a reply to the original client, you cannot make your aggregator distributed (several instances of the same application), but you will still a gain an async, parallel processing benefit just with that ExecutorChannel. don't forget to carry on a replyChannel header throughout your flow, so an aggregator in the end would know where to produce a reply.
The code should be written in C++. I'm mentioning this just in case someone will suggest a solution that won't work efficient when implementing in C++.
Objective:
Producer that runs on thread t1 inserts images to Consumer that runs on thread t2. The consumer has a list of clients that he should send the images to at various intervals. E.g. client1 requires images every 1sec, client2 requires images every 5sec and etc.
Suggested implementation:
There is one main queue imagesQ in Consumer to which Producer enqueues images to. In addition to the main queue, the Consumer manages a list of vector of queues clientImageQs of size as number of clients. The Consumer creates a sub-consumer, which runs on its own thread, for each client. Each such sub-consumer dequeues the images from a relevant queue from clientImageQs and sends images to its client at its interval.
Every time a new image arrives to imagesQ, the Consumer duplicates it and enqueus to each queue in clientImageQs. Thus, each sub-consumer will be able to send the images to its client at its own frequency.
Potential problem and solution:
If Producer enqueues images at much higher rate than one of the sub-consumers dequeues, the queue will explode. But, the Consumer can check the size of the queue in clientImageQs before enqueuing. And, if needed, Consumer will dequeue a few old images before enqueuing new ones.
Question
Is this a good design or there is a better one?
You describe the problem within a set of already determined solution limitations. Your description is complex, confusing, and I dare say, confused.
Why have a consumer that only distributes images out of a shared buffer? Why not allow each "client" as you call it read from the buffer as it needs to?
Why not implement the shared buffer as a single-image buffer. The producer writes at its rate. The clients perform non-destructive reads of the buffer at their own rate. Each client is ensured to read the most recent image in the buffer whenever the client reads the buffer. The producer simply over-writes the buffer with each write.
A multi-element queue offers no benefit in this application. In fact, as you have described, it greatly complicates the solution.
See http://sworthodoxy.blogspot.com/2015/05/shared-resource-design-patterns.html Look for the heading "unconditional buffer".
The examples in the posting listed above are all implemented using Ada, but the concepts related to concurrent design patterns are applicable to all programming languages supporting concurrency.
This is a followup to a previous thread with a similar name.
It has an accepted answer, but that answer does not really answer the question. From that thread, here is the use-case:
if len(myChannel) > 0 {
// Possible issue here: length could have changed to 0 making this blocking
elm := <- myChannel
return elm
}
The OP calls it a "Possible issue", but it's a Definite Issue: a race condition in which another consumer may have pulled a value from the channel between the evaluation of the if condition and execution of the two statements.
Now, we are told the Go Way is to favor channels over mutex, but here it seems we can not acheive even basic non-blocking read (by polling length and reading atomically) without pairing a mutex and a channel together, and using our new concurrency data type instead of a channel.
Can that be right? Is there really no way to reliably ensure a recv does not block by checking ahead for space? (Compare with BlockingQueue.poll() in Java, or similar facilities in other queue-based messaging IPC facilities...)
This is exactly what default cases in select are for:
var elm myType
select {
case elm = <-myChannel:
default:
}
return elm
This assigns elm if it can, and otherwise returns a zero value. See "A leaky buffer" from Effective Go for a somewhat more extensive example.
Rob Napier's answer is correct.
However, you are possibly trying too hard to achieve non-blocking behaviour, assuming that it is an anti-pattern.
With Go, you don't have to worry about blocking. Go ahead, block without guilt. It can make code much easier to write, especially when dealing with i/o.
CSP allows you to design data-driven concurrent programs that can scale very well (because of not using mutexes too much). Small groups of goroutines communicating via channels can behave like a component of a larger system; these components (also communicating via channels) can be grouped into larger components; this pattern repeats at increasing scales.
Conventionally, people start with sequential code and then try to add concurrency by adding goroutines, channels, mutexes etc. As an exercise, try something different: try designing a system to be maximally concurrent - use goroutines and channels as deeply as you possibly can. You might be unimpressed with the performance you achieve ... so then perhaps try to consider how to improve it by combining (rather than dividing) blocks, reducing the total number of goroutines and so achieving a more optimal concurrency.
Right now I'm using rabbitMQ to send data between two programs - 1 queue, 1 channel, 1 exchange. I'm going to extend it to be multithreaded and I want to declare another queue on the second thread.
I understand in this case I should use another channel, but what I would like to know is is it necessary to declare another exchange with a different name as well?
What exactly is the relationship between the two?
In what kind of situation would you need multiple exchanges?
As you figured out, the channel is the communication end point used to reach a rabbitMQ object.
There are so far 2 kinds of objects:
Queues, which are simply buffers for messages,
Exchanges, which are broadcasting devices.
As a simple analogy, queues are the pipes in which messages get accumulated, and exchanges are the T shaped, cross shaped and other types of connectors between pipes.
Perhaps it works better to compare it with a physical network, where queues would be like cables, and exchanges like switches or smart hubs, for which different distribution strategies can be chosen.
So basically, what you need to do is simply create a new queue from the new consumer thread, have it connect to the exchange object (for which you should have a name), and let the producer thread send its messages to the exchange object exclusively. Any new thread may follow the same protocol.
The last important point is to pick the correct distribution method for the exchange object, round-robin, fanout, etc. depending on how you wish your consumers to receive messages.
Take a look at our introduction to AMQP concepts
http://www.rabbitmq.com/tutorials/amqp-concepts.html
I am using a producer / consumer pattern backed with a BlockingCollection to read data off a file, parse/convert and then insert into a database. The code I have is very similar to what can be found here: http://dhruba.name/2012/10/09/concurrent-producer-consumer-pattern-using-csharp-4-0-blockingcollection-tasks/
However, the main difference is that my consumer threads not only parse the data but also insert into a database. This bit is slow, and I think is causing the threads to block.
In the example, there are two consumer threads. I am wondering if there is a way to have the number of threads increase in a somewhat intelligent way? I had thought a threadpool would do this, but can't seem to grasp how that would be done.
Alternatively, how would you go about choosing the number of consumer threads? 2 does not seem correct for me, but I'm not sure what the best # would be. Thoughts on the best way to choose # of consumer threads?
The best way to choose the number of consumer threads is math: figure out how many packets per minute are coming in from the producers, divide that by how many packets per minute a single consumer can handle, and you have a pretty good idea of how many consumers you need.
I solved the blocking output problem (consumers blocking when trying to update the database) by adding another BlockingCollection that the consumers put their completed packets in. A separate thread reads that queue and updates the database. So it looks something like:
input thread(s) => input queue => consumer(s) => output queue => output thread
This has the added benefit of divorcing the consumers from the output, meaning that you can optimize the output or completely change the output method without affecting the consumer. That might allow you, for example, to batch the database updates so that rather than making one database call per record, you could update a dozen or a hundred (or more) records with a single call.
I show a very simple example of this (using a single consumer) in my article Simple Multithreading, Part 2. That works with a text file filter, but the concepts are the same.