Is there an inbuilt or canonical way in Rust to ‘publish’ state? - multithreading

Is there any canonical way in Rust to publish a frequently updating ‘state’ such that any number of consumers can read it without providing access to the object itself?
My use case is that I have a stream of information coming in via web socket and wish to have aggregated metrics available to consume by other threads. One could do this externally with something like Kafka, and I could probably roll my own internal solution but wondering if there is any other method?
An alternative which I’ve used in Go is to have consumers register themselves with the producer and each receive a channel, with the producer simply publishing to each channel separately. There will generally be a low number of consumers so this may well work, but wondering if there’s anything better.

It sounds like you want a "broadcast channel".
If you're using async, the popular tokio crate provides an implementation in their sync::broadcast module:
A multi-producer, multi-consumer broadcast queue. Each sent value is seen by all consumers.
A Sender is used to broadcast values to all connected Receiver values. Sender handles are clone-able, allowing concurrent send and receive actions. [...]
[...]
New Receiver handles are created by calling Sender::subscribe. The returned Receiver will receive values sent after the call to subscribe.
If that doesn't quite suit your fancy, there are other crates that provide similar types that can be found by searching for "broadcast" on crates.io.

Related

Is it possible to make a Poller (or PollableMessageSource) to poll messages as List?

Following the example found in GitHub https://github.com/spring-cloud/spring-cloud-gcp/tree/master/spring-cloud-gcp-samples/spring-cloud-gcp-pubsub-polling-binder-sample regarding polling messages from a PubSub subscription, I was wondering...
Is it possible to make a PollableMessageSource retrieve List<Message<?>> instead of a single message per poll?
I've seen the #Poller notation only being used in Source typed objects, never in Processor or Sink. Is it possible to use in such context when for example using #StreamListener or with a functional approach?
The PollableMessageSource binding and Source stream applications are fully based on the Poller and MessageSource abstraction from Spring Integration where its contract is to produce a single message to the channel configured. The point of the messaging is really to process a single message not affecting others. The failure for one message doesn't mean to fail others in the flow.
On the other hand you probably mean GCP Pub/Sub messages to be produced as a list in the Spring message payload. That is really possible, but via some custom code from Pub/Sub consumer and MessageSource impl. Although I would think twice to expect some batched from the source. Probably you may utilize an aggregator to build some small windows if your further logic is about processing as list. But again: it is going to be a single Spring message.
May be better to start thinking about a reactive function implementation where you indeed can expect a Flux<Message<?>> as an input and Spring Cloud Stream framework will take care for you how to emit the data from Pub/Sub into the reactive stream you expect.
See more info in docs: https://docs.spring.io/spring-cloud-stream/docs/3.1.0/reference/html/spring-cloud-stream.html#_reactive_functions_support

Architecture issue - Azure servicebus and message order guarantee

Ok so i'm relatively new to the servicebus. Working on a project where we use Azure servicebus for queueing messages. Our architecture roughly looks like the following:
So the idea is that in our SourceSystem all kinds of stuff happens, which leads to messages being put on the servicebustopics. Now our responsibility is syncing these events to the external client so they are aware of what we are doing.
Now the issue is that currently we dont use servicebus sessions so message order isnt guaranteed. Also consider the following scenario:
OrderCreated
OrderUpdate 1
OrderUpdate 2
OrderClosed
What happens now is if the externalclients API is down for say OrderUpdate 1 and OrderUpdate 2, we could potentially send the messages in order: OrderCreated, OrderClosed, OrderUpdate 1, OrderUpdate 2.
Currently we just retry a message a few times and then it moves into the deadletter queue for manual reprocessing.
What steps should we take to better guarantee message order? I feel like in the scope of an order, message order needs to be guaranteed.
Should we force the sourcesystem to put all messages for a order in a servicebus session? But how can we handle this with multiple topics? And what do we do if message 1 from a session ends up in the deadletter?
There are a lot of considerations here, should we use a single topic so its easier to manage the sessions? But this opens up other problems with different message structures being in a single topic?
Id love to hear your opinions on this
Have a look at Durable Functions in Azure. You can use the 'Async Http API' or one of the other patterns to achieve the orchestration you need to do.
NServicebus' Sagas might also be a good option, here is an article that does a very good comparison between NServicebus and Durable Functions.
If the external client has to receive all those events and order matters, sending those messages to multiple topics where a topic is per message type will make your mission extremely hard to accomplish. For ordered messaging first you need to use a single entity (queue or topic) with Sessions enabled. That way you can guarantee ordered message processing. In case you have multiple external clients, you'd need to have a session-enabled entity (topic) per external client.
Another option is to implement a pattern known as Process Manager. The process manager would be responsible to make the decisions about the incoming messages and conclude when the work for a given order is completed or not.
There are also libraries (MassTransit, NServiceBus, etc) that can help you. NServiceBus implements Process Manager via a feature called Saga (tutorial) and MassTransit has it as well (documentation).

Immutable message in Spring Integration

I was wondering what is the reasoning behind making messages immutable in Spring Integration.
Is it only because of thread-safety in multi threaded evnironments?
Performance? Don't you get a performance penalization when you have to create a new message each time you want to add something to an existing message?
Avoiding a range of bugs when passing by reference?
Just guessing here.
The simplest way to explain this comes from the original Java Immutable Objects idea:
Immutable objects are particularly useful in concurrent applications. Since they cannot change state, they cannot be corrupted by thread interference or observed in an inconsistent state.
Since we talk here about Messaging we should always keep in mind the Loose coupling principle where the producer (caller) and consumer (executor) know nothing about each other and they communicate only via messages (events, commands, packages etc.). At the same time the same message may have several consumers to perform absolutely not related business logics. So, supporting immutable state for the active object we don't impact one process from another. That's might be also as a part of the security between processes when we execute a message in isolation.
The Spring Integration is really pure Java, so any concurrency and security restrictions just simply applied here as well and you would be surprised distributing a message to different independent processes and see modifications from one process in the other.
There is some information in the Reference Manual:
Therefore, when a Message instance is sent to multiple consumers (e.g. through a Publish Subscribe Channel), if one of those consumers needs to send a reply with a different payload type, it will need to create a new Message. As a result, the other consumers are not affected by those changes.
As you see it is applied for Message object per se and its MessageHeaders. The payload is fully your responsibility and I really had in past some problems adding and removing elements to the ArrayList payload in multi-threaded business logic.
Anyway the Framework suggest a compromise: MutableMessage, MutableMessageHeaders and MutableMessageBuilder. You also can globally override the MessageBuilder used in the Framework internally to the MutableMessageBuilderFactory. For this purpose you just need to register such a bean with the bean name IntegrationUtils.INTEGRATION_MESSAGE_BUILDER_FACTORY_BEAN_NAME:
#Bean(name = IntegrationUtils.INTEGRATION_MESSAGE_BUILDER_FACTORY_BEAN_NAME)
public static MessageBuilderFactory mutableMessageBuilderFactory() {
return new MutableMessageBuilderFactory();
}
And all messages in your integration flows will be mutable and supply the same id and timestamp headers.

zmq: can multiple threads PUSH in a simple PUSH-PULL pattern

I have two processes: a producer which pushes messages via ZMQ to a consumer in a simple PULL-PUSH point-to-point pattern. The producer has several internal threads that send() via zmq. However, 0MQ's docs suggest not to share sockets between threads.
Must I use a single thread to send?
Assuming there is no strict requirement for keeping the sending order between the threads, doesn't the fact that the socket is a one-directional simplex allow multiple threads to use it without introducing locks?
The easiest thing to do is to create a separate PUSH socket on each of producer's threads and connect all these sockets to a single PULL socket in consumer.
It's explicitly stated in the guide that ZeroMQ sockets must be used on a single thread. I'd say that violating this requirement is not a good idea, even if it seems to work: things may break in the next version of the library or on some specific platform or in some specific load scenario. So, it's just too risky.

Distributed pub/sub with single consumer per message type

I have no clue if it's better to ask this here, or over on Programmers.SE, so if I have this wrong, please migrate.
First, a bit about what I'm trying to implement. I have a node.js application that takes messages from one source (a socket.io client), and then does processing on the message, which might result in zero or more messages back out, either to the sender, or other clients within that group.
For the processing, I would like to essentially just shove the message into a queue, then it works its way through various message processors that might kick off their own items, and eventually, the bit running socket.io is informed "Hey, send this message back"
As a concrete example, say a user signs into the service, that sign in message is then placed in the queue, where the authorization processor gets it, does it's thing, then places a message back in the queue saying the client's been authorized. This goes back to the socket.io socket that is connected to the client, along with other clients that might be interested. It can also go to other subsystems that might want to do more processing on authorization (looking up user info, sending more info to the client based on their data, etc).
If I wanted strong coupling, this would be easy, but I tried that before, and it just goes to a mess of spaghetti code that's very fragile, and I would like to avoid that. Another wrench in the setup is this should be cluster-able, which is where the real problem comes in. There might be more than one, say, authorization processor running. But the authorization message should be processed only once.
So, in short, I'm looking for a pattern/technique that will allow me to, essentially, have multiple "groups" of subscribers for a message, and the message will be processed only once per group.
I thought about maybe having each instance of a processor generate a unique name that would be used as a list in Reids. This name would then be registered with some sort of dispatch handler, and placed into a set for that group of subscribers. Then when a message arrives, the dispatch pulls a random member out of that set, and places it into that list. While it seems like this would work, it seems somewhat over-complicated and fragile.
The core problem is I've never designed a system like this, so I'm not even sure the proper terms to use or look up. If anyone can point me in the right direction for this, I would be most appreciative.
I think what your describing is similar to https://www.getbridge.com/ service. I it but ended up writing my own based on zeromq, it allows you to register services, req -> <- rec and channels which are pub / sub workers.
As for the design, I used a client -> broker -> services & channels which are all plug and play using auto discovery, you have the services register their schema with the brokers who open a tcp connection so that brokers on other servers can communicate with that broker groups services. Then internal services and clients connect via unix sockets or ipc channels which ever is preferred.
I ended up wrapping around the redis publish/subscribe functions a bit to do this. Each type of message processor gets a "group name", and there can be multiple instances of the processor within that group (so multiple instances of the program can run for clustering).
When publishing a message, I generate an incremental ID, then store the message in a string key with that ID, then publish the message ID.
On the receiving end, the first thing the subscriber does is attempt to add the message ID it just got from the publisher into a set of received messages for that group with sadd. If sadd returns 0, the message has already been grabbed by another instance, and it just returns. If it returns 1, the full message is pulled out of the string key and sent to the listener.
Of course, this relies on redis being single threaded, which I imagine will continue to be the case.
What you might be looking for is an AMQP protocol implementation,where you can have queue get custom exchanges,and implement a pub-sub model.
RabbitMQ - a popular amqp protocol implementation with lots of libraries
it also has node.js library

Resources