Codec's were introduced in Spring Integration 4.2.
However, the description in the docs doesn't really describe how a Codec is different from a MessageConverter or in which scenarios to use which abstraction?
Basically what I want to know is:
Why was the Codec abstraction introduced when it seems similar to what a MessageConverter does?
Why would you use a Codec over a MessageConverter and vice versa?
When would you choose to use one over the other?
This question was highlighted in the context of Spring Cloud Stream where there is a default Kryo Codec configured but recently there has been work around MessageConverter's.
It's a bit of a grey area.
MessageConverters are used in Spring Integration in two areas:
To convert some external representation of a message to a spring-messaging Message<?> - e.g. to/from an mqtt message.
To implement the datatype on message channels.
Codecs, on the other hand only deal with message payloads when putting them on the wire (MessageBus in XD or Binder in Spring Cloud Stream). Kryo is an alternative to Java serialization.
Applications will typically not deal with Codecs directly, but Spring Integration provides a CodecMessageConverter which takes a codec to encode/decode the payload while converting.
It also provides a codec-based transformer so an app can do the encoding/decoding (if it wishes) somewhere else in the flow.
So, in the context of Spring Cloud Stream, the Kryo Codec is used to encode/decode the payload within the Binder.
The message converters are used to implement conversion within the application bound to the transport by the Binder, using the channel dataType feature.
Let's look at an example using Spring Cloud DataFlow:
stream create foo --definition "source | processor --outputType=application/json | sink"
Let's say the source emits some POJO that is received by the processor, and the processor internally normally emits a Map object, but the sink wants to receive JSON, then a MessageConverter does that for you because of the outputType declaration.
The data between source and processor, and processor and sink, are transported as kryo.
Related
Following the example found in GitHub https://github.com/spring-cloud/spring-cloud-gcp/tree/master/spring-cloud-gcp-samples/spring-cloud-gcp-pubsub-polling-binder-sample regarding polling messages from a PubSub subscription, I was wondering...
Is it possible to make a PollableMessageSource retrieve List<Message<?>> instead of a single message per poll?
I've seen the #Poller notation only being used in Source typed objects, never in Processor or Sink. Is it possible to use in such context when for example using #StreamListener or with a functional approach?
The PollableMessageSource binding and Source stream applications are fully based on the Poller and MessageSource abstraction from Spring Integration where its contract is to produce a single message to the channel configured. The point of the messaging is really to process a single message not affecting others. The failure for one message doesn't mean to fail others in the flow.
On the other hand you probably mean GCP Pub/Sub messages to be produced as a list in the Spring message payload. That is really possible, but via some custom code from Pub/Sub consumer and MessageSource impl. Although I would think twice to expect some batched from the source. Probably you may utilize an aggregator to build some small windows if your further logic is about processing as list. But again: it is going to be a single Spring message.
May be better to start thinking about a reactive function implementation where you indeed can expect a Flux<Message<?>> as an input and Spring Cloud Stream framework will take care for you how to emit the data from Pub/Sub into the reactive stream you expect.
See more info in docs: https://docs.spring.io/spring-cloud-stream/docs/3.1.0/reference/html/spring-cloud-stream.html#_reactive_functions_support
Spring Cloud Stream is based on At least once method,This means that in some rare cases a duplicate message can arrive at an endpoint.
Does Spring Cloud Stream keep a buffer of already received messages?
The IdempotentReceiver in Enterprise Integration Patterns book suggests :
Design a receiver to be an Idempotent Receiver,one that can safely receive the same message multiple times.
Does Spring Cloud Stream control duplicate messages in consumers?
Update:
A paragraph from Spring Cloud Stream says :
4.5.1. Durability
Consistent with the opinionated application model of Spring Cloud Stream, consumer group subscriptions are durable. That is, a binder implementation ensures that group subscriptions are persistent and that, once at least one subscription for a group has been created, the group receives messages, even if they are sent while all applications in the group are stopped.
Anonymous subscriptions are non-durable by nature. For some binder implementations (such as RabbitMQ), it is possible to have non-durable group subscriptions.
In general, it is preferable to always specify a consumer group when binding an application to a given destination. When scaling up a Spring Cloud Stream application, you must specify a consumer group for each of its input bindings. Doing so prevents the application’s instances from receiving duplicate messages (unless that behavior is desired, which is unusual).
I think your assumption on the responsibility of the spring-cloud-stream framework are incorrect.
Spring-cloud-stream in a nutshell is a framework responsible for connecting and adapting producers/consumers provided by the developer to the message broker(s) exposed by the spring-cloud-stream binder (e.g., Kafka, Rabbit, Kinesis etc).
So connecting to a broker, receiving message from the broker, deserialising it, invoking user code, serialising message and sending it back to the broker is in the scope of framework responsibility. So you can look at it as purely infrastructure.
What you're describing is more of an application concern since the actual receiver is something that user would develop as part of the spring-cloud-stream development experience, hence responsibility for idempotence would reside with such user.
Also, on top of that most brokers already handle idempotency (in a way) by ensuring that a particular message has been delivered only once. That said, if someone sends identical message to such broker, it will have no idea that it is duplicate so the requirement for idempotency and/or deduplication is still valid, but as you can see it is not as straight forward given the amount of factor that are in play where your understanding of idempotence could be different from mine, hence our approaches could be different as well.
One last thing (partially to prove my last point): can safely receive the same message multiple times. - That is all it states, but what does safely really mean to you vs. me vs. some other person?
If you are concerned about a case where the application receives and processes message from the broker but crashes before it acknowledges the message, that can happen. Spring cloud stream app starters provides support for auto-configuration of a persistent message metadata store which backs Spring Integration's IdempotentReceiverInterceptor. An example of this is in the SFTP source app starter. By default, the sftp source uses an in-memory metadata store, so it would not survive a restart, but can be customized to use a persistent store.
AFAIK the Spring Cloud Stream project is based on Spring Integration. Hence I was wondering if there is a nice way to resequence a subset of inbound messages before the StreamListener handler is triggered? Or do I need to assemble the whole IntegrationFlow from scratch using XML or Java DSL config from Spring Integration?
My use case is as follows. Most of the time I process inbound messages on a Kafka topic as they come. However, a few events have to be resequenced based on CORRELATION_ID, SEQUENCE_NUMBER, and SEQUENCE_SIZE headers. In other words I'd like to keep using StreamListener as much as possible and simply plug in resequencing strategy for some events.
Yes, you would need to use Spring Integration for it. In fact Spring Cloud Stream is effectively a binding framework only. It binds message handlers to the message brokers via binders. The message handlers themselves are provided by the users.
The #StreamListener annotation is pretty much an equivalent of Spring Integration's #ServiceActivator with few extra features (e.g., conditional routing), but other then it is just a message handler.
Now, as you eluded to, you are aware that you can use Spring Integration (SI) to implement a message handler or an internal SI flow, and that is normal and recommended for complex cases.
That said, we do provide out of the box apps that implements certain EIP components and we do have, for example, and aggregator app which you can use as a starting point in implementing resequencer. Further more, given that we have an aggregator app and not resequencer, we would be glad to accept a contribution for it if you're interested.
I hope this answers you question.
Currently i am working on a Spring Integration application which has a following scenario.
There is a Transformer which transforms incoming message in to a particular object type
Once the transformation is done, we need to write it to a log file and to a database table and then finally send to a JMS outbound adapter.
I was reading the Spring Integration reference and found out there are two ways we can approach this scenario.
Introduce a pub-sub channel as the output channel of the above mentioned transformer and have File-outbound, DB-outbound and JMS-outbound as the subscribers.
Introduce a Recipient List Router just after the transformer and specify the File-outbound, DB-outbound and JMS-outbound as the recipients.
When it comes to Enterprise Integration Patterns what is the best way to handle this scenario? Any new suggestions and improvements are welcome
Thanks,
Keth
There is no "best way" - both solutions are equivalent and there is little difference at runtime. So it's your preference; I generally use pub/sub for the simple case and an RLR if the recipients are conditional (with selectors).
We use an <int-amqp:publish-subscribe-channel/> as a kind of an event bus in our service-based application. The send method as well as the message handler are based on the Message class from spring-messaging (as of spring-integration 4.0 (+). Events are changes to entities that need to be picked up by other services.
The problem is: the spring-messaging Message class is treated as arbitrary object payload by spring-amqp as it is not recognized as a spring-amqp Message. This causes the following problems:
default message format is serialized Java objects. spring-amqp does not only serialize our original payload object only, but also the wrapping spring-messaging Message, which is not compatible between Spring Framework 4.0 and 4.1
configuring a message converter for JSON (Jackson2JsonMessageConverter to be exact) doesn't solve the problem, as it also converts the Message instance - which is spring-integration's GenericMessage, and that can't be instantiated from JSON as it lacks an appropriate constructor
We need to mix Spring versions, as we have services implemented with Grails 2.4 (based on Spring 4.0) and with current Spring Boot (relies on Spring 4.1).
Is there any way out of this, preferably an idiomatic spring-integration way? Is there maybe another abstraction instead or in addition to the PublishSubscribeAmqpChannel? Or any other means of message conversion that we could apply?
Instead of using an amqp-backed channel, use an outbound-channel-adapter to send and an inbound-channel-adapter to receive.
The channel holds the entire message (serialized) whereas the adapters transport the payload as the message body and (optionally) map headers to/from amqp headers.
You will need to configure a fanout exchange for pub/sub (the channel will create one called si.fanout.<channelName> by default. You can then bind a queue for each recipient.