I am reading a message , transforming it and outputting on the JMS channel. The JMS channel uses a WorkManager Task Executor to read the messages and processes it.
Even though we configured the WorkManager in application server to have 10 threads, only one thread is being used.
<si:chain id="prenotifchain" input-channel="preNotificationChannel" output-channel="notificationJMSChannel">
<si:transformer id="prenotif" method="transformRequest" ref="notificationTransformer"/>
</si:chain>
<si-jms:channel id="notificationJMSChannel" queue="notificationQueue" connection-factory="queueConnectionFactory" transaction-manager="txManager" task-executor="notificationTaskExecutor" />
<jee:jndi-lookup id="notificationQueue" jndi-name="jms/notifqueue"/>
<bean id="notificationTaskExecutor"
class="org.springframework.scheduling.commonj.WorkManagerTaskExecutor">
<property name="workManagerName" value="notifWM" />
<property name="resourceRef" value="true" />
</bean>
Are we missing any configuration or is there another way to read multiple ?
Please, use concurrency attribute:
<xsd:attribute name="concurrency" type="xsd:string">
<xsd:annotation>
<xsd:documentation><![CDATA[
The number of concurrent sessions/consumers to start for each listener.
Can either be a simple number indicating the maximum number (e.g. "5")
or a range indicating the lower as well as the upper limit (e.g. "3-5").
Note that a specified minimum is just a hint and might be ignored at runtime.
Default is 1; keep concurrency limited to 1 in case of a topic listener
or if message ordering is important; consider raising it for general queues.
]]></xsd:documentation>
</xsd:annotation>
</xsd:attribute>
Related
We have an application which can consume around 300 JMS messages per minute. We need to increase the speed to 3000 messages per minute.
I created a simple test program which reads the messages from the queue and logs the messages. No processing is involved, so I expected a high speed. However, the logging is still happening at a speed of around 400 messages per minute.
Below are the excerpts of my program
<int-jms:message-driven-channel-adapter id="testJmsInboundAdapter"
auto-startup="true"
destination="testQueueDestination"
connection-factory="testConnectionFactory"
channel="messageTransformerChannel" />
<int:channel id="messageTransformerChannel" />
<int:service-activator
id="loggerActivator"
input-channel="messageTransformerChannel"
method="log"
ref="logger" />
The logger method simply logs the message
public void log(final GenericMessage<Object> object) {
LOGGER.info("Logging message" + object);
}
Any advise where should I look at the bottleneck. Is there any limitation on the number of messages that can be consumed per minute using spring integration's message-driven-channel-adapter
Pay attention to these options:
<xsd:attribute name="concurrent-consumers" type="xsd:string">
<xsd:annotation>
<xsd:documentation>
Specify the number of concurrent consumers to create. Default is 1.
Specifying a higher value for this setting will increase the standard
level of scheduled concurrent consumers at runtime: This is effectively
the minimum number of concurrent consumers which will be scheduled
at any given time. This is a static setting; for dynamic scaling,
consider specifying the "maxConcurrentConsumers" setting instead.
Raising the number of concurrent consumers is recommendable in order
to scale the consumption of messages coming in from a queue. However,
note that any ordering guarantees are lost once multiple consumers are
registered
</xsd:documentation>
</xsd:annotation>
</xsd:attribute>
<xsd:attribute name="max-concurrent-consumers" type="xsd:string">
<xsd:annotation>
<xsd:documentation>
Specify the maximum number of concurrent consumers to create. Default is 1.
If this setting is higher than "concurrentConsumers", the listener container
will dynamically schedule new consumers at runtime, provided that enough
incoming messages are encountered. Once the load goes down again, the number of
consumers will be reduced to the standard level ("concurrentConsumers") again.
Raising the number of concurrent consumers is recommendable in order
to scale the consumption of messages coming in from a queue. However,
note that any ordering guarantees are lost once multiple consumers are
registered.
</xsd:documentation>
</xsd:annotation>
</xsd:attribute>
I have a int-kafka:outbound-channel-adapter which I am using to send messages to kafka and then receiving messages using a int-kafka:inbound-channel-adapter. Communication seems to work fine, I am able to send and receive messages but the format is a bit strange. I am sending individual messages separately to my outbound adapter but when I receive the messages, I get one message back with all messages aggregated into the payload of that one message.
This is how the message payload looks like when I receive the messages
[payload={mytopic={0=[string message 1, string message 2, string message 3, string message 4, string message 5, ...........]}}, headers={id=3934de02-1f42-ab90-6aa5-9c15f3cd0b6e, timestamp=1439260669762}]
Receive integration flow looks like this
<int-kafka:inbound-channel-adapter
id="kafkaInboundAdapter" kafka-consumer-context-ref="consumerContext"
auto-startup="true" channel="inputFromKafka">
<int:poller fixed-delay="10" time-unit="MILLISECONDS"
max-messages-per-poll="5" />
</int-kafka:inbound-channel-adapter>
<int:channel id="inputFromKafka" />
<int:service-activator id="kakfaMessageHandler"
input-channel="inputFromKafka">
<bean class="com...broker.MessageHandler"></bean>
</int:service-activator>
Any reason why I am receiving all messages aggregated in one spring integration message instead of separate messages as they were sent to kafka.
The KafkaHighLevelConsumerMessageSource is designed as many other polling MessageSource<?>: to get the data by the one poll and return it as a List<?>.
In this case we have this result from Kafka stream reading:
Message<Map<String, Map<Integer, List<Object>>>>
where the payload is a Map of Kafka topics and maps of partitions and messages there.
If you use only one topic on the consumerContext you can simply transform the top-level Map just to its partitions map. Or even go ahead and transform just to the payloads list, if you have only one partition there. and finally you can end up with the splitter.
If you'd like to receive messages from the topic one by one plus as quickly as they appear there, you should take a look to the <int-kafka:message-driven-channel-adapter>.
My problem statement. Read a csv file with 10 million data and store it in db. with as minimal time as possible.
I had implemented it using Simple multi threaded executor of java and the logic is almost similar to spring batch's chunk. Read preconfigured number of data from csv file and then create a thread, and passing the data to thread which validates data and then writes to file which runs in multi thread. once all the task is done I'm calling sql loader to load each file. Now I want to move this code to spring batch(I'm newbie to spring batch)
Here are my question
1. In task, is it possible to make ItemReader to Item writer multi threaded(as I read the file create a new thread to process the data before the thread writes to data)? if not I need to create two steps first step read the file which is single threaded and another step which is multi threaded writing to individual file, but how do I pass the list of data to another task from previous task.
2. In case if there are any failures in a single thread, how can I stop whole batch job processing.
3. How to retry the batch job in case of failure after certain interval. I know that there is retry option in case of failure but I could not find an option to retry the task after certain interval in case of failure. here I'm not talking about scheduler because I've batch job already runs under scheduler, but on failure it has to be re-run after 3 minutes are so.
Here is how I solved the problem.
Read a file and chunk the file( split the file) using Buffered and File Channel reader and writer ( the fastest way of File read/write, even spring batch uses the same). I implemented such that this is executed before job is started( However it can be executed using job as step using method invoker)
Start the Job with directory location as job parameter.
Use multiResourcePartitioner which will get the directory location and for each file a slave step is created in separate thread
In the Slave step get the file passed from Partitioner and use spring batchs itemreader to read the file
Use the Database item writer( I'm using mybatis batch itemwriter) to push the data to Database.
Its better to use the split count equal to commit-count of step.
About multi-thread read How to set up multi-threading in Spring Batch? answer; it will point you to right direction. Also, in this sample there are some consideration about restart for CSV file
Job should automatically fails if some error on thread: I have never tried, but this should be the default behaviour
Spring Batch How to set time interval between each call in a Chunk tasklet can be a start. Also, official doc about Backoff Policies - When retrying after a transient failure it often helps to wait a
bit before trying again, because usually the failure is caused by
some problem that will only be resolved by waiting. If a
RetryCallback fails, the RetryTemplate can pause execution according
to the BackoffPolicy in place.
Let me known if this help or how you solve problem because I'm interested for my (future) work!
I hope my indications can be helpful.
You can split your input file to many file , the use Partitionner and load small files with threads, but on error , you must restart all job after DB cleaned.
<batch:job id="transformJob">
<batch:step id="deleteDir" next="cleanDB">
<batch:tasklet ref="fileDeletingTasklet" />
</batch:step>
<batch:step id="cleanDB" next="split">
<batch:tasklet ref="countThreadTasklet" />
</batch:step>
<batch:step id="split" next="partitionerMasterImporter">
<batch:tasklet>
<batch:chunk reader="largeCSVReader" writer="smallCSVWriter" commit-interval="#{jobExecutionContext['chunk.count']}" />
</batch:tasklet>
</batch:step>
<batch:step id="partitionerMasterImporter" next="partitionerMasterExporter">
<partition step="importChunked" partitioner="filePartitioner">
<handler grid-size="10" task-executor="taskExecutor" />
</partition>
</batch:step>
</batch:job>
Full example code (on Github).
Hope this help.
My scheduler's application context defines this trgger:
<bean id="myTrigger" class="org.springframework.scheduling.quartz.CronTriggerBean">
<property name="jobDetail" ref="myJob"/>
<property name="cronExpression" value="0 0 ∗ ∗ ∗ ?"/>
</bean>
Does it fire every day at 00:00? Or every hour?
I'd say the latter, but the documentation of this project says otherwise...
Can you help me out?
Are there different kind of expressions?
Spring is using quartz for scheduling jobs so it is definitely every hour.
Please check:
http://www.quartz-scheduler.org/documentation/quartz-2.x/tutorials/tutorial-lesson-06
Iam trying to configure nutch for running multi-threaded crawling.
However , Iam facing an issue. I am not able to run crawl with multiple threads , I have modified the nutch-site.xml to use 25 threads but still I can see only 1 Threads running.
<property>
<name>fetcher.threads.fetch</name>
<value>25</value>
<description>The number of FetcherThreads the fetcher should use.
This is also determines the maximum number of requests that are
made at once (each FetcherThread handles one connection).</description>
</property>
<property>
<name>fetcher.threads.per.host</name>
<value>25</value>
<description>This number is the maximum number of threads that
should be allowed to access a host at one time.</description>
</property>
I always get the value of
activeThreads=25, spinWaiting=24, fetchQueues.totalSize=some value.
Whats the meaning of this, can you please explain whats the issue and how can I solve it.
I will highly appreciate your help.
Thanks,
Sumit
I think your issue is related to a known bug w/the new Nutch fetcher. See NUTCH-721.
You can try using OldFetcher (if you have Nutch 1.0) to see if that solves your problem.
-- Ken