I have a spring batch application that uses Azure SQL server as a backend, I am using Hibernate to update the database.
I am reading the data from CSV file using FlatfileReader & writing into Azure SQL Server using ItemWriter as mentioned below
public class StoreWriter implements ItemWriter<List<Store>> {
Logger logger = Logger.getLogger(StoreWriter.class);
private HibernateItemWriter<Store> hibernateItemWriter;
public StoreWriter(HibernateItemWriter<Store> hibernateItemWriter) {
this.hibernateItemWriter = hibernateItemWriter;
}
#Override
public void write(List<? extends List<Store>> items) throws Exception {
for (List<Store> Store : items) {
hibernateItemWriter.write(Store);
}
logger.info(String.format("Store Processing Completed %s", new LocalDateTime()));
}
}
Below is my Hibernate configuration
<bean id="transactionManager" class="org.springframework.orm.hibernate5.HibernateTransactionManager" lazy-init="true">
<property name="sessionFactory" ref="sessionFactory" />
</bean>
<tx:annotation-driven transaction-manager="transactionManager"/>
<bean id="hibernateProperties" class="org.springframework.beans.factory.config.PropertiesFactoryBean">
<property name="properties">
<props>
<prop key="hibernate.dialect">org.hibernate.dialect.SQLServer2012Dialect</prop>
<prop key="hibernate.show_sql">false</prop>
<prop key="hibernate.format_sql">false</prop>
<!-- <prop key="hibernate.hbm2ddl.auto">update</prop> -->
</props>
</property>
</bean>
<bean class="org.springframework.batch.core.scope.StepScope" />
<!-- DATA SOURCE -->
<bean id="demoDataSource" class="org.springframework.jdbc.datasource.DriverManagerDataSource">
<property name="driverClassName" value="com.microsoft.sqlserver.jdbc.SQLServerDriver" />
<property name="url" value="jdbc:sqlserver://demo.database.windows.net:1433;database=sqldb;encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.database.windows.net;loginTimeout=30;" />
<property name="username" value="user1" />
<property name="password" value="p#ssword1" />
</bean>
What I observe is that it is processing only 360 records per minute, is there a way to increase the performance?
Here is the Hibernate stats:
For 1 record
For 3 records
I have a module s3-puller which pulls file from was s3 .In the production i am facing some issue when i try to create a stream.But local single node it works fine and i tried to set up 3 node cluster and 1 admin node in local it works fine.
Below is my application context
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:int="http://www.springframework.org/schema/integration"
xmlns:int-aws="http://www.springframework.org/schema/integration/aws"
xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd
http://www.springframework.org/schema/integration http://www.springframework.org/schema/integration/spring-integration.xsd
http://www.springframework.org/schema/integration/aws http://www.springframework.org/schema/integration/aws/spring-integration-aws-1.0.xsd">
<int:poller fixed-delay="${fixed-delay}" default="true"/>
<bean id="credentials" class="org.springframework.integration.aws.core.BasicAWSCredentials">
<property name="accessKey" value="${accessKey}"/>
<property name="secretKey" value="${secretKey}"/>
</bean>
<bean
class="org.springframework.beans.factory.config.PropertyPlaceholderConfigurer">
<property name="location">
<value>dms-aws-s3-nonprod.properties</value>
</property>
</bean>
<bean id="clientConfiguration" class="com.amazonaws.ClientConfiguration">
<property name="proxyHost" value="${proxyHost}"/>
<property name="proxyPort" value="${proxyPort}"/>
<property name="preemptiveBasicProxyAuth" value="false"/>
</bean>
<bean id="s3Operations" class="org.springframework.integration.aws.s3.core.CustomC1AmazonS3Operations">
<constructor-arg index="0" ref="credentials"/>
<constructor-arg index="1" ref="clientConfiguration"/>
<property name="awsEndpoint" value="s3.amazonaws.com"/>
<property name="temporaryDirectory" value="${temporaryDirectory}"/>
<property name="awsSecurityKey" value="${awsSecurityKey}"/>
</bean>
<!-- aws-endpoint="https://s3.amazonaws.com" -->
<int-aws:s3-inbound-channel-adapter aws-endpoint="s3.amazonaws.com"
bucket="${bucket}"
s3-operations="s3Operations"
credentials-ref="credentials"
file-name-wildcard="${file-name-wildcard}"
remote-directory="${remote-directory}"
channel="splitChannel"
local-directory="${local-directory}"
accept-sub-folders="false"
delete-source-files="true"
archive-bucket="${archive-bucket}"
archive-directory="${archive-directory}">
</int-aws:s3-inbound-channel-adapter>
<int:splitter input-channel="splitChannel" output-channel="output"
expression="T(org.apache.commons.io.FileUtils).lineIterator(payload)"/>
<int:channel id="output"/>
my Application.java
package com.capitalone.api.dms.main;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.boot.builder.SpringApplicationBuilder;
import org.springframework.context.annotation.ImportResource;
#SpringBootApplication
#ImportResource("classpath:config/applicationContext.xml")
public class Application {
public static void main(String[] args) throws Exception {
new SpringApplicationBuilder(Application.class)
.web(false)
.showBanner(false)
.properties("security.basic.enabled=false")
.run(args);
}
}
I am getting below exception when i try to create a basic stream
module upload --file aws.jar --name aws-s3-options --type source
stream create feedTest91 --definition "aws-s3-options | log" --deploy
I get below exception
DeploymentStatus{state=failed,error(s)=org.springframework.beans.factory.BeanDefinitionStoreException: Invalid bean definition with name 'objectNameProperties' defined in null: Could not resolve placeholder 'xd.module.sequence' in string value "${xd.module.sequence}"; nested exception is java.lang.IllegalArgumentException: Could not resolve placeholder 'xd.module.sequence' in string value "${xd.module.sequence}" at org.springframework.beans.factory.config.PlaceholderConfigurerSupport.doProcessProperties(PlaceholderConfigurerSupport.java:211) at org.springframework.beans.factory.config.PropertyPlaceholderConfigurer.processProperties(PropertyPlaceholderConfigurer.java:222) at org.springframework.beans.factory.config.PropertyResourceConfigurer.postProcessBeanFactory(PropertyResourceConfigurer.java:86) at org.springframework.context.support.PostProcessorRegistrationDelegate.invokeBeanFactoryPostProcessors(PostProcessorRegistrationDelegate.java:265)
from the source code i see that its loaded by jmx mbean file of xd and loaded by below java file
https://github.com/spring-projects/spring-xd/blob/6923ee8705bd9c2c58ad73120724b8b87c5ba37d/spring-xd-dirt/src/main/resources/META-INF/spring-xd/plugins/jmx/mbean-exporters.xml
https://github.com/spring-projects/spring-xd/blob/e9ce8e897774722c1e61038817ebd55c5cf0befc/spring-xd-dirt/src/main/java/org/springframework/xd/dirt/plugins/MBeanExportingPlugin.java
Solution :
I am planning to inject them from my s3 module .Is it right way to do please let me know what should be the values?
<context:mbean-export />
<int-jmx:mbean-export object-naming-strategy="moduleObjectNamingStrategy" />
<util:properties id="objectNameProperties">
<prop key="group">${xd.group.name}</prop>
<prop key="label">${xd.module.label}</prop>
<prop key="type">${xd.module.type}</prop>
<prop key="sequence">${xd.module.sequence}</prop>
</util:properties>
<bean id="moduleObjectNamingStrategy"
class="org.springframework.xd.dirt.module.jmx.ModuleObjectNamingStrategy">
<constructor-arg value="xd.${xd.stream.name:${xd.job.name:}}" />
<constructor-arg ref="objectNameProperties" />
</bean>
That property should be automatically set up by the ModuleInfoPlugin.
This is the second time someone has said that property is missing somehow.
I have opened a JIRA Issue.
I have a spring integration sftp flow which I load as a child context within my overall application context. This is based on the dynamic ftp SI example. My integration flow has nothing about reactor or streams in it. Its a simple flow with one direct channel connected with a sftp-outbound-gateway to transfer files to a sftp server. I can even run units tests and the flow work fine (is able to transfer files) but when I run an integration test which loads the full parent application and then initializes the child context with this sftp flow loaded in it, it throws an error for not being able to find reactor/StringUtils class.
The reason for that seems to be that spring-integration-sftp loads reactor jars as transient deps but since my parent application already has a different version of reactor loaded in the classpath I excluded the reactor-core from spring integration dep. If I dont exclude the reactor-core from spring-integration then there are some version conflicts so I would like to exclude it.
reactorVersion = 2.0.0.M2
compile("io.projectreactor:reactor-core:$reactorVersion")
compile "io.projectreactor.spring:reactor-spring-context:$reactorVersion"
compile("org.springframework.integration:spring-integration-sftp") {
exclude module: "reactor-core"
}
Initializing the SI flow
context = new ClassPathXmlApplicationContext(new String[] { "classpath:adapters/"
+ sink.getConfigurationFile() }, false);
setEnvironment(context, sink);
context.setParent(parentContext);
context.refresh();
context.registerShutdownHook();
The error when i ran the integration test
org.springframework.beans.factory.BeanDefinitionStoreException: Unexpected exception parsing XML document from class path resource [adapters/sftp.xml]; nested exception is java.lang.NoClassDefFoundError: reactor/util/StringUtils
at org.springframework.beans.factory.xml.XmlBeanDefinitionReader.doLoadBeanDefinitions(XmlBeanDefinitionReader.java:414)
at org.springframework.beans.factory.xml.XmlBeanDefinitionReader.loadBeanDefinitions(XmlBeanDefinitionReader.java:336)
at org.springframework.beans.factory.xml.XmlBeanDefinitionReader.loadBeanDefinitions(XmlBeanDefinitionReader.java:304)
at org.springframework.beans.factory.support.AbstractBeanDefinitionReader.loadBeanDefinitions(AbstractBeanDefinitionReader.java:181)
Finally the SI flow
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:context="http://www.springframework.org/schema/context"
xmlns:task="http://www.springframework.org/schema/task" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:int="http://www.springframework.org/schema/integration"
xmlns:int-sftp="http://www.springframework.org/schema/integration/sftp"
xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd
http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context.xsd
http://www.springframework.org/schema/task http://www.springframework.org/schema/task/spring-task.xsd
http://www.springframework.org/schema/integration http://www.springframework.org/schema/integration/spring-integration.xsd
http://www.springframework.org/schema/integration/sftp http://www.springframework.org/schema/integration/sftp/spring-integration-sftp.xsd">
<import resource="common.xml" />
<bean id="sftpSessionFactory"
class="org.springframework.integration.file.remote.session.CachingSessionFactory">
<constructor-arg ref="defaultSftpSessionFactory" />
</bean>
<bean id="defaultSftpSessionFactory"
class="org.springframework.integration.sftp.session.DefaultSftpSessionFactory">
<property name="host" value="${sink.host}" />
<property name="port" value="22" />
<property name="privateKey" value="${sink.private.key}" />
<property name="privateKeyPassphrase" value="${sink.private.key.phrase}" />
<property name="user" value="${sink.user}" />
<property name="password" value="${sink.pass}" />
</bean>
<int:channel id="input" />
<int-sftp:outbound-channel-adapter
auto-startup="true" session-factory="sftpSessionFactory" channel="input"
remote-directory="${sink.path}" remote-filename-generator-expression="headers['remote_file_name']">
<int-sftp:request-handler-advice-chain>
<bean
class="org.springframework.integration.handler.advice.ExpressionEvaluatingRequestHandlerAdvice">
<property name="onSuccessExpression" value="payload" />
<property name="successChannel" ref="successChannel" />
<property name="onFailureExpression" value="payload" />
<property name="failureChannel" ref="failureChannel" />
<property name="trapException" value="true" />
</bean>
</int-sftp:request-handler-advice-chain>
</int-sftp:outbound-channel-adapter>
<int:channel id="successChannel" />
<int:service-activator input-channel="successChannel"
ref="completionHandler" method="handle" />
<int:channel id="failureChannel" />
<int:service-activator input-channel="failureChannel"
ref="failureHandler" method="handle" />
Updating to add my reactor configuration
#Configuration
#EnableReactor
public class ReactorConfiguration {
static {
Environment.initializeIfEmpty().assignErrorJournal();
}
#Bean
public EventBus eventBus() {
return EventBus.config().env(Environment.get()).dispatcher(Environment.SHARED).get();
}
#Bean
public IdGenerator randomUUIDGenerator() {
return new IdGenerator() {
#Override
public UUID generateId() {
return UUIDUtils.random();
}
};
}
}
Gradle currently doesn't do a great job excluding dependencies of transitive dependencies.
It's not spring-integration-sftp that pulls in reactor, it's spring-integration-core.
Try explicitly excluding via that.
We have removed the hard transitive reactor dependency in SI 4.2 (but it's not released yet).
The spring team has created a gradle plugin that might help.
Is there a posibility to Override Tomcat7 connectionTimeout property already stored in /tomcat/conf/server.xml. I mean setting a property in my application-context.xml file like
<bean id="dataSourceC3p0" class="com.mchange.v2.c3p0.ComboPooledDataSource" destroy-method="close">
<property name="driverClass" value="${jdbc.driverClassName}" />
<property name="jdbcUrl" value="${jdbc.url}" />
<property name="user" value="${jdbc.username}" />
<property name="password" value="${jdbc.password}" />
<property name="maxPoolSize" value="20" />
<property name="maxStatements" value="0" />
<property name="minPoolSize" value="5" />
<property name="numHelperThreads" value="5"/>
<property name="connectionTimeout" value="200000"/>
</bean>
although, last line is throwing an error:
org.springframework.beans.NotWritablePropertyException: Invalid property 'connectionTimeout'
all other properties are ok when I just comment last property
NotWritablePropertyException just tells me that there's no other way to set this value right ?
Thanks in Advanced
Property name is wrong..
Look at this doc:
http://www.databaseskill.com/4369778/
<! - When the connection pool is used when the client calls the getConnection ()
waiting to acquire a new connection timeout before throwing
SQLException, if set to 0 wait indefinitely. Milliseconds. Default: 0 ->
<property name="checkoutTimeout"> 100 </ property>
Hope that helps
In a Spring Batch I am trying to read a CSV file and want to assign each row to a separate thread and process it. I have tried to achieve it by using TaskExecutor, but what is happening all the thread is picking the same row at a time. I also tried to implement the concept using Partioner, there also same thing happening. Please see below my Configuration Xml.
Step Description
<step id="Step2">
<tasklet task-executor="taskExecutor">
<chunk reader="reader" processor="processor" writer="writer" commit-interval="1" skip-limit="1">
</chunk>
</tasklet>
</step>
<bean id="reader" class="org.springframework.batch.item.file.FlatFileItemReader">
<property name="resource" value="file:cvs/user.csv" />
<property name="lineMapper">
<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<!-- split it -->
<property name="lineTokenizer">
<bean
class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
<property name="names" value="userid,customerId,ssoId,flag1,flag2" />
</bean>
</property>
<property name="fieldSetMapper">
<!-- map to an object -->
<bean
class="org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper">
<property name="prototypeBeanName" value="user" />
</bean>
</property>
</bean>
</property>
</bean>
<bean id="taskExecutor" class="org.springframework.core.task.SimpleAsyncTaskExecutor">
<property name="concurrencyLimit" value="4"/>
I have tried with different types of task executor, but all of them are behaving in same way. How can I assign each row to a separate thread?
FlatFileItemReader is not thread-safe. In your example you can try to split the CSV file to smaller CSV files and then use a MultiResourcePartitioner to process each one of them. This can be done in 2 steps, one for splitting the original file(like 10 smaller files) and the other for processing splitted files.This way you won't have any issues since each file will be processed by one thread.
Example:
<batch:job id="csvsplitandprocess">
<batch:step id="step1" next="step2master">
<batch:tasklet>
<batch:chunk reader="largecsvreader" writer="csvwriter" commit-interval="500">
</batch:chunk>
</batch:tasklet>
</batch:step>
<batch:step id="step2master">
<partition step="step2" partitioner="partitioner">
<handler grid-size="10" task-executor="taskExecutor"/>
</partition>
</batch:step>
</batch:job>
<batch:step id="step2">
<batch:tasklet>
<batch:chunk reader="smallcsvreader" writer="writer" commit-interval="100">
</batch:chunk>
</batch:tasklet>
</batch:step>
<bean id="taskExecutor" class="org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor">
<property name="corePoolSize" value="10" />
<property name="maxPoolSize" value="10" />
</bean>
<bean id="partitioner"
class="org.springframework.batch.core.partition.support.MultiResourcePartitioner">
<property name="resources" value="file:cvs/extracted/*.csv" />
</bean>
The alternative instead of partitioning might be a Custom Thread-safe Reader who will create a thread for each line, but probably partitioning is your best choice
You're problem is that you reader is not in scope step .
That's means : all your threads share the same input Stream (Resource file).
To have for each thread one row to process you need to :
Be sure that all threads read the file from the start to the
end of file (Each thread should open the stream and close it for
each execution context )
The partitioner must inject the start and end position for each
execution context.
You're reader must read the file with this positions.
I write some code and this is the output :
Code of com.test.partitioner.RangePartitioner class :
public Map<String, ExecutionContext> partition() {
Map < String, ExecutionContext > result = new HashMap < String, ExecutionContext >();
int range = 1;
int fromId = 1;
int toId = range;
for (int i = 1; i <= gridSize; i++) {
ExecutionContext value = new ExecutionContext();
log.debug("\nStarting : Thread" + i);
log.debug("fromId : " + fromId);
log.debug("toId : " + toId);
value.putInt("fromId", fromId);
value.putInt("toId", toId);
// give each thread a name, thread 1,2,3
value.putString("name", "Thread" + i);
result.put("partition" + i, value);
fromId = toId + 1;
toId += range;
}
return result;
}
--> Look at the outPut console
Starting : Thread1
fromId : 1
toId : 1
Starting : Thread2
fromId : 2
toId : 2
Starting : Thread3
fromId : 3
toId : 3
Starting : Thread4
fromId : 4
toId : 4
Starting : Thread5
fromId : 5
toId : 5
Starting : Thread6
fromId : 6
toId : 6
Starting : Thread7
fromId : 7
toId : 7
Starting : Thread8
fromId : 8
toId : 8
Starting : Thread9
fromId : 9
toId : 9
Starting : Thread10
fromId : 10
toId : 10
Look at the configuration bellow :
http://www.springframework.org/schema/batch/spring-batch-2.2.xsd
http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans-3.2.xsd">
<import resource="../config/context.xml" />
<import resource="../config/database.xml" />
<bean id="mouvement" class="com.test.model.Mouvement" scope="prototype" />
<bean id="itemProcessor" class="com.test.processor.CustomItemProcessor" scope="step">
<property name="threadName" value="#{stepExecutionContext[name]}" />
</bean>
<bean id="xmlItemWriter" class="com.test.writer.ItemWriter" />
<batch:job id="mouvementImport" xmlns:batch="http://www.springframework.org/schema/batch">
<batch:listeners>
<batch:listener ref="myAppJobExecutionListener" />
</batch:listeners>
<batch:step id="masterStep">
<batch:partition step="slave" partitioner="rangePartitioner">
<batch:handler grid-size="10" task-executor="taskExecutor" />
</batch:partition>
</batch:step>
</batch:job>
<bean id="rangePartitioner" class="com.test.partitioner.RangePartitioner" />
<bean id="taskExecutor" class="org.springframework.core.task.SimpleAsyncTaskExecutor" />
<batch:step id="slave">
<batch:tasklet>
<batch:listeners>
<batch:listener ref="stepExecutionListener" />
</batch:listeners>
<batch:chunk reader="mouvementReader" writer="xmlItemWriter" processor="itemProcessor" commit-interval="1">
</batch:chunk>
</batch:tasklet>
</batch:step>
<bean id="stepExecutionListener" class="com.test.listener.step.StepExecutionListenerCtxInjecter" scope="step" />
<bean id="myAppJobExecutionListener" class="com.test.listener.job.MyAppJobExecutionListener" />
<bean id="mouvementReaderParent" class="org.springframework.batch.item.file.FlatFileItemReader" scope="step">
<property name="resource" value="classpath:XXXXX/XXXXXXXX.csv" />
<property name="lineMapper">
<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<property name="lineTokenizer">
<bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
<property name="delimiter" value="|" />
<property name="names"
value="id,numen,prenom,grade,anneeScolaire,academieOrigin,academieArrivee,codeUsi,specialiteEmploiType,natureSupport,dateEffet,modaliteAffectation" />
</bean>
</property>
<property name="fieldSetMapper">
<bean class="com.test.mapper.MouvementFieldSetMapper" />
</property>
</bean>
</property>
</bean>
<!-- <bean id="itemReader" scope="step" autowire-candidate="false" parent="mouvementReaderParent">-->
<!-- <property name="resource" value="#{stepExecutionContext[fileName]}" />-->
<!-- </bean>-->
<bean id="mouvementReader" class="com.test.reader.MouvementItemReader" scope="step">
<property name="delegate" ref="mouvementReaderParent" />
<property name="parameterValues">
<map>
<entry key="fromId" value="#{stepExecutionContext[fromId]}" />
<entry key="toId" value="#{stepExecutionContext[toId]}" />
</map>
</property>
</bean>
<!-- <bean id="xmlItemWriter" class="org.springframework.batch.item.xml.StaxEventItemWriter">-->
<!-- <property name="resource" value="file:xml/outputs/Mouvements.xml" />-->
<!-- <property name="marshaller" ref="reportMarshaller" />-->
<!-- <property name="rootTagName" value="Mouvement" />-->
<!-- </bean>-->
<bean id="reportMarshaller" class="org.springframework.oxm.jaxb.Jaxb2Marshaller">
<property name="classesToBeBound">
<list>
<value>com.test.model.Mouvement</value>
</list>
</property>
</bean>
TODO : Change my reader on other that read with position (start and end position) like with Scanner Class in java.
Hope this help.
You can split your input file to many file , the use Partitionner and load small files with threads, but on error , you must restart all job after DB cleaned.
<batch:job id="transformJob">
<batch:step id="deleteDir" next="cleanDB">
<batch:tasklet ref="fileDeletingTasklet" />
</batch:step>
<batch:step id="cleanDB" next="split">
<batch:tasklet ref="countThreadTasklet" />
</batch:step>
<batch:step id="split" next="partitionerMasterImporter">
<batch:tasklet>
<batch:chunk reader="largeCSVReader" writer="smallCSVWriter" commit-interval="#{jobExecutionContext['chunk.count']}" />
</batch:tasklet>
</batch:step>
<batch:step id="partitionerMasterImporter" next="partitionerMasterExporter">
<partition step="importChunked" partitioner="filePartitioner">
<handler grid-size="10" task-executor="taskExecutor" />
</partition>
</batch:step>
Full example code working (on Github)
Hope this help.