Gridgain: how to redistribute partitioned-mode cache when one node brought down - gridgain

I am new to gridgain and we are doing a POC using gridgain. We did some simple examples using partitioned cache, it works well however we found that when we bring a node down, cache from that node was gone. so my questions is: if we keep using patitioned mode, is there any way to re-distributed cache when a node (or several nodes) is undeployed. if not, is there any good way to do it? Thanks!
configuration Code:
<context:component-scan base-package="com.test" />
<bean id="hostGrid" class="org.gridgain.grid.GridSpringBean">
<property name="configuration">
<bean class="org.gridgain.grid.GridConfiguration">
<property name="localHost" value="127.0.0.1"/>
<property name="peerClassLoadingEnabled" value="false"/>
<property name="marshaller">
<bean class="org.gridgain.grid.marshaller.optimized.GridOptimizedMarshaller">
<property name="requireSerializable" value="false"/>
</bean>
</property
<property name="cacheConfiguration">
<list>
<bean class="org.gridgain.grid.cache.GridCacheConfiguration">
<property name="name" value="CACHE"/>
<property name="cacheMode" value="PARTITIONED"/>
<property name="store" >
<bean class="com.test.CacheJdbcPOCStore"></bean>
</property>
</bean>
</list>
</property>
</bean>
</property>
</bean>
We deployed the same war (using above configuration) to 3 tomcat 7 server. we did not specify number of backup so it should be 1 by default.
follow up
I solved this problem by putting backups= 1 in configuration. looks like previously it did not create backup copy. however it should make 1 copy since it is by default. also, when i tried to bring down 2 nodes at one time, i saw part of cache was gone, so I set backups=2 and found no cache loss this time. so it looks like if in a very bad case where all nodes except for the main node crash, we need to have # of nodes -1 backups to prevent data loss. but if I do so then it is just like replicated mode and replicated mode has less restriction on query and transactions. So my question is : if we need to take the advantage of parallel computation and at mean time want to prevent data loss when nodes crash what is the best practice?
Thanks!

Number of backups is 0 by default. The documentation has been fixed.
You are right about REPLICATED mode. If you are worried about any data loss, the REPLICATED mode is the only way to guarantee it. The disadvantage here is that writes will get slower, as all the nodes in the cluster will be updated. The advantage is that the data is available on every node, so you can easily access it from your computations without worrying which node to send them to.

Related

When should I use List of Data objects versus a ListData object?

Assume that I have AnimalData and AnimalListData:
<bean class="com.chang.data.AnimalData">
<property name="name" type="java.lang.String"/>
<property name="weight" type="java.lang.Integer"/>
</bean>
<bean class="com.chang.data.AnimalListData">
<property name="animals" type="java.util.List<com.chang.data.AnimalData>"/>
</bean>
If I create ZooData with a list of AnimalData, is it a better to use a List of Data objects (Approach #1) or a ListData object (Approach #2)? When should I use Approach #1 or Approach #2? How about for WsDTOs? Is the case the same?
Approach #1:
<bean class="com.chang.data.ZooData">
<property name="animals" type="java.util.List<com.chang.data.AnimalData>"/>
</bean>
Approach #2:
<bean class="com.chang.data.ZooData">
<property name="animals" type="com.chang.data.AnimalListData"/>
</bean>
It doesn't make a great deal of difference in my opinion, so some of this is personal preference. The thing I would think about is: in approach 2 you have to do zooData.getAnimals().getAnimals(), whereas in approach 1 it is zooData.getAnimals(). The same applies for the DTOs point. So approach 1 seems more intuitive to me .....
Approach #2.
Think from a real-world mapping perspective. Do you actually have a real-world object AnimalListData?AnimalData, on the other hand, is actually a real-world object that can directly be mapped to get detail of the animal. The list is just an abstract construct that you have created to simplify your programming otherwise it does not exist.

Spring integration multithreading

Referring to my earlier question at URL - Spring integration multithreading requirement - I think I may have figured out the root cause of the issue.
My requirement in brief -
Poll the database after a fixed delay of 1 sec and then publish very limited data to Tibco EMS queue. Now from this EMS queue I have to do the following tasks all in multithreaded fashion :- i) consume the messages, ii) fetch the full data now from the database and iii) converting this data into json format.
My design -
`<int:channel id="dbchannel"/>
<int-jdbc:inbound-channel-adapter id="dbchanneladapter"
channel="dbchannel" data-source="datasource"
query="${selectquery}" update="${updatequery}"
max-rows-per-poll="1000">
<int:poller id="dbchanneladapterpoller"
fixed-delay="1000">
<int:transactional transaction-manager="transactionmanager" />
</int:poller>
</int-jdbc:inbound-channel-adapter>
<int:service-activator input-channel="dbchannel"
output-channel="publishchannel" ref="jdbcmessagehandler" method="handleJdbcMessage" />
<bean id="jdbcmessagehandler" class="com.citigroup.handler.JdbcMessageHandler" />
<int:publish-subscribe-channel id="publishchannel"/>
<int-jms:outbound-channel-adapter id="publishchanneladapter"
channel="publishchannel" jms-template="publishrealtimefeedinternaljmstemplate" />
<int:channel id="subscribechannel"/>
<int-jms:message-driven-channel-adapter
id="subscribechanneladapter" destination="subscriberealtimeinternalqueue"
connection-factory="authenticationconnectionfactory" channel="subscribechannel"
concurrent-consumers="5" max-concurrent-consumers="5" />
<int:service-activator input-channel="subscribechannel"
ref="subscribemessagehandler" method="logJMSMessage" />
<bean id="subscribemessagehandler" class="com.citigroup.handler.SubscribeJMSMessageHandler" />
</beans>
<bean id="authenticationconnectionfactory"
class="org.springframework.jms.connection.UserCredentialsConnectionFactoryAdapter">
<property name="targetConnectionFactory" ref="connectionFactory" />
<property name="username" value="test" />
<property name="password" value="test123" />
</bean>
<bean id="connectionFactory" class="org.springframework.jndi.JndiObjectFactoryBean">
<property name="jndiTemplate">
<ref bean="jndiTemplate" />
</property>
<property name="jndiName" value="app.jndi.testCF" />
</bean>
<bean id="subscriberealtimeinternalqueue" class="org.springframework.jndi.JndiObjectFactoryBean">
<property name="jndiTemplate">
<ref bean="jndiTemplate" />
</property>
<property name="jndiName"
value="app.queue.testQueue" />
</bean>
<bean id="jndiTemplate" class="org.springframework.jndi.JndiTemplate">
<property name="environment">
<props>
<prop key="java.naming.factory.initial">com.tibco.tibjms.naming.TibjmsInitialContextFactory
</prop>
<prop key="java.naming.provider.url">tibjmsnaming://test01d.nam.nsroot.net:7222</prop>
</props>
</property>
</bean>`
Issue -
Using message-driven-channel with concurrent consumers value set to 5. However, it looks like just one consumer thread (container-2) is created and is picking up the messages from EMS queue. Please find below the log4j log -
16 Aug 2018 11:31:12,077 INFO SubscribeJMSMessageHandler [subscribechanneladapter.container-2][]: Total count of records read from Queue at this moment is 387
record#1:: [ID=7694066395] record#2:: [ID=7694066423] .. .. .. record#387:: [ID=6147457333]
Probable root cause here -
May be its the first step in the configuration where I am polling the database to fetch the data after a fixed-delay that's causing this multithreading issue. Referring to the logs above, my assumption here is since the number of records fetched is 387 and all these are bundled into a List object (List> message), it is being considered as just 1 message/payload instead of 387 different messages and that's why just one thread/container/consumer is picking up this bundled message. Reason for this assumption is the logs below -
GenericMessage [payload=[{"ID":7694066395},{"ID":7694066423},{"ID":6147457333}], headers={json__ContentTypeId__=class org.springframework.util.LinkedCaseInsensitiveMap, jms_redelivered=false, json__TypeId__=class java.util.ArrayList, jms_destination=Queue[app.queue.testQueue], id=e034ba73-7781-b62c-0307-170099263068, priority=4, jms_timestamp=1534820792064, contentType=application/json, jms_messageId=ID:test.21415B667C051:40C149C0, timestamp=1534820792481}]
Question -
Is my understanding of the root cause correct? If yes then what can be done to treat these 387 messages as individual messages (and not one List object of messages) and publish them one by one without impacting the transaction management??
I had discussed this issue with https://stackoverflow.com/users/2756547/artem-bilan in my earlier post on stackoverflow and I had to check this design by replacing Tibco EMS with ActiveMQ. However, ActiveMQ infrastructure is is still being analysed by our architecture team and so can't be used till its approved.
Oh! Now I see what is your problem. The int-jdbc:inbound-channel-Adapter indeed returns a list of records it could select from the DB. And this whole list is sent as a single message to the JMS. That’s the reason how you see only one thread in the consumer side: there is just only one message to get from the queue.
If you would like to have separate messages for each pulled record, you need to consider to use a <splitter> in between JDBC polling operation and sending to JMS.

Spring Integration with RedisLockRegistry example

We are implementing a flow where a <int-sftp:inbound-streaming-channel-adapter/> polls a directory for a file and when found it passes the stream to a service activator.
The issue is we will have multiple instances of the app running and we would like to lock the process so that only one instance can pick up the file.
Looking at the documentation, Redis Lock Registry looks to be the solution, is there an example of this being used in xml?
All I can find is a few references to it and the source code for it.
http://docs.spring.io/spring-integration/reference/html/redis.html point 24.1
Added info:
Ive added the RedisMetaDataStore and SftpSimplePatternFileListFilter. It does work but it does have one oddity, when sftpInboundAdapter is activated by the poller it adds an entry for each file in the metadatastore. Say there are 10 files, there would be 10 entries in the datastore, but it does not process all 10 files in "1 go", only 1 file is processed per poll from the adapter, which would be fine, but in a multi instance environment if the server which picked up the files went down after processing 5 files, another server doesn't seem to pick up the remaining 5 files unless the files are "touched".
Is the behaviour of picking up 1 file per poll correct or should it process all valid files during one poll.
Below is my XML
<int:channel id="sftpInbound"/> <!-- To Java -->
<int:channel id="sftpOutbound"/>
<int:channel id="sftpStreamTransformer"/>
<int-sftp:inbound-streaming-channel-adapter id="sftpInboundAdapter"
channel="sftpInbound"
session-factory="sftpSessionFactory"
filter="compositeFilter"
remote-file-separator="/"
remote-directory="${sftp.directory}">
<int:poller cron="${sftp.cron}"/>
</int-sftp:inbound-streaming-channel-adapter>
<int:stream-transformer input-channel="sftpStreamTransformer" output-channel="sftpOutbound"/>
<bean id="compositeFilter"
class="org.springframework.integration.file.filters.CompositeFileListFilter">
<constructor-arg>
<list>
<bean
class="org.springframework.integration.sftp.filters.SftpSimplePatternFileListFilter">
<constructor-arg value="Receipt*.txt" />
</bean>
<bean id="SftpPersistentAcceptOnceFileListFilter" class="org.springframework.integration.sftp.filters.SftpPersistentAcceptOnceFileListFilter">
<constructor-arg ref="metadataStore" />
<constructor-arg value="ReceiptLock_" />
</bean>
</list>
</constructor-arg>
</bean>
<bean id="redisConnectionFactory"
class="org.springframework.data.redis.connection.jedis.JedisConnectionFactory">
<property name="port" value="${redis.port}" />
<property name="password" value="${redis.password}" />
<property name="hostName" value="${redis.host}" />
</bean>
No; you need to use a SftpPersistentAcceptOnceFileListFilter (docs here) with a Redis (or some other) metadata store, not a lock registry.
EDIT
Regarding your comment below.
Yes, it's a known issue; in the next release we've added a max-fetch-size for exactly this reason - so the instances can each retrieve some of the files rather than the first instance grabbing them all.
(The inbound adapter works by first copying files found, that are not already in the store, to the local disk, and then emits them one at a time).
5.0 only available as a milestone right now M2 at the time of writing, but the current version and milestone repo can be found here; it won't be released for a few more months.
Another alternative would be to use outbound gateways - one to LS the files and one to GET individual files; your app would have to use the metadata store itself, though, to determine which file(s) can be fetched.

Infinispan 7.1 Cassandra cache-store

I have a very simple cache of type String, Long in Infinispan. I would like to persist this cache to cassandra.
I have installed Infinispan 7.1 Server and I have a cassandra instance running.
I've looked at http://infinispan.org/docs/cachestores/cassandra/ which lists two xml excerpts. Since I am completely new to Infinispan, I have no idea where to add the listed xml.
Is there an example Infinispan 7.1 server installation somewhere where this is setup and working?
EDIT 1:
I am able to use file based persistence by defining myCache as follows:
<subsystem xmlns="urn:infinispan:server:core:7.1" default-cache-container="clustered">
<cache-container name="clustered" default-cache="default" statistics="true">
<transport executor="infinispan-transport" lock-timeout="60000"/>
<distributed-cache name="myCache" mode="SYNC" start="EAGER">
<file-store
shared="false" preload="true"
fetch-state="true"
read-only="false"
purge="false"
path="${java.io.tmpdir}">
<write-behind flush-lock-timeout="15000" thread-pool-size="5" />
</file-store>
</distributed-cache>
</cache-container>
</subsystem>
The schema for the xml defines a "store" element that extends "custom-store".
<xs:element name="store" type="tns:custom-store">
Sooo... I guess I need to study its schema to see how I can create a Cassandra store.
I have a constant feeling I'm doing something wrong because I can find no examples of anyone using this xml...
EDIT 2
Replaced file-store with the following:
<store
name="cassandraStore"
class="org.infinispan.loaders.cassandra.CassandraCacheStore"
shared="true" preload="false" passivation="false"
fetch-state="true">
<property name="host">localhost</property>
<property name="keySpace">mykeyspace</property>
<property name="entryColumnFamily">mytable</property>
<property name="expirationColumnFamily">mytableExpiration</property>
<property name="sharedKeyspace">false</property>
<property name="readConsistencyLevel">ONE</property>
<property name="writeConsistencyLevel">ONE</property>
<property name="configurationPropertiesFile">cassandrapool.properties</property>
<property name="keyMapper">org.infinispan.loaders.keymappers.DefaultTwoWayKey2StringMapper</property>
</store>
When I start infinispan server I get
"JBAS010292: org.infinispan.loaders.cassandra.CassandraCacheStore is not a valid cache store"
Looking at http://mvnrepository.com/artifact/org.infinispan/infinispan-cachestore-cassandra I see the latest version is 6.0.0.Alpha1 from July of 2013
I think the conclusion here is: Cassandra is not supported by Infinispan.

Memory leak in Spring library : SimpleMessageStore

3 of the webservices that I am working on uses Springs, SimpleMessageStore for storing the messages. For some reason it is causing memory leak in production env and I am unable to reproduce it in the lower environments. I am new to spring integration and need help in understanding what might be causing this.
the spring config code looks like this:
<!-- MESSAGE STORES -->
<bean id="monitoringHeaderRequestMsgStore" class="org.springframework.integration.store.SimpleMessageStore"/>
<bean id="gbqHeaderRequestMsgStore" class="org.springframework.integration.store.SimpleMessageStore"/>
<bean id="bondAgreementResponseMsgStore" class="org.springframework.integration.store.SimpleMessageStore"/>
<bean id="bondWIthRulesRequestMsgStore" class="org.springframework.integration.store.SimpleMessageStore"/>
<bean id="ProcessVariableMessageStores" class="com.aviva.uklife.investment.impl.ProcessVariableMessageStores">
<property name="_monitoringHeaderRequestMsgStore" ref="monitoringHeaderRequestMsgStore"/>
<property name="_gbqHeaderRequestMsgStore" ref="gbqHeaderRequestMsgStore"/>
<property name="_bondWIthRulesRequestMsgStore" ref="bondWIthRulesRequestMsgStore"/>
<property name="_bondAgreementResponseMsgStore" ref="bondAgreementResponseMsgStore"/>
</bean>
<!-- Retrieve stored MonitoringHeaderRequest -->
<int:transformer expression="headers.get('#{T(.....Constants).MONITORING_HEADER_REQUEST_CLAIM_CHECK_ID}')"/>
<int:claim-check-out message-store="monitoringHeaderRequestMsgStore" remove-message="false"/>
<!-- Store HeaderRequest -->
<int:gateway request-channel="header-req-store-channel"/>
<!-- PROCESS VARIABLES STORAGE IN STORE CHANNELS WITH KEY OR CLAIMCHECK ID -->
<int:chain input-channel="monitoring-header-req-store-channel">
<int:claim-check-in message-store="monitoringHeaderRequestMsgStore"/>
<int:header-enricher>
<int:header name="#{T(....Constants).MONITORING_HEADER_REQUEST_CLAIM_CHECK_ID}" expression="payload"/>
</int:header-enricher>
<int:claim-check-out message-store="monitoringHeaderRequestMsgStore" remove-message="false"/>
</int:chain>
thank you
To be honest, it isn't recommended to use SimpleMessageStore in the production environment. That's because of memory-leak, as you noticed. If you don't clear the MessageStore periodically.
Right, there are might be some cases, when you need to keep messages in the MessageStore for the long time. So consider to replace SimpleMessageStore with some persistent MessageStore.
From other side we need to have more info on the matter to provide better help.
Maybe you just have several aggregators and don't use expire-groups-upon-completion = "true"...

Resources