Is it possible to combine OutOfMemoryHandler and MapStore in Hazelcast

Is it possible to combine OutOfMemoryHandler and MapStore in Hazelcast - hazelcast

I know it is possible to store every entry on a Map at Hazelcast to backing DataStore via implementation of MapStore interface but in our application we don't want that, we only want that if our application is running on danger of getting OutOfMemory Exception, it evicts certain percentage of data in the memory (with LRU principle) but during the eviction it store evicted entries to data store and load them again if entry key asked again.
I know that the OutOfMemoryHandler interface exist to manage OutOfMemory situations, %25 Eviction policy exists and MapStore also.
What I don't know can I combine them all?
Thx for answers...

You can use IMap.putTransient() to put your entries in IMap but not trigger MapStore.store()
IMap.evict() (or auto eviction) will not trigger MapStore removal, i.e. it will just remove from the IMap.
However I don't suggest this approach because once the jvm gets OutOfMemoryError it can become unstable so applying such business logic would not be possible. Also OutOfMemoryHandler may not be called even if JVM throws OutOfMemoryError because the error may be thrown from an external (user) thread and Hazelcast may not be informed about it.

Related

None value for paho_mqtt::create_options::CreateOptionsBuilder persistance

The documentation for CreateOptionsBuilder method.persistence indicates that setting this value as None will improve the performance, but ending up with a less reliable system.
Could someone elaborate on this? Please. Under which circumstances should I consider setting this to None?

The Eclipse Paho MQTT Rust Client Library is a "safe wrapper around the Paho C Library". The persistence options are mapped to values accepted by the C library with None becoming MQTTCLIENT_PERSISTENCE_NONE. The docs for the C client provide a more detailed explanation of the options:
persistence_type The type of persistence to be used by the client:
MQTTCLIENT_PERSISTENCE_NONE: Use in-memory persistence. If the device or system on which the client is running fails or is switched off, the current state of any in-flight messages is lost and some messages may not be delivered even at QoS1 and QoS2.
MQTTCLIENT_PERSISTENCE_DEFAULT: Use the default (file system-based) persistence mechanism. Status about in-flight messages is held in persistent storage and provides some protection against message loss in the case of unexpected failure.
MQTTCLIENT_PERSISTENCE_USER: Use an application-specific persistence implementation. Using this type of persistence gives control of the persistence mechanism to the application. The application has to implement the MQTTClient_persistence interface.
The upshot is that calling persistence(None) means that messages will be held in memory rather than being written to disk (assuming QOS1/2). This has the potential to improve performance (writing to disk can be expensive) but, because the info is only stored in memory, messages may be lost if your application shuts down without completing delivery.
A quick example might help (simplifying things a little); lets say you publish a message with QOS=1 and a network issue means that the broker does not receive it. When the connection is re-established (failed delivery will generally mean the connection will drop) the client will resend the message (because it has not processed an acknowledgment from the broker). With the default persistence (disk) the message will be retransmitted even if the failure was due to a power outage that affected the server your app was running on (obviously this only happens when power is restored and your app restarts); that message would be lost if you had called persistence(None).
The appropriate setting is going to depend upon your needs and other options may have an impact (e.g. if Clean Start/CleanSession is true then there unlikely to be any benefit to persisting to disk).

When you don't care if all messages are received. E.g. when using only QOS 0 messages

Is there any limit on the size of data that can be stored in Spring Integration message header

I have a requirement where I need to store a List of records in to Spring integration message header, so that it can be used later in the flow. This list can grow up to 100,000 records.
I would like to know is there a limit on the size of data which can be stored in spring integration header?
Also is there any alternative approach which can be taken to fulfill this requirement for ex: claim check usage etc.
Thanks.

If you don't do any persistence in between or don't propagate the message over messaging middleware (Kafka, JMS etc.), then all the data are in the memory and you definitely is just limited to what you have dedicated to JVM heap. So, if keeping those huge objects in the memory is the problem, then indeed a Claim Check is a good pattern to follow. This way the whole message is serialized to the external MessageStore which can restore later on via returned claim check key.

Spring Integration Feed Inbound Channel Adapter duplicate entries

I am using Spring Integration to consume RSS feeds using its inbound channel adapter and writing the feeds to a database table.
To prevent duplicate entries when the process is stopped/started, I have enabled the PropertiesPersistingMetadataStore. As a secondary measure, on the database table, I also have a unique constraint across the feed id/feed entry link columns.
This seems to be working fine but I have noticed on some restarts (not all the time) that I am getting some DB exception errors where it is trying to insert the same RSS feed item again.
Under what conditions would I being getting these duplicate errors and is there anyway I can get round them?

The PropertiesPersistingMetadataStore only persists its state on a normal application shutdown (when the bean is destroy()ed by the application context).
However, it implements Flushable so you can either call flush() on it in your flow after persisting.
You could use transaction synchronization to flush the store after the db transaction commits with the after commit expression #metadataStore.flush().
Or, you could use a more robust persistent store, such as Redis, which persists on each update.

Preventing duplicate entries in Multi Instance Application Environment

I am writing an application to serve facebook APIs; share, like etc.. I am keeping all those shared objects from my appliction in a database and I do not want to share the same object if it already been shared.
Considering I will deploy application on different servers there could be a case where both instance tries to insert the same object to table.
How can I manage this concurrency problem with blocking the applications fully ? I mean two threads will try to insert same object and they must sync but they should not block a 3rd thread where it is inserting totally different object.

If there's a way to derive primary key of data entry from data itself, database will resolve such concurrency issue by itself -- 2nd insert will fail with 'Primary Key constraint violation'. Perhaps, data supplied by Facebook API already have some unique ID?
Or, you can consider some distributed lock solution, for example, based on Hazelcast or on similar data grid. This would allow to have record state shared by different JVMs, so it will be possible to avoid unneeded INSERTS.

Hazelcast - OperationTimeoutException

I am using Hazelcast version 3.3.1.
I have a 9 node cluster running on aws using c3.2xlarge servers.
I am using a distributed executor service and a distributed map.
Distributed executor service uses a single thread.
Distributed map is configured with no replication and no near-cache and stores about 1 million objects of size 1-2kb using Kryo serializer.
My use case goes as follow:
All 9 nodes constantly execute a synchronous remote operation on the distributed executor service and generate about 20k hits per second (about ~2k per node).
Invocations are executed using Hazelcast API: com.hazelcast.core.IExecutorService#executeOnKeyOwner.
Each operation accesses the distributed map on the node owning the partition, does some calculation using the stored object and stores the object in to the map. (for that I use the get and set API of the IMap object).
Every once in a while Hazelcast encounters a timeout exceptions such as:
com.hazelcast.core.OperationTimeoutException: No response for 120000 ms. Aborting invocation! BasicInvocationFuture{invocation=BasicInvocation{ serviceName='hz:impl:mapService', op=GetOperation{}, partitionId=212, replicaIndex=0, tryCount=250, tryPauseMillis=500, invokeCount=1, callTimeout=60000, target=Address[172.31.44.2]:5701, backupsExpected=0, backupsCompleted=0}, response=null, done=false} No response has been received! backups-expected:0 backups-completed: 0
In some cases I see map partitions start to migrate which makes thing even worse, nodes constantly leave and re-join the cluster and the only way I can overcome the problem is by restarting the entire cluster.
I am wondering what may cause Hazelcast to block a map-get operation for 120 seconds?
I am pretty sure it's not network related since other services on the same servers operate just fine.
Also note that the servers are mostly idle (~70%).
Any feedbacks on my use case will be highly appreciated.

Why don't you make use of an entry processor? This is also send to the right machine owning the partition and the load, modify, store is done automatically and atomically. So no race problems. It will probably outperform the current approach significantly since there is less remoting involved.
The fact that the map.get is not returning for 120 seconds is indeed very confusing. If you switch to Hazelcast 3.5 we added some logging/debugging stuff for this using the slow operation detector (executing side) and slow invocation detector (caller side) and should give you some insights what is happening.
Do you see any Health monitor logs being printed?

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string