Hazelcast Jet: How to prevent 'event dropped'? - hazelcast

I am getting 'Event dropped, late by 5051 ms.'
How should I build my pipeline that all events are processed, regardless of their late arrival.
I have tried several approaches. Basically, what I tried was
Without windowing where I didn't get late events, but this is not applicable for me due to parallel execution and values in sink get overriden instead of merged.
Therefore I used windowing which solved my overriding problem, but caused late events.
Next, I tried to use windowing without timestamp, which throwed exception that timestamp must be defined.
Basically I have 2 problems here: 1) how to merge new event to existing ones in sink 2) without dropping events or overriding.
Code:
WindowDefinition customWindow = WindowDefinition.sliding(60000, 30000);
customWindow.setEarlyResultsPeriod(1000);
StreamStage<Map.Entry<...>> updatedState = p
.drawFrom(<source>)
.withIngestionTimestamps()
.groupingKey(...)
.window(customWindow)
.aggregate(AggregateOperations.toCollection(ArrayList::new))
.mapUsingIMap(...)
.sink(...)

Related

Tf package extrapolation to the past

When I lookup for transform like below,
(trans, rot) = self.listener.lookupTransform(\
self.robot_link, target_link, rospy.Time(0))
Error pops up:Lookup would require extrapolation into the past.
However, when I change this line to
(trans, rot) = self.listener.lookupTransform(\
self.robot_link, target_link, rospy.Time.now())
Error processing request: Lookup would require extrapolation into the future.
How can I solve this problem?
You can wait for the transformation to become available for a specific duration as described in the tf and Time ROS tutorial:
now = rospy.Time.now()
listener.waitForTransform(self.robot_link, target_link, now, rospy.Duration(4.0))
(trans,rot) = listener.lookupTransform(self.robot_link, target_link, now)
This blocks and waits for your specified duration until the transform you want to query is available.
You might want to add some handling of the case that the transform does not get published in the wait duration.

Hazelcast EntryListener: Why is set() returning and oldValue different than null?

I have joined a new project about a year ago and I have started to do some minor tasks with Hazelcast, including the creation of MapStores and EntryListeners for our IMaps.
Since the beginning that I am aware of the difference between using set() and put(), with the ladder carrying the weight of deserializing and returning the old value. That is why I would use put when we needed to access the oldValue in the EntryListeners and use set otherwise.
However, for the past weeks, my team started to report occurrences where map insertions done with set() would trigger the cEntryUpdated with a populated oldValue, which "breaks" some of our current logic.
Now I don't know if this was some recent change released by Hazelcast (we are currently using version 3.12.1) or if I'm just doing something wrong from the beginning. Shouldn't I expect that set would always trigger the listener with an empty oldValue?
There is always an old value, but the writer and listener are independently configurable for whether they receive it.
On map, the writer can use V Map.put(K,V) to receive the old value.
Or, the writer can use void Map.put(K,V) to not receive the old value.
On a listener, use the include-value=true to receive the old and new values, and include-value=false not to. On an insert, the old value will be null. On a delete, the new value will be null.

Right way to delete and then reindex ES documents

I have a python3 script that attempts to reindex certain documents in an existing ElasticSearch index. I can't update the documents because I'm changing from an autogenerated id to an explicitly assigned id.
I'm currently attempting to do this by deleting existing documents using delete_by_query and then indexing once the delete is complete:
self.elasticsearch.delete_by_query(
index='%s_*' % base_index_name,
doc_type='type_a',
conflicts='proceed',
wait_for_completion=True,
refresh=True,
body={}
)
However, the index is massive, and so the delete can take several hours to finish. I'm currently getting a ReadTimeoutError, which is causing the script to crash:
WARNING:elasticsearch:Connection <Urllib3HttpConnection: X> has failed for 2 times in a row, putting on 120 second timeout.
WARNING:elasticsearch:POST X:9200/base_index_name_*/type_a/_delete_by_query?conflicts=proceed&wait_for_completion=true&refresh=true [status:N/A request:140.117s]
urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='X', port=9200): Read timed out. (read timeout=140)
Is my approach correct? If so, how can I make my script wait long enough for the delete_by_query to complete? There are 2 timeout parameters that can be passed to delete_by_query - search_timeout and timeout, but search_timeout defaults to no timeout (which is I think what I want), and timeout doesn't seem to do what I want. Is there some other parameter I can pass to delete_by_query to make it wait as long as it takes for the delete to finish? Or do I need to make my script wait some other way?
Or is there some better way to do this using the ElasticSearch API?
You should set wait_for_completion to False. In this case you'll get task details and will be able to track task progress using corresponding API: https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete-by-query.html#docs-delete-by-query-task-api
Just to explain more in the form of codebase explained by Random for the newbee in ES/python like me:
ES = Elasticsearch(['http://localhost:9200'])
query = {'query': {'match_all': dict()}}
task_id = ES.delete_by_query(index='index_name', doc_type='sample_doc', wait_for_completion=False, body=query, ignore=[400, 404])
response_task = ES.tasks.get(task_id) # check if the task is completed
isCompleted = response_task["completed"] # if complete key is true it means task is completed
One can write custom definition to check if the task is completed in some interval using while loop.
I have used python 3.x and ElasticSearch 6.x
You can use the 'request_timeout' global param. This will reset the Connections timeout settings, as mentioned here
For example -
es.delete_by_query(index=<index_name>, body=<query>,request_timeout=300)
Or set it at connection level, for example
es = Elasticsearch(**(get_es_connection_parms()),timeout=60)

Missing ETW EventSource table in Azure SDK 2.6

I'm trying to use ETW for logging with several custom EventSource classes in Azure SDK 2.6.
When testing locally with the compute/storage emulator, three of my custom WADMyEventXYZ tables show up; however, the final expected table "WADMyDataSets" never seems to be created. How should I determine what is causing this problem? I see no errors from the compute emulator when the debugger is attached and stepping through the code in the debugger shows that WriteEntry on the EventSource is definitely called. The other tables show up in SchemasTable in the developer storage account, but there is no entry there for WADMyDataSets.
I exported WADDiagnosticInfrastrureLogsTable into CSV and examined it in Excel and see the following messages that reference "MyDataSets":
Validating table MyDataSets; DiskMB:451; RequiredQuota:451 RetentionSeconds:7776000 Pri:2 MinQuotaMB:0 RunningTotal:3757
Table does not exist
table C:\Users\Caleb\AppData\Local\dftmp\Resources\b316f531-f673-4db3-ac1c-e4649e289871\WAD0104\Tables\MyDataSets does not exist, CreationDisposition = 4
Table MyDataSets does not exist, will create a new one
Delaying the creation of table MyDataSets until the schema is known
Later on:
Converted EventSource provider name "MyDataSets" to {74a2b9c9-0bd8-547f-6cad-453da47055be}
Matched task with query id MyDataSetsQuery and regex ^MyDataSets$ to source table MyDataSets
Registering query MyDataSetsQuery_MyDataSets_XTableWadAccount:
Adding standard PkRk (MA) fields to 'MyDataSetsQuery_MyDataSets'
Successfully compiled the query 'MyDataSetsQuery_MyDataSets'
Added task MyDataSetsQuery_MyDataSets_WADMyDataSets_PT1M_XTableWadAccount from MyDataSets - Partitions:-1 Pri:normal TSPolicy:start StoreType:Central Repeat:2147483647 Timeout:3600s Deadline:300s DelayRange:0.00
Later on:
No checkpoint found for task MyDataSetsQuery_MyDataSets_WADMyDataSets_PT1M_XTableWadAccount after time 2015-05-13T00:44:21.000Z; retry time out is 3600 seconds
First scheduled task for MyDataSetsQuery_MyDataSets_WADMyDataSets_PT1M_XTableWadAccount is at 2015-05-13T01:44:00.000Z (plus a delay of 20s)
Later on:
Increasing query delay of task MyDataSetsQuery_MyDataSets_WADMyDataSets_PT1M_XTableWadAccount from 20 to 40 seconds to introduce randomness to the upload schedule
Later on:
Starting scheduled task MyDataSetsQuery_MyDataSets_WADMyDataSets_PT1M_XTableWadAccount from 2015-05-13T01:43:00.000Z to 2015-05-13T01:44:00.000Z; query delay 40 seconds
Table C:\Users\Caleb\AppData\Local\dftmp\Resources\b316f531-f673-4db3-ac1c-e4649e289871\WAD0104\Tables\MyDataSets does not exist
Ending scheduled task MyDataSetsQuery_MyDataSets_WADMyDataSets_PT1M_XTableWadAccount from 2015-05-13T01:43:00.000Z to 2015-05-13T01:44:00.000Z in 1ms
Update
The EventSource in question had one event on it:
[Event(1)]
public void DataSetLoaded(string traceActivityId, string userId, string reportCode, long timeToLoadMs)
Removing the fourth parameter "timeToLoadMs" resulted in the WAD event table showing up as expected. I tried changing the last parameter to a string, and it failed to show up again. Is there a documented limit on the number of parameters for an event method? I'm pretty sure I've seen samples that have four parameters.
I upgraded my web project to .NET 4.5.1 and now the WAD table shows up as expected (I had been running on just .NET 4.5 before this).
It would seem that there might be a bug with having 4 parameters on an EventSource event when using .NET 4.5.0.
As a side note, with 4.5.1, I now have the System.Diagnostics.Tracing.EventSource.SetCurrentThreadActivityId method which will let me get rid of manually including the CorrelationManager.ActivityId in my event output.
https://channel9.msdn.com/Series/ConnectOn-Demand/240 video released today says full support for Azure table logging for ETW eventsources.

BPEL process stopped on a second receive

I am new in BPEL writing. I have realized the simple process below:
receive1
|
|
invoke1
|
|
receive2
|
|
invoke2
The problem is that the process correctly runs till to the "receive2" but when I invoke, via soapUI, the operation associated to the "receive2" nothing happens. I have read other posts about BPEL but nothing matching with this question. Below the real activities (I omitted the Assign ones) involved.
<bpel:receive name="receiveInput" partnerLink="client"
portType="tns:HealthMobility"
operation="initiate" variable="input"
createInstance="yes"/>
<bpel:invoke name="getTreatmentOptions"
partnerLink="treatmentProviderPL" operation="getTreatmentOptions"
inputVariable="getTreatmentOptionsReq" outputVariable="getTreatmentOptionsResp">
</bpel:invoke>
<bpel:receive name="bookMobility" partnerLink="client" operation="bookMobility"
variable="bookMobilityReq" portType="tns:HealthMobility"/>
<bpel:invoke name="getTripOptions" partnerLink="mobilityMultiProvidersPL"
operation="getTripOptions" inputVariable="getTripOptionsReq"
outputVariable="getTripOptionsResp"></bpel:invoke>
I have tried to make debugging simply by deleting the receive and initializing statically the input variable required by the getTriOptions invoke. In this case all works fine so it means, necessarly, that the process continue to wait on the receive also if I invoke bookMobility via SOAPUI. My question is: Why? I'm missing something?
Thanks
You need to define a correlation set for the second receive. Each message that is sent to the operation that is connected to the first receive activity will create a new process instance. This means you may have multiple instance running in parallel. When these instances have reached the second receive, they are waiting for the second message, but in your example, there are no means to distinguish, which message is targeted to which process instance. I assume that your BPEL engine also logged that it could not route the message to a target instance.
To solve this problem, you need to find an identifier in the payload of a message and initialize a correlation set with this value. Then, when using the same correlation set with the second receive, all messages that contain the same identifier will be routed to this particular process instance. For further information about correlation sets I recommend reading the BPEL primer, section 4.2.4.

Resources