How can I get information about who is the primary node of the distributed queue and who is the backup node? - gridgain

I add some data into distributed queue and I am wondering how I can get some information about who is the primary node of the queue and who is the backup node.
Thanks,
Bill

GridGain cache queue is distributed, i.e. different elements can be stored on different nodes. If your cache has backups then each element will be duplicated on two or more nodes. So there is no way to determine the primary or backup node for non-collocated queue.
If queue is collocated, all its items will be stored on one node (this can be used if you have many small queues instead of one large queue). In this case you can get primary and backup node for this queue passing queue name to affinity, like this:
// Create collocated queue (3rd parameter is true).
GridCacheQueue<String> queue = cache.dataStructures().queue("MyQueue", 100, true, true);
// Get primary and backup nodes using cache affinity.
Iterator<GridNode> nodes = cache.affinity().mapKeyToPrimaryAndBackups("MyQueue").iterator();
// First element in collection is always the primary node.
GridNode primary = nodes.next();
// Other nodes in collection are backup nodes.
GridNode backup1 = nodes.next();
GridNode backup2 = nodes.next();
You don't see anything during iteration over cache because queue elements are internal entries, so they are accessible only via GridCacheQueue API, but not via GridCache API. Here is an example:
// Create or get a queue.
GridCacheQueue<String> queue = cache.dataStructures().queue("MyQueue", 100, false, true);
for (String item : queue)
System.out.println(item);

So far I know that distributed queue is based on GridGain cache. However, I run the following code and I get empty cache.
GridCache<Object, Object> cache = grid.cache("partitioned_tx");
GridCacheDataStructures dataStruct = cache.dataStructures();
GridCacheQueue<String> queue = dataStruct.queue("myQueueName", 0, false, true);
for (int i = 0; i < 20; i++){
queue.add("Value-"+i);
}
GridCacheAffinity<Object> affinity = cache.affinity();
int part;
Collection<GridNode> nodes;
for(Object key:cache.keySet()){
System.out.println("key="+key.toString());
part = affinity.partition(key);
nodes = affinity.mapPartitionToPrimaryAndBackups(part);
for(GridNode node:nodes){
System.out.println("key of "+key.toString()+" is primary: "+affinity.isPrimary(node, key));
System.out.println("key of "+key.toString()+" is backup: "+affinity.isBackup(node, key));
}
}

Related

Can't confirm any actors are being created

In Service Fabric I am trying to call an ActorService and get a list of all actors. I'm not getting any errors, but no actors are returned. It's always zero.
This is how I add actors :
ActorProxy.Create<IUserActor>(
new ActorId(uniqueName),
"fabric:/ECommerce/UserActorService");
And this is how I try to get a list of all actors:
var proxy = ActorServiceProxy.Create(new Uri("fabric:/ECommerce/UserActorService"), 0);
ContinuationToken continuationToken = null;
CancellationToken cancellationToken = new CancellationTokenSource().Token;
List<ActorInformation> activeActors = new List<ActorInformation>();
do
{
var proxy = GetUserActorServiceProxy();
PagedResult<ActorInformation> page = await proxy.GetActorsAsync(continuationToken, cancellationToken);
activeActors.AddRange(page.Items.Where(x => x.IsActive));
continuationToken = page.ContinuationToken;
}
while (continuationToken != null);
But no matter how many users I've added, the page object will always have zero items. What am I missing?
The second argument int in ActorServiceProxy.Create(Uri, int, string) is the partition key (you can find out more about actor partitioning here).
The issue here is that your code checks only one partition (partitionKey = 0).
So the solutions is quite simple - you have to iterate over all partitions of you service. Here is an answer with code sample to get partitions and iterate over them.
UPDATE 2019.07.01
I didn't spot this from the first time but the reason why you aren't getting any actors returned is because you aren't creating any actors - you are creating proxies!
The reason for such confusion is that Service Fabric actors are virtual i.e. from the user point of view actor always exists but in real life Service Fabric manages actor object lifetime automatically persisting and restoring it's state as needed.
Here is a quote from the documentation:
An actor is automatically activated (causing an actor object to be constructed) the first time a message is sent to its actor ID. After some period of time, the actor object is garbage collected. In the future, using the actor ID again, causes a new actor object to be constructed. An actor's state outlives the object's lifetime when stored in the state manager.
In you example you've never send any messages to actors!
Here is a code example I wrote in Program.cs of newly created Actor project:
// Please don't forget to replace "fabric:/Application16/Actor1ActorService" with your actor service name.
ActorRuntime.RegisterActorAsync<Actor1> (
(context, actorType) =>
new ActorService(context, actorType)).GetAwaiter().GetResult();
var actor = ActorProxy.Create<IActor1>(
ActorId.CreateRandom(),
new Uri("fabric:/Application16/Actor1ActorService"));
_ = actor.GetCountAsync(default).GetAwaiter().GetResult();
ContinuationToken continuationToken = null;
var activeActors = new List<ActorInformation>();
var serviceName = new Uri("fabric:/Application16/Actor1ActorService");
using (var client = new FabricClient())
{
var partitions = client.QueryManager.GetPartitionListAsync(serviceName).GetAwaiter().GetResult();;
foreach (var partition in partitions)
{
var pi = (Int64RangePartitionInformation) partition.PartitionInformation;
var proxy = ActorServiceProxy.Create(new Uri("fabric:/Application16/Actor1ActorService"), pi.LowKey);
var page = proxy.GetActorsAsync(continuationToken, default).GetAwaiter().GetResult();
activeActors.AddRange(page.Items);
continuationToken = page.ContinuationToken;
}
}
Thread.Sleep(Timeout.Infinite);
Pay special attention to the line:
_ = actor.GetCountAsync(default).GetAwaiter().GetResult();
Here is where the first message to actor is sent.
Hope this helps.

How to get Azure EventHub Depth

My EventHub has millions of messages ingestion every day. I'm processing those messages from Azure Function and printing offset and squence number value in logs.
public static async Task Run([EventHubTrigger("%EventHub%", Connection = "EventHubConnection", ConsumerGroup = "%EventHubConsumerGroup%")]EventData eventMessage,
[Inject]ITsfService tsfService, [Inject]ILog log)
{
log.Info($"PartitionKey {eventMessage.PartitionKey}, Offset {eventMessage.Offset} and SequenceNumber {eventMessage.SequenceNumber}");
}
Log output
PartitionKey , Offset 78048157161248 and SequenceNumber 442995283
Questions
PartitionKey value blank? I have 2 partitions in that EventHub
Is there any way to check backlogs? Some point of time I want to get how many messages my function need to process.
Yes, you can include the PartitionContext object as part of the signature, which will give you some additional information,
public static async Task Run([EventHubTrigger("HubName",
Connection = "EventHubConnectionStringSettingName",
ConsumerGroup = "Consumer-Group-If-Applicable")] EventData[] messageBatch, PartitionContext partitionContext, ILogger log)
Edit your host.json and set enableReceiverRuntimeMetric to true, e.g.
"version": "2.0",
"extensions": {
"eventHubs": {
"batchCheckpointFrequency": 100,
"eventProcessorOptions": {
"maxBatchSize": 256,
"prefetchCount": 512,
"enableReceiverRuntimeMetric": true
}
}
}
You now get access to RuntimeInformation on the PartitionContext, which has some information about the LastSequenceNumber, and your current message has it's own sequence number, so you could use the difference between these to calculate a metric, e.g something like,
public class EventStreamBacklogTracing
{
private static readonly Metric PartitionSequenceMetric =
InsightsClient.Instance.GetMetric("PartitionSequenceDifference", "PartitionId", "ConsumerGroupName", "EventHubPath");
public static void LogSequenceDifference(EventData message, PartitionContext context)
{
var messageSequence = message.SystemProperties.SequenceNumber;
var lastEnqueuedSequence = context.RuntimeInformation.LastSequenceNumber;
var sequenceDifference = lastEnqueuedSequence - messageSequence;
PartitionSequenceMetric.TrackValue(sequenceDifference, context.PartitionId, context.ConsumerGroupName,
context.EventHubPath);
}
}
I wrote an article on medium that goes into a bit more detail and show how you might consume the data in grafana,
https://medium.com/#dylanm_asos/azure-functions-event-hub-processing-8a3f39d2cd0f
PartitionKey value blank? I have 2 partitions in that EventHub
The partition key is not the same as the partition ids. When you publish an event to Event Hubs, you can set the partition key. If that partition key is not set, then it will be null when you go to consume it.
Partition key is for events where you don't care what partition it ends up in, just that you want events with the same key to end up in the same partition.
An example would be if you had hundreds of IoT devices transmitting telemetry data. You don't care what partition these IoT devices publish their data to, as long as it always ends up in the same partition. You may set the partition key to the serial number of the IoT device.
When that device publishes its event data with that key, the Event Hubs service will calculate a hash for that partition key, map it to a specific Event Hub partition, and will route any events with that key to the same partition.
The documentation from "Event Hubs Features: Publishing an Event" depicts it pretty well.

How do I configure Hazelcast read-through Map when only part of the nodes are able to populate the Map data?

Let's say I have two types of Hazelcast nodes running on cluster:
"Leader" nodes – these are able to load and populate Hazelcast map M. Leaders will also update values in M from time to time (based on external resource).
"Follower" nodes – these will need to read from M
My intent is for Follower nodes to trigger loading missing elements into M (loading thus needs to be done on Leader side) .
Roughly, the steps made to get an element from map could look like this:
IMap m = hazelcastInstance.getMap("M");
if (!m.containsKey(k)) {
if (iAmLeader()) {
Object fresh = loadByKey(k); // loading from external resource
return m.put(k, fresh);
} else {
makeSomeLeaderPopulateValueForKey(k);
}
}
return m.get(k);
What approach could you suggest?
Notes
I want Followers to act as nodes, not just clients, because there are going to be far more Follower instances than Leaders and I would like them to participate in load distribution.
I could just build another level of service, that would run only on Leader nodes and provide interface to populate map with requested keys. But that would mean adding extra layer of communication and configuration, and I was hoping that the kind of requirements stated above could be solved within single Hazelcast cluster.
I think I may have found an answer in the form of MapLoader (EDIT since originally posting, I have confirmed this is indeed the way to do this).
final Config config = new Config();
config.getMapConfig("MY_MAP_NAME").setMapStoreConfig(
new MapStoreConfig().setImplementation(new MapLoader<KeyType, ValueType>(){
#Override
public ValueType load(final KeyType key) {
//when a client asks for data for corresponding key of type
//KeyType that isn't already loaded
//this function will be invoked and give you a chance
//to load it and return it
ValueType rv = ...;
return rv;
}
#Override
public Map<KeyType, ValueType> loadAll(
final Collection<KeyType> keys) {
//Similar to MapLoader#load(KeyType), except this is
//a batched version of it for performance gains.
//this gets called on first access to the cache,
//where MapLoader#loadAllKeys() is called to get
//the keys parameter for this funcion
Map<KeyType, ValueType> rv = new HashMap<>();
keys.foreach((key)->{
rv.put(key, /*figure out what key means*/);
});
return rv;
}
#Override
public Set<KeyType> loadAllKeys() {
//Prepopulate all the keys. My understanding is that
//this is an initialization step, to give you a chance
//to load data on startup so an initial set of datas
//will be available to anyone using the cache. Any keys
//returned here are sent to MapLoader#loadAll(Collection)
Set<KeyType> rv = new HashSet<>();
//figure out what keys need to be in the return value
//to load a key into cache at first access to this map,
//named "MY_MAP_NAME" in this example
return rv;
}
}));
config.getGroupConfig().setName("MY_INSTANCE_NAME").setPassword("my_password");
final HazelcastInstance hazelcast = Hazelcast
.getOrCreateHazelcastInstance(config);

Data tracking in DocumentDB

I was trying to keep the history of data (at least one step back) of DocumentDB.
For example, if I have a property called Name in document with value "Pieter". Now I am changing that to "Sam", I have to maintain the history , it was "Pieter" previously.
As of now I am thinking of a pre-trigger. Any other solutions ?
Cosmos DB (formerly DocumentDB) now offers change tracking via Change Feed. With Change Feed, you can listen for changes on a particular collection, ordered by modification within a partition.
Change feed is accessible via:
Azure Functions
DocumentDB (SQL) SDK
Change Feed Processor Library
For example, here's a snippet from the Change Feed documentation, on reading from the Change Feed, for a given partition (full code example in the doc here):
IDocumentQuery<Document> query = client.CreateDocumentChangeFeedQuery(
collectionUri,
new ChangeFeedOptions
{
PartitionKeyRangeId = pkRange.Id,
StartFromBeginning = true,
RequestContinuation = continuation,
MaxItemCount = -1,
// Set reading time: only show change feed results modified since StartTime
StartTime = DateTime.Now - TimeSpan.FromSeconds(30)
});
while (query.HasMoreResults)
{
FeedResponse<dynamic> readChangesResponse = query.ExecuteNextAsync<dynamic>().Result;
foreach (dynamic changedDocument in readChangesResponse)
{
Console.WriteLine("document: {0}", changedDocument);
}
checkpoints[pkRange.Id] = readChangesResponse.ResponseContinuation;
}
If you're trying to make an audit log I'd suggest looking into Event Sourcing.Building your domain from events ensures a correct log. See https://msdn.microsoft.com/en-us/library/dn589792.aspx and http://www.martinfowler.com/eaaDev/EventSourcing.html

How to determine which keys are locked lock with Hazelcast Imap

I am using Hazelcast Imap interface to lock items in a distributed way.Instead of putting item in map I just call lock method which seem to be working but I don't know how to query which items are locked currently since items are not available in the map.Is there a way to query hazelcast about locked keys?
Here is the example code:
public void testMap_DistributedLock() {
final Config hazelcastConfig = new Config();
int numberOfRecords = 100;
final HazelcastInstance instance1 = Hazelcast.newHazelcastInstance(hazelcastConfig);
//monitorCluster(instance1);
IMap<Integer, Integer> myMap = instance1.getMap("myMap");
System.err.println("starting lock");
int index = 0;
while(index<numberOfRecords){
myMap.lock(index++);
}
System.err.println("After locking index is: " +index);
System.err.println("myMap.size()=" + myMap.size());
}
and output is :
starting lock
After locking index is: 100
myMap.size()=0
PS:Using java7 with Hazelcast 3.6
There is no API like IMap::getLocks but you can iterate through all known locks using IMap::isLocked and collect keys that are still being locked. If you really want some getLocks method please go ahead and file a feature request on github.

Resources