Is there a way to listen for operations / change events at the container level rather than the individual DDS level in the Fluid Framework? - fluid-framework

Scenario:
I have a service running that is keeping a global search or query index up to date for all containers in my “system”. This service is notified any time a container is opened by a client and opens its own reference to that container to listen for changes to content in that container so it can update the “global container index” storage. The container is potentially large and partitioned into may individual DDS entities, and I would like to avoid loading every DDS in the container in order to listen for changes in each of those DDS’s.
Ideally I would be able to listen for any “operations / changes” at the container level and dynamically load the impacted DDS to be able to transcribe the information that was updated into this external index storage.

I originally left this as a comment to SamBroner's response but it got too long.
The ContainerRuntime raises an "op" event for every op, so you can listen to that to implement something similar to #1. This is missing in the docs currently so it's not obvious.
I think interpreting ops without loading the DDS code itself might be possible for DDSes with simpler merge logic, like SharedMap, but very challenging for SharedSequence, for example.
I guess it depends on the granularity of information you're trying to glean from the ops with general purpose code. Knowing just that a DDS was edited may be feasible, but knowing its resulting state... more difficult.

There are actually two questions here: 1. How do I listen to container-level operations? 2. How do I load just one DDS?
How do I listen to operations?
This is certainly possible, but is not included as a capability in the reference implementation of the service. There are multiple ways of architect-ing a solution to this.
Use the EventEmitter on the Container object itself. The sequencedDocumentMessage will have a type property. When type === "op", the message contents will include metadata about a change to data. You can use the below to get a feel for this.
const container = await getTinyliciousContainer(documentId, DiceRollerContainerRuntimeFactory, createNew);
if (sequenceDocumentMessage.type === "op") {
console.log(sequenceDocumentMessage.contents)
}
If you're looking for all of the message types and message interfaces, the enum and the generic interface for ISequenceDocumentMessage are both here.
Listen directly to the Total Order Broadcast with a bespoke lambda
If you're running the reference implementation of Fluid, you can just add a new lambda that is directly listening to Kafka (default Total Order Broadcast) and doing this job. The lambdas that are already running are located here: server/routerlicious/packages/lambdas. Deli is actually doing a fairly similar job already by listening into and labeling operations from Kafka by their Container.
Use the Foreman "lambda" in R11S specifically for spawning jobs
I'd prefer an architecture where the lambda is actually just a "job runner". This would give you something along the lines of "Fluid Lambdas" where these lambdas can just react to operations coming off the Kafka Stream. Functionality like this is included, but poorly documented or tested in the Foreman lambda
Critically, listening to just the Ops is not a very good way to know the current state of a Distributed Data Structure. The Distributed Data Structures manage the merging of new operations into the state. Therefore easiest way to get the current state of DDS is to load the DDS.
How do I load just one DDS?
This is actually fairly straightforward, if not well documented. You'll need to provide a requestHandler that can fetch just the DDS from the container. Ultimately, the Container does have the ability to virtualize everything that isn't requested. You'll need to load the container, but just request the specific DDS.
In pseudocode...
const container = loader.resolve("fluid://URIToContainer");
const dds = await container.request("/ddspaths/uuid");
dds.getIndexData();

Related

CreateContainerIfNotExistsAsync is slower than GetContainer?

I am using Azure cosmosDB SDK v3.As you know the SDK supports CreateContainerIfNotExistsAsync which creates a container if there is no container matching provided container id. This is convenient.
But it pings Cosmos DB to know container exists or not whereas GetContainer doesn't as GetContainer assumes container exists. So CreateContainerIfNotExistsAsync would need one more round trip to Cosmos DB for most of operations if my understanding is correct.
So my questions is would it better to avoid using CreateContainerIfNotExistsAsync as much as possible in terms of API perspective? Api can have better latency and save bandwidth.
The different is explained in the Intellisense, GetContainer just returns a proxy object, one that simply gives you the ability to execute operations within that container, it performs no network requests. If, for example, you try to read an Item (ReadItemAsync) on that proxy and the container does not exist (which also makes the item non-existent) you will get a 404 response.
CreateContainerIfNotExists is also not recommended for hot path operations as it involves a metadata or management plane operation:
Retrieve the names of your databases and containers from configuration or cache them on start. Calls like ReadDatabaseAsync or ReadDocumentCollectionAsync and CreateDatabaseQuery or CreateDocumentCollectionQuery will result in metadata calls to the service, which consume from the system-reserved RU limit. CreateIfNotExist should also only be used once for setting up the database. Overall, these operations should be performed infrequently.
See https://learn.microsoft.com/azure/cosmos-db/sql/best-practice-dotnet for more details
Bottomline: Unless you expect the container to be deleted due to some logical pathway in your application, GetContainer is the right way, it gives you a proxy object that you can use to execute Item operations without any network requests.

shared resources between scripts

Sorry for the noob question... I'm trying to figure out a way to have shared resources between my tf scripts, but I can't find anything, so probably I'm looking for the wrong keywords...
Let's say I have 3 scripts:
base/base.tf
one/one.tf
two/two.tf
base creates an aws vpc and a network load balancer
one and two are two ecs fargate services. they create the task definition and add the mappind to the network load balancer.
My goal is to have something to keep track of the mapped port in the load balancer and read it and update from one and two.
Something like
base sets last_port to 14000
one reads last_port, increases by 1 and updates the value
two reads last_port, increases by 1 and updates the value
Is it possible at all ?
thanks
The general solution to this problem in Terraform is Data Sources, which are special resources that retrieve data from elsewhere rather than creating and managing objects themselves.
In order to use data sources, the data you wish to share must be published somewhere. For sharing between Terraform configurations, you need a data storage location that can be both written to and read from by Terraform.
Since you mentioned ECS Fargate I will assume you're using AWS, in which case a reasonable choice is to store your data in AWS SSM Parameter Store and then have other configurations read it out.
The configuration that creates the data would use the aws_ssm_parameter resource type to create a new parameter:
resource "aws_ssm_parameter" "foo" {
name = "port_number"
type = "String"
value = aws_lb_listener.example.port
}
The configurations that will make use of this data can then read it using the corresponding data source:
data "aws_ssm_parameter" "foo" {
name = "port_number"
}
However, your question talks about the possibility of one configuration reading the value, incrementing it, and writing the new value back into the same place. That is not possible with Terraform because Terraform is a declarative system that works with descriptions of a static desired state. Only one configuration can be managing each object, though many configurations can read an an object.
Instead of dynamically allocating port numbers then, Terraform will require one of two solutions:
Use some other system to manage the port allocations persistently such that once a port number is allocated for a particular caller it will always get the same port number. I don't know of any existing system that is built for this, so this may not be a tenable option in this case, but if such a system did exist then we'd model it in Terraform with a resource type representing a particular port allocation, which Terraform can eventually destroy along with all of the other infrastructure when the port is no longer needed.
Decide on a systematic way to assign consistent port numbers to each system such that each configuration can just know (either by hard-coding or by some deterministic calculation) which port number it should use.

Hazelcast and the need for custom serializers; works when creating the server but not when connecting to existing

We are using Hazelcast to store stuff in distributed maps. We are having a problem with remote servers and I need some feedback on what we can do to resolve the issue.
We create the server - WORKS
We create a new server (Hazelcast.newHazelcastInstance) inside our application's JVM. The hazelcast Config object we pass in has a bunch of custom serializers defined for all the types we are going to put in the maps. Our objects are a mixture of Protobufs, plain java objects, and a combination of the two. The server starts, we can put objects in the map and get objects back out later. We recently decided to start running Hazelcast in its own dedicated server so we tried the scenario below.
Server already exists externally, we connect as a client - DOESN'T WORK
Rather than creating our Hazelcast instance we connect to a remote instance that is already running. We pass in a config with all the same serializers we used before. We successfully connect to Hazelcast and we can put stuff in the map (works as far as I can tell) but we don't get anything back out. No events get fired letting our listeners know objects were added to a map.
I want to be able to connect to a Hazelcast instance that is already running outside of our JVM. It is not working for our use case and I am not sure how it is supposed to work.
Does the JVM running Hazelcast externally need in its class loader all of the class types we might put into the map? It seems like that might be where the problem is but wouldn't that make it very limiting to use Hazelcast?
How do you typically manage those class loader issues?
Assuming the above is true, is there a way to tell Hazelcast we will serialize the objects before even putting them in the map? Basically we would give Hazelcast an ID and byte array and that is all we would expect back in return. If so that would avoid the entire class loader issue I think we are running into. We do not need to be able to search on objects based on their fields. We just need to know as objects come and go and what their ID is.
#Jonathan, when using client-server architecture, unless you use queries or other operations that require data to be serialized on the cluster, members don't need to know anything about serialization. They just store already serialized data & serve it. If these listeners that you mentioned are on the client app, it should be working fine.
Hazelcast has a feature called User Code Deployment, https://docs.hazelcast.org/docs/3.11/manual/html-single/index.html#member-user-code-deployment-beta, but it's mainly for user classes. Serialization related config should be present on members or you should add that later & do a rolling restart.
If you can share some of the exceptions/setup etc, I can give specific answers as well.

DynamoDB Application Architecture

We are using DynamoDB with node.js and Express to create REST APIs. We have started to go with Dynamo on the backend, for simplicity of operations.
We have started to use the DynamoDB Document SDK from AWS Labs to simplify usage, and make it easy to work with JSON documents. To instantiate a client to use, we need to do the following:
AWS = require('aws-sdk');
Doc = require("dynamodb-doc");
var Dynamodb = new AWS.DynamoDB();
var DocClient = new Doc.DynamoDB(Dynamodb);
My question is, where do those last two steps need to take place, in order to ensure data integrity? I’m concerned about an object that is waiting for something happen in Dynamo, being taken over by another process, and getting the data swapped, resulting in incorrect data being sent back to a client, or incorrect data being written to the database.
We have three parts to our REST API. We have the main server.js file, that starts express and the HTTP server, and assigns resources to it, sets up logging, etc. We do the first two steps of creating the connection to Dynamo, creating the AWS and Doc requires, at that point. Those vars are global in the app. We then, depending on the route being followed through the API, call a controller that parses up the input from the rest call. It then calls a model file, that does the interacting with Dynamo, and provides the response back to the controller, which formats the return package along with any errors, and sends it to the client. The model is simply a group of methods that essentially cover the same area of the app. We would have a user model, for instance, that covers things like login and account creation in an app.
I have done the last two steps above for creating the dynamo object in two places. One, I have simply placed them in one spot, at the top of each model file. I do not reinstantiate them in the methods below, I simply use them. I have also instantiated them within the methods, when we are preparing to the make the call to Dynamo, making them entirely local to the method, and pass them to a secondary function if needed. This second method has always struck me as the safest way to do it. However, under load testing, I have run into situations where we seem to have overwhelmed the outgoing network connections, and I start getting errors telling me that the DynamoDB end point is unavailable in the region I’m running in. I believe this is from the additional calls required to make the connections.
So, the question is, is creating those objects local to the model file, safe, or do they need to be created locally in the method that uses them? Any thoughts would be much appreciated.
You should be safe creating just one instance of those clients and sharing them in your code, but that isn't related to your underlying concern.
Concurrent access to various records in DynamoDB is still something you have to deal with. It is possible to have different requests attempt writes to the object at the same time. This is possible if you have concurrent requests on a single server, but is especially true when you have multiple servers.
Writes to DynamoDB are atomic only at the individual item. This means if your logic requires multiple updates to separate items potentially in separate tables there is no way to guarantee all or none of those changes are made. It is possible only some of them could be made.
DynamoDB natively supports conditional writes so it is possible to ensure specific conditions are met, such as specific attributes still have certain values, otherwise the write will fail.
With respect to making too many requests to DynamoDB... unless you are overwhelming your machine there shouldn't be any way to overwhelm the DynamoDB API. If you are performing more read/writes that you have provisioned you will receive errors indicating provisioned throughput has been exceeded, but the API itself is still functioning as intended under these conditions.

Custom Logging mechanism: Master Operation with n-Operation Details or Child operations

I'm trying to implement logging mechanism in a Service-Workflow-hybrid application. The requirements for logging is that instead for independent log action, each log must be considered as a detail operation and placed against a parent/master operation. So, it's a parent-child and goes to database table(s). This is the primary reason, NLog failed.
To help understand better, I'm diving in a generic detail. This is how the application flow goes:
Now, the Main entry point of the application (normally called Program.cs) is Platform. It initializes an engine that is capable of listening incoming calls from ISDN lines, VoIP, or web services. The interface is generic, so any call that reaches the Platform triggers OnConnecting(). OnConnecting() is a thread-safe event and can be triggered as many times as system requires.
Within OnConnecting(), a new instance of our custom Workflow manager is launched and the context is a custom object called ProcessingInfo:
new WorkflowManager<ZeProcessingInfo>();
Where, ZeProcessingInfo:
var ZeProcessingInfo = new ProcessingInfo(this, new LogMaster());
As you can see, the ProcessingInfo is composed of Platform itself and a new instance of LogMaster. LogMaster is defined in an independent assembly.
Now this LogMaster is available throughout the WorkflowManager, all the Workflows it launches, all the activities within any running Workflow, and passed on to external code called from within any Activity. Now, when a new LogMaster is initialized, a Master Operation entry is created in the database and this LogMaster object now lives until this call is ended after a series of very serious roller coaster rides through different workflows. Upon every call of OnConnecting(), a new Master Operation is created and maintained.
The LogMaster allows for calling a AddDetail() method that adds new child detail under the internally stored Master Operation (distinguished through a Guid Primary Key). The LogMaster is built upon Entity Framework.
And, I'm able to log under the same Master Operation as many times as I require. But the application requirements are changing and there is a need to log from other assemblies now. There is a Platform Server assembly witch is a Windows Service that acts as a server listening to web service based calls and once a client calls a method, OnConnecting in Platform is triggered.
I need a mechanism to somehow retrieve the related LogMaster object so that I can add detail to the same Master Operation. But Platform Server is the once triggering the OnConnecting() on the Platform and thus, instantiating LogMaster. This creates a redundancy loop.
Also, failure scenarios are being considered as well. If LogMaster fails, need to revert to Event Logging from Database Logging. If Event Logging is failed (or not allowed through unified configuration), need to revert to file-based (XML) logging.
I hope I have given a rough idea. I don't expect code but I need some strategy for a very seamless plug-able configurable logging mechanism that supports Master-Child operations.
Thanks for reading. Any help would be much appreciated.
I've read this question a number of times and it was pretty hard to figure out what was going on. I don't think your diagram helps at all. If your question is about trying to retrieve the master log record when writing child log records then I would forget about trying to create normalised data in the log tables. You will just slow down the transactional system in trying to do so. You want the log/audit records to write as fast as possible and you can later aggregate them when you want to read them.
Create a de-normalised table for the logs entries and use a single Guid in that table to track the session/parent log master. Yes this will be a big table but it will write fast.
As for guaranteed delivery of log messages to a destination, I would try not to create multiple destinations as combining them later will be a nightmare but rather use something like MSMQ to emit the audit logs as fast as possible and have another service pick them up and process them in a guaranteed delivery manner. ETW (Event Logging) is not guaranteed under load and you will not know that it has failed.

Resources