For a certain internal endpoint I am working on for a Nodejs API, I have been asked to dynamically change the value of a property status based on the value of a property visibility of the same object just before sending down the response.
So for example lets say I have an object that represents a user's profile. The user can have visibility Live or Hidden but status can be IDLE, CREATING, UPDATING.
What's been asked of me is that when I send down the object response containing those two properties I override the status value with another based on the current value of visibility - so if visibility is LIVE then I should set status to ACTIVE, if visibility is HIDDEN then status should be INACTIVE (two status values that do not exist internally in the database or in the list of enums for this object) and then also if status is not IDLE I should change it's value to BUSY
So not only am I changing it's value based on the value of visibility but I'm also changing it's value based on it's own value not being a value!
I am just wondering if this is good practice for an API in any way (apart from some weird extra layer of complexity, and so much inconsistency as the client will later ask for the same object based on status too, which means a reverse mapping)?
status doesn't mean the same thing for different users, having the same name may be confusing but not a problem if well documented.
If the mapping become too complex, you can always persist the two values, but then you will have to keep them in sync.
Related
Our application creates/updates database entries based on an external service's webhooks. The webhook sends the external id of the object so that we can fetch more data for processing. The processing of a webhook with roundtrips to get more data is 400-1200ms.
Sometimes, multiple hooks for the same object ID are sent within microseconds of each other. Here are timestamps of the most recent occurrence:
2020-11-21 12:42:45.812317+00:00
2020-11-21 20:03:36.881120+00:00 <-
2020-11-21 20:03:36.881119+00:00 <-
There can also be other objects sent for processing around this time as well. The issue is that concurrent processing of the two hooks highlighted above will create two new database entries for the same single object.
Q: What would be the best way to prevent concurrent processing of the two highlighted entries?
What I've Tried:
Currently, at the start of an incoming hook, I create a database entry in a Changes table which stores the object ID. Right before processing, the Changes table is checked for entries that were created for this ID within the last 10 seconds; if one is found, it quits to let the other process do the work.
In the case above, there were two database entries created, and because they were SO close in time, they both hit the detection spot at the same time, found each other, and quit, resulting in nothing being done.
I've thought of adding some jitter'd timeout before the check (increases processing time), or locking the table (again, increases processing time), but it all feels like I'm fighting the wrong battle.
Any suggestions?
Our API is Django 3.1 with a Postgres db
Okay, this might not be a very satisfactory answer, but it sounds to me like the root of your problem isn't necessarily with your own app, but the webhooks service you are receiving from.
Due to inherent possibility for error in network communication, webhooks which guarantee delivery always use at-least-once semantics. A sender that encounters a failure that leaves receipt uncertain needs to try sending the webhook again, even if the webhook may have been received the first time, thus opening the possibility for a duplicate event.
By extension, all webhook sending services should offer some way of deduplicating an individual event. I help run our webhooks at Stripe, and if you're using those, every webhook sent will come with an event ID like evt_1CiPtv2eZvKYlo2CcUZsDcO6, which a receiver can use for deduplication.
So the right answer for your problem is to ask your sender for some kind of deduplication/idempotency key, because without one, their API is incomplete.
Once you have that, everything gets really easy: you'd create a unique index on that key in the database, and then use upsert to guarantee only a single entry. That would look something like:
CREATE UNIQUE INDEX index_my_table_idempotency_key ON my_table (idempotency_key);
INSERT INTO object_changes (idempotency_key, ...) VALUES ('received-key', ...)
ON CONFLICT (idempotency_key) DO NOTHING;
Second best
Absent an idempotency ID for deduping, all your solutions are going to be hacky, but you could still get something workable together. What you've already suggested of trying to round off the receipt time should mostly work, although it'll still have the possibility of losing two events that were different, but generated close together in time.
Alternatively, you could also try using the entire payload of a received webhook, or better yet, a hash of it, as an idempotency ID:
CREATE UNIQUE INDEX index_my_table_payload_hash ON my_table (payload_hash);
INSERT INTO object_changes (payload_hash, ...) VALUES ('<hash_of_webhook_payload>', ...)
ON CONFLICT (payload_hash) DO NOTHING;
This should keep the field relatively small in the database, while still maintaining accurate deduplication, even for unique events sent close together.
You could also do a combination of the two: a rounded timestamp plus a hashed payload, just in case you were to receive a webhook with an identical payload somewhere down the line. The only thing this wouldn't protect against is two different events sending identical payloads close together in time, which should be a very unlikely case.
If you look at the acquity webhook docs, they supply a field called action, which key to making your webhook idempotent. Here are the quotes I could salvage:
action either scheduled rescheduled canceled changed or order.completed depending on the action that initiated the webhook call
The different actions:
scheduled is called once when an appointment is initially booked
rescheduled is called when the appointment is rescheduled to a new time
canceled is called whenever an appointment is canceled
changed is called when the appointment is changed in any way. This includes when it is initially scheduled, rescheduled, or canceled, as well as when appointment details such as e-mail address or intake forms are updated.
order.completed is called when an order is completed
Based on the wording, I assume that scheduled, canceled, and order.completed are all unique per object_id, which means you can use a unique together constraint for those messages:
class AcquityAction(models.Model):
id = models.CharField(max_length=17, primary_key=True)
class AcquityTransaction(models.Model):
action = models.ForeignKey(AcquityAction, on_delete=models.PROTECT)
object_id = models.IntegerField()
class Meta:
unique_together = [['object_id', 'action_id']]
You can substitute the AcquityAction model for an Enumeration Field if you'd like, but I prefer having them in the DB.
I would ignore the change event entirely, since it appears to trigger on every event, according to their vague definition. For the rescheduled event, I would create a model that allows you to use a unique constraint on the new date, so something like this:
class Reschedule(models.Model):
schedule = models.ForeignKey(MyScheduleModel, on_delete=models.CASCADE)
schedule_date = models.DateTimeField()
class Meta:
unique_together = [['schedule', 'schedule_date']]
Alternatively, you could have a task specifically for updating your schedule model with a rescheduled date, that way it remains idempotent.
Now in your view, you will do something like this:
from django.db import IntegrityError
ACQUITY_ACTIONS = {'scheduled', 'canceled', 'order.completed'}
def webhook_view(request):
validate(request)
action = get_action(request)
if action in ACQUITY_ACTIONS:
try:
insert_transaction()
except IntegrityError:
return HttpResponse(200)
webhook_task.delay()
elif action == 'rescheduled':
other_webhook_task.delay()
...
Is it possible to have a plugin intervene when someone is editing an optionset?
I would have thought crm would prevent the removal of optionset values if there are entities that refer to them, but apparently this is not the case (there are a number of orphaned fields that refer to options that no longer exist). Is there a message/entity pair that I could use to check if there are entities using the value that is to be deleted/modified and stop it if there are?
Not sure if this is possible, but you could attempt to create a plugin on the Execute Method, and check the input parameters in the context to determine what the Request Type that is being processed is. Pretty sure you'll be wanting to look for either UpdateAttributeRequest for local OptionSets, or potentially UpdateOptionSetRequest for both. Then you could run additional logic to determine what values are changing, and ensuring the database values are correct.
The big caveat to this, is if you even have a moderate amount of data, I'm guessing you'll hit the 2 minute limit for plugin execution and it will fail.
First of all, let me state that I am new to Command Query Responsibility Segregation and Event Sourcing (Message-Drive Architecture), but I'm already seeing some significant design benefits. However, there are still a few issues on which I'm unclear.
Say I have a Customer class (an aggregate root) that contains a property called postalAddress (an instance of the Address class, which is a value object). I also have an Order class (another aggregate root) that contains (among OrderItem objects and other things) a property called deliveryAddress (also an instance of the Address class) and a string property called status.
The customer places an order by issueing a PlaceOrder command, which triggers the OrderReceived event. At this point in time, the status of the order is "RECEIVED". When the order is shipped, someone in the warehouse issues an ShipOrder command, which triggers the OrderShipped event. At this point in time, the status of the order is "SHIPPED".
One of the business rules is that if a Customer updates their postalAddress before an order is shipped (i.e., while the status is still "RECEIVED"), the deliveryAddress of the Order object should also be updated. If the status of the Order were already "SHIPPED", the deliveryAddress would not be updated.
Question 1. Is the best place to put this "conditionally cascading address update" in a Saga (a.k.a., Process Manager)? I assume so, given that it is translating an event ("The customer just updated their postal address...") to a command ("... so update the delivery address of order 123").
Question 2. If a Saga is the right tool for the job, how does it identify the orders that belong to the user, given that an aggregate can only be retrieved by it's unique ID (in my case a UUID)?
Continuing on, given that each aggregate represents a transactional boundary, if the system were to crash after the Customer's postalAddress was updated (the CustomerAddressUpdated event being persisted to the event store) but before the OrderDeliveryAddressUpdated could be updated (i.e., between the two transactions), then the system is left in an inconsistent state.
Question 3. How are such "violations" of consistency rules detected and rectified?
In most instances the delivery address of an order should be independent of any other data change as a customer may want he order sent to an arbitrary address. That being said, I'll give my 2c on how you could approach this:
Is the best place to handle this in a process manager?
Yes. You should have an OrderProcess.
How would one get hold of the correct OrderProcess instance given that it can only be retrieve by aggregate id?
There is nothing preventing one from adding any additional lookup mechanism that associates data to an aggregate id. In my experimental, going-live-soon, mechanism called shuttle-recall I have a IKeyStore mechanism that associates any arbitrary key to an AR Id. So you would be able to associate something like [order-process]:customerId=CID-123; as a key to some aggregate.
How are such "violations" of consistency rules detected and rectified?
In most cases they could be handled out-of-band, if possible. Should I order something from Amazon and I attempt to change my address after the order has shipped the order is still going to the original address. If your case of linking the customer postal address to the active order address you could notify the customer that n number of orders have had their addresses updated but that a recent order (within some tolerance) has not.
As for the system going down before processing you should have some guaranteed delivery mechanism to handle this. I do not regard these domain event in the same way I regard system events in a messaging infrastructure such as a service bus.
Just some thoughts :)
In the article "Improve XPages Application Performance with JSON-RPC" Brad Balassaitis writes:
For example, if you have a repeat control with a collection named myRepeat and
a property named myProperty, you could pass/retrieve it in client-side JavaScript
with this syntax:
‘#{javascript: myRepeat.myProperty}’
Then your call to the remote method would look like this:
myRpcService.setScopeVar(‘#{javascript: myRepeat.myProperty}’);
If I look at the xp:repeat control where should I set this myProperty property?
My idea is to display values from another source within a repeat control. So for each entry in the repeat control I would like to make a call via the Remote Service control and add additional information received from the service.
Anyone achieved this before?
JSON-RPC is just a mechanism to allow you to trigger server-side code without needing a full partial refresh. myProperty is not an actual property, same as myRepeat would not, in practice, be the name of your repeat. It's an example.
Are you wanting the user to click on something in the row in order to load additional information? That's the only use case for going down the RPC route.
If you want to show additional information not available in the current entry but based on a property of that entry, just add a control and compute the value.
In terms of optimisation, unless you're displaying hundreds of rows at a time or loading data from lots of different databases or views, each based on a property in the current row, it should be pretty quick. I would recommend getting it working, then optimise it if you find server-side performance is an issue. view.isRenderingPhase() is a good option for optimising performance of read-only data within a repeat, as is custom language to minimise the amount of HTML pushed to the browser, and also using a dataContext to ensure you only do the lookup to e.g. another document once. But if the network requests to and from the server are slow, optimising the server-side code to shave a fraction of a second on processing will have not have a very perceptible impact.
I have a background process receiving and applying changes to Core Data entities from a back end server using Restkit which works really well. All the entities have a version number property which updates when the back end accepts changes and publishes a new version. If an entity the user is viewing is changed I need to update the view with the latest version information.
Using KVO to observe version number for the current entity and refreshing the view when it changes works really well as long as version number is the last property.
That is, the 'column order' matters, and property updates are atomic. If the version number is the last property then when the observer is invoked all changes to all entity properties will have been applied.
If version number is not the last property defined, then when the observer is invoked the updated values of the properties after version will not been applied.
The solution is to change the database and ensure that version number is always last. This works however I cannot find anything in the documentation to suggest that the sequence of property changes is guaranteed.
I assume the only way to get a water-tight non-atomic notification is to register for managed object context change notifications and then process those notifications looking for changes to objects of interest. My concern with this is that it is not fined grained and there will be a lot of unnecessary processing to find relatively few things of interest.
Is this correct or is there a way to ensure an non-atomic view of an object when using KVO?
If you wanted to use KVO you would need to layer some change management on top, such as when the managed object is saved you check the version number and change another (non-persistent) attribute that is being observed. You can be sure that everything has been updated in a logical set when the object is saved.
Generally the context save notification is the approved approach. So long as you aren't making thousands of changes or making few large saves to the context it shouldn't be an issue. You can also look at using predicates to filter the changes and / or a fetched results controller (which does the observation for you).