socket.io sync state across sockets - node.js

I'm creating a board game where 2 players can play, and others can be spectators (viewers)
so, when a spectator joins, he gets the current state of the game, and from then on, he only gets the move each player has made (to save data obviously).
My question is: when the spectator first get the state of the game from the server, how can I make sure it is actually synced? I don't really know when he will get the state, and it might be a fraction of a second before something has changed, and then the Delta he gets for every move made won't make sense.
Should I use some kind of an internal? what would you suggest to make sure everything is synced?

Assuming that your state is result of, and only of, user actions, you could store you state in a table like format with an auto-increment integer ID.
In the move event, you pass the new ID and the previous ID. If the receiver's max ID is less than the previous ID, you know to ask server for the missing actions.

Related

Database doesn't update/write instantly

For context, i am making a multi-player game using nodejs and my db is postgres.
players play one by one, and i save everything in the db.
when first user plays, they can't play again, until the other player played too.
what i am doing now is having a boolean on each player in the db that says "ableToPlay" which is true, then turns to false if it's not the user's turn.
issue is when user spams the "play button" and my db is in a remote server, it takes time to update from true to false, making the user play multiple times then causes the app to crash.
I am using aws Microservices architecture so the server must be stateless.
is there any way i can save the game progress in a way where the progress is accessible to all my micro-services?
How do you check the turn? Is it something:
select turn from db
if turn == X then
//allow the turn
do all the logic
update the turn to Y
endif
So the "do all the logic" may be called several times as several requests will get turn=X.
This is a very common problem in programming, there are several approaches you could do.
Two key observations to address:
the same player should not do a turn twice in a row
while one player is making the turn, the other player must wait
Easiest way it to use a transaction in the DB while the turn is happening. For example, when player X making the turn:
start transaction
update turn=X where turn=Y (Y is the other player)
if update done (one record is updates)
do all the logic
commit the transaction
In that approach, update will wait for the previous one to finish, and the WHERE clause will make sure the same player won't do two or more turns in a row. And the transaction isolation will avoid running turn logic at the same time.
If you don't want to use the transaction, you could build a state machine, with states:
waitingForTurnX
makingTurnX
waitingForTurnY
makingTurnY
this would be a nice model to code and these transitions could be handled without transactions:
update state=makingTurnX where state=waitingForTurnX
This approach will also eliminate race condition, because in vast majority of databases, updates are atomic when it comes to a single record.

Do I need FIFO SQS for jira like board view app

Currently I am running a jira like board-stage-card management app on AWS ECS with 8 tasks. When a card is moved from one column/stage to another, I look for the current stage object for that card remove card from that stage and add card to the destination stage object. This is working so far because I am always looking for the actual card's stage in the Postgres database not base on what frontend think that card belongs to.
Question:
Is it safe to say that even when multiple users move the same card to different stages, but query would still happen one after the other and data will not corrupt? (such as duplicates)
If there is still a chance data can be corrupted. Is it a good option to use SQS FIFO to send message to a lambda and handle each card movement in sequence ?
Any other reason I should use SQS in this case ? or is SQS not applicable at all here?
The most important question here is: what do you want to happen?
Looking at the state of a card in the database, and acting on that is only "wrong" if it doesn't implement the behavior you want. It's true that if the UI can get out of sync with the database, then users might not always get the result they were expecting - but that's all.
Consider likelihood and consequences:
How likely is it that two or more people will update the same card, at the same time, to different stages?
And what is the consequence if they do?
If the board is being used by a 20 person project team, then I'd say the chances were 'low/medium', and if they are paying attention to the board they'll see the unexpected change and have a discussion - because clearly they disagree (or someone moved it to the wrong stage by accident).
So in that situation, I don't think you have a massive problem - as long as the system behavior is what you want (see my further responses below). On the other hand, if your board solution is being used to help operate a nuclear missile launch control system then I don't think your system is safe enough :)
Is it safe to say that even when multiple users move the same card to
different stages, but query would still happen one after the other and
data will not corrupt? (such as duplicates)
Yes the query will still happen - on the assumption:
That the database query looks up the card based on some stable identifier (e.g. CardID), and
that having successfully retrieved the card, your logic moves it to whatever destination stage is specified - implying there's no rules or state machine that might prohibit certain specific state transitions (e.g. moving from stage 1 to 2 is ok, but moving from stage 2 to 1 is not).
Regarding your second question:
If there is still a chance data can be corrupted.
It depends on what you mean by 'corruption'. Data corruption is when unintended changes occur in data, and which usually make it unusable (un-processable, un-readable, etc) or useless (processable but incorrect). In your case it's more likely that your system would work properly, and that the data would not be corrupted (it remains processable, and the resulting state of the data is exactly what the system intended it to be), but simply that the results the users see might not be what they were expecting.
Is it a good option
to use SQS FIFO to send message to a lambda and handle each card
movement in sequence ?
A FIFO queue would only ensure that requests were processed in the order in which they were received by the queue. Whether or not this is "good" depends on the most important question (first sentence of this answer).
Assuming the assumptions I provided above are correct: there is no state machine logic being enforced, and the card is found and processed via its ID, then all that will happen is that the last request will be the final state. E.g.:
Card State: Card.CardID = 001; Stage = 1.
3 requests then get lodged into the FIFO queue in this order:
User A - Move CardID 001 to Stage 2.
User B - Move CardID 001 to Stage 4.
User C - Move CardID 001 to Stage 3.
Resulting Card State: Card.CardID = 001; Stage = 3.
That's "good" if you want the most recent request to be the result.
Any other reason I should use SQS in this case ? or is SQS not
applicable at all here?
The only thing I can think of is that you would be able to store a "history", that way users could see all the recent changes to a card. This would do two things:
Prove that the system processed the requests correctly (according to what it was told to do, and it's logic).
Allow users to see who did what, and discuss.
To implement that, you just need to record all relevant changes to the card, in the right order. The thing is, the database can probably do that on it's own, so use of SQS is still debatable, all the queue will do is maybe help avoid deadlocks.
Update - RE Duplicate Cards
You'd have to check the documentation for SQS to see if it can evaluate queue items and remove duplicates.
Assuming it doesn't, you'll have to build something to handle that separately. All I can think of right now is to check for duplicates before adding them to the queue - because once that are there it's probably too late.
One idea:
Establish a component in your code which acts as the proxy/façade for the queue.
Make it smart in that it knows about recent card actions ("recent" is whatever you think it needs to be).
A new card action comes it, it does a quick check to see if it has any other "recent" duplicate card actions, and if yes, decides what to do.
One approach would be a very simple in-memory collection, and cycle out old items as fast as you dare to. "Recent", in terms of the lifetime of items in this collection, doesn't have to be the same as how long it takes for items to get through the queue - it just needs to be long enough to satisfy yourself there's no obvious duplicate.
I can see such a set-up working, but potentially being quite problematic - so if you do it, keep it as simple as possible. ("Simple" meaning: functionally as narrowly-focused as possible).
Sizing will be a consideration - how many items are you processing a minute?
Operational considerations - if it's in-memory it'll be easy to lose (service restarts or whatever), so design the overall system in such a way that if that part goes down, or the list is flushed, items still get added to the queue and things keep working regardless.
While you are right that a Fifo Queue would be best here, I think your design isn't ideal or even workable in some situation.
Let's say user 1 has an application state where the card is in stage 1 and he moves it to stage 2. An SQS message will indicate "move the card from stage 1 to stage 2". User 2 has the same initial state where card 1 is in stage 1. User 2 wants to move the card to stage 3, so an SQS message will contain the instruction "move the card from stage 1 to stage 3". But this won't work since you can't find the card in stage 1 anymore!
In this use case, I think a classic API design is best where an API call is made to request the move. In the above case, your API should error out indicating that the card is no longer in the state the user expected it to be in. The application can then reload the current state for that card and allow the user to try again.

Concurrency issue when processing webhooks

Our application creates/updates database entries based on an external service's webhooks. The webhook sends the external id of the object so that we can fetch more data for processing. The processing of a webhook with roundtrips to get more data is 400-1200ms.
Sometimes, multiple hooks for the same object ID are sent within microseconds of each other. Here are timestamps of the most recent occurrence:
2020-11-21 12:42:45.812317+00:00
2020-11-21 20:03:36.881120+00:00 <-
2020-11-21 20:03:36.881119+00:00 <-
There can also be other objects sent for processing around this time as well. The issue is that concurrent processing of the two hooks highlighted above will create two new database entries for the same single object.
Q: What would be the best way to prevent concurrent processing of the two highlighted entries?
What I've Tried:
Currently, at the start of an incoming hook, I create a database entry in a Changes table which stores the object ID. Right before processing, the Changes table is checked for entries that were created for this ID within the last 10 seconds; if one is found, it quits to let the other process do the work.
In the case above, there were two database entries created, and because they were SO close in time, they both hit the detection spot at the same time, found each other, and quit, resulting in nothing being done.
I've thought of adding some jitter'd timeout before the check (increases processing time), or locking the table (again, increases processing time), but it all feels like I'm fighting the wrong battle.
Any suggestions?
Our API is Django 3.1 with a Postgres db
Okay, this might not be a very satisfactory answer, but it sounds to me like the root of your problem isn't necessarily with your own app, but the webhooks service you are receiving from.
Due to inherent possibility for error in network communication, webhooks which guarantee delivery always use at-least-once semantics. A sender that encounters a failure that leaves receipt uncertain needs to try sending the webhook again, even if the webhook may have been received the first time, thus opening the possibility for a duplicate event.
By extension, all webhook sending services should offer some way of deduplicating an individual event. I help run our webhooks at Stripe, and if you're using those, every webhook sent will come with an event ID like evt_1CiPtv2eZvKYlo2CcUZsDcO6, which a receiver can use for deduplication.
So the right answer for your problem is to ask your sender for some kind of deduplication/idempotency key, because without one, their API is incomplete.
Once you have that, everything gets really easy: you'd create a unique index on that key in the database, and then use upsert to guarantee only a single entry. That would look something like:
CREATE UNIQUE INDEX index_my_table_idempotency_key ON my_table (idempotency_key);
INSERT INTO object_changes (idempotency_key, ...) VALUES ('received-key', ...)
ON CONFLICT (idempotency_key) DO NOTHING;
Second best
Absent an idempotency ID for deduping, all your solutions are going to be hacky, but you could still get something workable together. What you've already suggested of trying to round off the receipt time should mostly work, although it'll still have the possibility of losing two events that were different, but generated close together in time.
Alternatively, you could also try using the entire payload of a received webhook, or better yet, a hash of it, as an idempotency ID:
CREATE UNIQUE INDEX index_my_table_payload_hash ON my_table (payload_hash);
INSERT INTO object_changes (payload_hash, ...) VALUES ('<hash_of_webhook_payload>', ...)
ON CONFLICT (payload_hash) DO NOTHING;
This should keep the field relatively small in the database, while still maintaining accurate deduplication, even for unique events sent close together.
You could also do a combination of the two: a rounded timestamp plus a hashed payload, just in case you were to receive a webhook with an identical payload somewhere down the line. The only thing this wouldn't protect against is two different events sending identical payloads close together in time, which should be a very unlikely case.
If you look at the acquity webhook docs, they supply a field called action, which key to making your webhook idempotent. Here are the quotes I could salvage:
action either scheduled rescheduled canceled changed or order.completed depending on the action that initiated the webhook call
The different actions:
scheduled is called once when an appointment is initially booked
rescheduled is called when the appointment is rescheduled to a new time
canceled is called whenever an appointment is canceled
changed is called when the appointment is changed in any way. This includes when it is initially scheduled, rescheduled, or canceled, as well as when appointment details such as e-mail address or intake forms are updated.
order.completed is called when an order is completed
Based on the wording, I assume that scheduled, canceled, and order.completed are all unique per object_id, which means you can use a unique together constraint for those messages:
class AcquityAction(models.Model):
id = models.CharField(max_length=17, primary_key=True)
class AcquityTransaction(models.Model):
action = models.ForeignKey(AcquityAction, on_delete=models.PROTECT)
object_id = models.IntegerField()
class Meta:
unique_together = [['object_id', 'action_id']]
You can substitute the AcquityAction model for an Enumeration Field if you'd like, but I prefer having them in the DB.
I would ignore the change event entirely, since it appears to trigger on every event, according to their vague definition. For the rescheduled event, I would create a model that allows you to use a unique constraint on the new date, so something like this:
class Reschedule(models.Model):
schedule = models.ForeignKey(MyScheduleModel, on_delete=models.CASCADE)
schedule_date = models.DateTimeField()
class Meta:
unique_together = [['schedule', 'schedule_date']]
Alternatively, you could have a task specifically for updating your schedule model with a rescheduled date, that way it remains idempotent.
Now in your view, you will do something like this:
from django.db import IntegrityError
ACQUITY_ACTIONS = {'scheduled', 'canceled', 'order.completed'}
def webhook_view(request):
validate(request)
action = get_action(request)
if action in ACQUITY_ACTIONS:
try:
insert_transaction()
except IntegrityError:
return HttpResponse(200)
webhook_task.delay()
elif action == 'rescheduled':
other_webhook_task.delay()
...

What is the most effective way to handle multiple objects independent from all players when making a game with sockets?

For example, let's say I have a random game in which I have 500 independent objects and 10 players.
Independent object is an object that moves in a specific direction per update regardless of what players do (there is no need for players to come into contact with these objects).
Now if a player is shooting (lets say) a bullet, it is easier because it belongs to a specific player therefore it's easier to avoid in game lag. Lets look at something simpler, though, for example a player try to update their position. The typical thing I would do on client & server side would be this :
client side : update the coords of the player + send a message to the server as socket X
server side : receives the message from socket X, updates the coords of the player on the server side +
sends a message with the coords of that same player to all other sockets
When you do the communication like this, everyone will receive the new coords of the player and there will be little to no lag. (It is also sufficient for objects like bullets, because they are created upon firing a player event)
How do you handle 500+ independent objects that move in random directions with random speed all across the map and update them for all players efficiently? (Be aware that their velocity and speed can be changed upon contact with a player). What I've tried so far:
1) Put all of the movement + collission logic on the server side &
notifying all clients with a setTimeout loop & io.emit -
Result : causes massive lag even when you have only 500+ objects and 4 connected players. All of the players receive the server's response way too slow
2) Put all of the movement + collission logic on the client side & notifying the server about every object' position-
Result : To be honest, couldn't encounter much lag, but I am not sure if this is the correct idea as every time an object moves, I am literally sending a message to the server from each client to update that same object (server is getting notified N[number of connected clients] amount of times about that same object). Handling this entirely on the client side is also a bad idea because when a player randomly switches tabs [goes inactive], no more javascript will be executed in that players' browser and this whole logic will break
I've also noticed that games like agar.io, slither.io, diep.io, etc, all of them do not really have hundreds of objects that move in various directions. In agar.io and slither you mainly have static objects (food) and players, in diep.io there are dynamical objects, but none of them move at very high speeds. How do people achieve this? Is there any smart way to achieve this with minimal lag?
Thanks in advance
Convert your user interactions to enumerated actions and forward those. Player A presses the left arrow which is interpreted by the client as "MOVE_LEFT" with possible additional attributes (how much, angle, whatever) as well as a timestamp indicating when this action took place from Player A's perspective.
The server receives this and validates it as a possible action and forwards it to all the clients.
Each client then interprets the action themselves and updates their own simulation with respect to Player A's action.
Don't send the entire game state to every client every tick, that's too bloated. The other side is to be able to handle late or missing actions. One way of doing that is rollback where you keep multiple sets of state and then keep the game simulation going until a missinterpretation (late/missing packet) is found. Revert to the "right" state and replay all the messages since in order to get state to correct. This is the idea behind GGPO.
I suggest also reading every article related to networking that Gaffer on Games goes into, especially What Every Programmer Needs To Know About Game Networking. They're very good articles.

Should latest event version be queried in event sourcing?

I am developing a simple DDD + Event sourcing based app for educational purposes.
In order to set event version before storing to event store I should query event store but my gut tells that this is wrong because it causes concurrency issues.
Am I missing something?
There are different answers to that, depending on what use case you are considering.
Generally, the event store is a dumb, domain agnostic appliance. It's superficially similar to a List abstraction -- it stores what you put in it, but it doesn't actually do any work to satisfy your domain constraints.
In use cases where your event stream is just a durable record of things that have happened (meaning: your domain model does not get a veto; recording the event doesn't depend on previously recorded events), then append semantics are fine, and depending on the kind of appliance you are using, you may not need to know what position in the stream you are writing to.
For instance: the API for GetEventStore understands ExpectedVersion.ANY to mean append these events to the end of the stream wherever it happens to be.
In cases where you do care about previous events (the domain model is expected to ensure an invariant based on its previous state), then you need to do something to ensure that you are appending the event to the same history that you have checked. The most common implementations of this communicate the expected position of the write cursor in the stream, so that the appliance can reject attempts to write to the wrong place (which protects you from concurrent modification).
This doesn't necessarily mean that you need to be query the event store to get the position. You are allowed to count the number of events in the stream when you load it, and to remember how many more events you've added, and therefore where the stream "should" be if you are still synchronized with it.
What we're doing here is analogous to a compare-and-swap operation: we get a representation of the original state of the stream, create a new representation, and then compare and swap the reference to the original to point instead to our changes
oldState = stream.get()
newState = domainLogic(oldState)
stream.compareAndSwap(oldState, newState)
But because a stream is a persistent data structure with append only semantics, we can use a simplified API that doesn't require duplicating the existing state.
events = stream.get()
changes = domainLogic(events)
stream.appendAt(count(events), changes)
If the API of your appliance doesn't allow you to specify a write position, then yes - there's the danger of a data race when some other writer changes the position of the stream between your query for the position and your attempt to write. Data obtained in a query is always stale; unless you hold a lock you can't be sure that the data hasn't changed at the source while you are reading your local copy.
I guess you shouldn't to think about event version.
If you talk about the place in the event stream, in general, there's no guaranteed way to determine it at the creation moment, only in processing time or in event-storage.
If it is exactly about event version (see http://cqrs.nu/Faq, How do I version/upgrade my events?), you have it hardcoded in your application. So, I mean next use case:
First, you have an app generating some events. Next, you update app and events are changed (you add some fields or change payload structure) but kept logical meaning. So, now you have old events in your ES, and new events, that differ significantly from old. And to distinguish one from another you use event version, eg 0 and 1.

Resources