I'm making a simple online game and I'm suffering from out-of-sync issues. The server is implemented with IOCP, and since the game will be held almost always in LAN, the delay is relatively small.
The core algorithm of network connecting can be described as below: (There are 4 clients in a single fame)
Clients send their actions and the elasped time since the last frame to the server every frame, then wait until get a response from the server.
The server collects all four clients' messages, concatenate them together, then send it to all four clients.
On receiving the response, clients update their game with the messages provided in the response.
Now, I can see that after some time the four games go out of sync. It can be observed that the game I'm controling is different from the other three(which means the other three are the same), and just by walking around makes the problem happen.
Below is the code, if it might be helpful:
First, the server. Every message will be handled in a separate thread.
while(game_host[now_roomnum].Ready(now_playernum)) // wait for the last message to be taken away
{
Sleep(1);
}
game_host[now_roomnum].SetMessage(now_playernum, recv_msg->msg);
game_host[now_roomnum].SetReady(now_playernum, true);
game_host[now_roomnum].SetUsed(now_playernum, false);
while(!game_host[now_roomnum].AllReady()) // wait for all clients' messages
{
Sleep(1);
}
string all_msg = game_host[now_roomnum].GetAllMessage();
game_host[now_roomnum].SetUsed(now_playernum, true);
while(!game_host[now_roomnum].AllUsed()) // wait for all four responses are ready to send
{
Sleep(1);
}
game_host[now_roomnum].SetReady(now_playernum, false);// ready for receiving the next message
strcpy_s(ret.msg, all_msg.c_str());
And the clients' CGame::Update(float game_time) method:
CMessage msg = MakeMessage(game_time);//Make a message with the actions taken in the frame(pushed into a queue) and the elasped time between the two frames
CMessage recv = p_res_manager->m_Client._SendMessage(msg);//send the message and wait for the server's response
stringstream input(recv.msg);
int i;
rest_time -= game_time;
float game_times[MAX_PLAYER+1]={0};
//analyze recv operations
for(i=1; i<=MAX_PLAYER; i++)
{
int n;
input>>n;
input>>game_times[i];//analyze the number of actions n, and player[i]'s elasped game time game_times[i]
for(int oper_i = 1; oper_i <= n; oper_i++)
{
int now_event;
UINT nchar;
input>>now_event>>nchar;
if(now_event == int(Event::KEY_UP))
HandleKeyUpInUpdate(i, nchar);
else //if(now_event == int(Event::KEY_DOWN))
HandleKeyDownInUpdate(i, nchar);
}
}
//update player
for(i=1; i<=MAX_PLAYER; i++)
{
player[i].Update(game_times[i]);//something like s[i] = v[i] * game_time[i]
}
Thank you very much. I'll provide more detail if necassary.
Ur general design is wrong, that's why u get async at some point. The server should never ever deal with the fps of the clients. This is just a horrible design issue. On general the server calculates everything regarding on the input's the clients send to server. And the clients just request the current status of their surroundings of the server. That way u are fps independent on the server side. Which mean u can update the scene on the server as fast as possible, and the clients just retrieve the current status.
When u update the entities on the server fps dependent per user, u would have to keep a local copy of every entity for every client, otherwise it's impossible to transfer for different delta times.
The other design could be that ur server just syncs the clients, so every client calculates the scene on it's own and then send their status to the server. The server then distributes this to the other clients, and the clients decide what to do with this information.
Any other design will lead to major problems, i highly regret to not use any other design.
if u have any further questions feel free to contact me.
Related
Before everybody marks this as a dup let me state that I know my fair share of network programming and this question is my attempt to solve something that riddles me even after finding the "solution".
The setup
I've spend the last weeks writing some glue code to incorporate a big industrial system into our current setup. The system is controlled by a Windows XP computer (PC A) which is controlled from a Ubuntu 14.04 system (PC B) by sending a steady stream of UDP packets at 2000 Hz. It responds with UDP packets containing the current state of the system.
Care was taken to ensure that the the 2000 Hz rate was held because there is a 3ms timeout after which the system faults and returns to a safe state. This involves measuring and accounting for inaccuracies in std::this_thread::sleep_for. Measurements show that there is only a 0.1% derivation from the target rate.
The observation
Problems started when I started to receive the state response from the system. The controlling side on PC B looks roughly like this:
forever at 2000Hz {
send current command;
if ( socket.available() >= 0 ) {
receive response;
}
}
edit 2: Or in real code:
auto cmd_buf = ...
auto rsp_buf = ...
while (true) {
// prepare and send command buffer
cmd_buf = ...
socket.send(cmd_buf, endpoint);
if (socket.available() >= 0) {
socket.receive(rsp_buf);
// the results are then parsed and stored, nothing fancy
}
// time keeping
}
Problem is that, whenever the receiving portion of the code was present on PC B, PC A started to run out of memory within seconds when trying to allocate receive buffers. Additionally it raised errors stating that the timeout was missed, which was probably due to packets not reaching the control software.
Just to highlight the strangeness: PC A is the pc sending UDP packets in this case.
Edit in response to EJP: this is the (now) working setup. It started out as:
forever at 2000Hz {
send current command;
receive response;
}
But by the time the response was received (blocking) the deadline was missed. Therefore the availability check.
Another thing that was tried was to receive in a seperate thread:
// thread A
forever at 2000Hz {
send current command;
}
// thread B
forever {
receive response;
}
Which displays the same behavior as the first version.
The solution
The solution was to set the socket on PC B to non blocking mode. One line and all problems were gone.
I am pretty sure that even in blocking mode the deadline was met. There should be no performance difference between blocking and non-blocking mode when there is just one socket involved. Even if checking the socket for available data takes some microseconds more than in non-blocking mode it shouldn't make a difference when the overall deadline is met accuratly.
Now ... what is happening here?
If I read your code correctly and referring to this code:
forever at 2000Hz {
send current command;
receive response;
}
Examine the difference between the blocking and not blocking socket. With blocking socket you send current command and then you are stuck waiting for the response. By this time I would guess you already miss the 2kHz goal.
Now in non blocking socket you send the current command, try to received whatever is in receive buffers, but if there is nothing there you return immediately and continue your tight 2kHz loop of sending. This explains to me why your industrial control system works fine in non-blocking code.
I am trying out simple NodeJS app so that I could to understand the async nature.
But my problem is as soon as I hit "/home" from browser it waits for response and simultaneously when "/" is hit, it waits for the "/home" 's response first and then responds to "/" request.
My concern is that if one of the request needs heavy processing, in parallel we can't request another one? Is this correct?
app.get("/", function(request, response) {
console.log("/ invoked");
response.writeHead(200, {'Content-Type' : 'text/plain'});
response.write('Logged in! Welcome!');
response.end();
});
app.get("/home", function(request, response) {
console.log("/home invoked");
var obj = {
"fname" : "Dead",
"lname" : "Pool"
}
for (var i = 0; i < 999999999; i++) {
for (var i = 0; i < 2; i++) {
// BS
};
};
response.writeHead(200, {'Content-Type' : 'application/json'});
response.write(JSON.stringify(obj));
response.end();
});
Good question,
Now, although Node.js has it's asynchronous nature, this piece of code:
for (var i = 0; i < 999999999; i++) {
for (var i = 0; i < 2; i++) {
// BS
};
};
Is not asynchronous actually blocking the node main thread. And therefore, all other requests has to wait until this big for loop will end.
In order to do some heavy calculations in parallel I recommend using setTimeout or setInterval to achieve your goal:
var i=0;
var interval = setInterval(function() {
if(i++>=999999999){
clearInterval(interval);
}
//do stuff here
},5);
For more information I recommend searching for "Node.js event loop"
As Stasel, stated, code running like will block the event loop. Basically whenever javascript is running on the server, nothing else is running. Asynchronous I/O events such as disk I/O might be processing in the background, but their handler/callback won't be call unless your synchronous code has finished running. Basically as soon as it's finished, node will check for pending events to be handled and call their handlers respectively.
You actually have couple of choices to fix this problem.
Break the work in pieces and let the pending events be executed in between. This is almost same as Stasel's recommendation, except 5ms between a single iteration is huge. For something like 999999999 items, that takes forever. Firstly I suggest batch process the loop for about sometime, then schedule next batch process with setimmediate. setimmediate basically will schedule it after the pending I/O events are handled, so if there is not new I/O event to be handled(like no new http requests) then it will executed immediately. It's fast enough. Now the question comes that how much processing should we do for each batch/iteration. I suggest first measure how much does it on average manually, and for schedule about 50ms of work. For example if you have realized 1000 items take 100ms. Then let it process 500 items, so it will be 50ms. You can break it down further, but the more broken down, the more time it takes in total. So be careful. Also since you are processing huge amount of items, try not to make too much garbage, so the garbage collector won't block it much. In this not-so-similar question, I've explained how to insert 10000 documents into MongoDB without blocking the event loop.
Use threads. There are actually a couple nice thread implementations that you won't shoot yourself in foot with them. This is really a good idea for this case, if you are looking for performance for huge processings, since it would be tricky as I said above to implement CPU bound task playing nice with other stuff happening in the same process, asynchronous events are perfect for data-bound task not CPU bound tasks. There's nodejs-threads-a-gogo module you can use. You can also use node-webworker-threads which is built on threads-a-gogo, but with webworker API. There's also nPool, which is a bit more nice looking but less popular. They all support thread pools and should be straight forward to implement a work queue.
Make several processes instead of threads. This might be slower than threads, but for huge stuff still way better than iterating in the main process. There's are different ways. Using processes will bring you a design that you can extend it to using multiple machines instead of just using multiple CPUs. You can either use a job-queue(basically pull the next from the queue whenever finished a task to process), a multi process map-reduce or AWS elastic map reduce, or using nodejs cluster module. Using cluster module you can listen to unix domain socket on each worker and for each job just make a request to that socket. Whenever the worker finished processing the job, it will just write back to that particular request. You can search about this stuff, there are many implementations and modules existing already. You can use 0MQ, rabbitMQ, node built-in ipc, unix domain sockets or a redis queue for multi process communications.
I am using socket.io to send packets via websockets. They seem to disappear from time to time. So I have to implement some sort of acknowledge-system. My Idea was to immediatelly respond to a packet with an ACK-packet. If the server does not receive this ACK-packet within a given time, he will resend it (up to 3 times, then disconnect the socket).
My first thought was to start a timer (setTimeout) after sending a packet. if the timeout-event occurs, the packet has to be sent again. If the ACK will arrive, the timeout will get deleted. Quite easy and short.
var io = require('socket.io').listen(80);
// ... connection handling ...
function sendData(someData, socket) {
// TODO: Some kind of counter to stop after 3 tries.
socket.emit("someEvent", someData);
var timeout = setTimeout(function(){ sendData(someData, socket); }, 2000);
socket.on("ack", function(){
// Everything went ok.
clearTimeout(timeout);
});
}
But I will have 1k-3k clients connected with much traffic. I can't imagine, that 10k timers running at the same time are handlable by NodeJS. Even worse: I read that NodeJS will not fire the event if there is no time for it.
How to implement a good working and efficient packet acknowledge system?
If socket.io is not reliable enough for you, you might want to consider implementing your own websocket interface instead of adding a layer on top of socket.io. But to answer your question, I don't think running 10k timers is going to be a big deal. For example, the following code ran in under 3 seconds for me and printed out the expected result of 100000:
var x = 0;
for (var i = 0; i < 100000; i++) {
setTimeout(function() { x++; }, 1000);
}
setTimeout(function() { console.log(x); }, 2000);
There isn't actually that much overhead for a timeout; it essentially just gets put in a queue until it's time to execute it.
I read that NodeJS will not fire the event if there is no time for it.
This is a bit of an exaggeration, node.js timers are reliable. A timer set by setTimeout will fire at some point. It may be delayed if the process is busy at the exact scheduled time, but the callback will be called eventually.
Quoted from Node.js docs for setTimeout:
The callback will likely not be invoked in precisely delay milliseconds. Node.js makes no guarantees about the exact timing of when callbacks will fire, nor of their ordering. The callback will be called as close as possible to the time specified.
I have a question regarding server - client data transmission. The data is sent by the client after satisfying simple protocol. But I found there is a delay on server side. The client and the server are tested on the same PC that has an i5 core with SSD and 8 GB ram.
The way how I measured the delay is, after the clients says "Sending," both sides write current system time in millisecond. The data itself is the current system time sent by the client. The server is checking how much it is delayed in the server side. It starts from 0 ms and is increased up to 90 ms and stabilized at 40 ms. I wonder this delay is normal.
Here is the code of the server(multi-threaded):
....
while(!ScriptWillAcessHere){
inputLine = in.readLine();
//Greetings
if(i==0)
{
outputLine = SIMONSAYS.processInput(inputLine);
out.println(outputLine);
}
if(inputLine.equals("Sending")){
i = 1;
}
if(i>=1){ //Javascript will access this block
if(i==1){
StartTime = System.currentTimeMillis();
System.out.println(StartTime);
i++;
}
Differences = System.currentTimeMillis() - Double.parseDouble(inputLine);
saveSvr.write(Double.toString(Differences)+"\n");
...
//Checking elapsed time below:
}
}
Here is the code of the client(single thread):
....
if(Client.equals("Sending"))
{
while(bTimer)
{
ins++;
local_time = System.currentTimeMillis();
out.println(local_time);
if(ins >= 100000)
{
out.println("End of Message");
break;
}
}
}
Thanks,
This code must be removed from the while() loop. It causes massive CPU traffic and delay on server side.
Differences = System.currentTimeMillis() - Double.parseDouble(inputLine);
Instead, if anyone needs to compare server side local time with client local time, use ping each other first then save their local time at the beginning of transmission then save both in the server.
If there is no delay in the hub, pinging will indicates Max. 1 ms delay and both local time must be identical.
Of course, the client's local time must be adjusted according to the server time; that's why we need to save their local time at the beginning of the transmission to find an offset.
Also, if the server is doing some different task at the same time, there must be some delay around 10-15 ms. If the transmission itself doesn't have any delay, the Max. delay of this operation must be identical to the server internal delay. I found the server was also running different tasks at the same time and had Max. 15 ms delay caused by them. So, total delay on the server is:
Total delay = server internal delay on other tasks + server internal delay on transmission thread + transmission delay.
Environment: Webphere 6, Solaris box, Thick client, Java web app.
Number of request can be between 400 - 600. On each request to server, I am creating 15 threads( using Java ExecutorService) for requesting 15 different webservies simultaneously and group all the responses data together and send it back to user.
Load test fails at nearly 150 - 170 users. CPU and memory spikes are seen in DB servicing these webservices and eventually after a very short period of time app server too crashes.
Response time of webservice is 10-12 sec max and 4-6 secs min. Connection pooling size of the DB is 40.
I am assuming that 150 request are creating 150*15=2250 threads and app server resources are being spiked and hence crashing. So I want to use App server threadpool and have threadCount say 100 (may not be good number..). One thing thats troubling me is, with 100 threads I can process first 6 (6*15 = 90) requests and 10 calls of 7th request. The next requests have to wait for 10-15 secs for getting the threads back and then another 10-15 secs for its own webservice call. Is this approach even good?
Another idea was asynchronous beans provided in Websphere. Which one suits my requirement.
Please suggest!!. Calling one webservice after another takes a total of 15*(lets say 4sec for each request) = 60 secs which is really bad. So calling webserices together is what I want to do.
Managing your threads in application servers is not recommended. If you are using EJBs, the spec disallows that.
Why don't you use use a caching solution to improve the performance? The first few requests will be slower, but once the cache is hot everything will be very fast.
If caching the data is not feasible, what about changing the client to make multiple requests to the server, instead of splitting one request in multiple threads? You would need to change your web application, so that each method would call one web service. The client would call (in parallel) each method needed for the current page and assemble the final result (it may be possible to display partial results if you wish). By doing this you will do work in parallel and won't violate the spec.
I assume you have something like this, in your server:
public Result retriveData(Long id) {
Result myResult = new Result();
//...
//do some stuff
myResult.setSomeData(slowWebService1.retriveSomeData(id));
myResult.setSomeOtherData(slowWebService2.retriveSomeOtherData(id));
myResult.setData(slowWebService3.retriveData(id));
return myResult;
}
In your client:
Result result = webApplication.retriveData(10);
//use the result
My proposal, is to split the calls in multiple methods:
public SomeData retriveSomeData(Long id) {
//do some stuff
SomeData data = slowWebService1.retriveSomeData(id);
//do more stuff
return data;
}
public SomeOtherData retriveSomeOtherData(Long id) {
//do some stuff
SomeOtherData data = slowWebService2.retriveSomeOtherData(id);
//do more stuff
return data;
}
public Data retriveData(Long id) {
//do some stuff
Data data = slowWebService3.retriveData(id);
//do more stuff
return data;
}
In your client:
//Call these methods in parallel, if you were using Swing, this could be done with
//SwingWorker (I have no idea how to it with Flash :)).
//You can either wait for all methods to return or show partial results.
callInBackground(webApplication.retriveSomeData(10), useDataWhenDone);
callInBackground(webApplication.retriveSomeOtherData(10), useDataWhenDone);
callInBackground(webApplication.retriveData(10), useDataWhenDone);
By doing this you are calling only your web application, just like before, so there shouldn't be any security issues.
I am not familiar with Websphere, so I can't tell if using its asynchronous beans are better than this, but IMHO you should avoid starting threads manually.