Skype Conference Procedure - p2p

I've been looking into skypes protocol or what people can make out since its a propriety protocol. I've read "An analysis of the skype peer-to-peer internet telephony protocol", though it is old it discusses a certain property which I'm looking to recreate in my own architecture. What I'm interested in is during video a conference, data is sent to one machine (the one most likely with the best bandwidth and processing power) which then redistributes to the other machines.
What is not explained is what happens when the machine receiving and sending the data has unexpectedly dropped out. Of course rather than drop the conference it would be best to find another machine to carry on receiving and distributing the data. Is there any documentation on how this performed on skype or a similar peer-to-peer VoIP?
Basically I'm looking for the fastest method to detect when a "super peer" unexpectedly drops out and quickly migrating operations to another machine.

You need to set a timeout (i.e., limit) and declare that if you don't receive communication within then, the communication is either dead (no path between the peers, reachability issue) or the remote peer is down. There is no other method.
If you have direct tcp or other connection to the super peer, you can catch events telling you the connection dies too. If your communication is relayed, and your framework automatically attempt to find a new route to your target peer, it will either find one or never find out. Hence, the necessity for a timeout.
If none hears about someone for some time, they are finally considered/declared dead.

Related

WebRTC Mobile - Audio not working unless on same wifi

I am using react-native-webrtc to handle the WebRTC portion of this.
I am using Websockets to signal and using ICE trickling to keep track of the ICE candidates.
I queue my ICE candidates until setLocalDescription has been called on the callee side. Then I addIceCandidate for each candidate in the queue.
On the caller side I am doing the same thing and not processing my ICE candidates until setRemoteDescription has been called.
I am only doing audio so no video being used.
When I test this with two mobile devices on the same network I have no issues.
But if I disconnect one device from the WiFi the calls still connect just fine except the audio cannot be heard on either device.
The onConnectionStateChange handler will still return "connected" and the onIceGatheringStateChanged will still return "complete".
I thought maybe I needed to use a TURN server to get this working so I started using Twilio's paid TURN/STUN server but the issue is still persisting.
Any ideas what to look into?
BACKGROUND
Ok, so you have to take some background on P2P connection on RTC platforms. And so, it begins (in very short version):
In order to establish connection you have to establish direct connection between two clients (how obviously, I know). In order to find this routes you need help on network servers.
And that's why you setup local SDP with setting, to which server we can access. ICE, TURN, STUN (you can find any information, for ex. this one). Now ICE candidates most obvious one, because this server endpoints within your local network and that's why your version is not working with different network.
Right, you have to use TURN/STUN to find NAT and correct routes between peers. Most TURN server are private and paid, but for less loaded application you might use public STUN servers, that would be more then enough.
You can find many available over there. One ex. is here.
stun.l.google.com:19302
stun1.l.google.com:19302
stun2.l.google.com:19302
SOLUTION
Now coming to your problem. If you think you have connected your devices with your signaling it doesn't mean you connected devices. (It's just to clarify, if you don't have media on your devices your RTC connection failed to establish, and it's not just audio).
The problem in using it's TURN/STUN servers on your devices, and you have to trace SDP which established during setRemoteDescription and check the servers were included. Furthermore there is always a Google demo which is working perfectly.
UPDATE
In order to trace how remote SDP will be set and connection establish oyu have to print candidates which will be used to setup. To do that, you have to print information which candidates gathered during setLocalDescription and setRemoteDescription.
In place where you are gathering candidates add logging to print information. You have to see, that STUN, TURN candidates will be there. Below ex in Java. Word ICE shouldn't bother you, because it's just means that candidates AFTER ICE traversing will be found.
// Listen for local ICE candidates on the local RTCPeerConnection
peerConnection.addEventListener('icecandidate', event => {
if (event.candidate) {
// Here should be your part where you are sending this candidate to your signaling channel
// Add logging to print entire candidate information. You should see some data related to ICE, TURN.
}
});

Do I need to concern security about my game server? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I have a game server where clients can connect and communicate with via TCP. Any device can connect the server if it knows the IP and port.
I am wondering if I need to add some security to the server. For example,
(1) Add some encryption for the messages sent/received. (To prevent the protocol content is revealed)
(2) Add some key to the message so if the server cannot recognize the key after decryption, the message will be dropped. (To prevent unknown connections/messages flooding in)
Do you think these things are necessary and is there any other thing I should add for such a game server.
I would have rather posted this to the gamedev question but the mods there are apparently faster than here. Before you quote me, I'd like to point out that the following isn't based on 100% book-knowledge, nor do I have a degree in any of these topics. Please improve this answer if you know better, rather than comment and/or compete.
This is a pretty comprehensive list of client/server/security issues that I've gathered from research and/or experience:
Data
The "back-end" server contains everyone's username, password, credit card details, etc., and should be a fortress. This server is for authentication only and should be on a private subnet; it will communicate only with the login server, only when a well-formed login request is received, and will only reply with "allow" or "deny". If you take people's personal information, you are obligated to protect it, and it would be wise to off-load the liability of everything security-related to a professional or hosting company. There is no non-critical attack to this server; if it is breached, you are finished. Many/most/all companies now draw their pretty login screen on top of another companies' back-end credit card/billing system.
Login
Connections to the login server should be secure. The login server is just a message pump between the public login mechanism, the private data store, and the client/server connection state. For security purposes, any HTTP access to the login system should be hosted on a separate HTTP server; the WWW server crashing should not shut down your online game (my opinion).
World/UDP
Upon successful login and authentication, the server informs the client to begin listening for "bulk data" or to initiate an in-bound connection on a specific UDP port (could be random and per-connection-attempt). Either way, the server should remain silent and wait for the client to IDENT with some type of handshake to verify that the "alleged client" is actually your code. It is easier to guess when the server asks for input sequentially; instead rely on the client knowing the proper handshake when connecting to the world and drop those that don't. The correct handshake to use can be a function of the CPU clock-ticks or whatever. The TCP will be minimally used and/or disconnected from that point on. The initial bulk data is a good place to advertise the current server-side software revision so clients that are out-of-date can update. A common pool of UDP ports can be handed out among multiple servers and the clients can be load-balanced into the correct port/server. Within the game, "zone transfers" can mean a literal disconnect from one server/port and reconnection to a different server/port. In MMO's, this usually appears as a <2 second loading screen; enough time to disconnect, reconnect, start getting data, and synchronize to the new server clock, not to mention the actual content loading.
"World server" describes a single, multiple-client, state-pumping thread running on a single core of a single processor of a single blade. One, physical, server-of-worlds can have many worlds running on it at once. Worlds can be dynamically split/merged (in a quad-tree fashion), dividing the clients between them, again, for load-balancing; synchronization between the servers occurs at LAN speeds or better. The world server will probably only serve UDP connections and should have nothing to do except process state-changes to/from the UDP connections. UDP is "blind, deaf, and dumb", so-to-speak. Messages are sent with no flow control, no error checking, etc; they are basically assumed to be received as soon as they are sent and may actually arrive late, in the wrong order, or just never arrive. Using UDP, neither the server nor the client are ever stalled, hand-shaking, error-correcting, or waiting for data. Messages need time-stamps because they may arrive late and/or out-of-order. If a UDP channel gets clogged, switch valid clients dynamically to another (potentially random) port. The world server only initiates UDP connections with successfully authenticated clients and ignores all other traffic (world servers hosted separately from HTTP and everything else).
Overly simplified and, using only the position data as an example, each client tells the server "Time:Client###:(X, Y)" over and over. If the server doesn't hear, oh well. The server says "Time:listOfClients(X, Y)" over and over, to everyone at once. If one or more of the clients doesn't hear, oh well.
This implies using prediction/extrapolation on the client; the clients will need to "guess" what should be happening and then correct themselves to agree with the server when they start getting data again. Any time you get a packet with a "future" time, even if the packet doesn't make sense or isn't useful, you can at least advance the client clock to that point and discard any now-late packets, helping a lagging client to catch up.
Un-verified supposition:
Besides the existing security concerns, I don't see a reason why two or more clients could not maintain independent, but server-managed, UDP channels between each other. By notifying other clients within close game-proximity in addition to the server, the clients, themselves, can help to load-balance. The server should always verify that what the clients say happened could/should/would happen, and has the ability to undo all of it and reset both clients to it's own known-good state. The information that the clients are able to share, internally, should be extremely restricted; basically just the most-time-critical positional and/or state-data. Client's should probably not be allowed to request specific information and, again, rely only on "dumb" broadcasts. This begins to approach distributed/cloud computing, where the clients are actually doing a lot of the server work, while the server just watches and "referees," calling foul, when appropriate.
Client1 - "I fought Client2 and won"
Client2 - "I fought Client1 and won"
Server - "I watched and Client2 cheated. Client1 wins. (Client2 is forced to agree)"
The server doesn't necessarily even need to watch; if Client2 damages Client1 in an unusual/impossible way, Client1 can request arbitration from the server.
Side-effects
If the player moves around, but the data isn't getting to the server, the player experiences "rubber-banding", where the player appears to be moving on the client but, server-side, they are not. When the client gets the next server state, the client snaps the player back to where they were when the server stopped getting updates, creating the rubber-band effect.
This often manifests another way, too. If the server sees a player moving, then fails to receive the "stopped moving" message, the server will predict their continued path for all of the other clients. In MMO-RPG's, for example, you can see "lagging" players running directly into/at walls.
Holes
The last thing I can think of is just basic code security. This is especially important if your game is moddable. Mods are, by definition, a way for users to insert their own code into yours. If you are careless about the amount of "API" access you give away, inevitably, someone WILL feel the need to be malicious. Pay particular attention to string termination/handling if the language you are using requires it. Do not build your game from plain-text ASCII content files. If your game has even one "text box," someone WILL be trying to feed HTML/LUA/etc. code into it.
Lastly, paths should use appropriate system variables whenever possible to avoid platform shenanigans and/or access violations (x86/x64, no savegames in ProgramFiles, etc.)

Hard downsides of long polling?

For interactive web apps, things like Websockets are getting more popular. However, as the client, and proxy world is not always fully compliant, one usually use a complex framework like 'Socket.IO', hiding several different mechanisms for any case that may disable the other ones.
I just wonder what the downsides of a properly implemented long polling are, because with today's servers like node.js it is quite easy to implement and relies on old http technology which is well supported (despite the long polling behaveiour itself may break it).
From an high level view, long polling (despite some additional overhead, feasable for medium traffic apps) resembles a true push behaviour as WebSockets do, as the server actually send it's answer whenever he likes (despite some timeout / heartbeat mechanism).
So we have some more overhead due to the more TCP/IP acknowledgements I guess, but no constant traffic like frequent polling would do.
And using an event driven server, we would have no thread overhead to keep the connections blocked.
So is there any else hard downside that forces medium-traffic apps like chats to use WebSockets rather than long polling?
Overhead
It will create a new connection each time, so it will send the HTTP headers... including the cookie header that may be large.
Also, just "check if there is something new" is another connection for nothing. Connections implies the work of many items like firewalls, load balancers, web servers ... etc.. Probably, establish the connection is most time consuming thing as soon your IT infrastructure have several inspectors.
If you are using HTTPS, you are doing again and again the most expensive operation, the TLS handshake. TLS performance is good once the connection is established and the symmetric encryption is working, but the process of establishing the connection, key exchange and all that jazz is not fast.
Also, when connections are done, log entries are written somewhere, counters are incremented somewhere, memory is consumed, objects are created... etc... etc.. For example, the reason why we have different logging configurations when in production and in development, is because writing log entries also affect performance.
Presence
When is a long polling user connected or disconnected? If you check for this at a given moment of time... what would be the reliable amount of time you should wait to double check, to ensure it is disconnected or connected?
This may be totally irrelevant if your application just broadcast stuff, but it may be very relevant if your application is a game.
Not persistent
This is the big deal.
Since a new connection is created each time, if you have load balanced servers, in a round robbin scenario you cannot know in which server the next connection is going to fall.
When a user's server is known, like when using a WebSocket, you can push the events to that server straight away, and the server will relay them to the connection. If the user disconnects, the server can notify straight away that the user is not connected anymore, and when connect again can subscribe again.
If the server where the user is connected at the moment that an event for him is generated is unknown, you have to wait for the user to connect so then you can say "hey, user 123 is here, give me all the news since this timestamp", what make it a little bit more cumbersome. Long polling is not really push technology, but request-response, so if you plan for a EDA architecture, at some point you are going to have some level of impedance you have to address, like for example, you need a event aggregator that can give you all the events from a given timestamp (the last time that user connected to ask for news).
SignalR (I guess it is the equivalent in .NET to socket.io) for example, has a message bus named backplane, that relay all the messages to all the servers, as key for scaling out. Therefore, when a user connect to other server, "his" pending events are there "as well"(!) It is a "not too bad approach", but as you can guess, affects the throughput:
Limitations
Using a backplane, the maximum message throughput is lower than it is
when clients talk directly to a single server node. That's because the
backplane forwards every message to every node, so the backplane can
become a bottleneck. Whether this limitation is a problem depends on
the application. For example, here are some typical SignalR scenarios:
Server broadcast (e.g., stock ticker): Backplanes work well for this
scenario, because the server controls the rate at which messages are
sent.
Client-to-client (e.g., chat): In this scenario, the backplane might
be a bottleneck if the number of messages scales with the number of
clients; that is, if the rate of messages grows proportionally as more
clients join.
High-frequency realtime (e.g., real-time games): A backplane is not
recommended for this scenario.
For some projects, this may be a showstopper.
Some applications just broadcast general data, but others have a connection semantics, like for example a multiplayer game, and it is important to send the right events to the right connections.
IMHO
Long polling is a good solution for small projects, but became a big burden for high scalable apps that need high frecuency and/or very segmented event sending.
I implemented a Node.js Express server that supported long polling. The biggest mistake I made was not cleaning up the requests which caused slowing down the server. If your server doesn't support concurrency or threads, one of the essential tasks is to set the appropriate timeouts for the requests/responses to release them from the loop, which you have to do by yourself.
Edit: Also you need to keep in mind that browsers have their specific limit for the number of connections (i.e. 6 per hostname for Google Chrome). So if you have too many long polling connections at the same time, you will probably block yourself.

What is the need for the SIP RE-INVITE with regards to ICE?

I understand many of the fine details of NAT hole punching, ICE, and SIP VOIP calls. I've answered quite a few questions on SO on these topics. Now I have a question.
I am trying to understand the need for the RE-INVITE message that is documented for SIP+ICE after the call is already established.
Assume a topology of VOIP devices that signal over SIP and using ICE (with STUN/TURN) for establishing media connectivity. After the ICE connectivity checks are performed, both endpoints should have ascertained the best address candidate pairings (IP,port) and should be ready to stream media in both directions.
But my experience with SIP and plenty of documentation suggests that after the callee sends a 200 OK message to indicate he's in the answered state, the caller is to expected send a RE-INVITE with an SDP containing the specific address candidate selected by the connectivity checks.
Some links that describe the RE-INVITE with ICE are here and here (step 8). Rosenberg's tutorial (page 30) discusses that the RE-INVITE "ensures that middleboxes have the correct media address". I'm not sure why that's important.
Upon receiving a RE-INVITE, is the callee expected to reconfigure his ICE stack to switch sockets or addresses based on the new SDP received? Or is the RE-INVITE just a protocol formality to formally acknowledge the call has been established? If the RE-INVITE step was skipped and both sides started streaming media, what could go wrong?
The reason why I ask is because I am exploring using ICE over a signaling service that is not SIP. I'm trying to figure out if the RE-INVITE needs to be emulated.
One example of middlebox, which may be interested in the result of the ICE negotiation is a bandwidth manager.
Imagine an enterprise deployment, with endpoints inside the corporate firewall and others roaming around on the Internet, or behind private home firewalls. The deployment also includes a publicly accessible TURN server. Then let's imagine an endpoint inside the corporate firewall making a call. If the destination happens to be reachable on the same network, the call will go host to host and the TURN server will not be used (i.e. bindings will be de-allocated). This is a local network, and no bandwidth limitation needs to be imposed. On the other hand, if the call goes out to a roaming endpoint, then the TURN server will get involved, and data will flow through the corporate firewall, through what probably is a limited bandwidth uplink. We can very well imagine some policy that would want to limit the call's bandwidth. One way of doing it, is for the bandwidth manager to remain in the signaling path (using SIP record-routes) and look into the SDP of any INVITE going through, re-writing bandwidth values depending on media addresses.
I believe the general intention was, that legacy equipment of that type, which is not ICE aware, should keep working in a mixed environment introducing ICE endpoints. To keep this backwards compatibility, ICE should only introduce new processing (STUN checks, etc), but from the point of view of non-ICE-aware elements, it should not remove any existing rules (re-INVITEs are still needed to change the destination of a media stream).
In Rosenberg's tutorial I believe the re-INVITE is sent because the sockets chosen by ICE are different from those in the media and connection (m/c-lines) lines of the original SDP AND in order for any network elements that are in between the two user agents to be informed of the actual sockets that will be used RTP a re-INVITE is sent with the ICE selected socket(s) in the media and connection lines of the SDP.
If the ICE selected sockets were the same as the ones in the original INVITE request's SDP m/c lines then there would be no need for the re-INVITE request. Or if you know there are no "middleboxes" that need to be informed of the changes to the RTP sockets then there would also be no need to send the re-INVITE.

choose between tcp "long" connection and "short" connection for internal service

I got an app that web server re-direct some requests to backend servers, and the backend servers(Linux) will do complicated computations and response to web server.
For the tcp socket connection management between web server and backend server, i think there are two basic strategy:
"short" connection: that is, one connection per request. This seems very easy for socket management and simplify the whole program structure. After accept, we just get some thread to process the request and finally close this socket.
"long" connection: that is, for one tcp connection, there could be multi request one by one. It seems this strategy could make better use of socket resource and bring some performance improvement(i am not quite sure). BUT it seems this brings a lot of complexity than "short" connection. For example, since now socket fd may be used by multi-threads, synchronization must be involved. and there are more, socket failure process, message sequence...
Is there any suggestions for these two strategies?
UPDATE:, #SargeATM 's answer remind me that i should tell more about the backend service.
Each request is kind of context-free. Backend service can do calculation based on one single request message. It seems to be sth. stateless.
Without getting into the architecture of the backend which I think heavily influences this decision, I prefer short connections for stateless "quick" request/response type traffic and long connections for stateful protocols like a synchronization or file transfer.
I know there is some tcp overhead for establishing a new connection (if it isn't local host) but that has never been anything I have had to optimize in my applications.
Ok I will get a little into architecture since this is important. I always use threads not per request but by function. So I would have a thread that listened on the socket. Another thread that read packets off of all the active connections and another thread doing the backend calculations and a last thread saving to a database if needed. This keep things clean and simple. Easy to measure slow spots, maintain, and to optimize later when needed if needed.
What about a third option... no connection!
If your job description and job results are both of small size, UDP sockets may be a good idea. You have even less resources to manage, as there's no need to bound the request/response to a file descriptor, which give you some flexibility for the future. Imagine you have more backend services and would like to do some load balancing – a busy service can send the job to another one with UDP address of job submitter. The latter just waits for the result and doesn't care where you performed the task.
Obviously you'd have to deal with lost, duplicated and out of order packets, but as a reward you don't have to deal with broken connections. Out of order packets are probably not a big deal if you can fit the request and response in one UDP message, duplication can be taken care of by some job ids, and lost packet... well, they can be simply resent ;-)
Consider this!
Well, you are right.
The biggest problem with persistent connections will be making sure that app got "clean" connection from pool. Without any garbage left of data from another request.
There are a lot of ways to deal with that problem, but at the end it is better to close() tainted connection and open new one than trying to clean it...

Resources