I am using react-native-webrtc to handle the WebRTC portion of this.
I am using Websockets to signal and using ICE trickling to keep track of the ICE candidates.
I queue my ICE candidates until setLocalDescription has been called on the callee side. Then I addIceCandidate for each candidate in the queue.
On the caller side I am doing the same thing and not processing my ICE candidates until setRemoteDescription has been called.
I am only doing audio so no video being used.
When I test this with two mobile devices on the same network I have no issues.
But if I disconnect one device from the WiFi the calls still connect just fine except the audio cannot be heard on either device.
The onConnectionStateChange handler will still return "connected" and the onIceGatheringStateChanged will still return "complete".
I thought maybe I needed to use a TURN server to get this working so I started using Twilio's paid TURN/STUN server but the issue is still persisting.
Any ideas what to look into?
BACKGROUND
Ok, so you have to take some background on P2P connection on RTC platforms. And so, it begins (in very short version):
In order to establish connection you have to establish direct connection between two clients (how obviously, I know). In order to find this routes you need help on network servers.
And that's why you setup local SDP with setting, to which server we can access. ICE, TURN, STUN (you can find any information, for ex. this one). Now ICE candidates most obvious one, because this server endpoints within your local network and that's why your version is not working with different network.
Right, you have to use TURN/STUN to find NAT and correct routes between peers. Most TURN server are private and paid, but for less loaded application you might use public STUN servers, that would be more then enough.
You can find many available over there. One ex. is here.
stun.l.google.com:19302
stun1.l.google.com:19302
stun2.l.google.com:19302
SOLUTION
Now coming to your problem. If you think you have connected your devices with your signaling it doesn't mean you connected devices. (It's just to clarify, if you don't have media on your devices your RTC connection failed to establish, and it's not just audio).
The problem in using it's TURN/STUN servers on your devices, and you have to trace SDP which established during setRemoteDescription and check the servers were included. Furthermore there is always a Google demo which is working perfectly.
UPDATE
In order to trace how remote SDP will be set and connection establish oyu have to print candidates which will be used to setup. To do that, you have to print information which candidates gathered during setLocalDescription and setRemoteDescription.
In place where you are gathering candidates add logging to print information. You have to see, that STUN, TURN candidates will be there. Below ex in Java. Word ICE shouldn't bother you, because it's just means that candidates AFTER ICE traversing will be found.
// Listen for local ICE candidates on the local RTCPeerConnection
peerConnection.addEventListener('icecandidate', event => {
if (event.candidate) {
// Here should be your part where you are sending this candidate to your signaling channel
// Add logging to print entire candidate information. You should see some data related to ICE, TURN.
}
});
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I have a game server where clients can connect and communicate with via TCP. Any device can connect the server if it knows the IP and port.
I am wondering if I need to add some security to the server. For example,
(1) Add some encryption for the messages sent/received. (To prevent the protocol content is revealed)
(2) Add some key to the message so if the server cannot recognize the key after decryption, the message will be dropped. (To prevent unknown connections/messages flooding in)
Do you think these things are necessary and is there any other thing I should add for such a game server.
I would have rather posted this to the gamedev question but the mods there are apparently faster than here. Before you quote me, I'd like to point out that the following isn't based on 100% book-knowledge, nor do I have a degree in any of these topics. Please improve this answer if you know better, rather than comment and/or compete.
This is a pretty comprehensive list of client/server/security issues that I've gathered from research and/or experience:
Data
The "back-end" server contains everyone's username, password, credit card details, etc., and should be a fortress. This server is for authentication only and should be on a private subnet; it will communicate only with the login server, only when a well-formed login request is received, and will only reply with "allow" or "deny". If you take people's personal information, you are obligated to protect it, and it would be wise to off-load the liability of everything security-related to a professional or hosting company. There is no non-critical attack to this server; if it is breached, you are finished. Many/most/all companies now draw their pretty login screen on top of another companies' back-end credit card/billing system.
Login
Connections to the login server should be secure. The login server is just a message pump between the public login mechanism, the private data store, and the client/server connection state. For security purposes, any HTTP access to the login system should be hosted on a separate HTTP server; the WWW server crashing should not shut down your online game (my opinion).
World/UDP
Upon successful login and authentication, the server informs the client to begin listening for "bulk data" or to initiate an in-bound connection on a specific UDP port (could be random and per-connection-attempt). Either way, the server should remain silent and wait for the client to IDENT with some type of handshake to verify that the "alleged client" is actually your code. It is easier to guess when the server asks for input sequentially; instead rely on the client knowing the proper handshake when connecting to the world and drop those that don't. The correct handshake to use can be a function of the CPU clock-ticks or whatever. The TCP will be minimally used and/or disconnected from that point on. The initial bulk data is a good place to advertise the current server-side software revision so clients that are out-of-date can update. A common pool of UDP ports can be handed out among multiple servers and the clients can be load-balanced into the correct port/server. Within the game, "zone transfers" can mean a literal disconnect from one server/port and reconnection to a different server/port. In MMO's, this usually appears as a <2 second loading screen; enough time to disconnect, reconnect, start getting data, and synchronize to the new server clock, not to mention the actual content loading.
"World server" describes a single, multiple-client, state-pumping thread running on a single core of a single processor of a single blade. One, physical, server-of-worlds can have many worlds running on it at once. Worlds can be dynamically split/merged (in a quad-tree fashion), dividing the clients between them, again, for load-balancing; synchronization between the servers occurs at LAN speeds or better. The world server will probably only serve UDP connections and should have nothing to do except process state-changes to/from the UDP connections. UDP is "blind, deaf, and dumb", so-to-speak. Messages are sent with no flow control, no error checking, etc; they are basically assumed to be received as soon as they are sent and may actually arrive late, in the wrong order, or just never arrive. Using UDP, neither the server nor the client are ever stalled, hand-shaking, error-correcting, or waiting for data. Messages need time-stamps because they may arrive late and/or out-of-order. If a UDP channel gets clogged, switch valid clients dynamically to another (potentially random) port. The world server only initiates UDP connections with successfully authenticated clients and ignores all other traffic (world servers hosted separately from HTTP and everything else).
Overly simplified and, using only the position data as an example, each client tells the server "Time:Client###:(X, Y)" over and over. If the server doesn't hear, oh well. The server says "Time:listOfClients(X, Y)" over and over, to everyone at once. If one or more of the clients doesn't hear, oh well.
This implies using prediction/extrapolation on the client; the clients will need to "guess" what should be happening and then correct themselves to agree with the server when they start getting data again. Any time you get a packet with a "future" time, even if the packet doesn't make sense or isn't useful, you can at least advance the client clock to that point and discard any now-late packets, helping a lagging client to catch up.
Un-verified supposition:
Besides the existing security concerns, I don't see a reason why two or more clients could not maintain independent, but server-managed, UDP channels between each other. By notifying other clients within close game-proximity in addition to the server, the clients, themselves, can help to load-balance. The server should always verify that what the clients say happened could/should/would happen, and has the ability to undo all of it and reset both clients to it's own known-good state. The information that the clients are able to share, internally, should be extremely restricted; basically just the most-time-critical positional and/or state-data. Client's should probably not be allowed to request specific information and, again, rely only on "dumb" broadcasts. This begins to approach distributed/cloud computing, where the clients are actually doing a lot of the server work, while the server just watches and "referees," calling foul, when appropriate.
Client1 - "I fought Client2 and won"
Client2 - "I fought Client1 and won"
Server - "I watched and Client2 cheated. Client1 wins. (Client2 is forced to agree)"
The server doesn't necessarily even need to watch; if Client2 damages Client1 in an unusual/impossible way, Client1 can request arbitration from the server.
Side-effects
If the player moves around, but the data isn't getting to the server, the player experiences "rubber-banding", where the player appears to be moving on the client but, server-side, they are not. When the client gets the next server state, the client snaps the player back to where they were when the server stopped getting updates, creating the rubber-band effect.
This often manifests another way, too. If the server sees a player moving, then fails to receive the "stopped moving" message, the server will predict their continued path for all of the other clients. In MMO-RPG's, for example, you can see "lagging" players running directly into/at walls.
Holes
The last thing I can think of is just basic code security. This is especially important if your game is moddable. Mods are, by definition, a way for users to insert their own code into yours. If you are careless about the amount of "API" access you give away, inevitably, someone WILL feel the need to be malicious. Pay particular attention to string termination/handling if the language you are using requires it. Do not build your game from plain-text ASCII content files. If your game has even one "text box," someone WILL be trying to feed HTML/LUA/etc. code into it.
Lastly, paths should use appropriate system variables whenever possible to avoid platform shenanigans and/or access violations (x86/x64, no savegames in ProgramFiles, etc.)
Sorry about my English. I have a problem with webRTC. My application works correctly in the same network but in different is wrong.
Technologies that I use:
socket.io
node
coffeescript
gulp
zenserver
In this github I push my code: github/oihi08/webrtc
I dont know why the application not runs with different networks. I have uploaded to a server, I tried it and nothing. But in the same network yes.
Thank you so much!!
It sounds like you aren't using a STUN/TURN server. There are a few steps to create a connection between two devices. One of these steps is to select one or more STUN/TURN servers (like "stun:stun.l.google.com:19302" for example). This server will be used to create a connection between peers, even when there is a firewall in the way on one or both ends.
When you set up one or more STUN/TURN servers, you will see that ice candidates will start being generated. The callback function peerConnection.onicecandidate will be called for every ice candidate that is generated. When the library is done generating ice candidates, it calls the callback one more time with NULL as parameter, this flags the end of the list of candidates.
You need to get these ice candidates across to the other peer somehow, usually through the same signaling server you use to create the connection in the first place. When they arrive at the other side you need to call peerconnection.addIceCandidate.
If you do these steps, you will be able to get a proper connection, even across networks with strict NAT types.
I understand many of the fine details of NAT hole punching, ICE, and SIP VOIP calls. I've answered quite a few questions on SO on these topics. Now I have a question.
I am trying to understand the need for the RE-INVITE message that is documented for SIP+ICE after the call is already established.
Assume a topology of VOIP devices that signal over SIP and using ICE (with STUN/TURN) for establishing media connectivity. After the ICE connectivity checks are performed, both endpoints should have ascertained the best address candidate pairings (IP,port) and should be ready to stream media in both directions.
But my experience with SIP and plenty of documentation suggests that after the callee sends a 200 OK message to indicate he's in the answered state, the caller is to expected send a RE-INVITE with an SDP containing the specific address candidate selected by the connectivity checks.
Some links that describe the RE-INVITE with ICE are here and here (step 8). Rosenberg's tutorial (page 30) discusses that the RE-INVITE "ensures that middleboxes have the correct media address". I'm not sure why that's important.
Upon receiving a RE-INVITE, is the callee expected to reconfigure his ICE stack to switch sockets or addresses based on the new SDP received? Or is the RE-INVITE just a protocol formality to formally acknowledge the call has been established? If the RE-INVITE step was skipped and both sides started streaming media, what could go wrong?
The reason why I ask is because I am exploring using ICE over a signaling service that is not SIP. I'm trying to figure out if the RE-INVITE needs to be emulated.
One example of middlebox, which may be interested in the result of the ICE negotiation is a bandwidth manager.
Imagine an enterprise deployment, with endpoints inside the corporate firewall and others roaming around on the Internet, or behind private home firewalls. The deployment also includes a publicly accessible TURN server. Then let's imagine an endpoint inside the corporate firewall making a call. If the destination happens to be reachable on the same network, the call will go host to host and the TURN server will not be used (i.e. bindings will be de-allocated). This is a local network, and no bandwidth limitation needs to be imposed. On the other hand, if the call goes out to a roaming endpoint, then the TURN server will get involved, and data will flow through the corporate firewall, through what probably is a limited bandwidth uplink. We can very well imagine some policy that would want to limit the call's bandwidth. One way of doing it, is for the bandwidth manager to remain in the signaling path (using SIP record-routes) and look into the SDP of any INVITE going through, re-writing bandwidth values depending on media addresses.
I believe the general intention was, that legacy equipment of that type, which is not ICE aware, should keep working in a mixed environment introducing ICE endpoints. To keep this backwards compatibility, ICE should only introduce new processing (STUN checks, etc), but from the point of view of non-ICE-aware elements, it should not remove any existing rules (re-INVITEs are still needed to change the destination of a media stream).
In Rosenberg's tutorial I believe the re-INVITE is sent because the sockets chosen by ICE are different from those in the media and connection (m/c-lines) lines of the original SDP AND in order for any network elements that are in between the two user agents to be informed of the actual sockets that will be used RTP a re-INVITE is sent with the ICE selected socket(s) in the media and connection lines of the SDP.
If the ICE selected sockets were the same as the ones in the original INVITE request's SDP m/c lines then there would be no need for the re-INVITE request. Or if you know there are no "middleboxes" that need to be informed of the changes to the RTP sockets then there would also be no need to send the re-INVITE.
I've been looking into skypes protocol or what people can make out since its a propriety protocol. I've read "An analysis of the skype peer-to-peer internet telephony protocol", though it is old it discusses a certain property which I'm looking to recreate in my own architecture. What I'm interested in is during video a conference, data is sent to one machine (the one most likely with the best bandwidth and processing power) which then redistributes to the other machines.
What is not explained is what happens when the machine receiving and sending the data has unexpectedly dropped out. Of course rather than drop the conference it would be best to find another machine to carry on receiving and distributing the data. Is there any documentation on how this performed on skype or a similar peer-to-peer VoIP?
Basically I'm looking for the fastest method to detect when a "super peer" unexpectedly drops out and quickly migrating operations to another machine.
You need to set a timeout (i.e., limit) and declare that if you don't receive communication within then, the communication is either dead (no path between the peers, reachability issue) or the remote peer is down. There is no other method.
If you have direct tcp or other connection to the super peer, you can catch events telling you the connection dies too. If your communication is relayed, and your framework automatically attempt to find a new route to your target peer, it will either find one or never find out. Hence, the necessity for a timeout.
If none hears about someone for some time, they are finally considered/declared dead.
I am working on a Chatting application (needs to connect to a server) on iPhone. The sending packet from iPhone shouldn't be a problem.
But I would like to know whether it is possible for iPhone to establish a incoming socket connection to server continuously or forever under mobile environment.
OR What do I need to do to give the connection alive ? Need to send something over it to keep it alive ?
Thanks.
Not sure why you want to have chatting app to have persisted connection... I'd better use SMS like model. Anyways, Cocoa NSStream is based on NSSocket and allows a lot of functionality. Take a look at it.
Response to the question. Here is in a nutshell, what I would do:
Get an authentication token from the server.
this will also take care of user presence if necessary but now we are talking about the state; once presence is known, the server may send out notifications to clients that are active and have a user on their contact list.
Get user's contact list and contact presence state.
When a message send, handle it according to addressee state, i.e. if online, communicate back to the other user, if offline, queue for later delivery or reject.
Once token expires, reject communication with appropriate error and make the client to request a new token.
Communication from server to client, can be based on pull or push model. In first case, client periodically makes a request and fetches all messages. This may sound not good but in reality, how often users compose and send messages? Several times a minute? That's not too much. So fetching may happen every 5-10 seconds.
For push model, client must be able to listen and accept connections.
Finally, check out SIP, session initiation protocol. No need to use full version of it though. Just basic stuff.
This is very rough and perhaps simplified. I don't know the target complexity of your chatting system. For example, the simplest thing can also be that server just enables client to client communication by distributing their end points and clients take care of everything themselves.
Good luck!
Super out of date response, but maybe it will help the next person.
I would use xmppframework and a jabber server.