P2P resources searching - p2p

Recently I had an interest in decentralised network, and protocols enable and utilise it, like bittorrent.
However as I currently understand the protocol only gives a cryptic long hash for a given resource, file, video etc. and the trackers, and better DHTs, gives you the IPs of the devices that owns some of this resources to download from. But for this as I said one must povide a human unfriendly hash.
So, you cannot search by a keyword for a resource, like on networks that more http-ish. Google search etc. Even the torrent apis like Torrent Project, Strike project or any other torrent websites are not bittorrent protocol powered services. They are still just centralised services.
By that one conclusion is that, bittorrent depends on http, or centralised services and networks.
How is it possible to have a p2p protocol that gives the flexiblity of the current http accessed networks, services like, google, SO, etc. in terms of search for content over the network and still stay decentralized?
A web browser like browser app that connects over this protocol and can search resources by human inputable means like a google search with keywords and find peers by using DHTs or something specific to its own and download them.

Related

Networking Unity game over Internet with Node.js

I have a game I want to network over the internet, and wanted to confirm my understanding to make sure i'm not missing any technical or security issues that may arise if I go down this path.
I've managed to network my Unity Android Application on a local network using the built-in Unity networking tools, and I'd like to be able to 'match-make' the clients together over the internet. My plan is to host a node.js server on Digital Ocean as a point of contact for the clients. Clients will connect to the server, which will 'pair up' clients by exchanging IP addresses, to form a direct connection between them, and then function the same as if they were on a local network. I like this method as it is low overhead on the server end, with its only role being as a point of contact for the clients, pairing them up, then disconnecting and waiting for more requests, which node seems particularly suited to.
I do not plan to store any data on the server, or perform any strenuous processing, other than possibly matchmaking players of similar skill level.
Does this sound achievable?
Thanks.
Use photon unity networking its easier and you dont have to worry about hosting.

Considerations regarding a p2p social network

While the are many social networks in the wild, most rely on data stored on a central site owned by a third party.
I'd like to build a solution, where data remains local on member's systems. Think of the project as an address book, which automagically updates contact's data as soon a a contact changes its coordinates. This base idea might get extended later on...
Updates will be transferred using public/private key cryptography using a central host. The sole role of the host is to be a store and forward intermediate. Private keys remain private on each member's system.
If two client are both online and a p2p connection could be established, the clients could transfer data telegrams without the central host.
Thus, sender and receiver will be the only parties which are able create authentic messages.
Questions:
Do exist certain protocols which I should adopt?
Are there any security concerns I should keep in mind?
Do exist certain services which should be integrated or used somehow?
More technically:
Use e.g. Amazon or Google provided services?
Or better use a raw web-server? If yes: Why?
Which algorithm and key length should be used?
UPDATE-1
I googled my own question title and found this academic project developed 2008/09: http://www.lifesocial.org/.
The solution you are describing sounds remarkably like email, with encrypted messages as the payload, and an application rather than a human being creating the messages.
It doesn't really sound like "p2p" - in most P2P protocols, the only requirement for central servers is discovery - you're using store & forward.
As a quick proof of concept, I'd set up an email server, and build an application that sends emails to addresses registered on that server, encrypted using PGP - the tooling and libraries are available, so you should be able to get that up and running in days, rather than weeks. In my experience, building a throw-away PoC for this kind of question is a great way of sifting out the nugget of my idea.
The second issue is that the nature of a social network is that it's a network. Your design may require you to store more than the data of the two direct contacts - you may also have to store their friends, or at least the public interactions those friends have had.
This may not be part of your plan, but if it is, you need to think it through early on - you may end up having to transmit the entire social graph to each participant for local storage, which creates a scalability problem....
The paper about Safebook might be interesting for you.
Also you could take a look at other distributed OSN and see what they are doing.
None of the federated networks mentioned on http://en.wikipedia.org/wiki/Distributed_social_network is actually distributed. What Stefan intends to do is indeed new and was only explored by some proprietary folks.
I've been thinking about the same concept for the last two years. I've finally decided to give it a try using Python.
I've spent the better part of last night and this morning writing a sockets communication script & server. I also plan to remove the central server from the equation as it's just plain cumbersome and there's no point to it when all the members could keep copies of their friend's keys.
Each profile could be accessed via a hashed string of someone's public key. My social network relies on nodes and pods. Pods are computers which have their ports open to the network. They help with relaying traffic as most firewalls block incoming socket requests. Nodes store information and share it with other nodes. Each node will get a directory of active pods which may be used to relay their traffic.
The PeerSoN project looks like something you might be interested in: http://www.peerson.net/index.shtml
They have done a lot of research and the papers are available on their site.
Some thoughts about it:
protocols to use: you could think exactly on P2P programs and their design
security concerns: privacy. Take a great care to not open doors: a whole system can get compromised 'cause you have opened some door.
services: you could integrate with the regular social networks through their APIs
People will have to install a program in their computers and remeber to open it everytime, like any P2P client. Leaving everything on a web-server has a smaller footprint / necessity of user action.
Somehow you'll need a centralized server to manage the searches. You can't just broadcast the internet to find friends. Or you'll have to rely uppon email requests to add somenone, and to do that you'll need to know the email in advance.
The fewer friends /contacts use your program, the fewer ones will want to use it, since it won't have contact information available.
I see that your server will be a store and forward, so the update problem is solved.

Chat program without a central server

I'm developing a chat application (in VB.Net). It will be a "secure" chat program. All traffic will be encrypted (I also need to find the best approach for this, but that's not the question for now).
Currently the program works. I have a server application and a client application. However I want to setup the application so that it doesn't need a central server for it to work.
What approach can I take to decentralize the network?
I think I need to develop the clients in a way so that they do also act as a server.
How would the clients know what server it needs to connect with / what happens if a server is down? How would the clients / servers now what other nodes there are in the network without having a central server?
At best I don't want the clients to know what the IP addresses are of the different nodes, however I don't think this would be possible without having a central server.
As stated the application will be written in VB.Net, but I think the language doesn't really matter at this point.
Just want to know the different approaches I can follow.
Look for example at the paper of the Kademlia protocol (you can find it here). If you just want a quick overview, look at the Wikipedia page http://en.wikipedia.org/wiki/Kademlia. The Kademlia protocol defines a way of node lookups in a network in a decentral way. It has been successfully applied in the eMule software - so it is tested to really work.
It should cause no serious problems to apply it to your chat software.
You need some known IP address for clients to initially get into a network. Once a client is part of a network, things can be more decentralized, but that first step needs something.
There are basically only two options - either the user provides one (for an existing node of the network - essentially how BitTorrent trackers work), or you hard-code in a gateway node (which is effectively a central server).
Maybe you can see uChat program. It's a program from uTorrent creator with chat without server in mind.
The idea is connect to a swarm from a magnetlink and use it to send an receive messages. This is as Amber answer, you need an access point, may it be a server, a know swarm, manual ip, etc.
Here is uChat presentation: http://blog.bittorrent.com/2011/06/30/uchat-we-just-need-each-other/

Accessing data in internal production databases from a web server in DMZ

I'm working on an external web site (in DMZ) that needs to get data from our internal production database.
All of the designs that I have come up with are rejected because the network department will not allow a connection of any sort (WCF, Oracle, etc.) to come inside from the DMZ.
The suggestions that have come from the networking side generally fall under two categories -
1) Export the required data to a server in the DMZ and export modified/inserted records eventually somehow, or
2) Poll from inside, continually asking a service in the DMZ whether it has any requests that need serviced.
I'm averse to suggestion 1 because I don't like the idea of a database sitting in the DMZ. Option 2 seems like a ridiculous amount of extra complication for the nature of what's being done.
Are these the only legitimate solutions? Is there an obvious solution I'm missing? Is the "No connections in from DMZ" decree practical?
Edit: One line I'm constantly hearing is that "no large company allows a web site to connect inside to get live production data. That's why they send confirmation emails". Is that really how it works?
I'm sorry, but your networking department are on crack or something like that - they clearly do not understand what the purpose of a DMZ is. To summarize - there are three "areas" - the big, bad outside world, your pure and virginal inside world, and the well known, trusted, safe DMZ.
The rules are:
Connections from outside can only get to hosts in the DMZ, and on specific ports (80, 443, etc);
Connections from the outside to the inside are blocked absolutely;
Connections from the inside to either the DMZ or the outside are fine and dandy;
Only hosts in the DMZ may establish connections to the inside, and again, only on well known and permitted ports.
Point four is the one they haven't grasped - the "no connections from the DMZ" policy is misguided.
Ask them "How does our email system work then?" I assume you have a corporate mail server, maybe exchange, and individuals have clients that connect to it. Ask them to explain how your corporate email, with access to internet email, works and is compliant with their policy.
Sorry, it doesn't really give you an answer.
I am a security architect at a fortune 50 financial firm. We had these same conversations. I don't agree with your network group. I understand their angst, and I understand that they would like a better solution but most places don't opt with the better choices (due to ignorance on their part [ie the network guys not you]).
Two options if they are hard set on this:
You can use a SQL proxy solution like greensql (I don't work for them, just know of them) they are just greensql dot com.
The approach they refer to that most "Large orgs" use is a tiered web model. Where you have a front end web server (accessed by the public at large), a mid-tier (application or services layer where the actual processes occurs), and a database tier. The mid-tier is the only thing that can talk to the database tier. In my opinion this model is optimal for most large orgs. BUT that being said, most large orgs will run into either a vendor provided product that does not support a middle tier, they developed without a middle tier and the transition requires development resources they dont have to spare to develop the mid-tier web services, or plain outright there is no priorty at some companies to go that route.
Its a gray area, no solid right or wrong in that regard, so if they are speaking in finality terms then they are clearly wrong. I applaud their zeal, as a security professional I understand where they are coming from. BUT, we have to enable to business to function securely. Thats the challange and the gauntlet I always try and throw down to myself. how can I deliver what my customer (my developers, my admins, my dbas, business users) what they want (within reason, and if I tell someone no I always try to offer an alternative that meets most of their needs).
Honestly it should be an open conversation. Here's where I think you can get some room, ask them to threat model the risk they are looking to mitigate. Ask them to offer alternative solutions that enable your web apps to function. If they are saying they cant talk, then put the onus on them to provide a solution. If they can't then you default to it working. Site that you open connections from the dmz to the db ONLY for the approved ports. Let them know that DMZ is for offering external services. External services are not good without internal data for anything more than potentially file transfer solutions.
Just my two cents, hope this comment helps. And try to be easy on my security brethren. We have some less experienced misguided in our flock that cling to some old ways of doing things. As the world evolves the threat evolves and so does our approach to mitigation.
Why don't you replicate your database servers? You can ensure that the connection is from the internal servers to the external servers and not the other way.
One way is to use the ms sync framework - you can build a simple windows service that can synchronize changes from internal database to your external database (which can reside on a separate db server) and then use that in your public facing website. Advantage is, your sync logic can filter out sensitive data and keep only things that are really necessary. And since the entire control of data will be in your internal servers (PUSH data out instead of pull) I dont think IT will have an issue with that.
The connection formed is never in - it is out - which means no ports need to be opened.
I'm mostly with Ken Ray on this; however, there appears to be some missing information. Let's see if I get this right:
You have a web application.
Part of that web application needs to display data from a different production server (not the one that normally backs your site).
The data you want/need is handled by a completely different application internally.
This data is critical to the normal flow of your business and only a limited set needs to be available to the outside world.
If I'm on track, then I would have to say that I agree with your IT department and I wouldn't let you directly access that server either.
Just take option 1. Have the production server export the data you need to a commonly accessible drop location. Have the other db server (one in the DMZ) pick up the data and import it on a regular basis. Finally, have your web app ONLY talk to the db server in the dmz.
Given how a lot of people build sites these days I would also be loath to just open a sql port from the dmz to the web server in question. Quite frankly I could be convinced to open the connection if I was assured that 1. you only used stored procs to access the data you need; 2. the account information used to access the database was encrypted and completely restricted to only running those procs; 3. those procs had zero dynamic sql and were limited to selects; 4. your code was built right.
A regular IT person would probably not be qualified to answer all of those questions. And if this database was from a third party, I would bet you might loose support if you were to start accessing it from outside it's normal application.
Before talking about your particular problem I want to deal with the Update that you provided.
I haven't worked for a "large" company - though large is hard to judge without a context, but I have built my share of web applications for the non profit and university department that I used to work for. In both situations I have always connected to the production DB that is on the internal Network from the Web server on the DMZ. I am pretty sure many large companies do this too; think for example of how Sharepoint's architecture is setup - back-end indexing, database, etc. servers, which are connected to by front facing web servers located in the DMZ.
Also the practice of sending confirmation e-mails, which I believe you are referring to confirmations when you register for a site don't usually deal with security. They are more a method to verify that a user has entered a valid e-mail address.
Now with that out of the way, let us look at your problem. Unfortunately, other than the two solutions you presented, I can't think of any other way to do this. Though some things that you might want to think about:
Solutions 1:
Depending on the sensitivity of the data that you need to work with, extracting it onto a server on the DMZ - whether using a service or some sort of automatic synchronization software - goes against basic security common sense. What you have done is move the data from a server behind a firewall to one that is in front of it. They might as well just let you get to the internal db server from the DMZ.
Solution 2:
I am no networking expert, so please correct me if I am wrong, but a polling mechanism still requires some sort of communication back from the web server to inform the database server that it needs some data back, which means a port needs to be open, and again you might as well tell them to let you get to the internal database without the hassle, because you haven't really added any additional security with this method.
So, I hope that this helps in at least providing you with some arguments to allow you to access the data directly. To me it seems like there are many misconceptions in your network department over how a secure database backed web application architecture should look like.
Here's what you could do... it's a bit of a stretch, but it should work...
Write a service that sits on the server in the DMZ. It will listen on three ports, A, B, and C (pick whatever port numbers make sense). I'll call this the DMZ tunnel app.
Write another service that lives anywhere on the internal network. It will connect to the DMZ tunnel app on port B. Once this connection is established, the DMZ tunnel app no longer needs to listen on port B. This is the "control connection".
When something connects to port A of the DMZ tunnel app, it will send a request over the control connection for a new DB/whatever connection. The internal tunnel app will respond by connecting to the internal resource. Once this connection is established, it will connect back to the DMZ tunnel app on port C.
After possibly verifying some tokens (this part is up to you) the DMZ tunnel app will then forward data back and forth between the connections it received on port A and C. You will effectively have a transparent TCP proxy created from two services running in the DMZ and on the internal network.
And, for the best part, once this is done you can explain what you did to your IT department and watch their faces as they realize that you did not violate the letter of their security policy, but you are still being productive. I tell you, they will hate that.
If all development solutions cannot be applied because of system engineering restriction in DMZ then give them the ball.
Put your website in intranet, and tell them 'Now I need inbound HTTP:80 or HTTPS:443 connections to that applications. Set up what you want : reverse proxies, ISA Server, protocols break, SSL... I will adapt my application if necessary.'
About ISA, I guess they got one if you have exchange with external connections.
Lot of companies are choosing this solution when a resource need to be shared between intranet and public.
Setting up a specific and intranet network with high security rules is the best way to make the administration, integration and deployment easier. What is easier is well known, what is known is masterized : less security breach.
More and more system enginers (like mines) prefer to maintain an intranet network with small 'security breach' like HTTP than to open other protocols and ports.
By the way, if they knew WCF services, they would have accepted this solution. This is the most secure solution if well designed.
Personnaly, I use this two methods : TCP(HTTP or not) Services and ISA Server.

Fully Decentralized P2P?

I’m looking at creating a P2P system. During initial research, I’m reading from Peer-to-Peer – Harnessing the Power of Disruptive Technologies. That book states “a fully decentralized approach to instant messaging would not work on today's Internet.” Mostly blaming firewalls and NATs. The copyright is 2001. Is this information old or still correct?
It's still largely correct. Most users still are behind firewalls or home routers that block incoming connections. Those can be opened easier today than in 2001 (using uPnP for example, requiring little user interaction and knowledge) but most commercial end-user-targeting applications - phone (Skype, VoIP), chat (the various Messengers), remote control - are centralized solutions to circumvent firewall problems.
I would say that it is just plain wrong, both now and then. Yes, you will have many nodes that will be firewalled, however, you will also have a significant number who are not. So, if end-to-end encryption is used to protect the traffic from snooping, then you can use non-firewalled clients to act as intermediaries between two firewalled clients that want to chat.
You will need to take care, however, to spread the load around, so that a few unfirewalled clients aren't given too much load.
Skype uses a similar idea. They even allow file transfers through intermediaries, though they limit the through-put so as not to over load the middle-men.
That being said, now in 2010, it is a lot easier to punch holes in firewalls than it was in 2001, as most routers will allow you to automate the opening of ports via UPNP, so you are likely to have a larger pool of unfirewalled clients to work with.
Firewalls and NATs still commonly disrupt direct peer-to-peer communication between home-based PCs (and also between home-based PCs and corporate desktops).
They can be configured to allow particular peer-to-peer protocols, but that remains a stumbling block for most unsavvy users.
I think the original statement is no longer correct. But the field of Decentralized Computing is still in its infancy, with little serious contenders.
Read this interesting post on ZeroTier (thanks to #joehand): The State of NAT Traversal:
NAT is Traversable
In reading the Internet chatter on this subject I've been shocked by how many people don't really understand this, hence the reason this post was written. Lots of people think NAT is a show-stopper for peer to peer communication, but it isn't. More than 90% of NATs can be traversed, with most being traversable in reliable and deterministic ways.
At the end of the day anywhere from 4% (our numbers) to 8% (an older number from Google) of all traffic over a peer to peer network must be relayed to provide reliable service. Providing relaying for that small a number is fairly inexpensive, making reliable and scalable P2P networking that always works quite achievable.
I personally know of Dat Project, a decentralized data sharing toolkit (based on their hypercore protocol for P2P streaming).
From their Dat - Distributed Dataset Synchronization And Versioning paper:
Peer Connections
After the discovery phase, Dat should have a list of
potential data sources to try and contact. Dat uses
either TCP, UTP, or HTTP. UTP is designed to not
take up all available bandwidth on a network (e.g. so
that other people sharing wifi can still use the Inter-
net), and is still based on UDP so works with NAT
traversal techniques like UDP hole punching.
HTTP is supported for compatibility with static file servers and
web browser clients. Note that these are the protocols
we support in the reference Dat implementation, but
the Dat protocol itself is transport agnostic.
Furthermore you can use it with Bittorrent DHT. The paper also contains some references to other technologies that inspired Dat.
For implementation of peer discovery, see: discovery-channel
Then there is also IPFS, or 'The Interplanetary File System' which is currently best positioned to become a standard.
They have extensive documentation on their use of DHT and NAT traversal to achieve decentralized P2P.
The session messenger seem to have solved the issue with a truly decentralized p2p messenger by using a incentivized mixnet to relay and store messages. Its a fork of the Signal messenger with a mixnet added in. https://getsession.org -- whitepaper: https://getsession.org/wp-content/uploads/2020/02/Session-Whitepaper.pdf
It's very old and not correct. I believe there is a product out called Tribler (news article) which enables BitTorrent to function in a fully decentralized way.
If you want to go back a few years (even before that document) you could look at Windows. Windows networking used to function in a fully decentralized way. In some cases it still does.
UPNP is also decentralized in how it determines available devices on your local network.
In order to be decentralized you need to have a way to locate other peers. This can be done proactively by scanning the network (time consuming) or by having some means of the clients announcing that they are available.
The announcements can be simple UDP packets that get broadcast every so often to the subnet which other peers listen for. Another mechanism is broadcasting to IIRC channels (most common for command and control of botnets), etc. You might even use twitter or similar services. Use your imagination here.
Firewalls don't really play a part because they almost always leave open a few ports, such as 80 (http). Obviously you couldn't browse the network if that was closed. Now if the firewall is configured to only allow connections that originated from internal clients, then you'd have a little more work to do. But not much.
NATs are also not a concern for similiar issues.

Resources