in dat protocol, if I install dat and use it to share folder, I become so-called peer and also store other peers's data, right?

in dat protocol, if I install dat and use it to share folder, I become so-called peer and also store other peers's data, right? - node.js

I am studying some new emerging p2p protocols, then I found dat protocol.
In dat protocol, if I install dat and use it to share folders, I will become so-called peer and also store other peers's data, right?
I found no docs and FAQs saying that where and who are the so-called peers. Is it right that the PCs in which every users install dat will act as peers who share and store data for each others? and only dat client will do that? any other software?

In dat protocol, if I install dat and use it to share folders, I will become so-called peer and also store other peers's data, right?
It's important to know you control what data you download and share.
When you run the Dat CLI, you specify either a URL, or the path of a folder on your computer.
If you give an archive's URL, you will download the files in the archive, and then share those files on the p2p network until you close the CLI tool.
If you give a folder path, you will create a new archive, and Dat will give you a URL to share.
In practice, this is similar to how BitTorrent works. Each archive is a set of files, and so "swarming" an archive (that is, joining the network to exchange it) will only upload and download the files in that archive.
I found no docs and FAQs saying that where and who are the so-called peers. Is it right that the PCs in which every users install dat will act as peers who share and store data for each others? and only dat client will do that? any other software?
Peers are people who possess the URL, and who have told their Dat clients to swarm the URL. As of yet, there's no Dat client which tries to automatically download more than the URLs given explicitly by the user, but such a thing would be possible.
There are multiple clients for Dat already created:
The Official Dat CLI
The Official Dat Desktop App
Science Fair a tool for exchanging papers
Beaker Browser a browser which uses dat:// as a protocol
The dat-node library is written in javascript and is relatively easy to use for creating custom clients.

Related

InterPlanetary File System (IPFS) security question regarding illegal files

With IPFS being distributed p2p storage and sharing, isn't there then a chance that someone could store something illegal on your machine if you are an IPFS provider?
Is there some mechanism that IPFS systems use to prevent this? How would someone even know if illegal content is stored on their machine, especially if they are only storing a part of the file?
I want to run an IPFS node on my machine, but I am unsure if I have to worry about malicious actors using my IPFS node.

The law is very clear here, you are not responsible for caching partial/complete files/metadata in almost all cases.
distributed p2p storage and sharing
No, it works more like in BitTorrent protocol and less like in TV series "Silicon Valley". You need to share a file and somebody will need to find a hash to download a file (the difference is that .torrent file was mostly preferred over magnet hash before around 2015 in BitTorrent, while in IPFS hashes are the only way, also the system is global, that means the complex hash function is used so that no collisions are possible (at least not in a billion years) and thus it can check for hashes over THE WHOLE network and thus do not store duplicate chunks of data that folder/file structure is reconstructed from).
The point here is that just like in BitTorrent you store no files you do not request, now just like in BitTorrent you can do IPFS BitSwap stuff to accelerate the swarms, that is what cloudflare-ipfs.com does and ipfs.infura.io and others (?). In BitTorrent such things also exist in particular to automatically attach to updated torrents that have the same hashes for file parts... That is very cool, but in IPFS it is done automatically. Also different servers exist that propagate .torrent file (a.k.a. magnet metadata) using just magnet hash. I believe even DHT crawlers play some role, like BTDigg or https://btdb.eu/, but not much of course, you can set up you own crawler (as Btdigg is open source) that will do precisely that: share metadata of torrents, that requires almost no resources... (You can even set up your own bootstrap supernode to create you OWN seperate DHT.) But is very cool to do as a lot of stuff can be found there. As I understand IPFS also does that by default, i.e. it stores some metadata to help data propagation. You can further read this:
https://discuss.ipfs.io/t/ipfs-propagation/4301
https://discuss.ipfs.io/t/how-fast-do-ipns-changes-propagate/311
https://docs.ipfs.io/concepts/bitswap/
There is also this: https://collab.ipfscluster.io/

Does IPFS host files you access by default?

I could not find a straight answer, so I am sorry if it has already been solved.
I was wondering, like with BitTorrent, when you download something using IPFS, does it automatically 'seed'/host it?
I also know you can stop seeding with torrents. If it automatically hosts the file after downloading it, what would be a way to stop hosting a specific file? all files?
edit: if it doesn't automatically host files after a download, how would one go about hosting a file on IPFS?
Thanks a ton.

To understand IPFS the best thing to do is take the time to read the white paper.
The protocol used for data distribution is inspired by BitTorrent and is named BitSwap. To protect against leeches (free-loading nodes that never share), BitSwap use a credit-like system.
So to answer your questions, yes when you download some content it's automatically hosted (or a least part of it), and if you try to trick the protocol by not hosting the content your credit will drop and you will not be able to participate in the network.

Is tracker related to torrent or downloader?

Background
I'm trying to add some active trackers for transmission daemon to speed it up as I have done this before when using aria2.
But all the resources I found is how to add trackers to a torrent.
Question
So I'm wondering which is tracker related to? A torrent file or the downloader? If is the torrent file, how does I add trackers in aria2? The only way I can image is that aria2 automatically adds trackers to the added torrent.
BTW, how to add default trackers in transmission daemon just like in aria2?

Trackers can be centralized servers which you can request list of peers from.
The torrent file and the download don't go through the tracking server, the tracking server simply tells you from whom you can ask for pieces of the file from.
If you ask for more trackers, you won't download faster, you'll just have a wider pool of peers to pick from. If you want the download to be faster, you'll have to increase the amount of peers you are downloading from. (Tracking servers usually return 80 at a time I think in any case)
There are decentralized means of doing this using DHT (distributed hash table)

Considerations regarding a p2p social network

While the are many social networks in the wild, most rely on data stored on a central site owned by a third party.
I'd like to build a solution, where data remains local on member's systems. Think of the project as an address book, which automagically updates contact's data as soon a a contact changes its coordinates. This base idea might get extended later on...
Updates will be transferred using public/private key cryptography using a central host. The sole role of the host is to be a store and forward intermediate. Private keys remain private on each member's system.
If two client are both online and a p2p connection could be established, the clients could transfer data telegrams without the central host.
Thus, sender and receiver will be the only parties which are able create authentic messages.
Questions:
Do exist certain protocols which I should adopt?
Are there any security concerns I should keep in mind?
Do exist certain services which should be integrated or used somehow?
More technically:
Use e.g. Amazon or Google provided services?
Or better use a raw web-server? If yes: Why?
Which algorithm and key length should be used?
UPDATE-1
I googled my own question title and found this academic project developed 2008/09: http://www.lifesocial.org/.

The solution you are describing sounds remarkably like email, with encrypted messages as the payload, and an application rather than a human being creating the messages.
It doesn't really sound like "p2p" - in most P2P protocols, the only requirement for central servers is discovery - you're using store & forward.
As a quick proof of concept, I'd set up an email server, and build an application that sends emails to addresses registered on that server, encrypted using PGP - the tooling and libraries are available, so you should be able to get that up and running in days, rather than weeks. In my experience, building a throw-away PoC for this kind of question is a great way of sifting out the nugget of my idea.
The second issue is that the nature of a social network is that it's a network. Your design may require you to store more than the data of the two direct contacts - you may also have to store their friends, or at least the public interactions those friends have had.
This may not be part of your plan, but if it is, you need to think it through early on - you may end up having to transmit the entire social graph to each participant for local storage, which creates a scalability problem....

The paper about Safebook might be interesting for you.
Also you could take a look at other distributed OSN and see what they are doing.

None of the federated networks mentioned on http://en.wikipedia.org/wiki/Distributed_social_network is actually distributed. What Stefan intends to do is indeed new and was only explored by some proprietary folks.

I've been thinking about the same concept for the last two years. I've finally decided to give it a try using Python.
I've spent the better part of last night and this morning writing a sockets communication script & server. I also plan to remove the central server from the equation as it's just plain cumbersome and there's no point to it when all the members could keep copies of their friend's keys.
Each profile could be accessed via a hashed string of someone's public key. My social network relies on nodes and pods. Pods are computers which have their ports open to the network. They help with relaying traffic as most firewalls block incoming socket requests. Nodes store information and share it with other nodes. Each node will get a directory of active pods which may be used to relay their traffic.

The PeerSoN project looks like something you might be interested in: http://www.peerson.net/index.shtml
They have done a lot of research and the papers are available on their site.

Some thoughts about it:
protocols to use: you could think exactly on P2P programs and their design
security concerns: privacy. Take a great care to not open doors: a whole system can get compromised 'cause you have opened some door.
services: you could integrate with the regular social networks through their APIs
People will have to install a program in their computers and remeber to open it everytime, like any P2P client. Leaving everything on a web-server has a smaller footprint / necessity of user action.
Somehow you'll need a centralized server to manage the searches. You can't just broadcast the internet to find friends. Or you'll have to rely uppon email requests to add somenone, and to do that you'll need to know the email in advance.
The fewer friends /contacts use your program, the fewer ones will want to use it, since it won't have contact information available.
I see that your server will be a store and forward, so the update problem is solved.

ensuring uploaded files are safe

My boss has come to me and asked how to enure a file uploaded through web page is safe. He wants people to be able to upload pdfs and tiff images (and the like) and his real concern is someone embedding a virus in a pdf that is then viewed/altered (and the virus executed). I just read something on a procedure that could be used to destroy stenographic information emebedded in images by altering least sifnificant bits. Could a similar process be used to enusre that a virus isn't implanted? Does anyone know of any programs that can scrub files?
Update:
So the team argued about this a little bit, and one developer found a post about letting the file download to the file system and having the antivirus software that protects the network check the files there. The poster essentially said that it was too difficult to use the API or the command line for a couple of products. This seems a little kludgy to me, because we are planning on storing the files in the db, but I haven't had to scan files for viruses before. Does anyone have any thoughts or expierence with this?
http://www.softwarebyrob.com/2008/05/15/virus-scanning-from-code/

I'd recommend running your uploaded files through antivirus software such as ClamAV. I don't know about scrubbing files to remove viruses, but this will at least allow you to detect and delete infected files before you view them.

Viruses embedded in image files are unlikely to be a major problem for your application. What will be a problem is JAR files. Image files with JAR trailers can be loaded from any page on the Internet as a Java applet, with same-origin bindings (cookies) pointing into your application and your server.
The best way to handle image uploads is to crop, scale, and transform them into a different image format. Images should have different sizes, hashes, and checksums before and after transformation. For instance, Gravatar, which provides the "buddy icons" for Stack Overflow, forces you to crop your image, and then translates it to a PNG.
Is it possible to construct a malicious PDF or DOC file that will exploit vulnerabilities in Word or Acrobat? Probably. But ClamAV is not going to do a very good job at stopping those attacks; those aren't "viruses", but rather vulnerabilities in viewer software.

It depends on your company's budget but there are hardware devices and software applications that can sit between your web server and the outside world to perform these functions. Some of these are hardware firewalls with anti-virus software built in. Sometimes they are called application gateways or application proxies.
Here are links to an open source gateway that uses Clam-AV:
http://en.wikipedia.org/wiki/Gateway_Anti-Virus
http://gatewayav.sourceforge.net/faq.html

You'd probably need to chain an actual virus scanner to the upload process (the same way many virus scanners ensure that a file you download in your browser is safe).
In order to do this yourself, you'd have to keep it up to date, which means keeping libraries of virus definitions around, which is likely beyond the scope of your application (and may not even be feasible depending on the size of your organization).

Yes, ClamAV should scan the file regardless of the extension.

Use a reverse proxy setup such as
www <-> HAVP <-> webserver
HAVP (http://www.server-side.de/) is a way to scan http traffic though ClamAV or any other commercial antivirus software. It will prevent users to download infected files.
If you need https or anything else, then you can put another reverse proxy or web server in reverse proxy mode that can handle the SSL before HAVP
Nevertheless, it does not work at upload, so it will not prevent the files to be stored on servers, but prevent the files from being downloaded and thus propagated. So use it with a regular file scanning (eg clamscan).

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string