Prevent bottleneck on bandwidth for mobile internet - linux

I am sure that this question has already been answered, but unfortunately I do not know the keywords. Therefore my search remained unsuccessful until now.
Scenario: I want to transmit a lifestream via Mobile Internet using RaspberryPi, and depending on the bandwidth, downscale the streams and upscale them again when available.
My two questions for the network specialists among you:
i know i can actively check the bandwidth, but how would you do this without interfering with the existing processes transmitting? Should I commit a bandwidth to the processes and then slowly determine the remaining bandwidth using a test tool? Or are there already practical solutions?
Can I determine in the mobile Internet, or in the network interface, when a bottelneck is reached?
Passive methods would be my preference. where I wouldn't have to load the bandwidth. e.g. I could know how much bandwidth the stream uses, and how much arrives. But how do I make sure there is enough capacity before I go up with the bitrate?
Thanks for your wisdom ;)

Related

Where to host my NodeJS powered socket.io API?

I'm currently working on an api for an app that I'm developing, which I don't want to tell too much about. I'm a solo dev with no patents so it's probably a good idea to keep it anonymous. It uses socket.io together with node.js to interact with clients and the other way around, which I might be swapping out sometime later for elixir and it's sockets, but that isn't relevant for now. Now I'm trying to look into cloud hosting, but I'm having a rough time finding a good service to use.
These are my requirements:
24/7 uptime
Low memory and performance necessary (at least to start with). About 1+ gig with 2+ cores will most likely suffice (need 2 threads or more for node to handle async programming well)
Preferably free for like maybe even a year, or just really cheap, but that might be munch to ask
Must somehow be able to run some sort of database. Haven't really settled on this yet, but I want to implement a custom currency at some point, and probably have the ability to add some cooldowns. So it can be fairly simple and small. If anybody has any tips on what database I should use, that would also be very welcome. I was thinking of Cassandra because of the blazing fast performance and expandability. But I also wanna look into remote databases, especially if I'm gonna go international with the product
Ability to keep open socket.io connections, as you've probably guessed :P
Low ping decently high bandwith internet. The socket.io connections are lightweight and not a lot of data has to be sent. Mostly packets of a few kilobytes every now and then for all of the clients.
If this information is too vague or you want to know some other requirements I haven't thought of, let me know.
Check out Heroku (PaaS), they have a free version to start with

Do IOT devices provide real privacy of data?

So we are a startup been doing most of the work on cloud and looking at moving processing on device itself, so owner of the devices don't loose functionality once we decide to move on.
But we had this question we are debating is
Do IOT devices provide real privacy of data?
I know "real" is very subjective, but if we decide otherwise. Please suggest
Any supportive studies either ways. Seems like a broad question .. but
I think a lot of it would depend on what data are you retrieving from these devices and how are you handling it in cloud.
Also i think it would depend on the hardware of the device; like how much secure it is from that point of view
This is way too broad. A large proportion of IoT devices are horribly insecure and also offer little in the way of privacy. So if you're talking about existing devices, then the answer to your question is no.
That doesn't mean that IoT is inherently insecure or privacy-invading, just that the vast majority of devices have chosen to make it so, undermining trust in all of it - look at all the stuff that Google and Amazon have been trying to get away with.
You can of course build your own, but when you say "once we decide to move on", it suggests that you want these devices to operate peer-to-peer without a cloud connection (i.e. when there's nobody paying for servers). This is entirely possible using things like tor and signal protocols, but it's not easy, and you're unlikely to find a comprehensive answer on Stack Overflow. You're going to need some good privacy- and security-aware developers to make that work, and they won't be cheap.

flow-based traffic classification for traffic shaping

I’m wondering if there are ways to achieve flow-based traffic shaping with linux.
Traditional traffic shaping approaches seem be based on creating classes for specific protocols or types of packets (such as ssh, http, SYN or ACK) that need high troughput.
Here I want to see every TCP connection as a flow characterized by a certain data-rate.
There’ll be
quick flows such as interactive ssh or IRC chat and
slow flows (bulk data) such as scp or http file transfers
Now I’m looking for a way to characterize / classify an incoming packet to one of these classes, so I can run a tc based traffic shaper on it. Any hints?
Since you mention a dedicated machine I'll assume that you are managing from a network bridge and, as such, have access to the entirety of the packet for the lifetime it is in your system.
First and foremost: throttling at the receiving side of a connection is meaningless when you are speaking of link saturation. By the time you see the packet it has already consumed resources. This is true even if you are a bridge; you can only realistically do anything intelligent on the egress interface.
I don't think you will find an off-the-shelf product that is going to do exactly what you want. You are going to have to modify something like dummynet to be dynamic according to rules you derive during execution or you are going to have to program a dynamic software router using some existing infrastructure. One I am familiar with is Click modular router, but there are others. I really dont know how things like tc and ipfw will react to being configured/reconfigured with high frequency - I suspect poorly.
There are things that you should address ahead of time, however. Things that are going to make this task difficult regardless of the implementation. For instance,
How do you plan on differentiating between scp bulk and ssh interactive behavior? Will you monitor initial behavior and apply a rule based on that?
You mention HTTP-specific throttling; this implies DPI. Will you be able to support that on this bridge/router? How many classes of application traffic will you support?
How do you plan on handling contention? (you allot for 'bulk' flows to each get 30% of the capacity but get 10 'bulk' flows trying to consume)
Will you hard-code the link capacity or measure it? Is it fixed or will it vary?
In general, you can get a fairly rough idea of 'flow' by just hashing the networking 5-tuple. Once you start dealing with applications semantics, however, all bets are off and you need to plow through packet contents to get what you want.
If you had a more specific purpose it might render some of these points moot.

How to protect a website from DoS attacks

What is the best methods for protecting a site form DoS attack. Any idea how popular sites/services handles this issue?.
what are the tools/services in application, operating system, networking, hosting levels?.
it would be nice if some one could share their real experience they deal with.
Thanks
Sure you mean DoS not injections? There's not much you can do on a web programming end to prevent them as it's more about tying up connection ports and blocking them at the physical layer than at the application layer (web programming).
In regards to how most companies prevent them is a lot of companies use load balancing and server farms to displace the bandwidth coming in. Also, a lot of smart routers are monitoring activity from IPs and IP ranges to make sure there aren't too many inquiries coming in (and if so performs a block before it hits the server).
Biggest intentional DoS I can think of is woot.com during a woot-off though. I suggest trying wikipedia ( http://en.wikipedia.org/wiki/Denial-of-service_attack#Prevention_and_response ) and see what they have to say about prevention methods.
I've never had to deal with this yet, but a common method involves writing a small piece of code to track IP addresses that are making a large amount of requests in a short amount of time and denying them before processing actually happens.
Many hosting services provide this along with hosting, check with them to see if they do.
I implemented this once in the application layer. We recorded all requests served to our server farms through a service which each machine in the farm could send request information to. We then processed these requests, aggregated by IP address, and automatically flagged any IP address exceeding a threshold of a certain number of requests per time interval. Any request coming from a flagged IP got a standard Captcha response, if they failed too many times, they were banned forever (dangerous if you get a DoS from behind a proxy.) If they proved they were a human the statistics related to their IP were "zeroed."
Well, this is an old one, but people looking to do this might want to look at fail2ban.
http://go2linux.garron.me/linux/2011/05/fail2ban-protect-web-server-http-dos-attack-1084.html
That's more of a serverfault sort of answer, as opposed to building this into your application, but I think it's the sort of problem which is most likely better tackled that way. If the logic for what you want to block is complex, consider having your application just log enough info to base the banning policy action on, rather than trying to put the policy into effect.
Consider also that depending on the web server you use, you might be vulnerable to things like a slow loris attack, and there's nothing you can do about that at a web application level.

How many open udp or tcp/ip connections can a linux machine have?

There are limits imposed by available memory, bandwidth, CPU, and of course, the network connectivity. But those can often be scaled vertically. Are there any other limiting factors on linux? Can they be overcome without kernel modifications? I suspect that, if nothing else, the limiting factor would become the gigabit ethernet. But for efficient protocols it could take 50K concurrent connections to swamp that. Would something else break before I could get that high?
I'm thinking that I want a software udp and/or tcp/ip load balancer. Unfortunately nothing like that in the open-source community seems to exist, except for the http protocol. But it is not beyond my abilities to write one using epoll. I expect it would go through a lot of tweaking to get it to scale, but that's work that can be done incrementally, and I would be a better programmer for it.
The one parameter you will probably have some difficulty with is jitter. Has you scale the number of connections per box, you will undoubtedly put strain on all the resources of the said system. As a result, the jitter characteristics of the forwarding function will likely suffer.
Depending on your target requirements, that might or not be an issue: if you plan to support mainly elastic traffic (traffic which does not suffer much from jitter and latency) then it's ok. If the proportion of inelastic traffic is high (e.g. interactive voice/video), then this might be more of an issue.
Of course you can always over engineer in this case ;-)
If you intend to have a server which holds one socket open per client, then it needs to be designed carefully so that it can efficiently check for incoming data from 10k+ clients. This is known as the 10k problem.
Modern Linux kernels can handle a lot more than 10k connections, generally at least 100k. You may need some tuning, particularly the many TCP timeouts (if using TCP) to avoid closing / stale sockets using up lots of resource if a lot of clients connect and disconnect frequently.
If you are using netfilter's conntrack module, that may also need tuning to track that many connections (this is independent of tcp/udp sockets).
There are lots of technologies for load balancing, the most well-known is LVS (Linux Virtual Server) which can act as the front end to a cluster of a real servers. I don't know how many connections it can handle, but I think we use it with at least 50k in production.
To your question, you are only restrained by hardware limitations. This was the design philosophy for linux systems. You are describe exactly what would be your limiting factors.
Try HAProxy software load balancer:
http://haproxy.1wt.eu/

Resources