HTTP Live Streaming proxy setup - http-live-streaming

I'm currently deciding on what route to take for setting up load balancing and scaling for HLS.
My intuition is to have a custom component (that will auto-scale) and load-balance to it using sticky balancing based on the stream ID.
So basically one of two things can happen:
there is a proxy with the stream that is not overloaded (that one will be used)
all proxies with the stream are overloaded, the least used proxy will be used that will fetch the stream from the source
There can be extra layers on top of this with topology awareness, but this is the basic stuff.
However, I was told that the idea behind HLS was that you can just use normal HTTP proxies, which is however an area I'm not familiar with.
Is it possible to configure an HTTP proxy in this content-aware way? If yes, how would I go about it?
I'm really only familiar with setting up generic pools of proxies without additional configuration.

https://github.com/warren-bank/HLS-Proxy
I am not familiar with this area, but I would think a special purposed proxy like above works better and may save you some headache.

haproxy.org (open source) and haproxy.com (commercial) can support streaming applications.
See extending HAProxy with the stream processing offload engine and the Digital Ocean's High-Availability HAProxy Setup

Related

Communication Between Microservices

Say you have microservice A,B, and C which all currently communicate through HTTP. Say service A sends a request to service B which results in a response. The data returned in that response must then be sent to service C for some processing before finally being returned to service A. Service A can now display the results on the web page.
I know that latency is an inherent issue with implementing a microservice architecture, and I was wondering what are some common ways of reducing this latency?
Also, I have been doing some reading on how Apache Thrift and RPC's can help with this. Can anyone elaborate on that as well?
Also, I have been doing some reading on how Apache Thrift and RPC's can help with this. Can anyone elaborate on that as well?
The goal of an RPC framework like Apache Thrift is
to significantly reduce the manual programming overhead
to provide efficient serialization and transport mechanisms
across all kinds of programming languages and platforms
In other words, this allows you to send your data as a very compactly written and compressed packet over the wire, while most of the efforts required to achieve this are provided by the framework.
Apache Thrift provides you with a pluggable transport/protocol stack that can quickly be adapted by plugging in different
transports (Sockets, HTTP, pipes, streams, ...)
protocols (binary, compact, JSON, ...)
layers (framed, multiplex, gzip, ...)
Additionally, depending on the target language, you get some infrastructure for the server-side end, such as TNonBlocking or ThreadPool servers, etc.
So coming back to your initial question, such a framework can help to make communication easier and more efficient. But it cannot magically remove latency from other parts of the OSI stack.

Isn't Ngnix load balancing like proxy server?

Mainly is there any difference between using nginx as load balancer for bunch of upstream servers. Or using small nodejs proxy server that acts like a proxy between bunch of servers and one public hosting.
It may look obvious to you but to me nginx is very new. And i barely know anything about it..
Also i guess my question is there any performance advantage for using nginx as proxy server that distribute load vs running your own node js code that acts a proxy between other requests.
In case of introducing +1 technology I'd say keep custom NodeJS proxy as short-term solution.
Long-term solution is Nginx as reverse-proxy among array of backend makes a big sense by number technical and maintenance reasons. An application rarely stays the same because you apply new features, replace legacy code and deploy new ones so the way is to use right tool for right task. Nginx is proven and chosen by many heavy loaded applications over the web. The memory consumption and CPU utilisation is low and stable.
Most of people use Nginx as reverse-proxy (the biggest reason to use Nginx by the way) rather than anything else because such powerful and featured it is.
From request-response life cycle m Nginx keeps rotating between backend to send request again if a given backend is dead, so not even one request lost.
From maintenance point of view dynamic upstream (part of commercial installation) with Rest interface looks good enough. Even open source version is easy to roll out upstream update + graceful reload (HUP signal). Nginx also supports zero downtime binary upgrade (USR2+QUIT).

Integrating real-time components into REST backend

I am implementing a product that will be accessible via web and mobile clients, and am doing thorough research to make sure that I have chosen a good set of tools before I begin. For front-end, I am using AngularJS (Angularjs + angular-ui on web, ionic + cordova on mobile), and because I want to have a single backend serving all types of clients, I plan on implementing a RESTful service (likely one that accepts and returns JSON data). I am leaning towards using Mongo, Node, and Express to create this RESTful API, but am open to suggestions on that front.
But the sticking point for me right now is this: certain parts of the application (including, for example, a live chat/messaging section) need to be real-time. I am aware of the various technologies and protocols for implementing real-time web services (webhooks, websockets, long polling, etc.) and the libraries and frameworks that implement them and expose that functionality (SockJS, Socket.io, etc.) and I want to be clear that I am not asking one of those "what is the best framework" types of questions.
My question is rather about the correct way to implement these two kinds of services side-by-side. Should I be serving the chat separately from the rest of the application? Or is there a clean way to integrate these two different protocols into the same application?
The express framework is quite modular so it can sit side by side with a websocket module if you so wish. The most common reason for doing this is to share authentication routines across http and websockets by using the same session store in both modules.
For example you would authenticate a user by http with the express framework when they login, which will allow access to your chat application. From then on you would take advantage of the realtime and speedy protocol of websockets and on your server code you will check the cookie that the client sends with the socket message and check that the request corresponds to an authenticated session from before.
Many websites use websockets for chat or other push updates, and a separate RESTful API over AJAX, delivered to the same page. There are great reasons to leave RESTful things as they are, particularly if caching is an issue--websockets won't benefit from web caches outside your servers. Websockets are better suited for chat on any modern browser, which trades a small keep-alive for a reconnecting long-poll. So two separate interfaces adds a little complexity that you may benefit from, when scaling and cost-per-user are considered.
If your app grows enough to require this scaling, you'll find this actually simplifies things greatly--clients in the same chat groups can map to the same server, and a load balancer can distribute RESTful calls appropriately.
If you are looking for one communication protocol to serve both needs (calling the server from the client, as well as pushing data from the server), you might have a look at WAMP.
WAMP is an open WebSocket subprotocol that provides two application
messaging patterns in one unified protocol: Remote Procedure Calls +
Publish & Subscribe.
If you want to dig a little deeper, this describes the why, the motivation and the design. WAMP has multiple implementations in different languages.
Now, if you want to stick to REST, then you cannot integrate push at the protocol level (since REST simply does not have that), but only at "framework level". You need a 2nd protocol. The options are:
WebSocket
Server Sent Events (SSE)
HTTP Long-Poll
SSE in a way could be a good complement to REST. However, it's unsupported on IE (not even IE11), and it's unclear if it ever will be.
WebSocket obviously works, but then why not have it all running over WebSocket? (This line of thinking leads to WAMP).
So IMO the natural complement for REST would be some HTTP Long-poll based mechanism for simulating push. You can make HTTP Long-poll work robustly. You'll have to live with the inefficiencies and limitations of HTTP (for use cases like this) with this solution then.
You could use a hosted real-time messaging (and even storage) service and integrate it into your frontend apps (web and mobile). These services leverage the websocket protocol and normally include HTTP Comet fallbacks.
The cool thing is that you don't need to manage the underlying infrastructure in terms of high-availability and unlimited scalability and focus only on developing a great app.
I work for Realtime so i'm a bit biased but I think the Realtime Framework could help you. More at http://framework.realtime.co

How to use mod_security as standalone?

I've seen the module named standalone in the package of Mod_Security; but I'm not sure how to use it after making and installing it!
Is there any good resources for the start up?
It does not appear to be possible; based on what the ModSecurity website says for its modes of operation:
Reverse proxies are effectively HTTP routers, designed
to stand between web servers and their clients. When you install a
dedicated Apache reverse proxy and add ModSecurity to it, you get a
"proper" network web application firewall, which you can use to
protect any number of web servers on the same network. Many security
practitioners prefer having a separate security layer. With it you get
complete isolation from the systems you are protecting. On the
performance front, a standalone ModSecurity will have resources
dedicated to it, which means that you will be able to do more (i.e.,
have more complex rules). The main disadvantage of this approach is
the new point of failure, which will need to be addressed with a
high-availability setup of two or more reverse proxies.
They are considering it separate by created a dedicated host that is used for proxying to internal hosts.
That works; but it's technically not standalone.
I also filed a bug, and it was confirmed by Felipe Zimmerle:
Standalone is a wrapper to Apache internals that allows ModSecurity to be executed. That wrapper still demand Apache pieces. It is true that you can extend your application using the Standalone version although, you will need some Apache pieces
As you have noted ModSecurity is an add on to an existing web server - originally as an Apache module (hence the name) but now also available for Nginx and IIS.
You can run it in embedded mode (i.e. as part of your main web server) or run it in reverse proxy mode (which is basically the same but you set up a separate web server and run it on that, and then direct all traffic through that).
To be perfectly honest I've never found much point in the reverse proxy method. I guess it does mean you could use it on non-supported web servers (i.e. if you are not using Apache, Nginx nor IIS), and it would reduce the load on your main web server, but other than that it seems like an extra step and infrastructure for no real gains. Some people might also prefer to do the ModSecurity checks in front of several web servers but I woudl argue if you have several web servers, then it is likely for performance and resiliency reasons so why not spread the ModSecurity to this level too rather than creating a single point of failure which might be a bottleneck in front of it. Only other reason would be to apply session level rules (e.g. if people are changing session ids), which might ultimately be spread between different web servers but I've never been convinced that those rules are that great anyway.
When I build ModSecurity I get a mod_security2.so library being built but no separate standalone file(s) so I presume you're just seeing this from hunting through the source (I do see a standalone)? I'd say just because there is a "standalone" folder in the source is not a guarantee that it can run as a completely separate, standalone piece.
I'd question why you want to run this as a standalone app even if you could? Web servers have a lot of functionality in them and depending on ModSecurity, which was written for web security, rather than web security and all the other things a web server does (e.g. be quick, understand HTTP protocol, gzip and ungzip...etc), needlessly stretches what ModSecurity would need to handle. So why not use a web server to take care of this and let ModSecurity do what it's good at?
If you are using ModSecurity then I guess you have web apps (presumably with a web server), so why not use it through that?
Finally is there any problem with installing this through Apache (or Nginx or IIS)? It's free software that's well supported and easy to set up.
I guess ultimately I don't understand the reason for your question. Is there a particular problem you are trying to solve, or is this more just curiosity?

Using Node.js only vs. using Node.js with Apache/Nginx [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 months ago.
The community reviewed whether to reopen this question 6 months ago and left it closed:
Original close reason(s) were not resolved
Improve this question
In what cases should one prefer to use Node.js only as a server in real deployment?
When one does not want to use Node.js only, what plays better with Node.js? Apache or Nginx?
There are several good reasons to stick another webserver in front of Node.js:
Not having to worry about privileges/setuid for the Node.js process. Only root can bind to port 80 typically. If you let nginx/Apache worry about starting as root, binding to port 80, and then relinquishing its root privileges, it means your Node app doesn't have to worry about it.
Serving static files like images, css, js, and html. Node may be less efficient compared to using a proper static file web server (Node may also be faster in select scenarios, but this is unlikely to be the norm). On top of files serving more efficiently, you won't have to worry about handling eTags or cache control headers the way you would if you were servings things out of Node. Some frameworks may handle this for you, but you would want to be sure. Regardless, still probably slower.
As Matt Sergeant mentioned in his answer, you can more easily display meaningful error pages or fall back onto a static site if your node service crashes. Otherwise users may just get a timed out connection.
Running another web server in front of Node may help to mitigate security flaws and DoS attacks against Node. For a real-world example, CVE-2013-4450 is prevented by running something like Nginx in front of Node.
I'll caveat the second bullet point by saying you should probably be serving your static files via a CDN, or from behind a caching server like Varnish. If you're doing this it doesn't really matter if the origin is Node or Nginx or Apache.
Caveat with nginx specifically: if you're using websockets, make sure to use a recent version of nginx (>= 1.3.13), since it only just added support for upgrading a connection to use websockets.
Just to add one more reason to pauljz's answer, I use a front end server so that it can serve up 502 error pages when I'm restarting the backend server or it crashes for some reason. This allows your users to never get an error about unable to establish a connection.
It is my belief that using Node to serve static files is fine in all circumstances as long as you know what you're doing. It is certainly a new paradigm to use the application server to serve static files as so many (every?) competing technologies (PHP, Ruby, Python, etc) require a web server like HTTPD or Nginx in front of the application server(s).
Every objective reason I have ever read against serving static files with Node revolves around the idea of using what you know best or using what is perceived as better-tested / more stable. These are very valid reasons practically speaking, but have little purely technical relevance.
Unless you find a feature that is possible with a classic web server that is not possible with Node (and I doubt you will), choose what you know best or what you'd prefer to work with as either approach is fine.
As for Nginx vs Apache -- they will "play" with Node the same. You should compare them without regard to Node.
Using Node.js only
Node.js can do all the tasks of a web server: serve static files, respond to an API call, run server on HTTPS... There are also a lot of packages that provide extra functionalities like logging the request, compress the response, set cookies, prevent XSS attacks... Lack of functionalities isn't likely a reason for using another Webserver (Apache/Nginx/etc..) to complete Node.js. In other words, for a simple application that does not need to scale, you don't need to add an extra layer to Node.js, it just complicates the problem.
Using Node.js with another webserver
Each web server has its own advantages. For example, Apache allows additional configuration per-directory via the .htaccess file. Nginx is known for its performance when it comes to serving static files or acting as an reverse proxy. Node.js provides a huge benefit when dealing with I/O heavy systems... Sometimes, we need to combine the forces of different web servers to satisfy the system's requirements.
Example: For an enterprise-level application that might scale up in the future, set up Nginx as a reverse proxy before Node.js application has some advantages :
Nginx can act as a load balancer to dispatch traffic to your NodeJS instances if you have more than 1.
Nginx can handle HTTPS, caching, and compression for you. Encryption and compression are heavily computed operations that NodeJS is not good at. So using Nginx will give you better performance.
Nginx will serve static content, which reduces the load of Node.js.
Separation of concerns: Nginx takes care of all the "configuration" part, and Node.js focus on the application logic.
Placing NGINX in front of Node helps better handle high connection volumes. NGINX offers (to name a few) caching, load balancing, rate limiting (using the leaky bucket algorithm) and can help mitigate attacks if paired with a banning service like Fail2ban.
As for production applications, you can run your application server behind NGINX as reverse proxy, coupled with a caching server like Redis- all of which can be situated behind a content delivery network as another line of defense from exposing your ipv4/ipv6.
An extra: It is important also if you need a Reverse Proxy, for example to execute a Websocket Server on the same port, or maybe mix some techonlogies (reply with NodeJS some requests and with PHP some others or whatever)

Resources