Use "apt" or compile from scratch for a web service? - linux

For the first time, I am writing a web service that will call upon external programs to process requests in batch. The front-end will accept file uploads and then place them in a queue. The workers on the backend will take that file, run it through ffmpeg and the rest of my pipeline, and send an email when the process is complete.
I have my backend process working on my computer (Ubuntu 10.04). The question is: should I try to re-create that pipeline using binaries that I've compiled from scratch? Or is it okay to use apt when configuring in The Real World?
Not all hosting services uses Ubuntu, and not all give me root access. (I haven't chosen a host yet.) However, they will let me upload binaries to execute, and many give me shell access with gcc.
Usually this would be a no-brainier and I'd compile it all from scratch. But doing so - not to mention trying to figure out how to create a platform-independent .tar.gz binary - will be quite a task which ultimately doesn't really help me ship my product.
Do you have any thoughts on the best way to set up my stack so that I'm not tied to a specific hosting provider? Should I try creating my own .deb, which contains Ubuntu's version of ffmpeg (and other tools) with the configurations I need?
Short of a setup where I manage my own servers/VMs (which may very well be what I have to do), how might I accomplish this?

The question is: should I try to re-create that pipeline using binaries that I've compiled from scratch? Or is it okay to use apt when configuring in The Real World?
It is in reverse: it is not okay to deploy unpackaged in The Real World IMHO
and not all give me root access
How would you be deploying a .deb without root access. Chroot jails?
But doing so - not to mention trying to figure out how to create a platform-independent .tar.gz binary - will be quite a task which ultimately doesn't really help me ship my product.
+1 You answer you own question. Don't meddle unless you have to.
Do you have any thoughts on the best way to set up my stack so that I'm not tied to a specific hosting provider?
Only depend on wellpackaged standard libs (such as ffmpeg). Otherwise include them in your own deployment. This problem isn't too hard too solve for 10s of thousand Linux applications over decades now, so it would probably be feasible for you too.
Out of the box:
Look at rightscale and other cloud providers/agents that have specialized images/tool chains especially for video encoding.
A 'regular' VPS provider (with Xen or Virtuozzo) will not normally be happy with these kinds of workload, but EC2, Rackspace and their lot will be absolutely fine with that.
In general, I wouldn't believe that a cloud infrastructure provider that doesn't grant root access will allow for computationally intensive workloads. $0.02

Related

node.js on shared server space

I am playing with node.js and angular for first time. I use shared server hosting space. I am trying to get some node.js tests running.
CPanel seem to provide interface to deploy node applications. Example:
application url: myurl.com
application root: node-hello-world
application startup file: app.js
This seems to create directory and some artifacts in
/home/myurl/nodevenv/node-hello-world/6/bin
I have (limited?) shell access through Cpanel emulation, however I get error on source command.
source activate
got error: error: jailshell: fork: Cannot allocate memory
Does this mean node.js is installed and ready to run? Do I have to upload project as well? Where to? Trying to find more info on process of deploying to this type of server if possible.
sorry for nooby question.
Googling your error message, I came across this thread -- which admittedly is very old, but it's from cPanel and has the following comment from an administrator at the time:
Jailshell is a constrained environment by design. It is not meant to be a replacement for a full-featured, unrestricted, shell environment, such as is provided by Bash. If your user's need such full-featured environments then perhaps they need full shell access, or another method whereby they can accomplish their goal.
That answer was given in 2006 (yes, 13 years ago) but I have to imagine the spirit of that response is still true.
To be perfectly honest, I'd be afraid to use any shared hosting provider that would give you more than very limited shell to use -- it opens the door to many security vulnerabilities, and if multiple customers are in the same runtime (i.e. shared hosting) it could be catastrophic. Maybe your host does allow this, or maybe what you're describing isn't actually the same thing I'm referring to... you didn't offer a lot of details on this point.
Back to your question: Does this mean node.js is installed and ready to run? Do I have to upload project as well? Where to?
If I had to guess, Node probably isn't installed (it isn't in most shared hosting providers) -- but I can't say for sure based on the information you provided. My recommendation would be to call their customer support. Or pay for a dedicated hosting account where you get root access. Or just use something like Heroku.

How does RunKit make their virtual servers?

There are many websites providing cloud coding sush as Cloud9, repl.it. They must use server virtualisation technologies. For example, Clould9's workspaces are powered by Docker Ubuntu containers. Every workspace is a fully self-contained VM (see details).
I would like to know if there are other technologies to make sandboxed environment. For example, RunKit seems to have a light solution:
It runs a completely standard copy of Node.js on a virtual server
created just for you. Every one of npm's 300,000+ packages are
pre-installed, so try it out
Does anyone know how RunKit acheives this?
You can see more in "Tonic is now RunKit - A Part of Stripe! " (see discussion)
we attacked the problem of time traveling debugging not at the application level, but directly on the OS by using the bleeding edge virtualization tools of CRIU on top of Docker.
The details are in "Time Traveling in Node.js Notebooks"
we were able to take a different approach thanks to an ambitious open source project called CRIU (which stands for checkpoint and restore in user space).
The name says it all. CRIU aims to give you the same checkpointing capability for a process tree that virtual machines give you for an entire computer.
This is no small task: CRIU incorporates a lot of lessons learned from earlier attempts at similar functionality, and years of discussion and work with the Linux kernel team. The most common use case of CRIU is to allow migrating containers from one computer to another
The next step was to get CRIU working well with Docker
Part of that setup is being opened-source, as mentioned in this HackerNews feed.
It uses linux containers, currently powered by Docker.

What is the safest way to deliver an Application to novice Linux users?

My customers are novice Linux users, and so am i.
When I gave them my App packaged with ansible, they saw ansible problems, when i gave them manual steps, they also screwed that up, now i have 3 last options, either a perl/bash script or a snappy/deb/rpm package or Linux containers, can anyone share their experience on the safest way to see less problems when installing my app (Written in C)?
This depends on the nature of your application. Debs, rpms etc. are all fine but depend on which distro you're using.
If it's C application, it might make sense to make it a static binary. That way, you'll have to download a single file and just click on it to make it run. It will be big but it should work fine regardless of what else is there. Otherwise, you'll have to worry about dependencies etc.
As it was commented before it depends what you did to deploy the product.
In general, if you have dependencies (previous packages that you assume were already installed) or your installation is complex - use rpm or deb.
However if you target multi-platform bare in mind you will have at least two releases (one rpm and one deb...)
If configuration or installation is easier you can just give them an install script.
If your application requires a specific environment with specific configuration/packages I'd consider containers although I never done that personally before.

NodeJS Managed Hostings vs VPS [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
There are a bunch of managed cloud based hosting services for nodejs out there which seem relatively new and some still in Beta.
Yet another path to host a nodejs app is setting up a stack on a VPS like Linode.
I'm wondering what's the basic difference here between these two kinds of deployment.
Which factors should one consider in choosing one over another?
Which one is more suitable for production considering how young these services are.
To be clear I'm not asking on choosing a provider but to decide whether to host on a managed nodejs specific hosting or on an old fashioned self setup VPS.
Using one of the services is for the most part hands off - you write your code and let them worry about managing the box, keep your process up, creating the publishing channel, patching the OS, etc...
In contrast having your own VM gives you more control but with more up front and ongoing time investment.
Another consideration is some hosters and cloud providers offer proprietary or distinct variations on technologies. They have reasons for them and they offer value but it does mean that if you want to switch cloud providers, it might mean you have to rewrite code, deployment scripts etc... On the other hand using VMs with standard OS as the baseline is pretty generic. If you automate/script/document the configuration of your VMs and your code stays generic, then your options stay open. If you do take a dependency on a proprietary cloud technology then it would be good to abstract it away behind an interface so it's a decoupled component and not sprinkled throughout your code.
I've done both. I did the VM path recently mostly because I wanted the learning experience. I had to:
get the VM from the cloud provider
I had to update and patch the OS
I had to install and configure git as a publishing channel
I had to write some scripts and use things like forever to keep it running
I had to configure the reverse http-proxy to get it to run multiple sites.
I had to configure DNS with the cloud provider, open ports for git etc...
The list goes on. In the end, it cost me more up front time not coding but I learned about a lot more things. If those are important to you, then give it a shot. If you want to focus on writing your code, then a node hosting provider may be for you.
At the end of it, I had also had more options - I wanted to add a second site. I added an entry to my reverse proxy, append my script to start up another app with forever, voila, another site. More control. After that, I wanted to try out MongoDB - simple - installed it.
Cost wise they're about the same but if you start hosting multiple sites with many other packages like databases etc..., then the VM can start getting cheaper.
Nodejitsu open sourced their tools which also makes it easier if you do your own.
If you do it yourself, here's some links that may help you:
Keeping the server up:
https://github.com/nodejitsu/forever/
http://blog.nodejitsu.com/keep-a-nodejs-server-up-with-forever
https://github.com/bryanmacfarlane/svchost
Upstart and Monit
generic auto start and restart through monitoring
http://howtonode.org/deploying-node-upstart-monit
Cluster Node
Runs one process per core
http://nodejs.org/docs/latest/api/cluster.html
Reverse Proxy
https://github.com/nodejitsu/node-http-proxy
https://github.com/nodejitsu/node-http-proxy/issues/232
http://blog.nodejitsu.com/http-proxy-middlewares
https://github.com/nodejitsu/node-http-proxy/issues/168#issuecomment-3289492
http://blog.argteam.com/coding/hardening-node-js-for-production-part-2-using-nginx-to-avoid-node-js-load/
Script the install
https://github.com/bryanmacfarlane/svcinstall
Exit Shell Script Based on Process Exit Code
Publish Site
Using git to publish to a website
IMHO the biggest drawback of setting up your own stack is that you need to manage things like making Node.js run forever, start it as a daemon, bring it behind a reverse-proxy such as Nginx, and so on ... the great thing about Node.js - making firing up a web server a one-liner - is one of its biggest drawbacks when it comes to production-ready systems.
Plus, you've got all the issues of managing and updating and securing your server yourself.
This is so much easier with the hosters: Usually it's a git push and that's it. Scaling? Easy. Replication? Easy. ...? Easy. All within a few clicks.
The drawback with the hosters is that you can not adjust the environment. Okay, you can probably choose which version of Node.js and / or npm to run, but that's it. You have no control over what 3rd party software is installed. You've got no control over the OS. You've got no control over where the servers are located. And so on ...
Of course, some hosters allow you access to some of these things, but there is rarely a hoster that supports all.
So, basically the question regarding Node.js is the same as with each other technology: It's a pro vs con of individualism, pricing, scalabilty, reliability, knowledge, ...
I personally chose to go with a hoster: The time and effort I save easily outperform the disadvantages. Mind you: For me, personally.
This question needs to be answered individually.
Using Docker is another way to simplify the setup on single Linux VPS. With Docker both development and production setups are faster, more robust, and more secure.
The setup is faster and more robust because you will be deploying ready Node.js image at once, without running any installation scripts. And it would be more secure because internal dependencies, such as database, can be hidden from outside world completely and accessible only from Docker internal network. On top of it, Docker significantly simplifies the upgrade process for underlying OS and Node.js runtime.
There are two ways to setup Node.js Docker environment. The first one – follow the instruction published here how to dockerize your application and deploy it with Docker, alongside with databases when needed. The guide gives the instructions for the development setup, the production setup will be similar.
Another way would be deploying official Node.js docker image and mounting application code as a volume or a folder to Node.js image. That would allow to update Node.js image going forward without re-building and re-deploying the application. Such approach solves long-standing problem with security patching of Docker images.
To help out with the setup of Docker on single machine - you can use Abberit Admin Panel. It will set up Node.js environment for you with a click of a button, including databases if you need them. The tool is free, and you can turn it off after you have completed initial setup. On the other hand, if later you decide to reduce maintenance tax of the production - you can migrate into managed service without any changes in the app.
Disclaimer: I am one of the founders of Abberit.

Remote software update on Linux machines

We develop Linux-based networking application which will run on multiple servers. We need to develop some solution for remote application update.
All I can think of now is using rpm/deb packages but we prefer not to lock this to some distro-specific solution. Besides copying files via SSH by some Bash script what would you recommend?
Thanks.
Distros does vary so much in setup and dependencies, I would actually recommend you create distro specific packages and integrate with its update tool - in the end it normally saves you a ton of trouble.
With the ease of virtualization, it's rather easy to spin up a vmware/virtualbox image foor the various distros to create/test packaging for each of them
How about puppet?
Check out Blueprit and Blueprint I/O. Blueprint is a tool that detects all of the packages, files modifications and source installs on a server. It packages them up in a reusable format called a blueprint that can be applied to another server. Blueprint I/O is a tools for pushing to and pulling from another server. Both are open-source. Hope this helps.
https://github.com/devstructure/blueprint (Blueprint # Github)
https://github.com/devstructure/blueprint-io (Blueprint I/O # Github)
I'm eight years late, but check Ansible.
Ansible is a radically simple IT automation platform that makes your
applications and systems easier to deploy. Avoid writing scripts or
custom code to deploy and update your applications— automate in a
language that approaches plain English, using SSH, with no agents to
install on remote systems.
Also, you can check this guide.

Resources