What are the web-servers that meet at least part of the following requirements:
multithreading
receives scattered request headers
can cause an external DLL
capable of processing large data (50 gigabytes or more)
embedded
Some requirements are met by the Mongoose web-server, are there any other similar technologies?
Related
We are building a website for visualizing large sets of data. The datasets are loaded on the server (NodeJS and Express) using crossfilter.js and by exposing different endpoints the data is sent to the website where the visualizations are getting built.
So far the server was only able to provide visualizations for the dataset that is loaded when the server starts. For changing the dataset, the server needs to be restarted. We are trying now to allow the user to change the dataset that he is visualizing. The problem is that we don't know what is the best approach. The necesary steps would be the next ones:
The user provides the new dataset
The server is loading this dataset without altering the datasets that the other users are using
The server is able to provide to each user the right dataset.
Basically, our uncertainty is about how to load multiple datasets into memory. The server might become too overloaded.
Any advises?
You have a couple of choices here for memory management. You can allocate enough memory for everything, and overflow to swap. Or, more complicated, store the files on disk, and have the datasets follow a cache pattern. Using a fixed size lru cache will allow as many datasets as you have disk space, with the caveat that returning users may have to wait a second or two while the dataset is read off of disk into memory.
As Node.js beginner coming from Enterprise IT, I am unable to comprehend one aspect of node.js usage. I am framing my question in two parts.
Question-1) Strictly from scalability standpoint, how can an I/O heavy web application scale using node.js unless we scale back-end I/O resources that it is consuming?
A database server can serve only "X" number of concurrent users. Even if node based HTTP server is able to handle more incoming requests, overall throughput is going to be dictated by number of concurrent connections DB can handle.
Same applies for other enterprise resources like content retrieval from file servers or invocation of legacy APIs etc. I understand that we would be less worried about cloud resources which can elastically scale and are not in our direct purview.
Question-2) If answer to above question is "Node is not one-size-fit-all solution", how are companies like PayPal, Walmart, LinkedIn et al able to gain scale using node? They too would integrate within their existing system landscape, and are not totally network based applications (or are they?).
Node.js is typically used as an orchestration layer in SOA.It is mainly used as front-end for the backend services.It is true that
the throughput is going to be dictated by number of concurrent connections DB can handle but there is also the time involved
for the presentation layer to present the content.
Web technologies like JSP,Ruby on rails are designed to get the content on the server and serve as a single page to the client and are not suited for orchestration layer.Today we need services that handle mobile clients(where there are lot of API calls to retrieve small amount of data)Thus node.js reduces the response time and increases the user expierence.
Look at http://nodejs.org/video/ video by Eric Hammer to understand how Node.js is being used in Walmart.
We have an embedded box. The specs of the CPU is medium speed (600MHz) and RAM is between 512MB to 2GB depending on configuration. The system consists of data layer to process incoming data from hardware and needing to be displayed both remotely and on an HDMI output.
Seeing the remote aspect is as important as the local display, we have architected a client-server solution. For the server, it just needs to respond to requests for data. The data needs to come from the internals of another process (IPC interaction) and return it formatted for the client.
For the server we are thinking of using node.js. The data is spec'ed to be formatted into json messages, so using JavaScript and json is simple.
However, are there any better/other server options to consider?
The main requirement is the server can be extended to interact with the data process and be able to process and respond to requests.
We could write it ourselves, but feel that there must be usable tech to leverage already.
Thanks.
I am assuming that you need output as a webpage only.
It depends.
If your team knows java/servlet well, you can do using simple jetty/servlet/jsp combination.
But if you have team good with javascript/node.js, go with it. Although, I am not sure about stability requirements you have, because node.js is quite stable but it haven't reached 1.0 yet.
I'm reading about different approaches for scaling request handling capabilities on a single machine being taken by node.js, ruby, jetty and company.
Being an application developer, i.e. having very little understanding in Kernel/Networking I'm curious to understand the different approaches taken by each implementation (kernel select, polling the socket for connection, event based and company.) ?
Please note that I'm not asking about special handling features (such as jetty continuations (request->wait->request), a pattern which is typical for AJAX clients) but more generally, should you like to implement a server that can respond with "Hello World" to the maximum number of concurrent clients how would you do it? and Why?
Information / References to reading material would be great.
Take a look at The C10K problem page.
How would you go about describing the architecture of a "system" that splits a sensitive file into smaller pieces on different servers in order to protect the file?
Would we translate the file into bytes, and then distribute those bytes onto different servers? How would you even go about getting all the pieces back together in order to call the original file back (if you have the correct permissions)?
This is a theoretical problem that I do not know how to approach. Any hints at where I should start?
Not an authoritative answer but you will get many here as replies which provides partial answers to your question. It may just give you some idea.
My guess is, you would be creating a custom file system.
Take a look at various filesystems like
GmailFS: http://richard.jones.name/google-hacks/gmail-filesystem/gmail-filesystem.html
pyfilesystem: http://code.google.com/p/pyfilesystem/
A distributed file system in python: http://pypi.python.org/pypi/koboldfs
Hence architecturally, it will be very similar to way a typical distributed filesystem is implemented.
It should be a client/server architecture in master/slave mode. You will have to create a custom protocol for their communication.
Master process is what you will talk to for retrieving / writing your files.
Slave fs would be distributed across different servers which will keep a tagged file which contains partial bits of information of a file
Master fs will contain a per file entry that locates all sequence of tagged data distributed across various slave servers.
You could have redundancy with a tagged data being store on multiple server.
Communication protocol will have to be designed to allow multiple servers to respond back to requested tagged data. Master fs simply picks one and ignores others in the simplest case.
Usual security requirements needs to be respected for storing and communicating this information across servers.
You will be most interested in secure distributed filesystem implemented in Python : Tahoe
http://tahoe-lafs.org/~warner/pycon-tahoe.html
http://tahoe-lafs.org/trac/tahoe-lafs