I am creating a novel website that integrates web feeds from around the internet. I want to build a backend that does CPU intensive analysis of the web data on a regular basis, which will eventually add the results continuously into a database.
This database will be accessable by the website through a normal asp.net backend that will server the page up to the client.
Is it advisable, and best practice, to build the complex CPU operations in c# binaries that run continuously on the server?
Sounds like you want a .NET executable that either runs on a schedule (cronjob-style) or that schedules itself. In any case it's wise to have it completely separate to your website process. It sounds like data-generation and data-serving are separate concerns, so they should be kept separate. This also means that you can move it off the web-serving machine if load becomes an issue. If you're updating a live database remember to take transactions into account.
Related
I've been thinking about nodejs lately and i'm wondering if i can use node.js for a huge Point Of Sales application project but it got me thinking if is it a good idea to use node.js for this job? can a POS system considered as a blocking application?
I think NodeJs is a good choice for POS because this system is not CPU intensive, instead, its data intensive. This means that you do not need to perform heavy processing and complex formulas on transactions, instead your application will receive a transaction, will perform some basic operations, and will place it in the database. NodeJS is good for that. If you use NodeJs with Quasar it will save a lot of your development time because you only have to write your application one time and then using Quasar you can make Cross-Platform Desktop App from it (using ElectronJS), Mobile app (using Cordova), and a Web app.
I am planning to build a nodejs application as back end of a web application.
This server application will provide a rest api, websockets service and process to scrape some sites with a headless navigation (like zombie.js) each n hours to feed a database.
I would to ask if it's a good approache build all the things in one nodejs instance or if it's better use several nodejs applications for every task.
If you are having small size application (which doesnot require scale in future), you can keep all the stuff Rest, Socket, scraping on the same project).
Note: After scraping if you are processing HTML content, it will take some time to process that in Sync manner. So at that time event loop will be blocked. So Rest API will not handle any request. In this scenario, you can keep Rest and Socket combinely in one project and Scraping thing in another project.
If you are planning to scale in near future, I suggest to Keep Seperate Instance, considering Benefit of SOA in scaling.
In my opinion better way is using different Node.js applications. First one will server your API. Another one will work as a socket server on different port.
About scraper it can be PHP (as well as nodejs) script which will run as a cron job. How to setup cron job you can check this question:
https://askubuntu.com/questions/2368/how-do-i-set-up-a-cron-job
or this tutorial for Ubuntu server: https://help.ubuntu.com/community/CronHowto
I think it will be best approach.
I'm currently in the planning stages of a new project which is composed of a storefront, a highly reactive user dashboard, and the individual products being offered via the storefront being highly interactive mini-apps. We're trying to get away with making the entire platform a SPA and design the entire thing on a Flux architecture with React for the front-end views.
One issue, as with most SPAs, is SEO. I've prototyped an isomorphic solution based on the este.js dev stack. One issue is that our app consumes pretty much all of its data from a RESTful server, which is separate from the web server serving up the SPA. This means that the web server would need to fetch a considerable amount of data from the RESTful server, to isomorphically generate the HTML snapshot.
I've considered having a separate crawler process of my own crawl the entire storefront periodically and isomorphically generate HTML snapshots of the pages that could be served up when the web server encounters a search engine crawler. I'm not sure if this is a good approach though, as it would likely introduce additional maintenance and, frankly, seems a bit fragile. I could just have the web server isomorphically generate the HTML on the fly, but I fear bogging the server down for ordinary users as the server would be pulling considerable data from the REST API...
Is there a better way to handle such a case?
Check out Yahoo's Fetchr, an open-source library that allows you to isomorphically hit your API. It ties into Facebook's Flux architecture, so you need to have Stores, but at the very least you can glean some code and concepts from it. If you're in the planning phase, you might even consider going with the Flux or Fluxible.
http://fluxible.io/guides/data-services.html
https://github.com/yahoo/fetchr
I am conducting some research on emerging web technologies and have created a very simple Azure website which makes use of web sockets and mongo db as the database. I have managed to get all the components working together and now must perform load testing on the application.
The main criteria is the maximum user load that the app can support, at the moment there is 1 web role instance, so probably I would need to test the max user load for that instance, then try with 2 instances and so on.
I found some solutions online such as Loadstorm, however I cannot afford to pay to use these services so I need to be able to do this from my own development machine OR from another cloud service.
I have come across Visual Studio Load Tests and they seem quite useful, however it seems they require VS Ultimate and an active msdn subscription - the prerequisites are listed here. Also, from this video which shows the basics of load tests, it seems like these load tests are created completely separately from the actual web project, so does that mean I can only see metrics related to the user? i.e. I cannot see the amount of RAM being used, processor etc.
Any suggestions?
You might create a Linux virtual machine in Azure itself or another hosting provider and use ApacheBench (ab) or JMeter to do simple load testing on your application. Be aware that in such a setup your benchmark servers may be a bottleneck themselves.
Another approach is to use online load testing services wich allow some free usage, such as:
loader.io, by SendGrid Labs
LoadStorm
Blazemeter
Blitz
Neotys
Loadimpact
For load-testing, LoadStorm is very reasonably priced, especially compared to on-premises software (and has a free tier with up to 25 virtual clients). You can install code such as jmeter, but you'll still need machines (or vm's) to host and run it from, and you need to make sure that the load-generator machines aren't the bottleneck in your tests.
When you run your tests, you may want to consider separating your web tier from MongoDB. MongoDB will consume as much memory as possible (as that's what gives MongoDB its speed). In a real-world scenario, you'll likely have MongoDB in its own environment. So for your tests, I'd consider offloading MongoDB to its own instance(s), and 10gen has a Worker Role setup that's fairly straightforward to install.
Also remember that NIC bandwidth is 100Mbps per core, which could be a limiting factor on your tests, depending on how much load you're driving.
One alternative to self-hosting MongoDB: Offload MongoDB to a hoster such as MongoLab. This will allow you to test the capacity of your web app without worrying about the details around MongoDB setup, configuration, optimization, etc. Currently MongoLab offers their free tier hosted in Azure, US West and US East data centers.
Editing my response, didnt read the question carefully.
Check out this thread for various tools and links:
Open source Tool for Stress, Load and Performance testing
If you are interested in finding the performance counters of the application under test you can revisit some of the latest features added to Visual Load Cloud base load test.
http://blogs.msdn.com/b/visualstudioalm/archive/2014/04/07/get-application-performance-data-during-load-runs-with-visual-studio-online.aspx
To get more info on Visual Studio Cloud Load Testing solution - https://www.visualstudio.com/features/vso-cloud-load-testing-vs
I'm hoping someone can validate or correct my conclusions here.
I'm looking into writing a small side project. I want to create a desktop application for taking notes that will synchronise to a web-server so that multiple installations can be kept in step and data shared and also so that it can be accessed via a browser if necessary.
I've kind of been half-listening to the noises about CouchDB and I've heard mention of "offline functionality", of desktop-couchdb and of moves to utilise its ability to handle intermittent communications to enable distributed applications in the mobile market. This all led me to believe that it might be an interesting option to look at for providing my data storage and also handling my synchronisation needs, but after spending some time looking around for info on how to get started my conclusion is that I've got completely the wrong end of the stick and the reality is that:
There's no way of packaging up a CouchDB instance, distributing it as part of a desktop application and running it in the context of that application to provide local storage and synchronisation to a central database.
Am I correct here? If so is there any technology out there that does this sort of thing or am I left just rolling my own local storage and maybe still using CouchDB on the server?
Update (2012/05): check out the new TouchDB projects from Couchbase if you are targeting Mac OS X and/or iOS or Android. These actually use SQLite under the hood (at least for now) but can replicate to/from a "real" CouchDB server. Another clientside alternative that is finally starting to mature is PouchDB, which runs in IndexedDB capable browser engines. Using these or using them to inspire similar port to another desktop platform is now becoming a better-trod path.
Original answer:
There's no way of packaging up a
CouchDB instance, distributing it as
part of a desktop application and
running it in the context of that
application to provide local storage
and synchronisation to a central
database.
At this point in time, your statement is practically correct although it is possible to include CouchDB in an app — for an example see CouchDBX.app which is a thin wrapper around a prefixed bundle of CouchDB and all its dependencies.
The easiest way to build a CouchDB app is to assume that the user will already have a CouchDB server running. This is easier than it sounds, especially with Couchone's hosting or a prebuilt app like CouchDBX on OS X or DesktopCouch on Ubuntu. This latter is especially interesting, because if I understand correctly it is included by default with Ubuntu these days, and automatically spins up a CouchDB server per-user when you query its port via D-Bus. Something similar could (and should) be done on OS X using launchd and Bonjour.
So as you write, you either would design your app to store data in a local format and optionally sync with a CouchDB service you provide or you'd have to build and bundle all of Erlang, SpiderMonkey and CouchDB together with your app along with some scripts to make sure it was running when needed. This is possible but obviously neither of these are ideal, and believe me you're not the only one wanting a simpler solution for desktop-oriented apps!