Coding a caching server - programming-languages

Coding a caching server - programming-languages

I have been researching and have some few ideas about a distributed caching system for a in memory key-value store with replication and all the jazz associated. SO I wanted to know from the community what is the best language/framework/technology mix i should go for.

Surely you know there's stuff like memcached out there? It's powering some of the busiest sites on the web. No need to reinvent the wheel here.
If you're going write your own anyway, you want to make it as fast as possible, so I'd choose C or C++. Fast, widely supported, easy to write bindings for other languages.

Related

Purpose built light-weight alternative to SSL/TLS?

Target hardware is a rather low-powered MCU (ARM Cortex-M3 #72MHz, with just about 64KB SRAM and 256KB flash), so walking the thin line here. My board does have ethernet, and I will eventually get lwIP (lightweight TCP/IP FOSS suite) running on it (currently struggling). However, I also need some kind of super light-weight alternative to SSL/TLS. I am aware of the multiple GPL'd SSL/TLS implementations for such MCU's, but their footprint is still fairly significant. While they do fit-in, given everything else, don't leave much room for others.
My traffic is not HTTP, so I don't have to worry about HTTPS, and my client/server communication can be completely proprietary, so non-standard solution is okay. Looking for suggestions on what might be the minimalistic yet robust (well a weak security is worthless), alternative that helps me --
Encrypt my communication (C->S & S->C)
Do 2-way authentication (C->S & S->C)
Avoid man-in-middle attacks
I won't be able to optimize library at ARMv7 assembly level, and thus bank entirely on my programming skills and the GNU-ARM compiler's optimization. Given above, any pointers of what might be the best options ?

If any of those small TLS implementations allow you to disable all X.509 and ASN.1 functionality and just use TLS with preshared-keys you'd have quite a small footprint. That's because only symmetric ciphers and hashes are used.

There's CurveCP. It's meant to completely replace SSL.
It's fairly new, and still undergoing development, but its author is a well-known expert in the field, and has been carefully working toward it during the past decade. A lot of careful research and design has been put into it.

My immediate reaction would be to consider Kerberos. It's been heavily studied, so the chances of major holes at this point are fairly remote. At the same time, it's fairly minimalist, so unless you restrict what you need to do, you're probably not going to be able to use anything much more lightweight without compromising security.
If that's still too heavy for your purposes, you're probably going to need to impose some restrictions on what you want done.

You might have a look at MST (Minimal Secure Transport) https://github.com/DiplIngFrankGerlach/MST.
It provides the same security assurances as TLS, but requires a pre-shared key. Also, it is very small (less than 1000 LoC, without AES) and can therefore easily be reviewed by an expert.

I know it is to come with variant of answer about two years later, yet... "PolarSSL's memory footprint can get as small as 30k and averages below 110k." https://polarssl.org/features

Which programming language suits web critical application development?

According to this page,it seems that Perl,PHP,Python is 50 times slower than C/C++/Java.
Thus,I think Perl,PHP,Python could not handle critical application(such as >100 million user,>xx million request every second) well.But exceptions are exist,e.g. facebook(it is said facebook is written with PHP entirely),wikipeida.Moreover,I heard google use Python extensively.
So why?Is it the faster hardware fill the big speed gap between C/C++/Java and Perl/PHP/Python?
thanks.

Computational code is the least of my concerns in most heavy usage web applications.
The bottle necks in a typical high availablility web application are (not nessecarility in this order, but most likely):
Database (IO and CPU)
File IO
Network Bandwidth
Memory on the Application Server
Your Java / C++ / PHP / Python code
Your main concerns to make your application scalable are:
Reduce access to the database (caching, with clustering in mind, smart quering)
Distribute your application (clustering)
Eliminate useless synchronization locks between threads (see commons-pool 1.3)
Create the correct DB indexes, data model, and replication to support many users
Reduce the size of your responses, using incremental updates (AJAX)
Only after all of the above are implemented, optimize your code
Please feel free to add more to the list if I missed something

The page you are linking only tells half the truth. Of course native languages are faster than dynamic ones, but this is critical to applications with high computing requirements. For most web applications this is not so important. A web request is usually served fast. It is more important to have an efficient framework, that manages resources properly and starts new threads to serve requests quickly. Also the timing behaviour is not the only critical aspect. Reliable and error-free applications are probably better achieved with dynamic languages.
And no, faster hardware isn't a solution. In fact Google is famous for using a cluster of inexpensive machines.

(such as >100 million user,>xx million request every second)
To achieve that sort of performance, you are going to HAVE to design and implement the web site / application as a scalable multi-tier system with replication across (probably) all tiers. At this point, the fact that one programming language is faster / slower than another probably only affects the number of machines you need in your processor farm. The design of the system architecture is far more significant.

there is no JIT compiler in php which Compile the code into machine code
Another big reason is PHP's dynamic typing. A dynamically typed language is always going to be slower..
click below and read more
What makes PHP slower than Java or C#?

C is easily the fastest language out there. Its so fast we write other languages in it. Nobody seriously writes web sites in C. Why? Its very easy to screw up in C in ways that are very difficult to detect and it does almost nothing to help you. In short, it eats programmers and generates bugs.
Building a robust, fast application is not about picking the fastest langauge, its about A) maintainability and B) scalability.
Maintainability means it doesn't have a lot of bugs. It means you can quickly add new features and modify existing ones. You want a language that does as much of the work as possible for you and doesn't get in the way. This is why things like Perl, Python, PHP and Ruby are so popular. They were all written with the programmer's convenience in mind over raw performance or tidiness. C was written for raw performance. Java was written for conceptual tidiness.
Scalability means you can go from 10 users to 10,000 users without rewriting the whole thing. That used to mean you wrote the tightest code you can manage, but highly optimized code is usually hard to maintain code. It usually means doing things for the benefit of the computer, not the human and the business. That sacrifices maintainability and you have to tell your boss its going to take 3 months to add a new feature.
Scalability these days is mostly achieved by throwing hardware at it and parallelizing. How many processes and processors and machines can you farm your work out to? If you can achieve that, you can just fire up another cheap cloud computer as you need it. Of course you're going to want to optimize some, but at this scale you get so much more out of implementing a better algorithm than tightening up your code.
For example, I took a sluggish PHP app that was struggling to handle 50 users at a time, switched from Apache with mod_php to lighttpd with load balanced, remote FastCGI processes allowing parallelization with a minimum of code change. Some basic profiling revealed that the PHP framework they used to prototype was dog slow, so it was stripped out. Profiling also suggested a few indexes to make the database queries run faster. End result was a system that could handle thousands of users and more capacity could be added as needed while leaving most of the code implementing the business logic untouched. Took a few weeks, and I don't really know PHP well.
It may be beneficial to reimplement small, sharp pieces in a very fast language, but usually that's already been done for you in the form of an optimized library or tool. For example, your web server. For the complexity and ever-changing needs of business logic the important thing is ease of maintenance and how good your programmers are.
You will find that most of the web is written in PHP, Perl and Python because they are easy to write in, with small, sharp bits written in things like C, Java and exotics like Scala (for example, Twitter). Wikia, for example, is a modified Mediawiki which is written in PHP but it is performant (amongst other reasons) by doing a heroic amount of caching.

Google is using Python for GAE and Windows Azure is providing PHP. The LAMP architecture is a great for application scalabilty.
I also think that the programming language is not that important regarding performance. The most important thing is to look at the architecture of your app.
I hope it helps

To serve a web page, you need to:
Receive and parse the request.
Decide what you wish to do with the request.
Read/write persistent data (database, cache, file system)
Output HTML data.
The "speed" of the server side language only applies to steps two and four. Given that most scripts strive to keep step 2 as short as possible, and that most web languages (including PHP) optimize step 4 as much as they can, in any serious web site most of the request processing time will be spent in step 3.
And the time spent on step 3 is independent of the server-side language you use ... unless you implement your own database and distributed cache.

For php, there are lot of things you can do to increase performance. For example
Php Accelerator
Caching Queries
Optimize Queries
Using a profiler to find slower parts and optimize
These things would certainly help reduce the gap between lower level languages. So to answer your question there are other things you can do inside the code to optimize it and make it run faster

I agree with luc. Its the architecture that really matter and not the programming language.

Domain repository for requirements management - build or buy?

In my organisation, we have some very inefficient processes around managing requirements, tracking what was actually delivered on what versions, etc, do subsequent releases break previous functionality, etc - its currently all managed manually. The requirements are spread over several documents and issue trackers, and the implementation details is in code in subversion, Jira, TestLink. I'm trying to put together a system that consolidates the requirements info, so that it is sourced from a single, authoritative source, is accessible via standard interfaces - web services, browsers, etc, and can be automatically validated against. The actual domain knowledge is not that complicated but is highly proprietary and non-standard (i.e., not just customers with addresses, emails, etc), and is relational: customers have certain functionalities, features switched on/off, specific datasources hooked up - all on specific versions. So modelling this should be straightforward.
Can anyone advise the best approach for this - I a certain that I can develop a system from scratch that matches exactly the requirements, in say ruby on rails, grails, or some RAD framework. But I'm having difficulty getting management buy-in, they would feel safer with an off the shelf solution.
Can anyone recommend such a system? Or am I better off building it from scratch, as I feel I am? I'm afraid a bought system would take just as long to deploy, and would not meet our requirements.
Thanks for any advice.

I believe that you are describing two different problems. The first is getting everyone to standardize and the second is selecting a good tool for requirements management. I wouldn't worry so much about the tool as I would the process and the people. Having the best tool in the world won't help if your various project managers don't want to share.
So, my suggestion is to start simple. Grab Redmine or Trac and take on the challenge of getting everyone to standardize. Once you have everyone in the right mindset then you can improve the tools you use for storage.

{disclaimer - mentioning my employer's product}
The brief experiments I made with a commercial tool RequisitePro seemed pretty good me. Allowed one to annotate existing Word docs and create a real-time linked database of the identified requisistes then perform lots of analysis and tracking of them.
Sometimes when I see a commercial product I think "Oh, well nice glossy bits but the fundamentals I could knock up in Perl in a weekend." That's not the case with this stuff. I would certainly look at commercial products in this space and exeperiment with a couple (ReqPro has a free trial, I guess the competition will too) before spending time on my own development.

Thanks a mill for the reply. I will take a look at RequisitePro, at least I'll be following the "Nobody ever got fired for buying IBM" strategy ;) youre right, and I kinda knew it, in these situations, buy is better. It is tempting when I can visualise throwing it together quickly, but theres other tradeoffs and risks with that approach.
Thanks,
Justin

While Requisite Pro enforces a standard and that can certainly help you in your task, I'd certainly second Mark on trying to standardize the input by agreement with personnel and using a more flexible tool like Trac, Redmine (which both have incredibly fast deploy and setup times, especially if you host them from a VM) or even a custom one if you can get the management to endorse your project.

Excel as the Backend to a Website?

A third party developer my boss has brought in, designed a "Better" System than our ASP.NET + MSSQL Server 2005 website we're using now.
Here are the relevant specs:
Excel + ODBC as the data store
Built using old school ASP, not ASP.NET
Is there any glaring problem with his solution short of the ancient tech? Thread safety etc?
Let me put it this way, "What can tell my boss (who's only partially technical) to blow this code out of the water?"
Thank you,
Vindictive Developer :)

Excel should never be used as a data store,
It is not a database
It will not handle multiple users at once at all
No support for transactions, so if an error occurs in the middle of a odbc call the excel file could end up trashed. (Even access would be better then using excel and that isn't saying much)
Excel is a spreadsheet, designed for analyzing data, not for storing data.

Straight from Microsoft: http://support.microsoft.com/kb/195951
IMPORTANT: Though ASP/ADO applications support multi-user access, an Excel spreadsheet does not. Therefore, this method of querying and updating information does not support multi-user concurrent access.

Allain, as well as the great technical reasons that have come out here, I think you need to ask yourself "why did the boss do this?"
Perception is reality, and if your boss is only partially technical, then purely technical reasoning may not get through.
Apart from the glaring architectural weaknesses, is there some functionality in this monster that makes it more appealing to your boss? Generally people don't do stupid things on purpose, it may serve you well to consider where you boss is coming from before you go making a CLM.

Ummm.... it lacks scalability: You could only have a few users. Is the data important?

Here's what you can tell him: Remind him of the nightmares that happen when two or more people need to edit the same spreadsheet at the same time. Now tell him to imagine that multiplied by a hundred people who can't call each other to tell them to "close the spreadsheet so I can update it". That's what it will be like.

Synching issues dealing with a seperate xls datastore and sql server 2005? On our IIS server, classic asp pages are prohibited by default. Maybe that's a sign lol.

How about terrible performance, since Excel is not designed to be used as a database? Tell your boss Excel isn't even a single-user database (that's what MS Access is), let alone a multi-user database designed for high, concurrent performance.
And of course by using pure ASP you're losing access to all of the libraries .NET framework (which of course is what all library developers in the MS ecosystem are focusing on). But you asked for 1 reason, and the first is better.

I would go with the mantra that those are the wrong tools for the job (assuming they are in your case). It'd be like using a screw driver as a hammer. For one nail, it might work with a lot of sweat and tears. For a real project, through, this is likely doomed.
I'd boast about the tools you are familiar with--how much better the tooling is in terms of performance, security, maintenance (esp. maintenance cost).
You could say something like he's paying someone to write a new app with decade old technology which may not be supported for much longer (if it still is...).

Ummm...row limit?

Is an Excel spreadsheet even going to handle concurrent transactions correctly? It wasn't designed for this kind of thing, and I wouldn't hold it responsible if it did something bad (like only letting one ODBC connection in at a time, or not properly locking concurrent updates).

That excel file's going to get corrupted in a hurry with many people hitting it at the same time. The scalability of Excel as a backend datastore is almost non-existent. It has a hard enough time keeping data integrity with its native Shared Workbook feature...
BTW- Is this third party a relative of your boss??? Yikes...

The ancient tech is itself a glaring problem. Will you be around forever? It will be very difficult for the boss to find new developers to maintain something like this. The tech world has moved on.

best language / framework for a web CRUD app with roles on Linux

I have a Linux web server and I'd like to make some database tables (currently in Access) available on the web for CRUD. There needs to role-based security. What's the quickest path to develop this?
Also, which database would be best? I already have mySQL running on that box if it makes any difference.

I agree with Chuck, the question shouldn't really be about the language, but about the framework you choose.
I did something similar to you a while back, and ended up using Ruby on Rails, and the activescaffold plugin (http://www.activescaffold.com/) to provide a pretty front end. The actual code I ended up writing was extremely minimal. There are other plugins for Rails which provide role based security too (which I didn't bother with, I just had "you're either logged in and have write access, or you're not logged in and you don't") and which also mean you don't have to write much stuff yourself.
So put me in the camp for Rails come the religious war ;)
Edit: MySQL is a perfect database to use, so you don't have to worry there.

This will turn into a religious war between the Ruby on Rails camp and the Python camp, with a good smattering of the PHP and Perl. You should evaluate the langauges yourself and decide what is best for you. There are, of course, other choices, however listing those would just elicit mroe religious battles. Although, I would say all of those I listed would be reasonable choices. You can usually create a good design in spite of any shortcommings your chosen language may have.

I don't think language is the question you should be asking. There's no language particularly well-suited to CRUD Web apps. There are many frameworks designed for that sort of thing, though, in many different languages, and those are probably what you'll want to look at.
I think Rails is the best in general, and that's what I use for most projects. It's very well-suited to CRUD apps (to the point where it allows you to create a simple one without writing any code at all). But if there really were an undisputed "best" choice, you'd probably already know about it. Instead, some apps are made with Rails, some with Django, some with Cake, and so on and so forth.

If you want a solid, clean, stable CRUD web app that can be maintained and expanded for years to come, stick with the standards: PHP, Perl, JavaScript, CSS, and HTML. Learn those roots languages well. Take the time to do it right and focus on good coding habits like clarity, consistency, and organization. Practice good reuse of code, good naming, good commenting, and good database design. Test, document, and refactor. Take pride in the craftsmanship of your CRUD app. Learn it inside and out. Set the stage so you can later maintain and expand it. Your goal is to build something that will work well, last a long time, and make a great return on the business investment. Someone once said that it takes 10 years to become a good coder.
As for frameworks, plugins, and external libraries, that's wonderful icing to put on your cake. But never confuse the icing with the cake. If you want to learn to code, take the time to learn it right. If you're not comfortable coding a simple CRUD app, you'll be even less comfortable trying to navigate the framework-generated code. Coding is a wonderful gig. But never mistake the sizzle for the steak.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string