Redis to permanent storage migration

Redis to permanent storage migration - node.js

Currently, My web application is on Redis db(all database). it's required more than 4 GB RAM which is cost me a lot.
I want to migrate some part of my application into permanent storage DB(SQL, mongo...)
So, Can anyone tell me which is the best choice(SQL, mongo...)?
Technology stack of my application:
nodejs(express)
angularjs
redis

It really depend on your design. Is your data highly relational? Redis is considered a NoSQL technology so I guess MongoDb would be somewhat similar but implementation will be file-based instead of key-value set. If you need your data to have strong relationship between each data set then SQL family is designed for exactly that, but a lot of work is needed to build the tables first and then separate the data.

Related

Azure: Redis vs Table Storage for web cache

We currently use Redis as our persistent cache for our web application but with it's limited memory and cost I'm starting to consider whether Table storage is a viable option.
The data we store is fairly basic json data with a clear 2 part key which we'd use for the partition and row key in table storage so I'm hoping that would mean fast querying.
I appreciate one is in memory and one is out so table storage will be a bit slower but as we scale I believe there is only one CPU serving data from a Redis cache whereas with Table storage we wouldn't have that issue as it would be down to the number of web servers we have running.
Does anyone have any experience of using Table storage in this way or comparisons between the 2.
I should add we use Redis in a very minimalist way get/set and nothing more, we evict our own data and failing that leave the eviction to Redis when it runs out of space.

This is a fairly broad/opinion-soliciting question. But from an objective perspective, these are the attributes you'll want to consider when deciding which to use:
Table Storage is a durable, key/value store. As such, content doesn't expire. You'll be responsible for clearing out data.
Table Storage scales to 500TB.
Redis is scalable horizontally across multiple nodes (or, scalable via Redis Service). In contrast, Table Storage will provide up to 2,000 transactions / sec on a partition, 20,000 transactions / sec across the storage account, and to scale beyond, you'd need to utilize multiple storage accounts.
Table Storage will have a significantly lower cost footprint than a VM or Redis service.
Redis provides features beyond Azure Storage tables (such as pub/sub, content eviction, etc).
Both Table Storage and Redis Cache are accessible via an endpoint, with many language-specific SDK wrappers around the API's.

I find some metrials about the azure redis and table, hope that it can help you.There is a video about Azure Redis that also including a demo to compare between table storage and redis about from 50th minute in the videos.
Perhaps it can be as reference. But detail performance it depends on your application, data records and so on.
The pricing of the table storage depends on the capacity of table storage, please refer to details. It is much cheaper than redis.

There are many differences you might care about, including price, performance, and feature set. And, persistence of data, and data consistency.
Because redis is an in-memory data store it is pretty expensive. This is so that you may get low latency. Check out Azure's planning FAQ here for a general understanding of redis performance in a throughput sense.
Azure Redis planning FAQ
Redis does have an optional persistence feature, that you can turn on, if you want your data persisted and restored when the servers have rare downtime. But it doesn't have a strong consistency guarantee.
Azure Table Storage is not a caching solution. It’s a persistent storage solution, and saves the data permanently on some kind of disk. Historically (disclaimer I have not look for the latest and greatest performance numbers) it has much higher read and write latency. It is also strictly a key-value store model (with two-part keys). Values can have properties but with many strict limitations, around size of objects you can store, length of properties, and so on. These limitations are inflexible and painful if your app runs up against them.
Redis has a larger feature set. It can do key-value but also has a bunch of other data structures like sets and lists, and many apps can find ways to benefit from that added flexibility.
See 'Introduction to Redis' (redis docs) .
CosmosDB could be yet another alternative to consider if you're leaning primarily towards Azure technologies. It is pretty expensive, but quite fast and feature-rich. While also being primarily intended to be a persistent store.

store the temporary data in couchbase or redis

I have a nodejs project that using couchbase as database.
Just wonder if I store the temporary data in
1.redis
or in
2.couchbase directly.
As I know there is socket delay for couchbase, I think store temporary data in redis while store the permanent data in couchbase is better.
Is there any person has the experience on this?
Your comment welcome

I'm a big Redis fan, but in this situation I would use Couchbase only.
Couchbase is rather efficient, and comparable to the performance of memcached when the working set of your data fits in memory. Most of the time, an extra caching layer on top of Couchbase is not useful.
That said, if you really need a caching layer, or simply some storage for temporary data, you can simply create a memcached bucket hosted in the Couchbase cluster. So you would have an "eventually persistent" bucket for your persistent data, and a memcached bucket for the temporary data.
The bucket types are described here:
http://docs.couchbase.com/couchbase-manual-2.5/cb-admin/#data-storage
In that context, adding Redis as a extra storage layer does not really make sense.

Couchbase has a managed cache built into it, even for Couchbase buckets. So it already has a caching layer and adding another one on top just sounds superfluous.
I am not sure what you mean by a socket delay in Couchbase. Can you perhaps explain more about that? That is not something I have ever seen before and sticks out as suspect to me. I would try and troubleshoot this and figure out what that is before looking to add redis to the mix and have yet another layer to manage and code against. Without know more about the socket delay, it is difficult to make more recommendations.

It's an old question, but I'll have my take at it as well, if nothing else then for the people coming across it via google, just as I did.
I agree with he accepted answer, in that CouchBase has the most recently used Documents in RAM. In that aspect, it does the same as Redis. The advantage of CouchBase is of course that the data can reliably spill over the RAM limit, and the server disk limit, automatically, by adding more nodes.
However, I have a project where I am considering using Redis along side CouchBase. It's basically thought as a caching server, but for the "calculated" items. Such as html-snippets or other things. CouchBase is a fantastic document store, but making lists and other structures, doesn't come that easy, especially not without a lot of views. So I'm thinking to use Redis as a temporary datastore for the ad-hoc data manipulation needed, and CouchBase as the main datastore.

Architecture for Redis cache & Mongo for persistence

The Setup:
Imagine a 'twitter like' service where a user submits a post, which is then read by many (hundreds, thousands, or more) users.
My question is regarding the best way to architect the cache & database to optimize for quick access & many reads, but still keep the historical data so that users may (if they want) see older posts. The assumption here is that 90% of users would only be interested in the new stuff, and that the old stuff will get accessed occasionally. The other assumption here is that we want to optimize for the 90%, and its ok if the older 10% take a little longer to retrieve.
With this in mind, my research seems to strongly point in the direction of using a cache for the 90%, and then to also store the posts in another longer-term persistent system. So my idea thus far is to use Redis for the cache. The advantages is that Redis is very fast, and also it has built in pub/sub which would be perfect for publishing posts to many people. And then I was considering using MongoDB as a more permanent data store to store the same posts which will be accessed as they expire off of Redis.
Questions:
1. Does this architecture hold water? Is there a better way to do this?
2. Regarding the mechanism for storing posts in both the Redis & MongoDB, I was thinking about having the app do 2 writes: 1st - write to Redis, it then is immediately available for the subscribers. 2nd - after successfully storing to Redis, write to MongoDB immediately. Is this the best way to do it? Should I instead have Redis push the expired posts to MongoDB itself? I thought about this, but I couldn't find much information on pushing to MongoDB from Redis directly.

It is actually sensible to associate Redis and MongoDB: they are good team players. You will find more information here:
MongoDB with redis
One critical point is the resiliency level you need. Both Redis and MongoDB can be configured to achieve an acceptable level of resiliency, and these considerations should be discussed at design time. Also, it may put constraint on the deployment options: if you want master/slave replication for both Redis and MongoDB you need at least 4 boxes (Redis and MongoDB should not be deployed on the same machine).
Now, it may be a bit simpler to keep Redis for queuing, pub/sub, etc ... and store the user data in MongoDB only. Rationale is you do not have to design similar data access paths (the difficult part of this job) for two stores featuring different paradigms. Also, MongoDB has built-in horizontal scalability (replica sets, auto-sharding, etc ...) while Redis has only do-it-yourself scalability.
Regarding the second question, writing to both stores would be the easiest way to do it. There is no built-in feature to replicate Redis activity to MongoDB. Designing a daemon listening to a Redis queue (where activity would be posted) and writing to MongoDB is not that hard though.

Rate limiting - using CouchDB with Redis or CouchDB on its own

I've written an application with a CouchDB backend. I have invested a lot of time into CouchDB and so I'm reluctant to move everything over to a different NoSQL database (like Redis).
The problem is that I now need to implement a rate limiting (based on IP address) feature.
There are plenty of examples on how good Redis is for this kind of task, however because I don't want to drop CouchDB for other tasks this means I would essentially be running (and supporting) two databases (1 for most data, 1 for rate limiting) and so...
Is running CouchDB in tandem with Redis unheard of?
Is CouchDB itself suitable for handling rate limiting itself?

Is running CouchDB in tandem with Redis unheard of?
Redis is commonly used in complement with other storage solutions (MySQL, PostgreSQL, MongoDB, CouchDB, etc ...). Like many other NoSQL solutions, Redis is not adapted to all kind of workloads or situations. The authors of Redis are pragmatic and open people, and they routinely suggest to use other solutions rather than Redis, when they are more adapted to the situation.
Redis is therefore a good team player, and it is generally easy to integrate in an existing infrastructure.
Here is an example of usage of Redis with CouchDB.
Is CouchDB itself suitable for handling rate limiting itself?
CouchDB has a number of useful features to implement the rate limiting strategy described in Chris O'Hara's article. For instance, it supports bulk operations on several documents (with optional atomicity). A "bucket span" can be stored in a single document. In-place incrementation of counters can be covered by using update handlers.
IMO, the main missing feature would be automatic item expiration (which CouchDB does not provide AFAIK). So you would have to design a clever mechanism to get rid of obsolete data on top of CouchDB.
The main problem is CouchDB is not really designed for this kind of workload: it is a log structured document oriented database. Each time a counter has to be incremented, it would involve JSON unpacking/packing operations, some Javascript code to be executed, and writing a new revision of the whole document in append only files. You can find a good article describing how CouchDB stores its data here.
I suspect a rate limiting strategy implemented on top of CouchDB would not scale very well (too many I/Os, too much CPU consumption, inefficient network protocol). For instance, CouchDB is a RESTful server; I would not feel comfortable to initiate client HTTP operations (REST queries to CouchDB) to rate limit each incoming HTTP query of my system.
Redis is much more adapted to this kind of workload (fast, in-memory, no I/O, efficient client protocol, no JSON parsing/formatting, incrementations are native atomic operations, etc ...)

You can do rate limiting with Memcached - it has a nice counter increment command as you mention, plus obsolete data is automatically purged from the cache in due course, so it has all the benefits of Redis for this application without the annoying duplication of capability (and complexity) that running Redis on top of CouchDB would bring.
http://simonwillison.net/2009/jan/7/ratelimitcache/
You could add memcached to your own setup easily enough or you could investigate CouchBase whose current server product integrates a CouchDB derived database with Memcached compatibility baked in:
http://www.couchbase.com/memcached
Personally I dislike the way Couchbase forked from CouchDB, but for your application it might be a perfect fit.

.Net 4.0 Memory-Mapped Files verses RDMS Storage

I'm interested in people's thoughts comparing storing data in a traditional SQL based Database or utilising a Memory-Mapped File such as the one in the new .Net 4.0 runtime. The data in question would be arrays of simple structures.
Obvious pros and cons:
SQL Database Pros
Adhoc query support
SQL Management Tools
Schema changes (adding more columns and setting default values)
Memory-Mapped Pros
Lighter overhead? (this is an assumption on my part)
Shareable between process threads
Any others?
Is it worth it for performance gains?

You could try out MongoDB, and get a mixture of both worlds (database-like features over a memory mapped store).
MongoDB bridges the gap between
key-value stores (which are fast and
highly scalable) and traditional RDBMS
systems (which provide rich queries
and deep functionality).
Here's a good article that can walk you through installing and coding to MongoDB:
Going NoSQL with MongoDB

SQLServer can use memory mapped files if you choose "SharedMemory" as the protocol. Otherwise it'll use Pipes, TCP or VIA.
Regarding pros and cons.. to me they are amost not comparable. SQL has the whole query/multiuser/transaction etc infrastructure built in. If you store with MMF's you are on your own regarding all that. On the other hand, MMF are built in the OS.. no seed for a server/service.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string