ServiceStack cache size - servicestack

In ServiceStack when using IN-memory cache is there a way to find the actual size of the cached objects in bytes?

The In Memory Cache just stores everything in a ConcurrentDictionary. There's no available way to count the bytes.
One solution would be to create your own fork of it and add all the instrumentation you need on each write.

Related

Redis caches - when can large evictions be triggered?

We have a Redis Cache on Azure Standard 2.5gb. We observe the following behaviour:
Every now and then, we observe large drops in memory usage. It appears that lots of resources are being evicted.
Things to note:
Eviction policy is LRU
Available cache size is 2.5gb
No application code that would evict such large amounts of memory (largest objects are ~80kb and most are significantly smaller)
Observed memory drops represent tens of thousands of keys
We seldom use explicit expiry dates on cached objects, and when we do they are always < 1 hour.
My question is, apart from application logic explicitly evicting keys are there any other circumstances Redis would evict large amounts of keys?
The memory cleanup may not represent evictions.
You say "it appears" that lots of resources are being evicted, but if you are just relying on the reclaimed memory for that appearance, you may be chasing ghosts. Have you checked how this graph overlays with the Total Keys metric available in the Azure Portal? Overlaying the two series should allow you to see whether or not the memory reclamation really is due to eviction or if it's due to another process like Azure perhaps calling MEMORY PURGE periodically on the cache instance to clean up dirty pages?
Can you change your redis eviction policy to noeviction and see if that addresses your problem? Doing so means you will have to manage all content yourself. https://redis.io/topics/lru-cache has more details.

more that 1mb data in memcahed nodejs

I am using this memcached package with nodejs. As default max size of data per key is 1mb i am facing problem when data is more than 1mb for a particular key.
One work around would be in memcache.conf setting default max size more than 1 mb using
-I 2M
and in code setting the maxValue
var memcached = new Memcached('localhost:11211', {maxValue: 2097152});
What would be proper way to stay in 1mb limit? I have read suggestion about splitting data into multiple keys. How can i achieve multiple key splitting with JSON data in memcached package.
Options available :
1/ Make sure you are using compression while storing them in memcached, your nodejs memcached driver would be supporting gzip compression.
2/ Split the data into multiple keys
3/ Increase max object size to more than 1 MB ( but that may increase fragmentation,decrease performance based on your cache usage )
4/ Use redis as cache instead of memcached if your object sizes are usually large. Redis string datatype supports objects upto 512 MB in size, that would be easily available as direct get,set interface in any standard nodejs-redis cache driver

Difference between Cassandra Row caching and Partition key caching

What is the difference between row cache and Partition key cache? shall i need to use both for the good performance Perspective.
I have already read the basic definition from dataStax website
The partition key cache is a cache of the partition index for a
Cassandra table. Using the key cache instead of relying on the OS page
cache saves CPU time and memory. However, enabling just the key cache
results in disk (or OS page cache) activity to actually read the
requested data rows.
The row cache is similar to a traditional cache like memcached. When a
row is accessed, the entire row is pulled into memory, merging from
multiple SSTables if necessary, and cached, so that further reads
against that row can be satisfied without hitting disk at all.
Can anyone elaborate the area of uses . do need to have both implement both . ?
TL;DR : You want to use Key Cache and most likely do NOT want row cache.
Key cache helps C* know where a particular partition begins in the SStables. This means that C* does not have to read anything to determine the right place to seek to in the file to begin reading the row. This is good for almost all use cases because it speeds up reads considerably by potentially removing the need for an IOP in the read-path.
Row Cache has a much more limited use case. Row cache pulls entire partitions into memory. If any part of that partition has been modified, the entire cache for that row is invalidated. For large partitions this means the cache can be frequently caching and invalidating big pieces of memory. Because you really need mostly static partitions for this to be useful, for most use cases it is recommended that you do not use Row Cache.

difference between flush request and emptying cache for elasticsearch

What is the difference between between issuing a flush request and emptying the cache for elasticsearch? Does a restart of elasticsearch achieve either of these?
If you mean the difference between flush and clear cache api, it is pretty big.
Flush issues a lucene commit and empties the elasticsearch transaction log. As a result it gives durability on the lucene index level (that's why the translog can be emptied). Flush is called automatically under the hood at regular intervals that are adaptive depending on how many documents you index, how big they are and when the last flush was. You don't normally call flush, unless you are doing maintenance on the indices.
Clear cache empties the elasticsearch caches that are used to make search faster, for instance when it comes to executing the same filters or the same facets. There are different types of caches, but they are all at this time stored in memory (java heap).

On-disk lookup table with node.js bindings

For a project I am creating a queuing library and basically store URLs in a Set (it's actually an object, where I set keys to true, but one can see it as an array), so the queue only takes every url once. This works really well, however I am facing the problem that there are many URLs and so the RAM usage becomes really high.
Therefor I want to use an on-disk key-value store (actually only keys are required, no idea whether there is some different approach) with the following requirements:
No need to load the whole data set into RAM
Speedy lookups
Node.js bindings
It doesn't have to be too safe (losing data once in a while isn't a huge problem, low RAM requirements are more important) and even though I use Node.JS in this scenario this lookup doesn't necessarily need to run async.
Actually a side question would be whether there is some better way than a on-disk key-value approach. A term would be nice. Lookuptables somehow always lets me find data sets (IPs, ZIP codes, etc.)
I'd use a sql table with a single column (to store the url). Better control on memory usage than redis (which pretty much stores all in memory).
easy to check if there is already the same value
easy to insert
easy to remove one element
If it really "doesn't have to be too safe", another design would be to keep storing everything in memory but limit the number of URLs you store, for example by using an LRU cache.
You could either use a cache in node.js (easy to find via Google) or use a separate memcached server, possibly on the same machine.

Resources