Couchdb view generation performance

Couchdb view generation performance - couchdb

How to avoid slow requests on frequently updated view in couchdb , when returned "up-to-date" is not important, what I am talking is probably caching , and wondering is there any out of the box solution without involving third party software like "nginx cache"
What I've tried is set compression to 0,
[{db_fragmentation, "0%"}, {view_fragmentation, "0%"}] yet the views sometimes take 30+ seconds to be available for the consumer.

Adding &update=false on the end of url seems to do the job
I am "relaxed" again

Related

A way to quickly purge very large list of URLs from Varnish

My Varnish server caches a maps tile server, which is updated real-time from OpenStreetMap every 1 minute. Frequently, an entire area of the map needs to be invalidated -- i.e. 10,000 or even 100,000 tiles at once. Each tile is a single URL (no variances).
Is there an efficient way to run such large scale Varnish invalidation? Ideally objects should remain in cache (so that grace period would continue to work unless a URL flag of nograce is passed in), but marked as no longer valid. Ideally this tight loop would be implemented in VCL itself.
Example URL: http://example.org/tile/10/140/11.pbf (no variance, no query portion) where the numbers are {zoom}/{x}/{y}, and the list of those numbers (i.e. 100,000 at a time) is generated externally every minute and stored in a file. BTW, most likely most of those URLs won't even be in cache.

The answer depends a lot on how those URLs look like. Options are:
Using multiple soft purges [1] (beware of the 'soft' part; you'll need the purge VMOD for that) triggered by an external loop (sorry, you cannot do that in VCL). Soft purges set TTL to 0 instead of fully removing objects from the storage.
Using a simple ban [2]. However, bans will completely (and lazily) remove matching objects from the storage (i.e. there is not 'soft' flavour for bans).
Using the xkey VMOD [3]. The VMOD provides a 'soft' invalidation option, but not sure if a surrogate index would help for your use case.
[1] https://varnish-cache.org/docs/trunk/reference/vmod_purge.html
[2] https://varnish-cache.org/docs/trunk/users-guide/purging.html#bans
[3] https://github.com/varnish/varnish-modules/blob/master/docs/vmod_xkey.rst

Performance drop due to NotesDocument.closeMIMEEntities()

After moving my XPages application from one Domino server to another (both version 9.0.1 FP4 and with similar hardware), the application's performance strongly dropped. Benchmarks revealed that the execution of
doc.closeMIMEEntities(false,"body")
which takes ~0.1ms on the old server, now on average takes >10ms on the new one. This difference wouldn't matter if it was only about a few documents, but when initializing the application I read more than 1000 documents and so the initialization time changes from less than 1sec to more than 10sec.
In the code, I use the line above to close the MIME entity without saving any changes after reading from it (NO writing). The function always returns true on both servers. Still it now takes more than 100x longer despite nothing has been changed in the entity.
The facts that both server computers have more or less the same hardware, and the replicas of my application contain the same design and data on both servers, let me believe that the problem has something to do with the settings of the Domino server.
Can anybody help me with this?
PS: I always use session.setConvertMime(false) before opening the NotesDocument, i.e. the conversion from MIME to RichText should not be what causes the problem.
PPS: The HTTPJVMMaxHeapSize is the same on both servers (1024M) and there are multiple 100Mb of free memory. I just mention this in case someone thinks the problem might be related to being out of memory.

The problem is related to the "ImportConvertHeaders bug" in Domino 9.0.1 FP4. It has already been solved with Interim Fix 1 (as pointed out by #KnutHerrmann here).
It turned out that the old Domino server had Interim Fix 1 installed, while the "new" one had not. After applying the fix to the new Domino server the performance is back to normal and everything works as expected.

How to implement node-lru-cache?

I've developed a real time app with Node.js, Socket.io and mongodb. It has a certain requirement that when a user loads a specific page then some 20000 points with x and y coordinates which are between 2 specific dates are fetched from mongodb and rendered on the map to client. Now if the user again reloads, the process is repeated. I'm confused how to insert these points in cache with what key so that when user reloads, the values from cache are fetched easily with the key.
Any suggestions? Thanks!

you could
completly write your own caching layer
use an existing caching library here (e.g.
lru-cache-module by isaacs,
which probably is the most popular in this field)
could use redis as a cache (there is the
ability to set a TTL for inserted docs) there is already a
mongoose-redis-cache-module,
maybe that helps
and potentially x other solutions. it depends on the scale of your data/number of reqests and so on.

The caching is something your database does for you in this case. MongoDB relies on the operating system's memory-mapped I/O for storage. A general purpose OS will usually keep the most frequently used pages in memory. If you still want to use an additional cache, the obvious key to use for coordinates would be a Geohash.

This library runtime-memcache implements lru and a few other caching schemes in javascript. Works with Node.js and written in Typescript.
It uses modified Doubly Linked List to achieve O(1) for get, set and remove.

Is there any modern review of solutions to the 10000 client/sec problem

(Commonly called the C10K problem)
Is there a more contemporary review of solutions to the c10k problem (Last updated: 2 Sept 2006), specifically focused on Linux (epoll, signalfd, eventfd, timerfd..) and libraries like libev or libevent?
Something that discusses all the solved and still unsolved issues on a modern Linux server?

The C10K problem generally assumes you're trying to optimize a single server, but as your referenced article points out "hardware is no longer the bottleneck". Therefore, the first step to take is to make sure it isn't easiest and cheapest to just throw more hardware in the mix.
If we've got a $500 box serving X clients per second, it's a lot more efficient to just buy another $500 box to double our throughput instead of letting an employee gobble up who knows how many hours and dollars trying to figure out how squeeze more out of the original box. Of course, that's assuming our app is multi-server friendly, that we know how to load balance, etc, etc...

Coincidentally, just a few days ago, Programming Reddit or maybe Hacker News mentioned this piece:
Thousands of Threads and Blocking IO
In the early days of Java, my C programming friends laughed at me for doing socket IO with blocking threads; at the time, there was no alternative. These days, with plentiful memory and processors it appears to be a viable strategy.
The article is dated 2008, so it pulls your horizon up by a couple of years.

To answer OP's question, you could say that today the equivalent document is not about optimizing a single server for load, but optimizing your entire online service for load. From that perspective, the number of combinations is so large that what you are looking for is not a document, it is a live website that collects such architectures and frameworks. Such a website exists and its called www.highscalability.com
Side Note 1:
I'd argue against the belief that throwing more hardware at it is a long term solution:
Perhaps the cost of an engineer that "gets" performance is high compared to the cost of a single server. What happens when you scale out? Lets say you have 100 servers. A 10 percent improvement in server capacity can save you 10 servers a month.
Even if you have just two machines, you still need to handle performance spikes. The difference between a service that degrades gracefully under load and one that breaks down is that someone spent time optimizing for the load scenario.
Side note 2:
The subject of this post is slightly misleading. The CK10 document does not try to solve the problem of 10k clients per second. (The number of clients per second is irrelevant unless you also define a workload along with sustained throughput under bounded latency. I think Dan Kegel was aware of this when he wrote that doc.). Look at it instead as a compendium of approaches to build concurrent servers, and micro-benchmarks for the same. Perhaps what has changed between then and now is that you could assume at one point of time that the service was for a website that served static pages. Today the service might be a noSQL datastore, a cache, a proxy or one of hundreds of network infrastructure software pieces.

You can also take a look at this series of articles:
http://www.metabrew.com/article/a-million-user-comet-application-with-mochiweb-part-3
He shows a fair amount of performance data and the OS configuration work he had to do in order to support 10K and then 1M connections.
It seems like a system with 30GB of RAM could handle 1 million connected clients on a sort of social network type of simulation, using a libevent frontend to an Erlang based app server.

libev runs some benchmarks against themselves and libevent...

I'd recommend Reading Zed Shaw's poll, epoll, science, and superpoll[1]. Why epoll isn't always the answer, and why sometimes it's even better to go with poll, and how to bring the best of both worlds.
[1] http://sheddingbikes.com/posts/1280829388.html

Have a look at the RamCloud project at Stanford: https://ramcloud.atlassian.net/wiki/display/RAM/RAMCloud
Their goal is 1,000,000 RPC operations/sec/server. They have numerous benchmarks and commentary on the bottlenecks that are present in a system which would prevent them from reaching their throughput goals.

Should AspBufferLimit ever need to be increased from the default of 4 MB?

A fellow developer recently requested that the AspBufferLimit in IIS 6 be increased from the default value of 4 MB to around 200 MB for streaming larger ZIP files.
Having left the Classic ASP world some time ago, I was scratching my head as to why you'd want to buffer a BinaryWrite and simply suggested setting Response.Buffer = false. But is there any case where you'd really need to make it 50x the default size?
Obviously, memory consumption would be the biggest worry. Are there other concerns with changing this default setting?

Increasing the buffer like that is a supremely bad idea. You would allow every visitor to your site to use up to that amount of ram. If your BinaryWrite/Response.Buffer=false solution doesn't appease him, you could also suggest that he call Response.Flush() now and then. Either would be preferable to increasing the buffer size.
In fact, unless you have a very good reason you shouldn't even pass this through the asp processor. Write it to a special place on disk set aside for such things and redirect there instead.

One of the downsides of turning off the buffer (you could use Flush but I really don't get why you'd do that in this scenario) is that the Client doesn't learn what the Content length at the start of the download. Hence the browsers dialog at the other end is less meaningfull, it can't tell how much progress has been made.
A better (IMO) alternative is to write the desired content to a temporary file (perhaps using GUID for the file name) then sending a Redirect to the client pointing at this temporary file.
There are a number of reasons why this approach is better:-
The client gets good progress info in the save dialog or application receiving the data
Some applications can make good use of byte range fetches which only work well when the server is delivering "static" content.
The temporary file can be re-used to satisify requests from other clients
There are a number of downside though:-
If takes sometime to create the file content, writing to a temporary file can therefore leave some latency before data is received and increasing the download time.
If strong security is needed on the content having a static file lying around may be a concern although the use of a random GUID filename mitigates that somewhat
There is need for some housekeeping on old temporary files.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string