Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
What's better to retrieve complex data from ArangoDB: A big query with all collection joins and graph traversal or multiple queries for each piece of data?
I think it depends on several aspects, e.g. the operation(s) you want to perform, scenario in which the querie(s) should be executed or if you favor performance over maintainability.
AQL provides the ability to write a single non-trivial query which might span through entire dataset and perform complex operation(s). Dissolving a big query into multiple smaller ones might improve maintainability and code readability, but on the other hand separate queries for each piece of data might have negative performance impact in the form of network latency associated with each request. One should also consider if the scenario allows to work with partial results returned from database while the other batch of queries is being processed.
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
I am looking for compaction strategy for the data which has following characteristics
We don't need the data after 60-90 days. At extreme scenarios maybe 180 days.
Ideally insert happens and updates never happens but it is realistic to expect duplicate events which cause updates.
It is indirectly time series data if you think about it, events coming first will be stored first and once the event is stored its almost never modified unless duplicate events are published.
Which strategy will be best for this case?
TimeWindowCompactionStrategy is only suitable for timeseries use cases and is the only reason you'd choose TWCS.
LeveledCompactionStrategy has very limited edge cases and the time I spend helping users troubleshoot LCS because it doesn't suit their needs is hardly worth the supposed benefits.
Unless you have some very specific requirements, SizeTieredCompactionStrategy is almost always the right choice and the reason it is the default compaction strategy. Cheers!
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
How do I implement caching mechanism of search results as on stackoverflow?
How does elastic search and lucene deal with caching?
As of now, you can cache in two different ways within Elasticsearch
Filter cache - Here if you can offload as many constraints which don't take part in scoring of results, you can have segment level caches for that particular filter alone. This along with warmer API provides some decent amount in memory based caching for the filters applied alone
Shard request cache * - You can cache the results ( Other than hits) on query level. This is pretty new feature and should provide a good amount of caching. But still _source needs to be still taken the shards.
Within Elasticsearch you can exploit these features to attain a good amount of caching.
Also, you can explore other caching option external to Elasticsearch to memcache or other in memory caches.
previously called shared query cache
Warmers
Warmers have been removed. There have been significant improvements to the index that make warmers not necessary anymore.
in ElasticSearch 5.4+
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
In general, say you have a (<16mb) table in a database running on the same machine as your server. If you need to do lots of querying into this table (%100 reads), is it better to:
Get the entire table, and do all the searching/querying/ in the server code.
Make lots of queries into the local database.
If the database is local, can I take advantage of the dbms's highly-efficient internal data structures for querying, or is the delay such that it's faster to map the tables returned by the database into my own data structures.
Thanks.
This is going to depend heavily on what kind of searches you're doing.
If your data is all ID lookups, it's probably faster to have it in RAM.
If your data is all full scans (no indexes), it's probably faster to have it in RAM.
If your data uses indexes, it's probably faster to have it in the DB.
Of course, much of the appeal of a database is indexes and a common query interface, so you have to weigh how valuable those are versus raw speed.
There's no way to really answer this without knowing exactly the nature of the data and queries to be done on it. Over-the-wire time has its cost, as does BSON <-> native marshalling, but indexed searches can be O(log n) as opposed to a dumb O(n) (or worse) search over a simple in-memory data structure.
Have you tried benchmarking?
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
What would be a good or recommended way to model SVG DOM tree in Google's Realtime API? Specifically, stringify the SVG DOM tree and choose a collaborative string model or is there a better way? Thanks.
It depends on what you want to do with it. If all you want to do is to display something, without it being editable, then I would just store it is a blob. E.g., maybe just a static string.
If you want to be able to edit it, a collaborative string is problematic, as its hard to guarantee that the results of merging different collaborator's actions will result in well-formed XML.
Instead you could use custom objects to model the various nodes in the tree. You could do this either with a generic dom-like model where nodes have arbitrary attributes, or with specific classes different element types. I think the last would be the most powerful way to deal with it, and the nicest to work with, but also the most work to setup.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I'm trying to implement a system using node.js in which a number of sites would contain js loaded from a common host, and trigger an action when some user visits n+ sites.
I suppose a nosql solution storing a mapping of ip address => array of sites visited would be preferable to a RDBM both in terms of performance and simplicity. The actions I need are "add to array if not there already" and getting the length of the array. Also, I wouldn't like it all to sit in memory all the time, since the db might get large some day.
What's a system that fits these requirements the best? MongoDB seems like a nice option given $addToSet exists, but maybe there's something better in terms of RAM usage?
When I hear about working with lists or sets, the first choice is Redis