With regard to the Key/Value model of ArangoDB, does anyone know the maximum size per Value? I have spent hours searching the Internet for this information but to no avail; you would think that this is a classified information. Thanks in advance.
The answer depends on different things, like the storage engine and whether you mean theoretical or practical limit.
In case of MMFiles, the maximum document size is determined by the startup option wal.logfile-size if wal.allow-oversize-entries is turned off. If it's on, then there's no immediate limit.
In case of RocksDB, it might be limited by some of the server startup options such as rocksdb.intermediate-commit-size, rocksdb.write-buffer-size, rocksdb.total-write-buffer-size or rocksdb.max-transaction-size.
When using arangoimport to import a 1GB JSON document, you will run into the default batch-size limit. You can increase it, but appears to max out at 805306368 bytes (0.75GB). The HTTP API seems to have the same limitation (/_api/cursor with bindVars).
What you should keep in mind: mutating the document is potentially a slow operation because of the append-only nature of the storage layer. In other words, a new copy of the document with a new revision number is persisted and the old revision will be compacted away some time later (I'm not familiar with all the technical details, but I think this is fair to say). For a 500MB document is seems to take a few seconds to update or copy it using RocksDB on a rather strong system. It's much better to have many but small documents.
Related
This question is similar to the one asked here. But, the answer does not help me clearly understand what user memory in spark actually is.
Can you help me understand with an example. Like, an example to understand execution and storage memory would be: In c = a.join(b, a.id==b.id); c.persist() the join operation (shuffle etc) uses the execution memory, the persist uses the storage memory to keep the c cached. Similarly, can you please give me an example of user memory?
From the official documentation, one thing I understand is that it stores UDFs. Storing the UDFs does not warrant even a few MBs of space let alone the default of 25% that is actually used in spark. What kind of heavy objects might get stored in user memory that one should be careful of and should take into consideration while deciding to set the necessary parameters (spark.memory.fraction) that set the bounds of user memory?
That's a really great question, to which I won't be able to give a fully detailed answer (I'll be following this question to see if better answers pop up) but I've been snooping around on the docs and found out some things.
I wasn't sure whether I should post this as an answer, because it ends with a few questions of my own but since it does answer your question to some degree I decided to post this as an answer. If this is not appropriate I'm happy to move this somewhere else.
Spark configuration docs
From the configuration docs, you can see the following about spark.memory.fraction:
Fraction of (heap space - 300MB) used for execution and storage. The lower this is, the more frequently spills and cached data eviction occur. The purpose of this config is to set aside memory for internal metadata, user data structures, and imprecise size estimation in the case of sparse, unusually large records. Leaving this at the default value is recommended. For more detail, including important information about correctly tuning JVM garbage collection when increasing this value, see this description.
So we learn it contains:
Internal metadata
User data structures
Imprecise size estimation in case of sparse, unusually large records
Spark tuning docs: memory management
Following the link in the docs, we get to the Spark tuning page. In there, we find a bunch of interesting info about the storage vs execution memory, but that is not what we're after in this question. There is another bit of text:
spark.memory.fraction expresses the size of M as a fraction of the (JVM heap space - 300MiB) (default 0.6). The rest of the space (40%) is reserved for user data structures, internal metadata in Spark, and safeguarding against OOM errors in the case of sparse and unusually large records.
and also
The value of spark.memory.fraction should be set in order to fit this amount of heap space comfortably within the JVM’s old or “tenured” generation. See the discussion of advanced GC tuning below for details.
So, this is a similar explanation and also a reference to garbage collection.
Spark tuning docs: garbage collection
When we go to the garbage collection page, we see a bunch of information about classical GC in Java. But there is a section that discusses spark.memory.fraction:
In the GC stats that are printed, if the OldGen is close to being full, reduce the amount of memory used for caching by lowering spark.memory.fraction; it is better to cache fewer objects than to slow down task execution. Alternatively, consider decreasing the size of the Young generation. This means lowering -Xmn if you’ve set it as above. If not, try changing the value of the JVM’s NewRatio parameter. Many JVMs default this to 2, meaning that the Old generation occupies 2/3 of the heap. It should be large enough such that this fraction exceeds spark.memory.fraction.
What do I gather from this
As you have already said, the default spark.memory.fraction is 0.6, so 40% is reserved for this "user memory". That is quite large. Which objects end up in there?
This is where I'm not sure, but I would guess the following:
Internal metadata
I don't expect this to be huge?
User data structures
This might be large (just intuition speaking here, not sure at all), and I would hope that someone with more knowledge about this would be able to give some good examples here.
If you make intermediate structures during a map operation on a dataset, do they end up in user memory or in execution memory?
Imprecise size estimation in the case of sparse, unusually large records
Seems like this is only triggered in special cases, would be interesting to know where/how this gets decided.
In some other place in the docs it is said safeguarding against OOM errors in the case of sparse and unusually large records. So it might be that this is more of a safety buffer than anything else?
I have a question regarding CosmosDB and how to deal with the costs and RUs. Lets say I have JSON document which is around 100KB big. Now when I want to update a property and use an upsert the cosmos will do a replace which is essentially a delete and create which will result in relatively high consumption of RU right?
My first question is, can I reduce the amount of RUs for updating when I split the file into smaller parts? Let's say 10x 10KB documents. Because then a smaller document has to be replaced and it needs less CPU etc.
So this would be the case for upserts. But now there is a game changer for ComsosDB called partial update.
How would it be in this case? Would smaller files lead to a decrease in RU consumption? Because in the background cosmos has to parse the file and insert the new property. Bigger file more to parse more consumption of RU?
My last question is: Will the split into more files lead to higher consumption of RUs because I have to do 10 requests instead of one?
I'm going to preface my answer with the comment that anything performance related is something that users need to test themselves because the benefits or trade-offs can vary widely. There is however some general guidance around this.
In scenarios where you have very large documents with frequent updates on a small number of properties, it is often better to shred that document with one document that has the frequently updated properties and another that has the static properties. Smaller documents consume less RU/s to update and also reduce the load on the client and network payload.
Partial updates provides zero benefit over Update or Upsert for RU/s regardless of whether you shred the document or not. The service still needs to patch and merge the entire document. It will only reduce CPU consumption and network payload due to the smaller amount of data.
We are trying to do a POC to change the way we are storing content in the geode region. We operate on the sketches (sizes can vary from 1GB to 30GB) and currently breaking them into parcels and storing the parcels in the region. We then read these parcels, merge them to create a complete sketch for our processing. We are seeing some inconsistencies in the data due to the cache eviction and trying to come up with an approach of storing the complete object in the region instead of storing the parts.
I was looking at Geode documentation but did not seem to find the size limitation for any entry in the region, but wanted to reach a broader group in case anyone has done anything similar or have some insights into it.
Thanks for your response in advance.
Best Regards,
Amit
According to what I've been investigating, the maximum object size is set as 1GB, you can have a look at GEODE-478 and commit 1e3f89ddcd for further details. It's worth mentioning, as a side note, that objects that big might cause problems with GC, so you might want to stay away from that.
Cheers.
Is there a way to calculate how many RUs I would need if the a documentdb database is expected to have roughly 800 writes a second and 1500 reads a second?
Each read is a simple retrieve based on the index, and each item will have about 15 small data fields (a few bools, short strings, and short doubles).
Each write will be an update of most of the data values for the record.
The documentations states 1 RU = 1kb GET, well each GET in this instance should be less than 1kb I would suspect so the reads would be about 1500 RU/s but I have no idea how to calculate the writes; any help would be greatly appreciated.
There's a simple to use capacity planning tool available online. You can simply upload a sample JSON document and then specify how many reads and writes per second you expect and it will estimate your required RU/s throughput.
As David so eloquently pointed out, this should only be used as a starting point to give you a ballpark of what your minimum RU cost might be. If your primary read pattern was simply retrieving documents directly by their Id then it might be relatively accurate. In reality, RU is calculated based on the complexity of your queries. So once you have your baseline it's important to do proper analysis of your query patterns and get a feel for their RU cost.
Luckily, the ease and speed with which you can scale Cosmos in response to load is one of it's most compelling features in my opinion. In my experience, adding or removing RU throughput is done within a matter of seconds so you can definitely add a layer of intelligent database tuning within your application to optimize your cost and usage.
I am a new user of Arango DB and I am currently evaluating it for my project.
Can someone please tell me, what the maximum number of databases you can create in Arango DB is?
Thanks.
As far as I know, there are virtually no limits to the number of databases in ArangoDB.
The only thing you have to keep in mind, are the resources that are needed for databases and their collections.
Some of those resources, for each Database / Collection, are:
Files on disk: space and file descriptors needed.
Memory: each Database / Collection will take up space on the disk (and also in memory when loaded.)
For a collection, the number of file descriptors needed at any time is dependent of the journal size defined for it. If the journal size is big, less files are needed, ergo less file descriptors (and their associated resources) are needed.
There is also a nice blog post on disk space usage, here. It is a bit older, and might not be accurate now, but it should give you a general idea.
https://www.arangodb.com/2012/07/collection-disk-usage-arangodb/
Regarding journal-sizes and performance, you should also look at this:
https://www.arangodb.com/2012/09/performance-different-journal-sizes/