Is there a limit to the number of document sets you can create in one document library?
A list can have up to 30 million items and a library can have up to 30 million files and folders. Views can have up to 12 lookup columns.
Reference here.
Looks like a big library however there is a downside of this, Even though you have lets say 20000 files. The list will only show up to 5000 files in a list. Usually the latest 5000 files.
Reference here.
There is a way to increase the threshold, you can check here.
----------Updates----------
As here mentioned, document set is group of related documents as a single entity. which in theory it should share the file limits with a regular files limits. Also, from the first reference, File size - Less than 15 GB per file. Files attached to list items can be up to 250 MB in size. which should apply to document sets as well.
There is no limit to the number of document sets you can create in a document library.
Please use this link as a reference
Related
For example, your FRC fetches a news feed and groups the articles into sections by date of publication.
And then you want to limit the size of each section to be up to 10 articles each.
One option I’ve considered is having separate NSFetchedResultsControllers for each day and setting a fetch limit. But that seems unnecessary as the UI only really needs a single FRC (not to mention that the number of days is unbounded).
Edit:
I’m using a diffable data source snapshot.
If it was me, I'd leave the NSFetchedResultsController alone for this and handle it in the table view. Implement tableView(_:, numberOfRowsInSection:) so that it never returns a value greater than 10. Then the table will never ask for more than 10 rows in a section, and your UI will be as you want.
Since I’m using a diffable data source snapshot, I am able to take the snapshot I receive in the FRC delegate callback and use it to create a new snapshot, keeping only the first K items in a section.
I am trying to ingest a load of 13k json documents into azure search engine, but the index stops at around 6k documents without any error for the indexer and the index storage size is 7.96MB and it doesn't surpass this limit no matter what.
I have tried using smaller batches of 3k/indexer and after that 1k/indexer, but I got the same result.
In my json I have around 10 simple fields, and 20 complex fields (which have other nested complex fields, but up to level 5).
Do you have any idea if there is a limit per size for an index? And where I can set it up?
As SLA, I think we are using S1 plan (based on what limits we have - 50 indexers, and so on)
Thanks
Really hard to help without seeing it, but I remember I faced a problem like this in the past. In my case, it was a problem of duplicating with the key field.
I also recommend you smaller batches (~500 documents)
PS: Take a look if your nested jsons are not too big (in case it's marked as retrievable).
As there is a size limit for cosmsos db for single entry of data, how can I add a data of size more than 2 mb as a single entry?
The 2MB limit is a hard-limit, not expandable. You'll need to work out a different model for your storage. Also, depending on how your data is encoded, it's likely that the actual limit will be under 2MB (since data is often expanded when encoded).
If you have content within an array (the typical reason why a document would grow so large), consider refactoring this part of your data model (perhaps store references to other documents, within the array, vs the subdocuments themselves). Also, with arrays, you have to deal with an "unbounded growth" situation: even with documents under 2MB, if the array can keep growing, then eventually you'll run into a size limit issue.
While performing a simple Get operation by Id where a single document is returned (not an array with one document) I get the following x-ms-resource-usage:
x-ms-resource-usage:documentSize:0;documentsSize:288;collectionSize=307;
Questions:
Why is documentSize 0?
What is the unit of measure? Bytes?
What is the difference between documentSize and documentsSize? Please note the query only returns one document.
What is the collectionSize? Is that the total number of documents in the collection?
What is the difference between x-ms-resource-usage and x-ms-resource-quota?
I'm fairly sure the numbers are as follows, and all in KB:
documentSize: Size of the document
documentsSize: Combined size of all documents in collection
collectionSize: Combined size of all documents in collection, along with overhead such as indexes
x-ms-resource-usage is about consumed resources within the collection, while x-ms-resource-quota is going to give you your limits. So with quota, you'll see documentsSize and collectionSize both set to something like 10485760, which is 10GB (10,485,760 MB).
documentSize and documentsSize are the same value - first one in MB and the second one in kB. Apparently, documentSize is being deprecated.
collectionSize = documentsSize+metadata (in kB)
I have created a search project that based on lucene 4.5.1
There are about 1 million documents and each of them is about few kb, and I index them with fields: docname(stored), lastmodified,content. The overall size of index folder is about 1.7GB
I used one document (the original one) as a sample, and query the content of that document against index. the problems now is each query result is coming up slow. After some tests, I found that my queries are too large although I removed stopwords, but I have no idea how to reduce query string size. plus, the smaller size the query string is, the less accurate the result comes.
This is not limited to specific file, because I also tested with other original files, the performance of search is relatively slow (often 1-8 seconds)
Also, I have tried to copy entire index directory to RAMDirectory while search, that didn't help.
In addition, I have one index searcher only across multiple threads, but in testing, I only used one thread as benchmark, the expected response time should be a few ms
So, how can improve search performance in this case?
Hint: I'm searching top 1000
If the number of fields is large a nice solution is to not store them then serialize the whole object to a binary field.
The plus is, when projecting the object back out after query, it's a single field rather than many. getField(name) iterates over the entire set so O(n/2) then getting the values and setting fields. Just one field and deserialize.
Second might be worth at something like a MoreLikeThis query. See https://stackoverflow.com/a/7657757/277700