MongoEngine: measure document size to not exceed the 16 MB limit - mongoengine

I am creating this new post because I am looking for a solution to measure an mongo engine object size in MB within a python program, before to add anything to this object and save it within the DB. My idea is simply to make sure that the object size does not exceed the Mongo Engine limit (which is about 16 MB) before to add anything else to one of its list field.
Do you have any advise that could interest me?
Thank in advance

Related

Is there a way to increase the attachment file size on Azure Devops Services Wiki

I have tried finding a setting but can't find an option to increase the attachment size from 18 MB.
I am afraid that there is no method could increase the attachment size from 18 MB.
You could refer to this doc: File naming conventions
Restriction type Restriction
File size Must not exceed the maximum of 18 MB
Attachment file size Must not exceed the maximum of 19 MB
This is a limitation of Azure DevOps, so there is no method to increase it for the time being.
But the requirement makes sense, you could refer to the following suggestion ticket in Our UserVoice Site: File size limits for wiki are restricting
You can also create a new suggestion ticket based on your requirement.

Can retrieve only 20 documents from a folder

I have a Spring CM folder that has 1000s of small files in it. I'm doing retrieval this way:
GET /v201411/folders/{id}/documents
but when it executes, I get back only 20 files. The sum of all of their sizes is: 1.8 MB and the Content Length of the response -> content -> headers is only 3.8 MB.
I didn't find anything in their documentations that mentions the limit of retrieving documents via the api.
Is that really the limitation of spring CM?
From the documentation on API Collections:
Limit (integer): The maximum number of elements retrieved per request.
Default limit is 20. Maximum limit is 100
When there are more items in the collection than the specified limit,
the application can page through the collection, retrieving the
objects in chunks by specifying the limit and/or offset on the query
string when the collection is requested. The first, previous, next,
and last properties are added as a convenience by appending the
appropriate limit and offset to the URI and a GET request can be done
to this URIs specified by these properties to navigate the collection.
To minimize the number of sequential calls you need to make, you can adjust the limit property up to the max, 100.

Liferay: huge DLFileRank table

I have a Liferay 6.2 server that has been running for years and is starting to take a lot of database space, despite limited actual content.
Table Size Number of rows
--------------------------------------
DLFileRank 5 GB 16 million
DLFileEntry 90 MB 60,000
JournalArticle 2 GB 100,000
The size of the DLFileRank table sounds to me as abnormally big (if it is totally normal please let me know).
While the file ranking feature of Liferay is nice to have, we would not really mind resetting it if it halves the size of the database.
Question: Would a DELETE * FROM DLFileRank be safe? (stop Liferay, run that SQL command, maybe set dl.file.rank.enabled=false in portal-ext.properties, start Liferay again)
Is there any better way to do it?
Bonus if there is a way to keep recent ranking data and throw away only the old data (not a strong requirement).
Wow. According to the documentation here (Ctrl-F rank), I'd not have expected the number of entries to be so high - did you configure those values differently?
Set the interval in minutes on how often CheckFileRankMessageListener
will run to check for and remove file ranks in excess of the maximum
number of file ranks to maintain per user per file. Defaults:
dl.file.rank.check.interval=15
Set this to true to enable file rank for document library files.
Defaults:
dl.file.rank.enabled=true
Set the maximum number of file ranks to maintain per user per file.
Defaults:
dl.file.rank.max.size=5
And according to the implementation of CheckFileRankMessageListener, it should be enough to just trigger DLFileRankLocalServiceUtil.checkFileRanks() yourself (e.g. through the scripting console). Why you accumulate that large number of files is beyond me...
As you might know, I can never be quoted by stating that direct database manipulation is the way to go - in fact I refuse thinking about the problem from that way.

How do I compute Memory required to create igniteRDD

Can anyone please help me understanding,
How is the Memory calculation done, to store A String, a long variable, and a int variable in IgniteRDD.
I was going to forums and find different answers, and I am totally confused on how to calculate the Memory requirement for my application.
I am trying to calculate 48 Billion records with 2 Strings, 1 int and 2 long variables.
Any help will be greatly appreciated.
Regards,
D V Nithin
See https://apacheignite.readme.io/docs/capacity-planning
Also, for estimating memory size - I would recommend loading various sized data several times and measuring Memory consumed by it. It will help you to understand approximate needed memory size. These metrics will help you in this: https://apacheignite.readme.io/docs/memory-metrics

Couchdb views crashing for large documents

Couchdb keeps crashing whenever I try to build the index of the views of a design document emitting values for large documents. The total size of the database is 40 MB and I guess the documents are about 5 MB each. We're talking about large JSON without any attachment.
What concerns me is that I have 2.5 GB of free ram before trying to access the views but as soon as I try to access them, the CPU usage raises to 99% and all the free RAM gets eaten by erl.exe before the indexing fails with exit code 1.
Here is the log:
[info] 2016-11-22T22:07:52.263000Z couchdb#localhost <0.212.0> -------- couch_proc_manager <0.15603.334> died normal
[error] 2016-11-22T22:07:52.264000Z couchdb#localhost <0.15409.334> b9855eea74 rexi_server throw:{os_process_error,{exit_status,1}} [{couch_mrview_util,get_view,4,[{file,"src/couch_mrview_util.erl"},{line,56}]},{couch_mrview,query_view,6,[{file,"src/couch_mrview.erl"},{line,244}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,139}]}]
Views skipping these documents can be accessed without issue. Which general guidelines could you provide me to help with this kind of situation? I am using couchdb 2.0 on windows.
Many thanks
Update : I tried to limit the number of view server instances to 1 and vary the max RAM allowed for couchjs, but it keeps crashing. Also I noticed that even though CouchDb is supposed to pass only one document at a time to the view server, erl.exe keeps eating all the available RAM (3GB used for three 5mb docs to update...). Initially I thought this could be because of the multiple couchjs instances but apparently this isn't the case.
Update : Made some progress, now it looks like the indexing is progressing well for just less than 10 minutes then erl.exe crashes. I have posted the dump here (just to clarify "well" means, 99% CPU usage and computer screen completely frozen).

Resources