Microsoft.WindowsAzure.Storage.Table.CloudTable.ExecuteBatchAsync() truncates message - azure

When I call this method with large EntityProperty (around 17Kb of text), it truncates the string.
I know that there is a limitation of 64Kb for a column and 1Mb for 1 entire row when it comes to Azure Table.
Any insights?

Apart from all these size restrictions, you forgot about the size restriction in an entity group transaction which is done by ExecuteBatchAsync method and that is:
The transaction can include at most 100 entities, and its total
payload may be no more than 4 MB in size.
Ref: http://msdn.microsoft.com/en-us/library/azure/dd894038.aspx
Please ensure that your payload size is less than 4 MB.

Related

Can I increase the size of the column "Statement" in Azure Log Analytics

I have a KLOG statement used to extract some data from Azure Log Analytics. My problem is related to the fact that Azure Log Analytics seems to truncate the SQL statements longer than 4000 characters. For the audited server, I have more queries written by the users longer than 4000 characters. Can I increase the size of the column "Statement" somehow?
Thank you
Can I increase the size of the column "Statement" somehow?
Azure has a limitation of collecting log messages size.
If you want to add a custom property message with large no of length you can use the trackTrace. It has the capacity to have max length up to 8192 characters.
# it will add more than 4000 character in AI Logs
telemetryClient.TrackTrace(<telemetry with more than 4000 character >);
The highest max allowed message limit is 32768 characters for items in the property collection have limit of 8192. (Max key length 150 value length 8192 characters)
Refer MS-DOC for length limits of Application Insights Data model with respective to the type of telemetry.
Reference
Follow the Steps given by #cijothomas to add large length of message (more than 8K) to application Insights

Can retrieve only 20 documents from a folder

I have a Spring CM folder that has 1000s of small files in it. I'm doing retrieval this way:
GET /v201411/folders/{id}/documents
but when it executes, I get back only 20 files. The sum of all of their sizes is: 1.8 MB and the Content Length of the response -> content -> headers is only 3.8 MB.
I didn't find anything in their documentations that mentions the limit of retrieving documents via the api.
Is that really the limitation of spring CM?
From the documentation on API Collections:
Limit (integer): The maximum number of elements retrieved per request.
Default limit is 20. Maximum limit is 100
When there are more items in the collection than the specified limit,
the application can page through the collection, retrieving the
objects in chunks by specifying the limit and/or offset on the query
string when the collection is requested. The first, previous, next,
and last properties are added as a convenience by appending the
appropriate limit and offset to the URI and a GET request can be done
to this URIs specified by these properties to navigate the collection.
To minimize the number of sequential calls you need to make, you can adjust the limit property up to the max, 100.

Cassandra batch prepared statement size warning

I see this error continuously in the debug.log in cassandra,
WARN [SharedPool-Worker-2] 2018-05-16 08:33:48,585 BatchStatement.java:287 - Batch of prepared statements for [test, test1] is of size 6419, exceeding specified threshold of 5120 by 1299.
In this
where,
6419 - Input payload size (Batch)
5120 - Threshold size
1299 - Byte size above threshold value
so as per this ticket in Cassandra, https://github.com/krasserm/akka-persistence-cassandra/issues/33 I see that it is due to the increase in input payload size so I Increased the commitlog_segment_size_in_mb in cassandra.yml to 60mb and we are not facing this warning anymore.
Is this Warning harmful? Increasing the commitlog_segment_size_in_mb will it affect anything in performance?
This is not related to the commit log size directly, and I wonder why its change lead to disappearing of the warning...
The batch size threshold is controlled by batch_size_warn_threshold_in_kb parameter that is default to 5kb (5120 bytes).
You can increase this parameter to higher value, but you really need to have good reason for using batches - it would be nice to understand the context of their usage...
commit_log_segment_size_in_mb represents your block size for commit log archiving or point-in-time backup. These are only active if you have configured archive_command or restore_command in your commitlog_archiving.properties file.
Default size is 32mb.
As per Expert Apache Cassandra Administration book:
you must ensure that value of commitlog_segment_size_in_mb must be twice the value of max_mutation_size_in_kb.
you can take reference of this:
Mutation of 17076203 bytes is too large for the maxiumum size of 16777216

Liferay: huge DLFileRank table

I have a Liferay 6.2 server that has been running for years and is starting to take a lot of database space, despite limited actual content.
Table Size Number of rows
--------------------------------------
DLFileRank 5 GB 16 million
DLFileEntry 90 MB 60,000
JournalArticle 2 GB 100,000
The size of the DLFileRank table sounds to me as abnormally big (if it is totally normal please let me know).
While the file ranking feature of Liferay is nice to have, we would not really mind resetting it if it halves the size of the database.
Question: Would a DELETE * FROM DLFileRank be safe? (stop Liferay, run that SQL command, maybe set dl.file.rank.enabled=false in portal-ext.properties, start Liferay again)
Is there any better way to do it?
Bonus if there is a way to keep recent ranking data and throw away only the old data (not a strong requirement).
Wow. According to the documentation here (Ctrl-F rank), I'd not have expected the number of entries to be so high - did you configure those values differently?
Set the interval in minutes on how often CheckFileRankMessageListener
will run to check for and remove file ranks in excess of the maximum
number of file ranks to maintain per user per file. Defaults:
dl.file.rank.check.interval=15
Set this to true to enable file rank for document library files.
Defaults:
dl.file.rank.enabled=true
Set the maximum number of file ranks to maintain per user per file.
Defaults:
dl.file.rank.max.size=5
And according to the implementation of CheckFileRankMessageListener, it should be enough to just trigger DLFileRankLocalServiceUtil.checkFileRanks() yourself (e.g. through the scripting console). Why you accumulate that large number of files is beyond me...
As you might know, I can never be quoted by stating that direct database manipulation is the way to go - in fact I refuse thinking about the problem from that way.

Cassandra 2.0 eating disk space

I am using cassandra in my app and it started eating up disk space much faster than I expected and much faster than defined in manual. Consider this most simple example:
CREATE TABLE sizer (
id ascii,
time timestamp,
value float,
PRIMARY KEY (id,time)
) WITH compression={'sstable_compression': ''}"
I am turning off compression on purpose to see how many bytes will each record take.
Then I insert few values, I run nodetool flush and then I check the size of data file on disk to see how much space did it take.
Results show huge waste of space. Each record take 67 bytes, I am not sure how that is possible.
My id is 13 bytes long at it is saved only once in data file, since it is always the same for testing purposes.
According to: http://www.datastax.com/documentation/cassandra/2.0/webhelp/index.html#cassandra/architecture/architecturePlanningUserData_t.html
Size should be:
timestamp should be 8 bytes
value as column name takes 6 bytes
column value float takes 4 bytes
column overhead 15 bytes
TOTAL: 33 bytes
For testing sake, my id is always same, so I have actually only 1 row if I understood correctly.
So, my questions is how do I end up on using 67 bytes instead of 33.
Datafile size is correct, I tried inserting 100, 1000 and 10000 records. Size is always 67 bytes.
There are 3 overheads discussed in the file. One is the column overhead, which you have accommodated for. The second is the row overhead. And also if you have replication_factor greater than 1 there's an over head for that as well.

Resources