I am running aws lambda which will fetch data from maria DB and return the fetched rows as a JSON object.
A total number of item in JSON array is 64K.
I am getting this error:
{ "error": "body size is too long" }
Is there a way I can send all 64K rows by making any configuration change to lambda?
You cannot send the 64K rows (Which goes beyond 6MB body payload size limit) making configuration changes to Lambda. Few alternative options are.
Query the data and build a JSON file with all the rows in /tmp (Up to 512MB) directory inside Lambda, upload it to S3 and return a CloudFront Signed URL to access the data.
Split the dataset into multiple pages and do multiple queries.
Use a EC2 instance or ECS, instead of Lambda.
Note: Based on the purpose of queried data, its size & etc. different mechanisms can be used, efficiently using other AWS services.
This error indicates that your response exceeds the maximum (6 MB), which is maximum data size AWS Lambda can respond.
http://docs.aws.amazon.com/lambda/latest/dg/limits.html
It seems that you're hitting the hard limit of a maximum 6 MB response size. As it's a hard limit there's unfortunately no way to increase this.
You'll need to set up your lambda to be able to send at most 6MB and paginate through the rows you need to retrieve in different invocations until you've fetched all 64K.
Sources:
https://docs.aws.amazon.com/lambda/latest/dg/limits.html#limits-list
https://forums.aws.amazon.com/thread.jspa?threadID=230229
Related
I am trying to develop a Node.js function on Google Cloud Functions that reads a CSV file on Cloud Storage and encodes its character code, and writes an encoded CSV file on the storage.
The function generates the file on the storage successfully when the target file size is small(15KB). However, if the file size is large(>100MB), the function generates nothing.
Is there an upper limit to the file size when reading and writing files with Node js on Google Cloud Functions?
If you know how to deal with this problem, I would appreciate it if you could let me know.
Though google spreadsheets are completely free from google, it still has some limitations
Google help shows limitations but not on the file size limit itself, you may compute from these limitations that file size is somewhere around 20mb though it varies depending on the field datatype you have whether string or integer, and so on.
Google spreadsheet limitation as of 26-Jul-2021
Up to 5 million cells or 18,278 columns (column ZZZ) for spreadsheets
that are created in or converted to Google Sheets.
Up to 5 million
cells or 18,278 columns for spreadsheets imported from Microsoft
Excel. The limits are the same for Excel and CSV imports.
If anyone
cell has more than 50,000 characters, that single cell will not be uploaded.
For more than 100mb file size, you would have already crossed some of these limits. Sometimes such error is responded from google cloud for your requests, you might notice such errors in the google sheet responses for your batch request.
From this documentation it is clear that Google Cloud Functions has a Quota limitation of 10 mb for data sent to HTTP Functions in an HTTP request, data sent from HTTP functions in an HTTP response and data sent in events to background functions. This StackOverFlow thread shows how you can compress the files to avoid this limit.
recently I was using DynamoDB to build my service. I use the provisioned mode for my DynamoDB table.
In order to test how DynamoDB will react, I set both Read Capacity Unit and Write Capacity Unit to only 1. In addition, I insert 20 items which account for about 27KB in my table. I use Scan method with ReturnConsumedCapacity parameter. I use Postman to test it, the result shows that it consumes 2.5 capacity units!
Why does DynamoDB not reject my request? I only assign 1 to both RU & WU! Doesn't it mean that it should only be able to read as much as 4KB of data in one second?
This is the screenshot of Postman result
Reference -
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.ReadWriteCapacityMode.html#HowItWorks.ProvisionedThroughput.Manual
One read request unit represents one strongly consistent read request, or two eventually consistent read requests, for an item up to 4 KB in size. Transactional read requests require 2 read request units to perform one read for items up to 4 KB. If you need to read an item that is larger than 4 KB, DynamoDB needs additional read request units. The total number of read request units required depends on the item size, and whether you want an eventually consistent or strongly consistent read. For example, if your item size is 8 KB, you require 2 read request units to sustain one strongly consistent read, 1 read request unit if you choose eventually consistent reads, or 4 read request units for a transactional read request.
I have a requirement of reading multiple files (105 files) from ADLS(Azure data lake storage); parsing them and subsequently adding the parsed data directly to multiple collections in azure cosmos db for mongodb api. All this needs to be done in one request. Average file size is 120kb.
The issue is that after multiple documents are added,an error is raised "request size limit too large"
Please let me know if someone has any inputs on this.
It's unclear how you're performing multi-document inserts but... You can't increase maximum request size. You'll need to perform individual inserts, or insert in smaller batches.
I am querying the Google cloud Data using Bigquery.
When i am running the query it return about 8 millions of row.
But it throws error :
Response too large to return
How i can get all 8 million records,can anybody help.
1. What is the maximum size of Big Query Response?
As it's mentioned on Quota-policy queries maximum response size: 128 MB compressed (unlimited when returning large query results)
2. How do we select all the records in Query Request not in 'Export Method'?
If you plan to run a query that might return larger results, you can set allowLargeResults to true in your job configuration.
Queries that return large results will take longer to execute, even if the result set is small, and are subject to additional limitations:
You must specify a destination table.
You can't specify a top-level ORDER BY, TOP or LIMIT clause. Doing so negates the benefit of using allowLargeResults, because the query output can no longer be computed in parallel.
Window functions can return large query results only if used in conjunction with a PARTITION BY clause.
Read more about how to paginate to get the results here and also read from the BigQuery Analytics book, the pages that start with page 200, where it is explained how Jobs::getQueryResults is working together with the maxResults parameter and int's blocking mode.
Update:
Query Result Size Limitations - Sometimes, it is hard to know what 128 MB of compressed
data means.
When you run a normal query in BigQuery, the response size is limited to 128 MB
of compressed data. Sometimes, it is hard to know what 128 MB of compressed
data means. Does it get compressed 2x? 10x? The results are compressed within
their respective columns, which means the compression ratio tends to be very
good. For example, if you have one column that is the name of a country, there
will likely be only a few different values. When you have only a few distinct
values, this means that there isn’t a lot of unique information, and the column
will generally compress well. If you return encrypted blobs of data, they will
likely not compress well because they will be mostly random. (This is explained on the book linked above on page 220)
try this,
Under the query window, there is an button 'Show Options', click that and then you will see some options,
select or create a new destination table;
click the 'Allow Large Results'
run your query, and see whether it works.
I'm working with the Node.js AWS SDK for S3. I have a zip file output stream that I'd like to upload to S3 bucket. Seems simple enough reading the docs. But I noticed there are optional part size and queue size parameters, I was wondering what exactly are these? Should I use them? If so how do I determine appropriate values? Much appreciated.
This is a late response.
Multiple parts can be queued and sent in parallel, the size of this parts is the parameter partSize
The queueSize parameter is how many parts you can process.
The max memory usage is partSize * queueSize so i think the values you are looking depends on the memory available in your machine.