Node JS - Big Query update table with a request object - node.js

Based on below request object I need to update the table
var reqBody = {
"Name":"testing11",
"columns":[
{
"fieldExistsIn":"BOTH",
"columnWidth":5,
"hide":false
},
{
"fieldExistsIn":"BOTH",
"columnWidth":10,
"hide":false
}
],
"Range":{
"startDate":"20-Oct-2022",
"endDate":"26-Oct-2022"
}
}
UPDATE table_name
SET requestData = reqBody
WHERE requestData.Name = reqBody.oldName;
I am doing the insertion using the below query
await bigquery
.dataset(datasetId)
.table(tableId)
.insert(reqBody);
For table schema you can refer the below question
Node JS - Big Query insert to a request object fully into a record data type

As per the table schema and sample data you have provided, I tried to replicate it on my end.
Table schema:
As per your requirement, I have modified the code by referring to the queryParamsStructs and query sample codes from Google BigQuery Node.js Client API. To update multiple columns in a table using a JSON object through the BigQuery Client API, we have to write an UPDATE query and pass it in the code. The JSON object should be passed in params and you should have to access that JSON object in the UPDATE query as like below:
const {BigQuery} = require('#google-cloud/bigquery');
const bigquery = new BigQuery();
async function query() {
// Queries the U.S. given names dataset for the state of Texas.
const query = `UPDATE
\`ProjectID.DatasetID.TableID\`
SET reqData.columns = ARRAY(
SELECT AS STRUCT * FROM UNNEST(#reqData.columns)
),
reqData.Range.startDate = CAST(#reqData.Range.startDate AS DATE),
reqData.Range.endDate = CAST(#reqData.Range.endDate AS DATE)
WHERE reqData.Name = #reqData.Name`;
// For all options, see https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/query
const options = {
query: query,
// Location must match that of the dataset(s) referenced in the query.
location: 'US',
params: {
"reqData": {
"Name": "testing",
"columns": [
{
"fieldExistsIn": "One",
"columnWidth": 1,
"hide": true
},
{
"fieldExistsIn": "four",
"columnWidth": 4,
"hide": true
}
],
"Range":{
"startDate": "2021-1-11",
"endDate": "2022-2-22"
}
}
},
};
// Run the query as a job
const [job] = await bigquery.createQueryJob(options);
console.log(`Job ${job.id} started.`);
// Wait for the query to finish
const [rows] = await job.getQueryResults();
// Print the results
console.log('Rows:');
rows.forEach(row => console.log(row));
}
// [END bigquery_query]
query();
Initial Data:
Updated Result:
Note: If you have inserted the rows to the table recently by using streaming insert, the rows cannot be modified with UPDATE, DELETE or MERGE within the last 30 minutes. Refer to this limitations doc for more information. If you tried to update a table within the last 30 mins, you will get the below error:
UnhandledPromiseRejectionWarning: Error: UPDATE or DELETE statement over table projectID.datasetID.tableID would affect rows in the streaming buffer, which is not supported

Related

How to get data from DynamoDB sorted by timestamp ? - NodeJs

I want to get data form dynamoDB, shorted by timestamp. Anyone can help? My code is given below.
const AWS = require("aws-sdk");
const dynamoDbClient = new AWS.DynamoDB.DocumentClient();
const USERS_TABLE = process.env.USERS_TABLE;
const getNews = async (req, res) => {
try {
//dynamodb params
const params = {
TableName: USERS_TABLE,
FilterExpression: "PK = :this",
ExpressionAttributeValues: { ":this": "newsTable" },
};
//get dynamodb data
const data = await dynamoDbClient.scan(params).promise();
res.status(200).send({ data: data });
} catch (e) {
return res.status(400).send({ message: e.message });
}
};
module.exports = { getNews };
Option 1: Keep Scan; Sort client-side
Works for small tables only. Single Scan call will scan only the first 1 MB of data in the table.
If you're doing scan operation as in your code example, it's impossible to get results sorted from DynamoDB. The only way to sort them is on client-side after you download all your data from database.
Replace:
res.status(200).send({ data: data });
With:
res.status(200).send({data: data.sort((a, b) => b.date - a.date)});
However, this is not recommended, since Scan operation without pagination will scan only 1st MB of data in your table. So you could get partial results. Possible solutions are:
Option 2: (recommended) Don't use Scan; Use Query; Sort by secondary key
This will work if you have your timestamp in the secondary key of the table
Don't use Scan; Use Query -- that way you can sort your data by SK (secondary key) by passing the ScanIndexForward: false to get the most recent results first.
Assuming you have such a table schema, where a timestamp is in the secondary key:
PK
SK
email
newsTable
2022-01-01
some-1#example.com
newsTable
2022-02-01
some-2#example.com
newsTable
2022-03-01
some-3#example.com
You can change your code from:
const params = {
TableName: USERS_TABLE,
FilterExpression: 'PK = :this',
ExpressionAttributeValues: {':this': 'newsTable'},
};
//get dynamodb data
const data = await dynamoDbClient.scan(params).promise();
To:
const params = {
TableName: USERS_TABLE,
KeyConditionExpression: 'PK = :this',
ExpressionAttributeValues: {':this': 'newsTable'},
ScanIndexForward: false,
};
//get dynamodb data
const data = await dynamoDbClient.query(params).promise();
And it will return results sorted from database already.
If you don't have a timestamp in your secondary key, and you cannot add it, you can add Local Secondary Index or Global Secondary Index.
Option 3: Keep Scan, but paginate; Sort client-side
Works if you cannot change DB schema and cannot switch your code to the Query operation.
Beware, it will be much more expensive, much slower. The larger the table, the slower it gets.
If you absolutely need to use Scan, you need to paginate through all the pages of the Scan operation, and then sort results in the JS code, like I described before. I've developed a handy library that makes scanning in parallel and supports pagination.

knex js query many to many

i'm having trouble with node & knex.js
I'm trying to build a mini blog, with posts & adding functionality to add multiple tags to post
I have a POST model with following properties:
id SERIAL PRIMARY KEY NOT NULL,
name TEXT,
Second I have Tags model that is used for storing tags:
id SERIAL PRIMARY KEY NOT NULL,
name TEXT
And I have many to many table: Post Tags that references post & tags:
id SERIAL PRIMARY KEY NOT NULL,
post_id INTEGER NOT NULL REFERENCES posts ON DELETE CASCADE,
tag_id INTEGER NOT NULL REFERENCES tags ON DELETE CASCADE
I have managed to insert tags, and create post with tags,
But when I want to fetch Post data with Tags attached to that post I'm having a trouble
Here is a problem:
const data = await knex.select('posts.name as postName', 'tags.name as tagName'
.from('posts')
.leftJoin('post_tags', 'posts.id', 'post_tags.post_id')
.leftJoin('tags', 'tags.id', 'post_tags.tag_id')
.where('posts.id', id)
Following query returns this result:
[
{
postName: 'Post 1',
tagName: 'Youtube',
},
{
postName: 'Post 1',
tagName: 'Funny',
}
]
But I want the result to be formated & returned like this:
{
postName: 'Post 1',
tagName: ['Youtube', 'Funny'],
}
Is that even possible with query or do I have to manually format data ?
One way of doing this is to use some kind of aggregate function. If you're using PostgreSQL:
const data = await knex.select('posts.name as postName', knex.raw('ARRAY_AGG (tags.name) tags'))
.from('posts')
.innerJoin('post_tags', 'posts.id', 'post_tags.post_id')
.innerJoin('tags', 'tags.id', 'post_tags.tag_id')
.where('posts.id', id)
.groupBy("postName")
.orderBy("postName")
.first();
->
{ postName: 'post1', tags: [ 'tag1', 'tag2', 'tag3' ] }
For MySQL:
const data = await knex.select('posts.name as postName', knex.raw('GROUP_CONCAT (tags.name) as tags'))
.from('posts')
.innerJoin('post_tags', 'posts.id', 'post_tags.post_id')
.innerJoin('tags', 'tags.id', 'post_tags.tag_id')
.where('posts.id', id)
.groupBy("postName")
.orderBy("postName")
.first()
.then(res => Object.assign(res, { tags: res.tags.split(',')}))
There are no arrays in MySQL, and GROUP_CONCAT will just concat all tags into a string, so we need to split them manually.
->
RowDataPacket { postName: 'post1', tags: [ 'tag1', 'tag2', 'tag3' ] }
The result is correct as that is how SQL works - it returns rows of data. SQL has no concept of returning anything other than a table (think CSV data or Excel spreadsheet).
There are some interesting things you can do with SQL that can convert the tags to strings that you concatenate together but that is not really what you want. Either way you will need to add a post-processing step.
With your current query you can simply do something like this:
function formatter (result) {
let set = {};
result.forEach(row => {
if (set[row.postName] === undefined) {
set[row.postName] = row;
set[row.postName].tagName = [set[row.postName].tagName];
}
else {
set[row.postName].tagName.push(row.tagName);
}
});
return Object.values(set);
}
// ...
query.then(formatter);
This shouldn't be slow as you're only looping through the results once.

Dynamo DB Query Filter Node.js

Running a Node.js serverless backend through AWS.
Main objective: to filter and list all LOCAL jobs (table items) that included the available services and zip codes provided to the filter.
Im passing in multiple zip codes, and multiple available services.
data.radius would be an array of zip codes = to something like this:[ '93901', '93902', '93905', '93906', '93907', '93912', '93933', '93942', '93944', '93950', '95377', '95378', '95385', '95387', '95391' ]
data.availableServices would also be an array = to something like this ['Snow removal', 'Ice Removal', 'Salting', 'Same Day Response']
I am trying to make an API call that returns only items that have a matching zipCode from the array of zip codes provided by data.radius, and the packageSelected has a match of the array data.availableServices provided.
API CALL
import * as dynamoDbLib from "./libs/dynamodb-lib";
import { success, failure } from "./libs/response-lib";
export async function main(event, context) {
const data = JSON.parse(event.body);
const params = {
TableName: "jobs",
FilterExpression: "zipCode = :radius, packageSelected = :availableServices",
ExpressionAttributeValues: {
":radius": data.radius,
":availableServices": data.availableServices
}
};
try {
const result = await dynamoDbLib.call("query", params);
// Return the matching list of items in response body
return success(result.Items);
} catch (e) {
return failure({ status: false });
}
Do I need to map the array of zip codes and available services first for this to work?
Should I be using comparison operators?
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/LegacyConditionalParameters.QueryFilter.html
Is a sort key value or partition key required to query and filter? (the table has a sort key and partition key but i would like to avoid using them in this call)
Im not 100% sure on how to go about this so if anyone could point me in the right direction that would be wonderful and greatly appreciated!!
I'm not sure what your dynamodb-lib refers to but here's an example of how you can scan for attribute1 in a given set of values and attribute2 in a different set of values. This uses the standard AWS JavaScript SDK, and specifically the high-level document client.
Note that you cannot use an equality (==) test here, you have to use an inclusion (IN) test. And you cannot use query, but must use scan.
const AWS = require('aws-sdk');
let dc = new AWS.DynamoDB.DocumentClient({'region': 'us-east-1'});
const data = {
radius: [ '93901', '93902', '93905', '93906', '93907', '93912', '93933', '93942', '93944', '93950', '95377', '95378', '95385', '95387', '95391' ],
availableServices: ['Snow removal', 'Ice Removal', 'Salting', 'Same Day Response'],
};
// These hold ExpressionAttributeValues
const zipcodes = {};
const services = {};
data.radius.forEach((zipcode, i) => {
zipcodes[`:zipcode${i}`] = zipcode;
})
data.availableServices.forEach((service, i) => {
services[`:services${i}`] = service;
})
// These hold FilterExpression attribute aliases
const zipcodex = Object.keys(zipcodes).toString();
const servicex = Object.keys(services).toString();
const params = {
TableName: "jobs",
FilterExpression: `zipCode IN (${zipcodex}) AND packageSelected IN (${servicex})`,
ExpressionAttributeValues : {...zipcodes, ...services},
};
dc.scan(params, (err, data) => {
if (err) {
console.log('Error', err);
} else {
for (const item of data.Items) {
console.log('item:', item);
}
}
});

How to pass query statement to bigquery in node.js environment

During the big query, the parameters of the function in the SQL statement
I want to update the result of a sql statement by inserting it as # variable name.
However, there is no method to support node.js.
For python, there are methods like the following example.
You can use the function's parameters as # variable names.
query = "" "
SELECT word, word_count
FROM `bigquery-public-data.samples.shakespeare`
WHERE corpus = # corpus
AND word_count> = #min_word_count
ORDER BY word_count DESC;"" "
query_params = [
bigquery.ScalarQueryParameter ('corpus', 'STRING', 'romeoandjuliet'),
bigquery.ScalarQueryParameter ('min_word_count', 'INT64', 250)]
job_config = bigquery.QueryJobConfig ()
job_config.query_parameters = query_params
related document:
https://cloud.google.com/bigquery/docs/parameterized-queries#bigquery-query-params-python
I would like to ask for advice.
BigQuery node.js client supports parameterized queries when you pass them with the params key in options. Just updated the docs to show this. Hope this helps!
Example:
const sqlQuery = `SELECT word, word_count
FROM \`bigquery-public-data.samples.shakespeare\`
WHERE corpus = #corpus
AND word_count >= #min_word_count
ORDER BY word_count DESC`;
const options = {
query: sqlQuery,
// Location must match that of the dataset(s) referenced in the query.
location: 'US',
params: {corpus: 'romeoandjuliet', min_word_count: 250},
};
// Run the query
const [rows] = await bigquery.query(options);
let ip_chunk = "'1.2.3.4', '2.3.4.5', '10.20.30.40'"
let query = `
SELECT
ip_address.ip as ip,
instance.zone as zone,
instance.name as vmName,
instance.p_name as projectName
FROM
\`${projectId}.${datasetId}.${tableId}\` instance,
UNNEST(field_x.DATA.some_info) ip_address
WHERE ip_address.networkIP IN (${ip_chunk})`
**Use - WHERE ip_address.networkIP in (${ip_chunk})
instead of - WHERE ip in (${ip_chunk})**
It is worth adding that you can create a stored procedure and pass parameters the same way as the accepted answer shows.
const { BigQuery } = require('#google-cloud/bigquery');
function testProc() {
return new Promise((resolve) => {
const bigquery = new BigQuery();
const sql = "CALL `my-project.my-dataset.getWeather`(#dt);";
const options = {
query: sql,
params: {dt: '2022-09-01'},
location: 'US'
};
// Run the query
const result = bigquery.query(options);
return result.then((rows) => {
console.log(rows);
resolve(rows);
});
});
}
testProc().catch((err) => { console.error(JSON.stringify(helpers.getError(err.message))); });

how to get one page data list and total count from database with knex.js?

I have a user table with some records(such as 100), how can I get one page data and total count from it when there are some where conditions?
I tried the following:
var model = knex.table('users').where('status',1).where('online', 1)
var totalCount = await model.count();
var data = model.offset(0).limit(10).select()
return {
totalCount: totalCount[0]['count']
data: data
}
but I get
{
"totalCount": "11",
"data": [
{
"count": "11"
}
]
}
, how can I get dataList without write where twice? I don't want to do like this:
var totalCount = await knex.table('users').where('status',1).where('online', 1).count();
var data = await knex.table('users').where('status',1).where('online', 1).offset(0).limit(10).select()
return {
totalCount: totalCount[0]['count']
data: data
}
Thank you :)
You probably should use higher level library like Objection.js which has already convenience method for getting pages and total count.
You can do it like this with knex:
// query builder is mutable so when you call method it will change builders internal state
const query = knex('users').where('status',1).where('online', 1);
// by cloning original query you can reuse common parts of the query
const total = await query.clone().count();
const data = await query.clone().offset(0).limit(10);

Resources