Most performant way to Insert or Read(if record already exists) in Google Cloud Spanner - node.js

Assuming I have a cars table where vin is the primary key.
I want to insert a record(in a transaction) or read the record(if one already exists with the same PK).
What's the most performant way to insert the record or read it if one already exists with the same PK?
This is my current approach:
Case A: Record does not exist
Insert record
Return record
Case B: Record already exists
Insert record
Check if error is due to the record already existing
Read the record
Return record
const car = { vin: '123', make: 'honda', model: 'accord' };
spannerDatabase.runTransactionAsync(async (databaseTransaction) => {
try {
// Try to insert car
await databaseTransaction.insert('cars', car);
await databaseTransaction.commit();
return car;
} catch (error) {
await databaseTransaction.end();
// Spanner "row already exists" error. Insert failed because there is already a record with the same vin(PK)
if (error.code === 6) {
// Since the record already exists, I want to read it and return it. Whats the most performant way to do this?
const existingRecord = await carsTable.read({
columns: ['vin', 'make', 'model'],
keys: [car.vin],
json: true,
});
return existingRecord;
}
}
})

As #skuruppu mentioned in the comment above, your current example is mostly fine for what you are describing. It does however implicitly assume a couple of things, as you are not executing the read and the insert in the same transaction. That means that the two operations together are not atomic, and other transactions might update or delete the record between your two operations.
Also, your approach assumes that scenario A (record does not exist) is the most probable. If that is not the case, and it is just as probable that the record does exist, then you should execute the read in the transaction before the write.
You should also do that if there are other processes that might delete the record. Otherwise, another process might delete the record after you tried to insert the record, but before you try to read it (outside the transaction).
The above is only really a problem if there are other processes that might delete or alter the record. If that is not the case, and also won't be in the future, this is only a theoretical problem.
So to summarize:
Your example is fine if scenario A is the most probable and no other process will ever delete any records in the cars table.
You should execute the read before the write using the same read/write transaction for both operations if any of the conditions in 1 are not true.
The read operation that you are using in your example is the most efficient way to read a single row from a table.

Related

What happens in CouchDB when I create an index repeatedly?

To implement sorting, in CouchDB we have to create an index (otherwise the corresponding mango query fails). I haven't found a way to do this in Fauxton (if I have missed something, please comment in Github), so I've decided to create it programmatically. As I'm using couchdb-nano, I've added:
this.clientAuthPromise.then(async () => {
try {
await this.client.use('test_polling_storage').createIndex({
index: {
fields: [
'isoDate',
],
},
name: 'test_polling_storage--time_index',
})
console.log('index created?')
} catch (error) {
console.log(`failed to create index:`, error)
}
})
into the storage class constructor, where
this.clientAuthPromise = this.client.auth(connectionParams.auth.user, connectionParams.auth.password)
Now, on each run of the server, I'm getting index created?, so the createIndex method (which presumably POSTs to /db/_index) doesn't fail (and sorting works, too). But as I haven't found indexes viewer in Fauxton either, I wonder what actually happens on each call of createIndex: does it create a new index? Does it rebuild the index? Or sees that the index with such name already exists and doesn't do anything? It's annoying to deal with this in a blind fashion, so please clarify or suggest a way to clarify.
Ok, as the docs suggest that the response will contain "created" or "exists", I've tried
const result = await this.client.use('test_polling_storage').createIndex({
...
console.log('index created?', result.result)
got index created? exists and concluded that if the index was created before, it won't be re-created. It's not clear what will happen if I try to change the index, but at least now I have a mean to find out.

Typeorm querybuilder update get updated result

I'm running a query-builder that updates multiple users based on last logged in date, meaning I don't know which users are getting updated. Here is my code, which does not provide information about which users it updated.
await getConnection()
.createQueryBuilder()
.update(User)
.set({
isDeactivated: true
})
.where('lastConnected < :someTimeAgo', { someTimeAgo })
.andWhere('isDeactivated = :isDeactivated', { isDeactivated: false })
.execute()
.then(result => {
// Result: UpdateResult { generatedMaps: [], raw: undefined }
})
How can I access the updated data? Database is SQL Server.
Normally you cannot find which rows were updated by an UPDATE statement in SQL, hence tyeorm cannot tell you.
Here are several solutions if you REALLY need to know which rows were updated.
Before go ahead, ask WHY do you need to know? Do you REALLY need to know?
If, after careful consideration, you find you need to know which rows were updated, there are several solutions:
In your code, find the users to be deleted, then delete them one at a time, logging info on each one as you go.
Create a table in the database containing the user id's to be deactivated. Populate this table first: INSERT INTO deactivatedusers (userid) SELECT userid FROM users WHERE ... then run UPDATE users SET isDeactivated = 1 WHERE userid IN SELECT userid FROM deactivatedusers then to find which users were deactivated: SELECT userid FROM deactivatedusers and finally clear deactivatedusers ready for next time, either with DELETE FROM deactivatedusers or TRUNCATE TABLE deactivatedusers
Since you are using MS SQL Server, this provides OUTPUT INTO specifically to do what you are asking (non standard SQL, so only works with this DBMS). If you decide to use this approach, you should write a stored procedure to do the update and return the updated data back to caller, then call this stored proc from typeorm.

How to perform multiple inserts In knex transaction with dependent validity checks between inserts?

I'm writing a multi-step knex transaction that needs to insert multiple records into a table. Each of these records needs to pass a validity check written in Node before it is inserted.
I would batch perform the validity checks then insert all at once except that two records might invalidate each other. I.e. if I insert record 1, record 2 might no longer be valid.
Unfortunately, I can't seem to query the database in a transaction-aware fashion. During the the validity-check of the second record, my queries (used for the validity checks) do not show that the first insert exists.
I'm using the trx (transaction) connection object rather than the base knex object. I expected this would fix it since the transaction connection object is supposed to be promise aware, but alas it did not.
await knex.transaction(async trx => {
/* perform validity checks and insert if valid */
/* Check the record slated to be created
* against the rules to ensure validity. */
relationshipValidityChecks = await Promise.all(
relationshipsToCreate.map(async r => {
const obj = {
fromId: r.fromId,
toId: r.toId,
type: r.type,
...(await relationshipIsValid( // returns an object with validity boolean
r.fromId,
r.toId,
r.type,
trx // i can specify which connection obj to use (trx/knx)
))
};
if (obj.valid) {
await trx.raw(
`
insert into t1.relationship
(from_id, to_id, type) values
(?, ?, ?)
returning relationship_key;
`,
[r.fromId, r.toId, r.type]
);
}
return obj;
})
);
}
When I feed in two records that are valid by themselves but invalidate each other, the first record should be inserted and the second record should return an invalid error. The relationshipIsValid is somewhat complicated so I left it out, but I'm certain it works as expected because if I feed the aforementioned two records in separately (i.e. in two different endpoint calls) the second will return the invalid error.
Any help would be greatly appreciated. Thanks!

Node.js avoid db race condition with cluster/pm2

I have a Node application which runs in cluster mode with pm2.
I also have a function which checks if a specific row is in a db table. If the row is missing it creates the row otherwise a value is set and saved.
I only need one row for each combination of userId and groupId.
function someFunction()={
return Activation.findOne({ where: { userId: userId, groupId: groupId } })
.then(activationObject => {
if (!activationObject) {
return Activation.create({ userId: userId, groupId: groupId, activationTime: sequelize.fn('NOW') })
} else {
activationObject.activationTime = sequelize.fn('NOW');
return activationObject.save()
}
})
}
How can I avoid race conditions when running node in cluster mode?
Currently if first worker checks the row is available and the second checks at the same time both get no result and in the end we have two newly created rows instead of one.
I know that Sequelize provides a findOrCreate() method but I wanted an easy understandable example.
The easiest way would be to add a UNIQUE constraint for the combination of userId and groupId with an ON CONFLICT REPLACE clause, and always create a new row instead of updating. This will cause a newly inserted row with the new activationTime to replace the old row.
You can additionally check the number of rows inserted to tell whether the insert succeeded or not.
Example: UNIQUE (userId, groupId) ON CONFLICT REPLACE

How do I see output of SQL query in node-sqlite3?

I read all the documentation and this seemingly simple operation seems completely ignored throughout the entire README.
Currently, I am trying to run a SELECT query and console.log the results, but it is simply returning a database object. How do I view the results from my query in Node console?
exports.runDB = function() {
db.serialize(function() {
console.log(db.run('SELECT * FROM archive'));
});
db.close();
}
run does not have retrieval capabilities. You need to use all, each, or get
According to the documentation for all:
Note that it first retrieves all result rows and stores them in
memory. For queries that have potentially large result sets, use the
Database#each function to retrieve all rows or Database#prepare
followed by multiple Statement#get calls to retrieve a previously
unknown amount of rows.
As an illistration:
db.all('SELECT url, rowid FROM archive', function(err, table) {
console.log(table);
});
That will return all entries in the archive table as an array of objects.

Resources