Using Node shortid / nanoid and checking for collisions in DB - node.js

I am working on a self-education project building a URL shortener in Node. I was going to use shortid, but that's been deprecated so I switched to nanoid. My concern is the eventual possibility of a collision with a generated ID and an existing ID in the DB (via Knex). The concern isn't exactly
"crypto-grade", more of a functional issue with the app crashing due to a new short URL ID already existing in the DB. I have come up with the following hypothetical solution (not my actual code!). Does it seem like the right (best, efficient, beautiful, etc.) way of doing this?
var shortId = nanoid();
while (knex.('urls').where('shortID',shortId).first() != NULL) {
shortId = nanoid();
}

You can create an ID based on the timestamps, this is how MongoDB uses its id indexing mechanism.
new Date().valueOf() //1606597014945
You can play with it and add letters, you can shrink it to another letter mechanism.
Hope it helped you :)

Related

Getting database names from server

I want to do a simple thing: get the database names on a RavenDB server. Looks straightforward according to the docs (https://ravendb.net/docs/article-page/4.1/csharp/client-api/operations/server-wide/get-database-names), however I'm facing a chicken-and-egg problem.
The problem comes because I want to get the database names without knowing them in advance. The code in the docs works great, but requires to have an active connection to a DocumentStore. And to get an active connection to a DocumentStore, is mandatory to select a valid database. Otherwise I can't execute the GetDatabaseNamesOperation.
That makes me think that I'm missing something. Is there any way to get the database names without having to know at least one of them?
The database isn't mandatory to open a store. Following code works with no problems:
using (var store = new DocumentStore
{
Urls = new[] { "http://live-test.ravendb.net" }
})
{
store.Initialize();
var dbs = store.Maintenance.Server.Send(new GetDatabaseNamesOperation(0, 25));
}
We send GetDatabaseNamesOperation to the ServerStore, which is common for all databases and holds common data (like database names).

Generating a unique key for dynamodb within a lambda function

DynamoDB does not have the option to automatically generate a unique key for you.
In examples I see people creating a uid out of a combination of fields, but is there a way to create a unique ID for data which does not have any combination of values that can act as a unique identifier? My questions is specifically aimed at lambda functions.
One option I see is to create a uuid based on the timestamp with a counter at the end, insert it (or check if it exists) and in case of duplication retry with an increment until success. But, this would mean that I could potentially run over the execution time limit of the lambda function without creating an entry.
If you are using Node.js 8.x, you can use uuid module.
var AWS = require('aws-sdk'),
uuid = require('uuid'),
documentClient = new AWS.DynamoDB.DocumentClient();
[...]
Item:{
"id":uuid.v1(),
"Name":"MyName"
},
If you are using Node.js 10.x, you can use awsRequestId without uuid module.
var AWS = require('aws-sdk'),
documentClient = new AWS.DynamoDB.DocumentClient();
[...]
Item:{
"id":context.awsRequestId,
"Name":"MyName"
},
The UUID package available on NPM does exactly that.
https://www.npmjs.com/package/uuid
You can choose between 4 different generation algorithms:
V1 Timestamp
V3 Namespace
V4 Random
V5 Namespace (again)
This will give you:
"A UUID [that] is 128 bits long, and can guarantee uniqueness across
space and time." - RFC4122
The generated UUID will look like this: 1b671a64-40d5-491e-99b0-da01ff1f3341
If it's too long, you can always encode it in Base64 to get G2caZEDVSR6ZsAAA2gH/Hw but you'll lose the ability to manipulate your data through the timing and namespace information contained in the raw UUID (which might not matter to you).
awsRequestId looks like its actually V.4 UUID (Random), code snippet below:
exports.handler = function(event, context, callback) {
console.log('remaining time =', context.getRemainingTimeInMillis());
console.log('functionName =', context.functionName);
console.log('AWSrequestID =', context.awsRequestId);
callback(null, context.functionName);
};
In case you want to generate this yourself, you can still use https://www.npmjs.com/package/uuid or Ulide (slightly better in performance) to generate different versions of UUID based on RFC-4122
For Go developers, you can use these packages from Google's UUID, Pborman, or Satori. Pborman is better in performance, check these articles and benchmarks for more details.
More Info about Universal Unique Identifier Specification could be found here.
We use idgen npm package to create id's. There are more questions on the length depending upon the count to increase or decrease the size.
https://www.npmjs.com/package/idgen
We prefer this over UUID or GUID's since those are just numbers. With DynamoDB it is all characters for guid/uuid, using idgen you can create more id's with less collisions using less number of characters. Since each character has more ranges.
Hope it helps.
EDIT1:
Note! As of idgen 1.2.0, IDs of 16+ characters will include a 7-character prefix based on the current millisecond time, to reduce likelihood of collisions.
if you using node js runtime, you can use this
const crypto = require("crypto")
const uuid = crypto.randomUUID()
or
import { randomUUID } from 'crypto'
const uuid = randomUUID()
Here is a better solution.
This logic can be build without any library used because importing a lambda function layer can get difficult sometimes. Below you can find the link for the code which will generate the unique id and save it in the SQS queue, rather than DB which will incur the cost for writing, fetching, and deleting the ids.
There is also a cloudformation template provided, which you can go and deploy in your account, and it will setup the whole application. A detailed explanation is provided in the link.
Please refer to the link below.
https://github.com/tanishk97/UniqueIdGeneration_AWS_CFT/wiki

Can you use CouchDB 'document update handlers' with replication?

I am replicating docs from DB A to DB B, every time a Doc from DB A arrives in DB B I want to run a 'stored procedure' to remove most of the fields from DB A (DB A is private, but has attachments that I want to be publicly available)
So far I've seen that this might be achieved using the _changes feed (continuous)and then running an 'update' handler on each document.
The document update handlers doc: https://wiki.apache.org/couchdb/Document_Update_Handlers
This seems like something that CouchDB would implement for me... (and I'm not really sure yet how to do the above).
Is there something like a 'hook' that can be run on every document that enters the database?
== EDIT ==
It seems that I would want to somehow include the update handler command in the replication trigger?
It sounds like with some changes to how your storing documents you may be able to benefit from CouchDB's filtered replication. You'd need to store the attachments in documents that could be equivalently copied (without modification) between the two databases.
If that's not an option, then you could potentially use transform-pouchdb plus PouchDB's .replicate.from() method to manage the replication.
Some quick pseudo-code for this idea looks a bit like this:
var PouchDB = require('pouchdb');
PouchDB.plugin(require('transform-pouch'));
var dbA = new PouchDB('a'); // "a" could be a URL to CouchDB or Cloudant
var dbB = new PouchDB('b');
dbB.transform({
incoming: function (doc) {
// do something to the document before storage
return doc;
}
});
dbB.replicate.from(dbA);
In theory, that (or something like it) should do what you're wanting...or at least giving you the framework in which to do what you're wanting. ^_^
Hope that helps!

Delete multiple couchbase entities having common key pattern

I have a use case where I have to remove a subset of entities stored in couchbase, e.g. removing all entities with keys starting with "pii_".
I am using NodeJS SDK but there is only one remove method which takes one key at a time: http://docs.couchbase.com/sdk-api/couchbase-node-client-2.0.0/Bucket.html#remove
In some cases thousands of entities need to be deleted and it takes very long time if I delete them one by one especially because I don't keep list of keys in my application.
I agree with the #ThinkFloyd when he saying: Delete on server should be delete on server, rather than requiring three steps like get data from server, iterate over it on client side and finally for each record fire delete on the server again.
In this regards, I think old fashioned RDBMS were better all you need to do is 'DELETE * from database where something=something'.
Fortunately, there is something similar to SQL is available in CouchBase called N1QL (pronounced nickle). I am not aware about JavaScript (and other language syntax) but this is how I did it in python.
Query to be used: DELETE from <bucketname> b where META(b).id LIKE "%"
layer_name_prefix = cb_layer_key + "|" + "%"
query = ""
try:
query = N1QLQuery('DELETE from `test-feature` b where META(b).id LIKE $1', layer_name_prefix)
cb.n1ql_query(query).execute()
except CouchbaseError, e:
logger.exception(e)
To achieve the same thing: alternate query could be as below if you are storing 'type' and/or other meta data like 'parent_id'.
DELETE from <bucket_name> where type='Feature' and parent_id=8;
But I prefer to use first version of the query as it operates on key, and I believe Couchbase must have some internal indexes to operate/query faster on key (and other metadata).
The best way to accomplish this is to create a Couchbase view by key and then range query over that view via your NodeJS code, making deletes on the results.
http://docs.couchbase.com/admin/admin/Views/views-querySample.html
http://docs.couchbase.com/couchbase-manual-2.0/#couchbase-views-writing-querying-selection-partial
http://docs.couchbase.com/sdk-api/couchbase-node-client-2.0.8/ViewQuery.html
For example, your Couchbase view could look like the following:
function(doc, meta) {
emit(meta.id, null);
}
Then in your NodeJS code, you could have something that looks like this:
var couchbase = require('couchbase');
var ViewQuery = couchbase.ViewQuery;
var query = ViewQuery.from('designdoc', 'by_id');
query.range("pii_", "pii_" + "\u0000", false);
var myBucket = myCluster.openBucket();
myBucket.query(query, function(err, results) {
for(i in results) {
// Delete code in here
}
});
Of course your Couchbase design document and view will be named differently than the example that I gave, but the important part is the ViewQuery.range function that was used.
All document ids prefixed with pii_ would be returned, in which case you can loop over them and start deleting.
Best,

node.js + mongoose connection and creation issue

I just want to know if when I set a mongoose connection and I define some models, (previously adding their appropriate requires on app.js, or wathever), the model, if not exist, will be created automatically the first time when I run node app.js?
Is this kind of logic correct?
If not, do I have to create before my mongoDB collections, models and so on?
I was thinking to an automatic creation of the mongo db collection when I first run the app.js
Thanks!
Michele Prandina
Schemas (and models) are a client-side (node.js) manifestation of your data model. A few things, like the indexes you've defined, are created upon first use (like saving a document for example). Nearly everything else is delay created, including collections.
If you want consistent behavior regarding your models (and their associated schemas), you'll need to make sure they're loaded prior to any access of the associated database. It doesn't really matter where you put them, as long as they are created/executed prior to usage. You might for example:
app.js
models\Cheese.js
\Cracker.js
Then, in app.js:
var Cheese = require('Cheese.js');
var Cracker = require('Cracker.js');
Assuming, of course, you've exported the models:
model.exports = mongoose.model('Cheese',
new mongoose.Schema({
name: String,
color: String
})
);

Resources