Firebase order by key with numerical string keys - node.js

I saw that the realtime database keys must be strings.
I want to use numbers as keys and do a start at and end at to fetch chunks of data e.g.
admin.database().ref('PARENT').orderByKey().startAt('107').endAt('1032').once('value');
will this do in numerical order e.g. 107,108,109,110,111,112... or will it return in a string order by e.g. 1,11,111,2,22,222
Thank you

Firebase Realtime Database keys are strings and should always be treated as such.
The exception to this happens with when fetching a parent node that that contains child nodes that look like mostly sequential numbers. In that case, the client receives a list or array of results where the string keys actually become the numeric indexes on in the array. However, it's generally not recommended to deal with lists of data this way (and I strongly recommend reading that linked blog post). In particular, this part:
If all of the keys are integers, and more than half of the keys between 0 and the maximum key in the object have non-empty values, then Firebase will render it as an array.

Keys are returned in lexicographical order, so you'll get 1,11,111,2,22,222.
If you want to get them in numerical order, use keys that sort the same lexicographically as they do numerically, by padding with 0's at the start:
{
"001": { ... },
"002": { ... },
"011": { ... },
"022": { ... },
"111": { ... },
"222": { ... }
}
The number of character you use should be picked to allow the maximum numeric value you ever want to support. In the above that maximum value would be 999, but you may have a different requirement and thus pick a different number of characters.
You can then also prefix the keys with a short non-numeric string, to ensure you never hit the array coercion that Doug mentions in his answer:
{
"key001": { ... },
"key002": { ... },
"key011": { ... },
"key022": { ... },
"key111": { ... },
"key222": { ... }
}

Related

Aliasing fields in Apollo GraphQL Server

Aliasing is very handy and works great when aliasing a specific resolver. For instance:
{
admins: users(type:"admin"){
username
}
moderators: users(type:"moderators"){
moderators
}
}
I'm not sure how to handle aliasing of the fields themselves though. For example:
{
site_stats {
hits: sum(field: "hits")
bounces: sum(field: "bounces")
}
}
If the resolver returns any sum value, the same value is aliased to both hits and bounces (which makes sense, since only a single sum value could even be returned). If I make the resolver use the alias names as the field names when returning the results, hits and bounces both become null.
I could simply break those fields out into separate resolvers, but that complicates integration for the front end devs. We would also lose a ton of efficiency benefits, since I can aggregate all the data needed in a single query to our data source (we're using ElasticSearch).
Any help from you geniuses would be greatly appreciated!
Using aliases and single fields has very limited usability.
You can use complex filters (input params), f.e. list of keys to be returned and their associated params, f.e.
[{name:"hits", range:"month"},
{name:"bounces", range:"year"}]
With query - expected structure
{
stats {
name
sum
average
}
}
Required fields may vary, f.e. only name and sum.
You can return arrays of object f.e.
{ stats: [
{ name:"hits",
sum:12345,
average: 456 }
Aliases can be usable here to choose different data sets f.e. name and sum for hits, bounces additionally with average.
... more declarative?
PS. There is nothing that "complicates integration for the front end devs". Result is json, can be converted/transformed/adapted after fetching (clinet side) when needed.
It sounds like you're putting all your logic inside the root-level resolver (site_stats) instead of providing a resolver for the sum field. In other words, if your resolvers look like this:
const resolvers = {
Query: {
site_stats: () => {
...
return { sum: someValue }
},
},
}
you should instead do something like:
const resolvers = {
Query: {
site_stats: () => {
return {} // empty object
},
},
SiteStats: {
sum: () => {
...
return someValue
},
},
}
This way you're not passing down the value for sum from the parent and relying on the default resolver -- you're explicitly providing the value for sum inside its resolver. Since the sum resolver will be called separately for each alias with the arguments specific to that alias, each alias will resolve accordingly.

Why is there no 'position' argument in Relay+GraphQL connections?

GraphQL and Relay has a robust pagination algorithm which enables easy pagination for the end user, allowing pagination even in unbounded and order-independent results.
However, I have a use case that I'm not really sure how to go about doing in GraphQL and relay, and it's quite easy that I'm sure I just missed something.
How do I, for example, get the 5th item (and only the 5th item), if my list is ordered (by, say, an orderBy argument)?
This not very well documented, but here's how to do it.
query {
allPeople(first: 5, last: 1) {
edges {
node {
name
}
}
}
}
First you select first: 5 to get the first 5 people in the list. Then, do last:1 which gets the last person from that subset. In other words - get the fifth person.
If you do (first: 5, last: 2) you would get the 4th and the 5th person in the list.
Demo
(if it returns an error - manually re-type the word query in the query and it will work). Then, try again without first and last to see the whole list and you'll see that Leia is 5th.
If you have an ordered list at the backend and you want to get the element at a particular position, just specify the position value as an argument for the query field. The code for the query field looks like the following:
employee: {
type: EmployeeType,
args: {
position: {
type: new GraphQLNonNull(GraphQLInt)
},
...args,
},
resolve: async (context, {position, ...args}) => {
// Get the ordered list of employees, probably from cache.
// Pick the employee with the requested position in the list.
// Return the employee.
},
},

CouchDB reducing sums with date filter

I'm pretty new to couchdb and map/reduce in general. I have the following view:
{
"_id": "_design/keys",
"views": {
"keys": {
"map": "function(doc) { for (var thing in doc) { if (doc.created_at != null) { emit([thing, doc.created_at],1); } } }",
"reduce": "function(key,values) { return sum(values); }"
}
}
}
This works well to give me a sum of the count of all document keys in the database with the proper group_level:
.../_design/keys/_view/keys?group_level=1
{"rows":[
{"key":["_id"],"value":2},
{"key":["_rev"],"value":2},
{"key":["created_at"],"value":2},
{"key":["testing"],"value":2}
]}
Now what I want to do is reduce these mapped documents by date, which is an IOS8601 string:
{"rows":[
{"key":["_id","2015-11-25T21:13:58Z"],"value":1},
{"key":["_id","2015-11-25T21:14:39Z"],"value":1},
{"key":["_rev","2015-11-25T21:13:58Z"],"value":1},
{"key":["_rev","2015-11-25T21:14:39Z"],"value":1},
{"key":["created_at","2015-11-25T21:13:58Z"],"value":1},
{"key":["created_at","2015-11-25T21:14:39Z"],"value":1},
{"key":["testing","2015-11-25T21:13:58Z"],"value":1},
{"key":["testing","2015-11-25T21:14:39Z"],"value":1}
]}
But I still want the results grouped by the first part of the key. That is, I want to specify a start time of 2015-11-25T21:13:57Z and an end time of 2015-11-25T21:13:59Z, and get back everything with the time stamp of 2015-11-25T21:13:58Z, like so:
{"rows":[
{"key":["_id"],"value":1},
{"key":["_rev"],"value":1},
{"key":["created_at"],"value":1},
{"key":["testing"],"value":1}
]}
How can I do this?
You should use your view function to emit the date component of the timestamp (which as you note is conveniently already in hierarchical structure) as a complex key:
Instead of "2015-11-26T...", emit the key as [2015, 11, 26, 21, 13, 58]
Then you can range query on the complex keys to different levels (year, month, date, time). Note that if you use times other than Zulu time you may need to use the view function to read the tz and emit in Zulu time so that all sort correctly.
Please pardon typos as was entered from mobile device
I had a similar problem just a few days ago and found that List Functions are a pretty easy way to solve this. You could simply use the date as key, the things as values, do the counting in the list function and can still use all the regular view features to define start and end keys.

In a MongoDB will an index help when a field is just being tested on its length?

I am creating a routine to check for interrupted processing and to carry on, during the startup I'm performing the following search:
.find({"DocumentsPath": {$exists: true, $not: {$size: 0}}})
I want it to be as fast as possible, however the documentation suggests that the index is for scanning within the data. I never need to search within the "DocumentsPath" just use it if its there. Creating an index seems like an overhead I don't want. However having the index might speed up the size test.
My question is whether this field should be indexed within the DB?
Thought of commenting but this does deserve an answer. Should this be indexed? Well probably, but for other purposes. Does this make a difference here? No it does not.
The big point to make is your query terms are redundant ( or could be better ) in this case. Let's look at the example:
{ "DocumentsPath": { "$exists": true } }
That will tell you if there is actually an element in a document that matches the property specified. No it does not an cannot use an index. You can use a "sparse" index though and not even need to call that.
{ "DocumentsPath": { "$not": { "$size" : 0 } } }
This is cute one. Yes it tests the length of an array, but what you are really asking here is "I don't want the array to be empty".
So for the better solution.
Use a "sparse" index:
db.collection.ensureIndex({ "DocumentsPath": 1 }, { "sparse": true })
Query for the zeroth element of an index
{ "DocumentsPath.0": { "$exists": true } }
Still no index for "matching" really, but at least the "sparse" index sorted out some of that my excluding documents and the "dot notation" form here is actually more efficient than evaluating via $size.

Randomly generating a value not in mongodb?

I'm trying to generate session ids in Node and store them in MongoDB. For my application I only want to generate a random 6 digit number (this number is used to sync two sessions so it's short for users to easily copy). I want to generate a number that is not already in my sessions collection in MongoDB. My first thought which will probably work 99% of the time is this:
function GetSessionKey(cb) {
var key = parseInt(Math.random() * 1000000).toString();
while (key.length < 6) key = '0' + key;
db.sessions.find({ Key: key }, function (err, docs) {
if (!err && docs.length == 0) {
cb(key);
} else {
GetSessionKey(cb);
}
});
}
But there is a slim chance that the generated key constantly picks existing keys in mongo. Or the possibility that all keys are already in use (which I do not expect to happen).
I know Node is asynchronous but does that mean that a recursive async call like this will fill up the stack if it keeps getting called? Or do async calls not get put on the stack? Is there a better way to generate short unique keys? Old keys will be removed and can be reused so I don't think it's be a ticking time bomb. I'm just worried that the random aspect may cause trouble.
An alternative is to use MongoDB to enforce uniqueness. For example, create an unique index on the Key field in the Mongo shell:
db.sessions.ensureIndex( { Key: 1 }, { unique: true } )
Modify the Node.js code to insert the generated random value and test the err variable for duplicate key errors. If there is no error, you're good to go. Otherwise, regenerate the value and try again. You can test for duplicate key errors like this:
...
db.sessions.insert({ Key: key }, function (err, docs) {
...
if (err) {
if (err.code && err.code == 11000) {
// Duplicate key error. Generate a new random value and insert again.
} else {
throw err;
}
}
...
If you can store the generated key in the _id field instead of Key, then you can skip ensureIndex(...) as MongoDB automatically creates an unique index on _id.
Recursive async call will not fill up stack. They'll just spin cpu over and over again until they find a key or some kind of an error happens.
I suggest to set a reasonable maximum on an amount of guesses though. And yes, base64 (as suggested above) or base36 (that's simply .toString(36)) will increase amount of keys available reducing collision chance if you're worrying about that.
Edit: there is indeed a better way - sequential keys.

Resources