IndexedDB getAll() ordering - google-chrome-extension

I'm using getAll() method to get all items from db.
db.transaction('history', 'readonly').objectStore('history').getAll().onsuccess = ...
My ObjectStore is defined as:
db.createObjectStore('history', { keyPath: 'id', autoIncrement: true });
Can I count on the ordering of the items I get? Will they always be sorted by primary key id?
(or is there a way to specify sort explicitly?)
I could not find any info about ordering in official docs

If the docs don't help, consult the specs:
getAll refers to "steps for retrieving multiple referenced values"
the retrieval steps refer to "first count records in index"
the specification of index contains the following paragraph:
The records in an index are always sorted according to the record's
key. However unlike object stores, a given index can contain multiple
records with the same key. Such records are additionally sorted
according to the index's record's value (meaning the key of the record
in the referenced object store).
Reading backwards: An index is sorted. getAll retrieves the first N of an index, i.e. it is order-preserving. Therefore the result itself should retain the sort order.

Related

When are Keys Not Sortable in Sort Merge Join in Spark?

When I read articles on Sort Merge Join, it says this is the most preferred one in spark after Broadcast join, but only if joining keys are sortable. My question is when can a joining key be unsortable? Any datatype can be sorted. Could you help me understand a scenario where keys may not be sortable?
See https://www.waitingforcode.com/apache-spark-sql/sort-merge-join-spark-sql/read. Excellent site.
Not all types can be sorted. E.g CalendarIntervalType.
Quoting:
"for not sortable keys the sort merge join" should "not be used" in {
import sparkSession.implicits._
// Here we explicitly define the schema. Thanks to that we can show
// the case when sort-merge join won't be used, i.e. when the key is not sortable
// (there are other cases - when broadcast or shuffle joins can be chosen over sort-merge
// but it's not shown here).
// Globally, a "sortable" data type is:
// - NullType, one of AtomicType
// - StructType having all fields sortable
// - ArrayType typed to sortable field
// - User Defined DataType backed by a sortable field
// The method checking sortability is org.apache.spark.sql.catalyst.expressions.RowOrdering.isOrderable
// As you see, CalendarIntervalType is not included in any of above points,
// so even if the data structure is the same (id + login for customers, id + customer id + amount for orders)
// with exactly the same number of rows, the sort-merge join won't be applied here.
This is an old post, since v3 a comparison can be made.
https://spark.apache.org/docs/3.0.0/api/scala/org/apache/spark/sql/types/CalendarIntervalType.html
But it demonstrates the point.
Also, what about non-equi joins?

Azure CosmosDB: how to ORDER BY id?

Using a vanilla CosmosDB collection (all default), adding documents like this:
{
"id": "3",
"name": "Hannah"
}
I would like to retrieve records ordered by id, like this:
SELECT c.id FROM c
ORDER BY c.id
This give me the error Order-by item requires a range index to be defined on the corresponding index path.
I expect this is because /id is hash indexed and not range indexed. I've tried to change the Indexing Policy in various ways, but any change I make which would touch / or /id gets wiped when I save.
How can I retrieve documents ordered by ID?
The best way to do this is to store a duplicate property e.g. id2 that has the same value of id, and is indexed using a range index, then use that for sorting, i.e. query for SELECT * FROM c ORDER BY c.id2.
PS: The reason this is not supported is because id is part of a composite index (which is on partition key and row key; id is the row key part) The Cosmos DB team is working on a change that will allow sorting by id.
EDIT: new collections now support ORDER BY c.id as of 7/12/19
I found this page CosmosDB Indexing Policies , which has the below Note that may be helpful:
Azure Cosmos DB returns an error when a query uses ORDER BY but
doesn't have a Range index against the queried path with the maximum
precision.
Some other information from elsewhere in the document:
Range supports efficient equality queries, range queries (using >, <,
>=, <=, !=), and ORDER BY queries. ORDER By queries by default also require maximum index precision (-1). The data type can be String or
Number.
Some guidance on types of queries assisted by Range queries:
Range Range over /prop/? (or /) can be used to serve the following
queries efficiently:
SELECT FROM collection c WHERE c.prop = "value"
SELECT FROM collection c WHERE c.prop > 5
SELECT FROM collection c ORDER BY c.prop
And a code example from the docs also:
var rangeDefault = new DocumentCollection { Id = "rangeCollection" };
// Override the default policy for strings to Range indexing and "max" (-1) precision
rangeDefault.IndexingPolicy = new IndexingPolicy(new RangeIndex(DataType.String) { Precision = -1 });
await client.CreateDocumentCollectionAsync(UriFactory.CreateDatabaseUri("db"), rangeDefault);
Hope this helps,
J

DynamoDB equivelent to Find({}).toArray

I'm looking to export an entire table in DynamoDB as an array of objects. I'm recently converting from MongoDB, and in that I'd use .find({}).toArray( (err,res)=> {...} ). I'm having a bit of trouble finding an alternative to DynamoDB.
You can use Scan.
The Scan operation returns one or more items and item attributes by
accessing every item in a table or a secondary index. To have DynamoDB
return fewer items, you can provide a FilterExpression operation.
The data from the Scan operation is returned in JSON format, which has an Items element:
Items
An array of item attributes that match the scan criteria. Each element
in this array consists of an attribute name and the value for that
attribute.
Type: array of String to AttributeValue object maps

Index multiple MongoDB fields, make only one unique

I've got a MongoDB database of metadata for about 300,000 photos. Each has a native unique ID that needs to be unique to protect against duplication insertions. It also has a time stamp.
I frequently need to run aggregate queries to see how many photos I have for each day, so I also have a date field in the format YYYY-MM-DD. This is obviously not unique.
Right now I only have an index on the id property, like so (using the Node driver):
collection.ensureIndex(
{ id:1 },
{ unique:true, dropDups: true },
function(err, indexName) { /* etc etc */ }
);
The group query for getting the photos by date takes quite a long time, as one can imagine:
collection.group(
{ date: 1 },
{},
{ count: 0 },
function ( curr, result ) {
result.count++;
},
function(err, grouped) { /* etc etc */ }
);
I've read through the indexing strategy, and I think I need to also index the date property. But I don't want to make it unique, of course (though I suppose it's fine to make it unique in combine with the unique id). Should I do a regular compound index, or can I chain the .ensureIndex() function and only specify uniqueness for the id field?
MongoDB does not have "mixed" type indexes which can be partially unique. On the other hand why don't you use _id instead of your id field if possible. It's already indexed and unique by definition so it will prevent you from inserting duplicates.
Mongo can only use a single index in a query clause - important to consider when creating indexes. For this particular query and requirements I would suggest to have a separate unique index on id field which you would get if you use _id. Additionally, you can create a non-unique index on date field only. If you run query like this:
db.collection.find({"date": "01/02/2013"}).count();
Mongo will be able to use index only to answer the query (covered index query) which is the best performance you can get.
Note that Mongo won't be able to use compound index on (id, date) if you are searching by date only. You query has to match index prefix first, i.e. if you search by id then (id, date) index can be used.
Another option is to pre aggregate in the schema itself. Whenever you insert a photo you can increment this counter. This way you don't need to run any aggregation jobs. You can also run some tests to determine if this approach is more performant than aggregation.

Get all fields and values of hash key using redis in node

In red is using hash, I need to store hash key with multiple fields and values.
I tried as below:
client.hmset("Table1", "Id", "9324324", "ReqNo", "23432", redis.print);
client.hmset("Table1", "Id", "9324325", "ReqNo", "23432", redis.print);
var arrrep = new Array();
client.hgetall("Table1", function(err, rep){
console.log(rep);
});
Output is: { Id: '9324325', ReqNo: '23432' }
I am getting only one value. How to get all fields and values in the hash key? Kindly help me if I am wrong and let me get the code. Thanks.
You are getting one value because you override the previous value.
client.hmset("Table1", "Id", "9324324", "ReqNo", "23432", redis.print);
This adds Id, ReqNo to the Table1 hash object.
client.hmset("Table1", "Id", "9324325", "ReqNo", "23432", redis.print);
This overrides Id and ReqNo for the Table1 hash object. At this point, you only have two fields in the hash.
Actually, your problem comes form the fact you are trying to map a relational database model to Redis. You should not. With Redis, it is better to think in term of data structures and access paths.
You need to store one hash object per record. For instance:
HMSET Id:9324324 ReqNo 23432 ... and some other properties ...
HMSET Id:9324325 ReqNo 23432 ... and some other properties ...
Then, you can use a set to store the IDs:
SADD Table1 9324324 9324325
Finally to retrieve the ReqNo data associated to the Table1 collection:
SORT Table1 BY NOSORT GET # GET Id:*->ReqNo
If you want to also search for all the IDs which are associated to a given ReqNo, then you need another structure to support this access path:
SADD ReqNo:23432 9324324 9324325
So you can get the list of IDs for record 23432 by using:
SMEMBERS ReqNo:23432
In other words, do not try to transpose a relational model: just create your own data structures supporting your use cases.

Resources