Best way to retrieve an item from dynamoDB using attribute which is not partition key - node.js

I am new to dynamoDB and need some suggestion from experienced people here . There is a table created with below model
orderId - PartitionKey
stockId
orderDetails
and there is a new requirement to fetch all the orderIds which includes particular stockId. The item in the table looks like
{
"orderId":"ord_12234",
"stockId":[
123221,
234556,
123231
],
"orderDetails":{
"createdDate":"",
"dateOfDel":""
}
}
provided the scenario that stockId can be an array of id it cant be made as GSI .Performing scan would be heavy as the table has large number of records and keeps growing . what would be the best option here , How the existing table can be modified to achieve this in efficient way

You definitely want to avoid scanning the table. One option is to modify your schema to a Single Table Design where you have order items and order/stock items.
For example:
pk
sk
orderDetails
stockId
...
order#ord_12234
order#ord_12234
{createdDate:xxx, dateOfDel:yyy}
...
order#ord_12234
stock#123221
23221
...
order#ord_12234
stock#234556
234556
...
order#ord_12234
stock#123231
123231
...
You can then issue the following queries, as needed:
get the order details with a query on pk=order#ord_12234, sk=order#ord_12234
get the stocks for a given order with a query on pk=order#ord_12234, sk=stock#
get everything associated with the order with a query on pk=order#ord_12234

Related

Loop Through a list in python to query DynamoDB for each item

I have a list of items and would like to use each item as the pk (Primary Key) to query Dynamo DB, using Python.
I have tried using a for loop but I dont get any results, If I try the same query with the actual value from the group_id list it does work which means my query statement is correct.
group_name_query = []
for i in group_id:
group_name_query = config_table.query(
KeyConditionExpression=Key('pk').eq(i) & Key('sk').eq('GROUP')
)
Here is a sample group_ip = ['GROUP#6501e5ac-59b2-4d05-810a-ee63d2f4f826', 'GROUP#6501e5ac-59b2-4d05-810a-ee63d2sfdgd']
not answering your issue but got a suggestion, if you're querying base table with pk and sk instead of query gsi, i would suggest you Batch Get Item API to get multiple items in one shot
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/example_dynamodb_BatchGetItem_section.html

TypeORM count grouping with different left joins each time

I am using NestJS with TypeORM and PostgreSQL. I have a queryBuilder which joins other tables based on the provided array of relations.
const query = this.createQueryBuilder('user');
if (relations.includes('relation1') {
query.leftJoinAndSelect('user.relation1', 'r1');
}
if (relations.includes('relation2') {
query.leftJoinAndSelect('user.relation2', 'r2');
}
if (relations.includes('relation3') {
query.leftJoinAndSelect('user.relation3', 'r3');
}
// 6 more relations
Following that I select a count on another table.
query
.leftJoin('user.relation4', 'r4')
.addSelect('COUNT(case when r4.value > 10 then r4.id end', 'user_moreThan')
.addSelect('COUNT(case when r4.value < 10 then r4.id end', 'user_lessThan')
.groupBy('user.id, r1.id, r2.id, r3.id ...')
And lastly I use one of the counts (depending on the request) for ordering the result with orderBy.
Now, of course, based on the relations parameter, the requirements for the groupBy query change. If I join all tables, TypeORM expects all of them to be present in groupBy.
I initially had the count query separated, but that was before I wanted to use the result for ordering.
Right now I planned to just dynamically create the groupBy string, but this approach somehow feels wrong and I am wondering if it is in fact the way to go or if there is a better approach to achieving what I want.
You can add group by clause conditionally -
if (relations.includes('relation1') {
query.addGroupBy('r1.id');
}

Limit relationships

I have two tables I.e Order and OrderTransaction where Order to Order Transaction is one to many relationship. I have to fetch Order Transaction as relationship in Order table . How can I limit the number of Order Transaction to a specific number eg 3, rather than pulling all the transactions for the order .
class Order(Base):
orderTransactions = relationship(
"Order", uselist=True,
lazy="joined"
)
When I fetch order based on primary key order id, I need to fetch only 3 transactions in orderTransactions relationship. How can I achieve that ?
Fix for the above is using lazy loading as dynamic.
class Order(Base):
orderTransactions = relationship(
"Order", uselist=True,
lazy="dynamic"
)
As i need to fetch only 3 order transactions so whenever i need order transaction i would do like this .
transactions = order.orderTransactions[0:3]

Hazelcast Jet - Group By Use Case

We have a requirement to group by multiple fields in a dynamic way on a huge data set. The data is stored in Hazelcast Jet cluster. Example: if Person class contains 4 fields: age, name, city and country. We first need to group by city and then by country and then we may group by name based on conditional parameters.
We already tried using Distributed collection and is not working. Even when we tried using Pipeline API it is throwing error.
Code:
IMap res= client.getMap("res"); // res is distrbuted map
Pipeline p = Pipeline.create();
JobConfig jobConfig = new JobConfig();
p.drawFrom(Sources.<Person>list("inputList"))
.aggregate(AggregateOperations.groupingBy(Person::getCountry))
.drainTo(Sinks.map(res));
jobConfig = new JobConfig();
jobConfig.addClass(Person.class);
jobConfig.addClass(HzJetListClientPersonMultipleGroupBy.class);
Job job = client.newJob(p, jobConfig);
job.join();
Then we read from the map in the client and destroy it.
Error Message on the server:
Caused by: java.lang.ClassCastException: java.util.HashMap cannot be cast to java.util.Map$Entry
groupingBy aggregates all the input items into a HashMap where the key is extracted using the given function. In your case it aggregates a stream of Person items into a single HashMap<String, List<Person>> item.
You need to use this:
p.drawFrom(Sources.<Person>list("inputList"))
.groupingKey(Person::getCountry)
.aggregate(AggregateOperations.toList())
.drainTo(Sinks.map(res));
This will populate the res map with a list of persons in each city.
Remember, without groupingKey() the aggregation is always global. That is, all items in the input will be aggregated to one output item.

Index multiple MongoDB fields, make only one unique

I've got a MongoDB database of metadata for about 300,000 photos. Each has a native unique ID that needs to be unique to protect against duplication insertions. It also has a time stamp.
I frequently need to run aggregate queries to see how many photos I have for each day, so I also have a date field in the format YYYY-MM-DD. This is obviously not unique.
Right now I only have an index on the id property, like so (using the Node driver):
collection.ensureIndex(
{ id:1 },
{ unique:true, dropDups: true },
function(err, indexName) { /* etc etc */ }
);
The group query for getting the photos by date takes quite a long time, as one can imagine:
collection.group(
{ date: 1 },
{},
{ count: 0 },
function ( curr, result ) {
result.count++;
},
function(err, grouped) { /* etc etc */ }
);
I've read through the indexing strategy, and I think I need to also index the date property. But I don't want to make it unique, of course (though I suppose it's fine to make it unique in combine with the unique id). Should I do a regular compound index, or can I chain the .ensureIndex() function and only specify uniqueness for the id field?
MongoDB does not have "mixed" type indexes which can be partially unique. On the other hand why don't you use _id instead of your id field if possible. It's already indexed and unique by definition so it will prevent you from inserting duplicates.
Mongo can only use a single index in a query clause - important to consider when creating indexes. For this particular query and requirements I would suggest to have a separate unique index on id field which you would get if you use _id. Additionally, you can create a non-unique index on date field only. If you run query like this:
db.collection.find({"date": "01/02/2013"}).count();
Mongo will be able to use index only to answer the query (covered index query) which is the best performance you can get.
Note that Mongo won't be able to use compound index on (id, date) if you are searching by date only. You query has to match index prefix first, i.e. if you search by id then (id, date) index can be used.
Another option is to pre aggregate in the schema itself. Whenever you insert a photo you can increment this counter. This way you don't need to run any aggregation jobs. You can also run some tests to determine if this approach is more performant than aggregation.

Resources