Relationship between collections in mongodb

Relationship between collections in mongodb - node.js

How can we do relational update queries in mongodb.
We have the following scenario:
In a bank the deposits of the clients are managed, once a certain
period of time is fulfilled the bank returns the deposit to the
clients plus the accrued interest, a client can have several deposits.
we create the client's collection, with his name, and what he has
available to withdraw, we create the deposit collection, with the
amount, the interest rate, and we join it to the client's model by the
client's id from a clientId field. Every 24 hours the bank updates all
user accounts, if the deposit creation date is less than two years,
the user's interest is updated, if the date is equal to 2 years a new
field is added to the deposit (expired: true), and to the client's
collection in the available field is added, what was already
accumulated, the interest, plus the amount of the deposit. To give a
solution I have tried to obtain all the deposits, I save them in an
object and I go through it with the map property. Inside the map I try
to update the clients that have expired deposits, however, only the
last element that runs through the map is being updated.
What would be the best solution to this problem. I clarify that I am using mongoose and nodejs.

Related

how to display if customer has a single transaction or multiple transactions

I would like to ask a question on how to correctly display single or multiple transactions if a user has more than one purchase from different stores in day. If a user bought once it will display "single" and if the user bought more than once it will display "multiple"
For reference, column F is for determining if the user bought more than once, 0= first transaction and 1= other transactions. If a user has only 0 and no 1s then it is considered single transaction only. I tried using this formula in column H:
=IF(COUNTIF(A$2:A2,A2)=1,
IF(COUNTIFS(A$2:A2,A2,D$2:D2,D2)>1,
OFFSET(A$2,MATCH(A2&D2,$A$2:A2&A2&$D$2:D2,0)-1,1),
MAX(IF($A1:A$2=A2,$F1:F$2))+1))
but, it was not showing the result I want.

If I understand your question correctly, it should be possible to use SUMPRODUCT to accomplish this.
Try: =IF(SUMPRODUCT(--(A:A=A2),--(D:D=D2))>1,"mutiple","single")

How to inquire the database after limited time?

There is a SQL table contains Bids.
When first bids is inserted to table the downcounter starts. After some time, as instance 5 minutes I must aggregate all data and find the max price across bids.
I wonder how to trigger this event and send message to the Node service that should handle this?
Another directions when service asks each second DB and compares startDate, endDate and makes aggregate by sum.
Which is approach to choose?
What about create a cron UNIX task when bid is inserted to DB?
So, the bid continues according by configurated time in script. In my case it is 5 minutes. After the no one can not send own bid.
After I need to select all participant who made bids and aggregate max price across them.

Mongoose update same column with different values in single query

I am trying to make a mongoose query that modify two records at once but with different updated value.
Scenario: Increase User A's balance while decrease User B's balance.
Any idea???

Listing Stripe refunds between dates

Using the Stripe API, I'm struggling to work out how to list refunds between two dates.
I can list charges between two dates, but as the refund can come at a date significantly after the charge date, it is not good.
Thanks for anyone who can point me in the right direction!

A dictionary with dates is working for me even though the API docs don't show this as an option for Refund.list:
epoch_dict = {'lt': 1496505600, 'gte': 1496502000}
refunds = stripe.Refund.list(limit=100, created=epoch_dict)

I would use the List All Balance History API and pass type: "refund" to only list the refunds and also pass the created hash based on the dates I want to limit my search to and you should get all the balance transactions associated with a refund created between those two dates.

Azure Tables - Partition Key and Row Key - Correct Choice

I am new to Azure tables and having read a lot of articles but would like some reassurance on the above given its fundamental.
I have data which is similar to this:
CustomerId, GUID
TripId, GUID
JourneyStep, GUID
Time, DataTime
AverageSpeed, int
Based on what I have read, is CustomerId a good PartitionKey? Where I become stuck is the combination of CustomerId and TripId that does not make a unique row. My justification for TripId as the Row Key is because every query will be a dataset based on CustomerId and TripId.
Just for context, the CustomerId is clearly unique, the TripId represents one journey in a vehicle and within that journey the JourneyStep represents a unit within that Trip which may be 10 steps or 1000.
The intention is aggregate the data into further tables with each level being used for a different purpose. At the most aggregated level, the customer will be given some scores.
The amount of data will obviously be huge so need to think about query performance from the outset.
Updated:
As requested, the solution is for Vehicle Telematics so think of yourself in your own car. Blackbox shipping data to an server which in turn passes it to Azure Tables. In Relational DB terms, I would have a Customer Table and a trip table with a foreign key back to the customer table.
The tripId is auto generated by the blackbox. TripId does not need stored by date time from a query point of view, however may be relevant from a query performance point of view.
Queries will be split into two:
Display a map of a single journey for each customer, so filter by customer and then Trip to then iterate each row (journeystep) to a map.
Per customer, I will score each trip and then retrieve trips for, let's say, the last month to aggregate a score. I do have SQL Database to enrich data with client records etc but for the volume data (the trip data) I wish to use Azure Tables.
The aggregates from the second query will probably be stored in a separate table, so if someone made 10 trips in one month, I would run the second query which would score each trip, then produce a score for all trips that month and store both answers so potentially a table of trip aggregates and a table of monthly aggregates.

The thing about the Partition Key is that it represents a logical grouping; You cannot insert data spanning multiple partition keys, for example. Similarly, rows with the same partition are likely to be stored on the same server, making it quick to retrieve all the data for a given partition key.
As such, it is important to look at your domain and figure out what aggregate you are likely to work with.
If I understand your domain model correctly, I would actually be tempted to use the TripId as the Partition Key and the JourneyStep as the Row Key.
You will need to, separately, have a table that lists all the Trip IDs that belongs to a given Customer - which sort of makes sense as you probably want to store some data, such as "trip name" etc in such a table anyway.

Your design has to be related to your query. You can filter your data based on 2 columns PartitionKey and RowKey. PartitionKey is your most important column since your queries will hit that column first.
In your case CustomerId should be your PartitionKey since most of the time you will try to reach your data based on the customer. (you may also need to keep another table for your client list)
Now, RowKey can be your tripId or time. if I were you I probably use rowKey as yyyyMMddHHmm|tripId format which will let you to query based on startWith and endWidth options.

Adding to #Frans answer:
One thing you could do is create a separate table for each customer. So you could have table named like Customer. That way each customer's data is nicely segregated into different tables. Then you could use TripId as PartitionKey and then JourneyStep as RowKey as suggested by #Frans. For storing some metadata about the trip, instead of going into a separate table, I would still use the same table but here I would keep the RowKey as empty and put other information about the trip there.

I would suggest considering the following approach to your PK/RK design. I believe it would yield the best performance for your outlined queries:
PartitionKey: combination of CustomerId and TripId.
string.Format("{0}_{1}", customerId.ToString(), tripId.ToString())
RowKey: combination of the DateTime.MaxValue.Ticks - Time.Ticks formatted to a large 0-padded string with the JourneyStep.
string.Format("{0}_{1}", (DateTime.MaxValue.Ticks - Time.Ticks).ToString("00000000000000000"), JourneyStep.ToString())
Such combination will allow you to do the following queries "quickly".
Get data by CustomerId only. Example: context.Trips.Where(n=>string.Compare(id + "_00000000-0000-0000-0000-000000000000", n.PartitionKey) <= 0 && string.Compare(id+"_zzzzzzzz-zzzz-zzzz-zzzz-zzzzzzzzzzzz") >=0).AsTableServiceQuery(context);
Get data by CustomerId and TripId. Example: context.Trips.Where(n=>n.PartitionKey == string.Format("{0}_{1}", customerId, tripId).AsTableServiceQuery(context);
Get last X amount of journey steps if you were to search by either CustomerId or CustomerId/TripId by using the "Take" function
Get data via date-range queries by translating timestamps into Ticks
Save data into a trip with a single storage transaction (assuming you have less than 100 steps)
If you can guarantee uniqueness of Times of Steps within each Trip, you don't even have to put JourneyStep into the RowKey as it is somewhat inconvenient
The only downside to this schema is not being able to retrieve a particular single journey step without knowing its Time and Id. However, unless you have very specific use cases, downloading all of the steps inside a trip and then picking a particular one from the list should not be so bad.
HTH

The design of table storage is a function to optimize two major capabilities of Azure Tables:
Scalability
Search performance
As #Frans user already pointed out, Azure tables uses the partitionkey to decide how to scale out your data on multiple storage server nodes. Because of this, I would advise against having unique partitionkeys, since in theory, you will have Azure spanning out storage nodes that will be able to serve one customer only. I say "in theory" because, in practice, Azure uses smart algorithms to identify if there are patterns in your partitionkeys and thus be able to group them (example, if your ids are consecutive numbers). You don't want to fall into this scenario because the scalability of your storage will be unpredictable and at the hands of obscure algorithms that will be making those decisions. See HERE for more information about scalability.
Regarding performance, the fastest way to search is to hit both partitionkey+rowkey in your search queries. Contrary to Amazon DynamoDB, Azure Tables does not support secondary column indexes. If you have your search queries search for attributes stored in columns apart from those two, Azure will need to do a full table scan.
I faced a situation similar to yours, where the design of the partition/row keys was not trivial. In the end, we expanded our data model to include more information so we could design our table in such a way that ~80% of all search queries can be matched to partition+row keys, while the remaining 20% require a table scan. We decided to include the user's location, so our partition key is the user's country and the rowkey is a customer unique ID. This means our data model had to be expanded to include the user's country, which was not a big deal. Maybe you can do the same thing? Group your customers by segment, or by location, or by email address SMTP domain?

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string