How can I save array of objects in cassandra?
I'm using a nodeJS application and using cassandra-driver to connect to Cassandra DB. I wanted to save records like below in my db:
{
"id" : "5f1811029c82a61da4a44c05",
"logs" : [
{
"conversationId" : "e9b55229-f20c-4453-9c18-a1f4442eb667",
"source" : "source1",
"destination" : "destination1",
"url" : "https://asdasdas.com",
"data" : "data1"
},
{
"conversationId" : "e9b55229-f20c-4453-9c18-a1f4442eb667",
"source" : "source2",
"destination" : "destination2",
"url" : "https://afdvfbwadvsffd.com",
"data" : "data2"
}
],
"conversationId" : "e9b55229-f20c-4453-9c18-a1f4442eb667"
}
In the above record, I can use type "text" to save values of the columns "id" and "conversationId". But not sure how can I define the schema and save data for the field "logs".
With Cassandra, you'll want to store the data in the same way that you want to query it. As you mentioned querying by conversatonid, that's going to influence how the PRIMARY KEY definition should look. Given this, conversationid, should make a good partition key. As for the clustering columns, I had to make some guesses as to cardinality. So, sourceid looked like it could be used to uniquely identify a log entry within a conversation, so I went with that next.
I thought about using id as the final clustering column, but it looks like all entries with the same conversationid would also have the same id. It might be a good idea to give each entry its own unique identifier, to help ensure uniqueness:
{
"uniqueid": "e53723ca-2ab5-441f-b360-c60eacc2c854",
"conversationId" : "e9b55229-f20c-4453-9c18-a1f4442eb667",
"source" : "source1",
"destination" : "destination1",
"url" : "https://asdasdas.com",
"data" : "data1"
},
This makes the final table definition look like this:
CREATE TABLE conversationlogs (
id TEXT,
conversationid TEXT,
uniqueid UUID,
source TEXT,
destination TEXT,
url TEXT,
data TEXT,
PRIMARY KEY (conversationid,sourceid,uniqueid));
You have a few options depending on how you want to query this data.
The first is to stringify the json in logs field and save that to the database and then convert it back to JSON after querying the data.
The second option is similar to the first, but instead of stringifying the array, you store the data as a list in the database.
The third option is to define a new table for the logs with a primary key of the conversation and clustering keys for each element of the logs. This will allow you to lookup either by the full key or query by just the primary key and retrieve all the rows that match those criteria.
CREATE TABLE conversationlogs (
conversationid uuid,
logid timeuuid,
...
PRIMARY KEY ((conversationid), logid));
Related
In NodeJs with MS-SQL, I want to retrieve two three table data in the form of array of objects
Hello there my name is Shaziya, please help me out (โยดโ`โ)
Actually i have done NodeJs from YouTube,
I want learn NodeJS with MS-SQL, do you or any friends have such course for Advance understanding,
Like how to connect 4 5 tables and show data in array of objects format
How to make nested query or subquires like...
I mean if wanna do table match with two table product and order
Product ID with Order table by matching product ID
data {
productId : 1:
productName : "abc",
[{
orderId : 1
orderName : "xyz"
},
{
orderId : 2
orderName : "pqr"
}
]
}
At least i got some course or solution for that where i stuck
I want to store data in following structure :-
"id" : 100, -- primary key
"data" : [
{
"imei" : 862304021502870,
"details" : [
{
"start" : "2018-07-24 12:34:50",
"end" : "2018-07-24 12:44:34"
},
{
"start" : "2018-07-24 12:54:50",
"end" : "2018-07-24 12:56:34"
}
]
}
]
So how do I create table schema in Cassandra for the same ?
Thanks in advance.
There are several approaches to this, depending on the requirements regarding data access/modification - for example, do you need to modify individual fields, or you update at once:
Declare the map of imei/details as user-defined type (UDT), and then declare table like this:
create table tbl (
id int primary key,
data set<frozen<details_udt>>);
But this is relatively hard to support in the long term, especially if you add more nested objects with different types. Plus, you can't really update fields of the frozen records that you must to use in case of nested collections/UDTs - for this table structure you need to replace complete record inside set.
Another approach - just do explicit serialization/deserialization of data into/from JSON or other format, and have table structure like this:
create table tbl(
id int primary key,
data text);
the type of data field depends on what format you'll use - you can use blob as well to store binary data. But in this case you'll need to update/fetch complete field. You can simplify things if you use Java driver's custom codecs that will take care for conversion between your data structure in Java & desired format. See example in the documentation for conversion to/from JSON.
Please look at the following example:
Insert
INSERT INTO my_keyspace.my_table (id, name, my_info) VALUES (
3464546,
'Sumit',
{ birthday : '1990-01-01', height : '6.2 feet', weight : '74 kg' }
);
Second Insert
INSERT INTO my_keyspace.my_table (id, name, my_info) VALUES (
3464546,
'Sumit',
{ birthday : '1990-01-01', height : '6.2 feet', weight : null }
);
Consider "id" as the Primary Key.
In the second insert "weight" attribute inside "my_info" UDT is set as null. Does this create a tombstone? How null inside an UDT is stored in the Cassandra database?
Yes Setting a column to NULL is the same as writing a tombstone in some cases.
I'm a bit of a noob with MongoDB, so would appreciate some help with figuring out the best solution/format/structure in storing some data.
Basically, the data that will be stored will be updated every second with a name, value and timestamp for a certain meter reading.
For example, one possibility is water level and temperature in a tank. The tank will have a name and then the level and temperature will be read and stored every second. Overall, there will be 100's of items (i.e. tanks), each with millions of timestamped values.
From what I've learnt so far (and please correct me if I'm wrong), there are a few options as how to structure the data:
A slightly RDMS approach:
This would consist of two collections, Items and Values
Items : {
_id : "id",
name : "name"
}
Values : {
_id : "id",
item_id : "item_id",
name : "name", // temp or level etc
value : "value",
timestamp : "timestamp"
}
The more document db denormalized method:
This method involves one collection of items each with an array of timestamped values
Items : {
_id : "id",
name : "name"
values : [{
name : "name", // temp or level etc
value : "value",
timestamp : "timestamp"
}]
}
A collection for each item
Save all the values in a collection named after that item.
ItemName : {
_id : "id",
name : "name", // temp or level etc
value : "value",
timestamp : "timestamp"
}
The majority of read queries will be to retrieve the timestamped values for a specified time period of an item (i.e. tank) and display in a graph. And for this, the first option makes more sense to me as I don't want to retrieve the millions of values when querying for a specific item.
Is it even possible to query for values between specific timestamps for option 2?
I will also need to query for a list of items, so maybe a combination of the first and third option with a collection for all the items and then a number of collections to store the values for each of those items?
Any feedback on this is greatly appreciated.
Don't use timestamp if you are not modifying the ObjectId.
As ObjectId itself has time stamp in it.
So you will be saving a lot of memory by it.
MongoDB Id Documentation
In case if you dont require the previous data then you can use update query in MongoDB to update the fields every second instead of storing.
If you want to store the updated data each time then instead of updating store it in flat structure.
{ "_id" : ObjectId("XXXXXX"),
"name" : "ItemName",
"value" : "ValueOfItem"
"created_at" : "timestamp"
}
Edit 1: Added timestamp as per the comments
In red is using hash, I need to store hash key with multiple fields and values.
I tried as below:
client.hmset("Table1", "Id", "9324324", "ReqNo", "23432", redis.print);
client.hmset("Table1", "Id", "9324325", "ReqNo", "23432", redis.print);
var arrrep = new Array();
client.hgetall("Table1", function(err, rep){
console.log(rep);
});
Output is: { Id: '9324325', ReqNo: '23432' }
I am getting only one value. How to get all fields and values in the hash key? Kindly help me if I am wrong and let me get the code. Thanks.
You are getting one value because you override the previous value.
client.hmset("Table1", "Id", "9324324", "ReqNo", "23432", redis.print);
This adds Id, ReqNo to the Table1 hash object.
client.hmset("Table1", "Id", "9324325", "ReqNo", "23432", redis.print);
This overrides Id and ReqNo for the Table1 hash object. At this point, you only have two fields in the hash.
Actually, your problem comes form the fact you are trying to map a relational database model to Redis. You should not. With Redis, it is better to think in term of data structures and access paths.
You need to store one hash object per record. For instance:
HMSET Id:9324324 ReqNo 23432 ... and some other properties ...
HMSET Id:9324325 ReqNo 23432 ... and some other properties ...
Then, you can use a set to store the IDs:
SADD Table1 9324324 9324325
Finally to retrieve the ReqNo data associated to the Table1 collection:
SORT Table1 BY NOSORT GET # GET Id:*->ReqNo
If you want to also search for all the IDs which are associated to a given ReqNo, then you need another structure to support this access path:
SADD ReqNo:23432 9324324 9324325
So you can get the list of IDs for record 23432 by using:
SMEMBERS ReqNo:23432
In other words, do not try to transpose a relational model: just create your own data structures supporting your use cases.