I'm using UPSERT:
FOR doc IN temp_collection
UPSERT { _key: doc._key }
INSERT
{...}
UPDATE
{
seen: OLD.seen + doc.seen
x: (OLD.seen + doc.seen) > 10 ? 0 : 1
}
IN main_collection
I'd like to replace (OLD.seen + doc.seen) in x by value from new, yet not commited seen. But I can't use pseudo variable NEW here and can use LET inside UPDATE. What else I can do if I want to get rid of the duplication?
Thank you in advance!
AQL does currently not support anything to avoid this "duplication".
You may request this as feature but it may not easily be possible because of language restrictions that exist in order to allow certain optimizations.
Related
Hi I would like to insert random test data into an edge collection called Transaction with the fields _id, Amount and TransferType with random data. I have written the following code below, but it is showing a syntax error.
FOR i IN 1..30000
INSERT {
_id: CONCAT('Transaction/', i),
Amount:RAND(),
Time:Rand(DATE_TIMESTAMP),
i > 1000 || u.Type_of_Transfer == "NEFT" ? u.Type_of_Transfer == "IMPS"
} INTO Transaction OPTIONS { ignoreErrors: true }
Your code has multiple issues:
When you are creating a new document you can either not specify the _key key and Arango will create one for you, or you specify one as a string to be used. _id as a key will be ignored.
RAND() produces a random number between 0 and 1, so it needs to be multiplied in order to make it into the range you want you might need to round it, if you need integer values.
DATE_TIMESTAMP is a function and you have given it as a parameter to the RAND() function which needs no parameter. But because it generates a numerical timestamp (milliseconds since 1970-01-01 00:00 UTC), actually it's not needed. The only thing you need is the random number generation shifted to a range that makes sense (ie: not in the 1970s)
The i > 1000 ... line is something I could only guess what it wanted to be. Here the key for the JSON object is missing. You are referencing a u variable that is not defined anywhere. I see the first two parts of a ternary operator expression (cond ? true_value : false_value) but the : is missing. My best guess is that you wanted to create a Type_of_transfer key with value of "NEFT" when i>1000 and "IMPS" when i<=1000
So, I rewrote your AQL and tested it
FOR i IN 1..30000
INSERT {
_key: TO_STRING(i),
Amount: RAND()*1000,
Time: ROUND(RAND()*100000000+1603031645000),
Type_of_Transfer: i > 1000 ? "NEFT" : "IMPS"
} INTO Transaction OPTIONS { ignoreErrors: true }
So I got a small database, It's not going to grow much more and I'm trying to get one document from the db in an API that I implemented in python so that with a given document Id I retrieve the document in the db. However, I find it a little hard to put the user to write a random number from the db. All I require is a function that modifies each document by setting an id field and to Auto-Increment. As I said, it's not going to grow that much and the performance isn't really an issue here.
So far what I've been able to do is this:
var i = 0
db.MyCollection.update({},
{$set : {"new_field":1}},
{upsert:false,
multi:true}
i ++;),
I achieved to set an id field but it sets the same number to each document (the count of every document) So let's say that if the db has 10 docs, it'll set the Id to 10.
Find-and-modify operation returns the document updated (before or after the update depending on returnDocument setting). You can use this with $inc to implement a counter. Ruby example where c is a collection:
irb(main):005:0> c['foo'].insert_one(counter:true,count:1)
=> #<Mongo::Operation::Insert::Result:0x8040 documents=[{"n"=>1, "opTime"=>{"ts"=>#<BSON::Timestamp:0x00005609f260b7e0 #seconds=1594961771, #increment=2>, "t"=>1}, "electionId"=>BSON::ObjectId('7fffffff0000000000000001'), "ok"=>1.0, "$clusterTime"=>{"clusterTime"=>#<BSON::Timestamp:0x00005609f260b538 #seconds=1594961771, #increment=2>, "signature"=>{"hash"=><BSON::Binary:0x8060 type=generic data=0x0000000000000000...>, "keyId"=>0}}, "operationTime"=>#<BSON::Timestamp:0x00005609f260b290 #seconds=1594961771, #increment=2>}]>
irb(main):011:0> c['foo'].find_one_and_update({counter:true},{'$inc':{count:1}})
=> {"_id"=>BSON::ObjectId('5f112f6b2c97a6281f63f575'), "counter"=>true, "count"=>1}
irb(main):012:0> c['foo'].find_one_and_update({counter:true},{'$inc':{count:1}})
=> {"_id"=>BSON::ObjectId('5f112f6b2c97a6281f63f575'), "counter"=>true, "count"=>2}
irb(main):013:0> c['foo'].find_one_and_update({counter:true},{'$inc':{count:1}})
=> {"_id"=>BSON::ObjectId('5f112f6b2c97a6281f63f575'), "counter"=>true, "count"=>3}
irb(main):014:0> c['foo'].find_one_and_update({counter:true},{'$inc':{count:1}})
=> {"_id"=>BSON::ObjectId('5f112f6b2c97a6281f63f575'), "counter"=>true, "count"=>4}
Why not just use this logic? Instead of updating all via one query, just launch multiple queries one by one? Mongo will do it pretty fast, even if you have >1M docs in database (according to your phrase: I got a small database) because pre-builded index on _id field.
this is a javasript code, but I guess, you'll understand the logic of it
let all_documents = db.MyCollection.find({});
for (let i = 0; i < all_documents.length; i++) {
db.MyCollection.update({_id: all_documents[i]._id }, {$set : {"new_field": i}}, {upsert:false})
}
I am trying to count the number of documents that are in each possible state in a particular Arango collection.
This should be possible in 1 pass over all of the documents using a bucket-sort like strategy where you iterate over all documents, if the value for the state hasn't been seen before, you add a counter with a value of 1 to a list. If you have seen that state before, you increment the counter. Once you've reached the end, you'll have a counter for each possible state in the DB that indicates how many documents are currently stored with that state.
I can't seem to figure out how to write this type of logic in AQL to submit as a query. Current strategy is like this:
Loop over all documents, filtering only docs of a particular state.
Loop over all documents, filtering only docs of a different particular state.
...
All states have been filtered.
Return size of each set
This works, but I'm sure it's much slower than it should be. This also means that if we add a new state, we have to update the query to loop over all docs an additional time, filtering based on the new state. A bucket-sort like query would be quick, and would need no updating as new states are created as well.
If these were the documents:
{A}
{B}
{B}
{C}
{A}
Then I'd like the result to be
{ A:2, B:2, C:1 }
Where A,B,&C are values for a particular field. Current strategy filters like so
LET docsA = (
FOR doc in collection
FILTER doc.state == A
RETURN doc
)
Then manually construct the return object calling LENGTH on each list of docs
Any help or additional info would be greatly appreciated
What about using a COLLECT function? (see docs here)
FOR doc IN collection
COLLECT s = doc.state WITH COUNT INTO c
RETURN { state: s, count: c }
This would return something like:
[
{ state: 'A', count: 23 },
{ state: 'B', count: 2 },
{ state: 'C', count: 45 }
]
Would that accomplish what you are after?
I want to know is it possible that Use current data context when we update.
collection.update({_id: id},
{$set:
{'tmp.$.data': (function(){
return this.a + this.b})()},
{multi:true});
In the set operation, I tried to calculate something with IIFE function and 'this' keyword,
but there's nothing I can get as we expected, because that IIFE scope is placed outside of the update scope.
(a & b is an item in that object.)
That's what I want to do.
If we can control some data when we're doing update, I think it's really useful to manipulate collections.
Does anyone have idea about this?
Thanks in advance-
ps. I updated this question to js version.
Whenever I run into strangeness around # variables, I just make references outside of that closure. So from this:
collection.update (_id: id),
($set:
'tmp.$.data': do ->
return #a + #b),
(multi:true)
To this:
a = #a
b = #b
collection.update(
_id: id
$set:
'tmp.$.data': do ->
return a + b
multi:true
)
It seems like things can get goofed up when a function is expecting a function as an argument and get's something else. I'm not sure if that do in the tmp.$.data hash is really necessary either but I don't know meteor well enough to say yes or no.
I'm stumbling a bit with my CouchDB knowledge.
I have a database of content that is tagged with an array of tags and has a created date.
I want to create a view that pulls a limited number of newest stories tagged with a specific tag.
For example, the newest 6 stories tagged "Business."
Ran across this question, which seems to get me almost to where I need to go, but I'm missing one key element, which I think is how to craft the query string to sort by one key while searching by the other.
Here's my map function.
function(doc) {
if (doc.published == "yes" && doc.type == "news") {
for (var i = 0; i < doc.tags.length; i++) {
if (doc.tags[i]) {
emit([doc.created, doc.tags[i]], doc);
}
}
}
}
So how do I query that view for a all documents tagged "Business" that are the newest documents based on created.
The created attribute is a date sortable format.
First, I would switch the order of your emit:
emit([doc.tags[i], doc.created]);
(leave out doc as well, you can just add include_docs=true to get the entire document, and your view won't take up so much disk-space in the process)
Now you can query for the all the stories tagged as "Business" by using the following querystring:
startkey=["Business"]&endkey=["Business",{}]
You'll get all the documents with the tag business, and they'll be sorted by date.
This takes advantage of view collation, which basically is the rules governing how indexes are sorted/queried. For complex keys like this, the sorting is done for each item of the array separately. (ie. the first key is sorted first, the second key is sorted second, etc) This is why the order matters, as you must always move from left to right when querying a view index.
If you want the 6 most recent, your querystring will need to change:
descending=true&limit=6&endkey=["Business"]&startkey=["Business",{}]
NOTICE You need to swap the startkey/endkey values, due to how the descending parameter works. See the View reference page on the wiki for further explanation.
OK, I think I figured this out, but I'm not quite certain I fully understand it.
I found this story about complex keys and searching and sorting.
My map function looks like this:
function(doc) {
if (doc.published == "yes" && doc.type == "news") {
for (var i = 0; i < doc.tags.length; i++) {
if (doc.tags[i]) {
emit([doc.tags[i], doc.created], doc);
}
}
}
}
And to query and sort using it, the query looks like this.
http://localhost:5984/database/_design/story/_view/tagged?limit=10&startkey=["Business"]&endkey=["Business",{}]&descending=false
I'm getting the results I want, but I'm not entirely certain I understand it all.