Error : Writing in a non-empty collection - python-3.x

I have been facing issue while writing json to my Cosmos DB. I could able to read data and having an issue while writing the same
An applied following version of Cosmos db connectors
https://docs.azuredatabricks.net/spark/latest/data-sources/azure/cosmosdb-connector.html
and tried different versions too. The issue remains to persist.
RawFilePath="/mnt/ADLS/Users/test/CosmosDB/testfile.json"
DFRead=spark.read.json(RawFilePath)
DFNew = DFRead.selectExpr("activity", "partition AS xfactor","response", "source", "type")
writeConfig = {
"Endpoint" : "{End Point}",
"Masterkey" : "{MasterKey}",
"Database" : "{DB Name}",
"Collection" : "{Connection name}",
"Upsert" : "true"
}
DFNew.write.format("com.microsoft.azure.cosmosdb.spark").options(**writeConfig).save()
Getting the following error.
Error : java.lang.UnsupportedOperationException: Writing in a non-empty collection.
Expecting to write json into cosmos DB collection. But, could not able to resolve non-empty collection error in Databricks. Its really appreciate your help.
thank you..

Try by specifying mode
DFNew.write.format("com.microsoft.azure.cosmosdb.spark").mode("append").options(**writeConfig).save()

Related

how to create a new collection in $out in mongodb based on field names?

I am trying to create a new collection based on field names of document in $out.
I have tried bellow command but didn't work.
{ $out: "$fieldName" }
output is: MongoServerError: PlanExecutor error during aggregation :: caused by :: error with target namespace: Invalid collection name: $fieldName.
one of my documents is like the bellow:
{
"_id" : {
"age" : "24",
"gender" : "female"
},
"fieldName" : "engineer",
"name" : "",
"value" : NumberInt(1)
}
I don't think you can do this.
The $out stage specifies a single place to direct the output of the entire pipeline to. There is no guarantee that the field names would all specify the same value (and my guess is that you expect that they don't). We can see in the documentation that the coll (and db) parameters take string values as input rather than expressions which resolve to strings.
If I attempt to run the command in the comments it fails with a warning about the wrong type:
test> db.version()
6.0.1
test> db.foo.aggregate([{ $out: {db: "db_name", coll: { $getField: "x" }} }])
MongoServerError: wrong type for field (coll) object != string
Interestingly, the playground actually doesn't fail. But it looks like that's because it is effectively not evaluating/executing the $out. I can put in a random parameter name for the stage and it still "works", whereas it normally validates all of the parameters. So I think that the playground here is actually misleading.

Just like an Upsert operation from Azure COSMOS SDK, do we have PATCH + INSERT operation

Existing document
{
"property1" : "value1",
"property2" : "value2",
"property3" : "value3",
}
Now, "property2" needs to be changed to "newValue2".
With UpsertItemAsync method,
Container.UpsertItemAsync<T>(T, Nullable<PartitionKey>, ItemRequestOptions, CancellationToken)
document is updated as
{
"property2" : "newValue2" //"property1" & "property3" are removed.
}
what I want is
{
"property1" : "value1",
"property2" : "newValue2",
"property3" : "value3",
}
PatchItemAsync works
Container.PatchItemAsync<T>(String, PartitionKey, IReadOnlyList<PatchOperation>, PatchItemRequestOptions, CancellationToken)
but it returns 404 NOT FOUND if document doesn't already exist.
My question, is there a way I could do PATCH + INSERT?
Microsoft.Azure.Cosmos (3.23.0)
At least looking at the docs, it seems something similar to Upsert is not possible with the Patch endpoint.
And the Cosmos client didn't seem to support anything like that either.
If you compare it with the Create document operation, it mentions an x-ms-documentdb-is-upsert header that makes it an upsert if the document already exists.
You could patch the document and in the case it fails due to the document not existing, you could create it.
This does use exceptions for control flow so I'm not super happy with it :\

How to insert an Object into MongoDB

So I have an object using a dictionary to store products that a user has added to the cart in a shopping cart application. I am taking is object and attempting to insert into mongoDB with zero luck.
The piece of data I am attempting to insert looks like this:
products: '{"rJUg4uiGl":{"productPrice":"78.34","count":2},"BJ_7VOiGg":{"productPrice":"3","count":2}}' }
My process of attempting to insert it into mongoDB looks like this:
db.orders.insert("products":{"rJUg4uiGl":{"productPrice":"78.34","count":2},"BJ_7VOiGg":{"productPrice":"3","count":2}});
Currently with this approach I get the following error:
2016-12-15T18:11:43.862-0500 E QUERY [thread1] SyntaxError: missing ) after argument list #(shell):1:27
Which is implying there is some sort of a formatting issue with inserting it. I have moved quotation marks and parenthesis around plenty, simply to either get the above error, or a ... response from mongoDB implying that it is waiting for me to do something more to fix what exactly is causing an error.
Any chance anyone could help give some guidance in the best way to store this object in mongoDB?
My true question feels that it should have been in regards to the mongoose schema that would be used in order to store this data format. I hoped that getting how to initially insert it into mongodb was going to be enough but the way the data is being saved has me a bit confused. I know this is a bit of an awful question but could I get any assistance with setting up my schema for this as well?
"products" : {
"rJUg4uiGl" : {
"productPrice" : "78.34",
"count" : 2
},
"BJ_7VOiGg" : {
"productPrice" : "3",
"count" : 2
}
}
This is what the data looks like when it is stored in mongo. I think what is confusing me on how to set up is the "rJUg4uiGl" portion of the data. I am un-sure of how exactly that is suppose to look in mongoose schema. Here are a few of my rather poor attempts:
products: {
productId: {
productPrice: Number,
count: Number
}
}
Above simply doesn't store anything in the database
products: {
productId: [{
productPrice: Number,
count: Number
}]
}
Above gives:
"products" : {
"productId" : [ ]
}
Again, I know that this is quite specific but any help at all would be extremely appreciated.
Need to wrap your insert data in {}
db.orders.insert({"products":{"rJUg4uiGl":{"productPrice":"78.34","count":2},"BJ_7VOiGg":{"productPrice":"3","count":2}}});

Storing a complex Query within MongoDb Document [duplicate]

This is the case: A webshop in which I want to configure which items should be listed in the sjop based on a set of parameters.
I want this to be configurable, because that allows me to experiment with different parameters also change their values easily.
I have a Product collection that I want to query based on multiple parameters.
A couple of these are found here:
within product:
"delivery" : {
"maximum_delivery_days" : 30,
"average_delivery_days" : 10,
"source" : 1,
"filling_rate" : 85,
"stock" : 0
}
but also other parameters exist.
An example of such query to decide whether or not to include a product could be:
"$or" : [
{
"delivery.stock" : 1
},
{
"$or" : [
{
"$and" : [
{
"delivery.maximum_delivery_days" : {
"$lt" : 60
}
},
{
"delivery.filling_rate" : {
"$gt" : 90
}
}
]
},
{
"$and" : [
{
"delivery.maximum_delivery_days" : {
"$lt" : 40
}
},
{
"delivery.filling_rate" : {
"$gt" : 80
}
}
]
},
{
"$and" : [
{
"delivery.delivery_days" : {
"$lt" : 25
}
},
{
"delivery.filling_rate" : {
"$gt" : 70
}
}
]
}
]
}
]
Now to make this configurable, I need to be able to handle boolean logic, parameters and values.
So, I got the idea, since such query itself is JSON, to store it in Mongo and have my Java app retrieve it.
Next thing is using it in the filter (e.g. find, or whatever) and work on the corresponding selection of products.
The advantage of this approach is that I can actually analyse the data and the effectiveness of the query outside of my program.
I would store it by name in the database. E.g.
{
"name": "query1",
"query": { the thing printed above starting with "$or"... }
}
using:
db.queries.insert({
"name" : "query1",
"query": { the thing printed above starting with "$or"... }
})
Which results in:
2016-03-27T14:43:37.265+0200 E QUERY Error: field names cannot start with $ [$or]
at Error (<anonymous>)
at DBCollection._validateForStorage (src/mongo/shell/collection.js:161:19)
at DBCollection._validateForStorage (src/mongo/shell/collection.js:165:18)
at insert (src/mongo/shell/bulk_api.js:646:20)
at DBCollection.insert (src/mongo/shell/collection.js:243:18)
at (shell):1:12 at src/mongo/shell/collection.js:161
But I CAN STORE it using Robomongo, but not always. Obviously I am doing something wrong. But I have NO IDEA what it is.
If it fails, and I create a brand new collection and try again, it succeeds. Weird stuff that goes beyond what I can comprehend.
But when I try updating values in the "query", changes are not going through. Never. Not even sometimes.
I can however create a new object and discard the previous one. So, the workaround is there.
db.queries.update(
{"name": "query1"},
{"$set": {
... update goes here ...
}
}
)
doing this results in:
WriteResult({
"nMatched" : 0,
"nUpserted" : 0,
"nModified" : 0,
"writeError" : {
"code" : 52,
"errmsg" : "The dollar ($) prefixed field '$or' in 'action.$or' is not valid for storage."
}
})
seems pretty close to the other message above.
Needles to say, I am pretty clueless about what is going on here, so I hope some of the wizzards here are able to shed some light on the matter
I think the error message contains the important info you need to consider:
QUERY Error: field names cannot start with $
Since you are trying to store a query (or part of one) in a document, you'll end up with attribute names that contain mongo operator keywords (such as $or, $ne, $gt). The mongo documentation actually references this exact scenario - emphasis added
Field names cannot contain dots (i.e. .) or null characters, and they must not start with a dollar sign (i.e. $)...
I wouldn't trust 3rd party applications such as Robomongo in these instances. I suggest debugging/testing this issue directly in the mongo shell.
My suggestion would be to store an escaped version of the query in your document as to not interfere with reserved operator keywords. You can use the available JSON.stringify(my_obj); to encode your partial query into a string and then parse/decode it when you choose to retrieve it later on: JSON.parse(escaped_query_string_from_db)
Your approach of storing the query as a JSON object in MongoDB is not viable.
You could potentially store your query logic and fields in MongoDB, but you have to have an external app build the query with the proper MongoDB syntax.
MongoDB queries contain operators, and some of those have special characters in them.
There are rules for mongoDB filed names. These rules do not allow for special characters.
Look here: https://docs.mongodb.org/manual/reference/limits/#Restrictions-on-Field-Names
The probable reason you can sometimes successfully create the doc using Robomongo is because Robomongo is transforming your query into a string and properly escaping the special characters as it sends it to MongoDB.
This also explains why your attempt to update them never works. You tried to create a document, but instead created something that is a string object, so your update conditions are probably not retrieving any docs.
I see two problems with your approach.
In following query
db.queries.insert({
"name" : "query1",
"query": { the thing printed above starting with "$or"... }
})
a valid JSON expects key, value pair. here in "query" you are storing an object without a key. You have two options. either store query as text or create another key inside curly braces.
Second problem is, you are storing query values without wrapping in quotes. All string values must be wrapped in quotes.
so your final document should appear as
db.queries.insert({
"name" : "query1",
"query": 'the thing printed above starting with "$or"... '
})
Now try, it should work.
Obviously my attempt to store a query in mongo the way I did was foolish as became clear from the answers from both #bigdatakid and #lix. So what I finally did was this: I altered the naming of the fields to comply to the mongo requirements.
E.g. instead of $or I used _$or etc. and instead of using a . inside the name I used a #. Both of which I am replacing in my Java code.
This way I can still easily try and test the queries outside of my program. In my Java program I just change the names and use the query. Using just 2 lines of code. It simply works now. Thanks guys for the suggestions you made.
String documentAsString = query.toJson().replaceAll("_\\$", "\\$").replaceAll("#", ".");
Object q = JSON.parse(documentAsString);

Marklogic|NodeJS API - Query on a specific categorie "properties"

I have a json document in my DB that looks like this :
{
"uri" : "/me/myself/and/bd1e0f91656bfc713eb6560eeaad7ad1.json",
"category" : "content",
"format" : "json",
"versionId" : "14697362595356370",
"contentType" : "application/json",
"contentLength" : "1938",
"collections" : ["http://me.myself.com/collectionA"],
"properties" : {
"relatives" : ["/me/myself/and/B.json", "/me/myself/and/A.json"]
},
"content":{}
}
I'm trying to get all documents that have a specific relative in the properties:
qb.where(
qb.scope(
qb.property('relatives'),
qb.word("/me/myself/and/B.json"),
qb.fragmentScope('properties')
))
But i keep getting a large set of document that doesn't fit the query.
Any idea how to do this using the Marklogic NodeJS API?
I see two things that look like they might be problems. The first is qb.fragmentScope('properties'). This tells MarkLogic to look in the document's properties, rather than the document's content. That doesn't look like what you meant, given your sample JSON document.
The second problem is the word query -- "/me/myself/and/B.json" is likely being broken up into its constituent words (me, myself, and, B, json), which are then matching in other documents. You want to match exactly what's there, so try a value query:
qb.where(
qb.scope(
qb.properties('properties'),
qb.value('relatives', '/me/myself/and/B.json')
)
)
Note that the qb.scope and the qb.properties are to restrict the search to just match the value when it appears in relatives under a properties JSON property. This is different from the JSON property-versus-content point made above.
qb.where(
qb.propertiesFragment(
qb.term('/me/myself/and/B.json')
)
)
This worked for me.

Resources