Synchronous paging against Azure Table Storage - azure

I am new to working with Azure table Storage but I have being able to put together the code below that successfully allows my query to accept filterQuery (string) specified by the user -- for ex: (Amount le 5000.00) -- and to retrieve all rows (entities) matching the criteria.
Dim sBuilder As New System.Text.StringBuilder
Dim query = MyBase.CreateQuery(Of cData)("CustomerData")
Dim userQuery = String.Format("(PartitionKey eq '{0}' and {1})", AppID, filterQuery)
sBuilder.AppendFormat(userQuery)
query = query.AddQueryOption("$filter", sBuilder.ToString).AsTableServiceQuery().Take(50)
Dim results As List(Of cData) = query.Execute.ToList
I should point out that this way of allowing the user to specify the filter string is key for me since I am using a generic class that has a dictionary inside of it in order to allow my caller to pass in any number of elements to store into a given entity. Therefore, this solution allows the user to drive how he wants his query to search by and my code does not have to 'know' anything about his custom fields.
Now I need to add pagination. My understanding is that the 'Execute' method I am using handles the pagination for you so if there are 7,000 records matching the criteria, my code will sit until all the entries are retrieved/returned. However, I want to instead allow my user to specify how many entities he wants returned at a time (max results) and allow him to then make subsequent calls using continuation tokens to get the next 'batch' of matching entities.
Any thoughts on how I can achieve this without losing my ability to allow the user to specify his search criteria in a simple string?

I think you can just do query.EndExecuteSegmented(query.BeginExecuteSegmented(...))
Check out the code for SmarxToDo: http://blog.smarx.com/posts/todo-list-app-using-asp-net-mvc-and-windows-azure-tables

You may want to implement your query using REST API (http://msdn.microsoft.com/en-us/library/dd179421.aspx). You will get XML response back from storage service which you can parse to create the collection of objects.

Related

How to check for inequality between fields in same document in Azure Cognitive search?

We have an index set up in Azure cognitive search that has two string fields (hash1 & hash2) containing separate hashes. We would like to query the index for documents where the two hashes within a document aren't equal.
I tried applying the filter: $filter=hash1 ne hash2, expecting the query to return all documents with mismatched hashes. Instead, I was greeted with the following error message:
"Invalid expression: Comparison must be between a field, range variable or function call and a literal value.\r\nParameter name: $filter"
From what I can gather there seems to be some kind of limitation preventing comparisons between fields. Would it be possible to perform this type of query in Azure cognitive search using a different technique?
I would use content enrichment in this case. Even if comparing two hashes with a query was supported, it would be inefficient compared to pre-calculating the value using a content enrichment technique.
Introduce a new boolean property called something like HasEqualHashes
Populate that property with an appropriate boolean value
Use a $filter to filter your content as you wish
search=whatever&$filter=HasEqualHashes
Note that two different scenarios determine how you can enrich your content.
CONTENT SUBMITTED VIA SDK
When you use the SDK to submit content, you can enrich your items any way you want using regular code. Populating your HasEqualHashes property is a trivial one-liner in C#.
CONTENT SUBMITTED USING BUILT-IN INDEXERS
If you use one of the built-in indexers, you have to learn and understand the concept of skillsets.
https://learn.microsoft.com/en-us/azure/search/cognitive-search-working-with-skillsets

How can I retrieve the id of a document I added to a Cosmosdb collection?

I have a single collection into which I am inserting documents of different types. I use the type parameter to distinguish between different datatypes in the collection. When I am inserting a document, I have created an Id field for every document, but Cosmosdb has a built-in id field.
How can I insert a new document and retrieve the id of the created Document all in one query?
The CreateDocumentAsync method returns the created document so you should be able to get the document id.
Document created = await client.CreateDocumentAsync(collectionLink, order);
I think you just need to .getResource() method to get the create document obj.
Please refer to the java code:
DocumentClient documentClient = new DocumentClient(END_POINT,
MASTER_KEY, ConnectionPolicy.GetDefault(),
ConsistencyLevel.Session);
Document document = new Document();
document.set("name","aaa");
document = documentClient.createDocument("dbs/db/colls/coll",document,null,false).getResource();
System.out.println(document.toString());
//then do your business logic with the document.....
C# code:
Parent p = new Parent
{
FamilyName = "Andersen.1",
FirstName = "Andersen",
};
Document doc = client.CreateDocumentAsync("dbs/db/colls/coll",p,null).Result.Resource;
Console.WriteLine(doc);
Hope it helps you.
Sure, you could always fetch the id from creation method response in your favorite API as already shown in other answers. You may have reasons why you want to delegate key-assigning to DocumentDB, but to be frank, I don't see any good ones.
If inserted document would have no id set DocumentDB would generate a GUID for you. There wouldn't be any notable difference compared to simply generating a new GUID yourself and assign it into id-field before save. Self-assigning the identity would let you simplify your code a bit and also let you use the identity not only after persisting but also BEFORE. Which could simplify a lot of scenarios you may have or run into in future.
Also, note that you don't have to use GUIDs as as id and could use any unique value you already have. Since you mentioned you have and Id field (which by name, I assume to be a primary key) then you should consider reusing this instead introducing another set of keys.
Self-assigned non-Guid key is usually a better choice since it can be designed to match your data and application needs better than a GUID. For example, in addition to being just unique, it may also be a natural key, narrower, human-readable, ordered, etc.

How to check for duplication before creating a new document in CouchDB/Cloudant?

We want to check if a document already exists in the database with the same fields and values of a new object we are trying to save to prevent duplicated item.
Note: This question is not about updating documents or about duplicated document IDs, we only check the data to prevent saving a new document with the same data of an existing one.
Preferably we'd like to accomplish this with Mango/Cloudant queries and not rely on views.
The idea so far is:
1) Scan the the data that we are trying to save and dynamically create a selector that matches that document's structure. (We can't have the selectors hardcoded because we have types of many documents)
2) Query de DB with for any documents matching that selector to if any document already exists that matches those criteria.
However I wonder about the performance of this approach since many of the selector fields will not be indexed.
I also much rather follow best practices than create something out of the blue, but haven't been able to find any known solutions for this specific scenario.
If you happen to know of any, please share.
Option 1 - Define a meaningful ID for your documents
The ID could be a logical coposition or a computed hash from the values that should be unique
If you want to check if a document ID already exists you can use the HEAD method
HEAD /db/docId
which returns 200-OK if the docId exits on the database.
If you would like to check if you have the same content in the new document and in the previous one, you may use the Validate Document Update Function which allows to compare both documents.
function(newDoc, oldDoc, userCtx, secObj) {
...
}
Option 2 - Use content hash computed outside CouchDB
Before create or update a document a hash should be computed using the values of the attributes that should be unique.
The hash is included in the document in a new attribute i.e. "key_hash"
Create a mango index using the "key_hash" attribute
When a new doc should be inserted, the hash should be computed and find for documents with the same hash value using a mango expression before the doc is inserted.
Option 3 - Compute hash in a View
Define a view which emit the computed hash for each document as key
Couchdb Javascript support does not include hashing functions, this could be difficult to include in a design document.
Use erlang to define the map function, where you can access to the erlang support for hashing.
Before creating a new document you should query the view using a the hash that you need to compute previously.
One solution would be to take Juanjo's and Alexis's comment one step further.
Select the keys you wish to keep unique
Put the values in a string and generate a hash
Set the document's _id to that hash
PUT the document on the database.
check return for failure
If another document already exists on the database with the same _id value, the PUT request will fail.

Can I create azure cosmos db document with a custom key other than Id

I am using azure cosmos db for saving and editing my session information. Currently i am not using ID in my document, instead i have another unique field with all docs. How can i update my query to get documents?
You can use whatever property you want, for your custom key (just make sure you don't remove its index). By default, all properties are indexed unless you explicitly set up a custom index policy that removes certain properties from being indexed.
You cannot eliminate the built-in id property though; if you don't set it explicitly, it will just be set to a guid.
If you are doing queries, this really shouldn't matter, functionality-wise. Just search on whatever properties you want. However: If you are doing point-reads (a read is more efficient, RU-wise, than a query, when retrieving a single document) you can only perform a point-read by specifying the id property, not your custom property. If you must use a custom property and you need to do point-reads, you can consider storing your custom property as id as well (as long as it's guaranteed to be unique per document).

How to select only needed fields of objects?

I am using the Pimcore API to fetch objects.
$myObjects = new Object\MyObject\Listing();
$myObjects->load();
$myObjects->getObjects();
Works as expected. Now I want to select only a specific field of my objects, e.g. the name field.
How can I tell Pimcore to select only fields that I want? Is it even possibile through the API or do I need to use custom SQL? If so, how can I do that?
Best regards
The pimcore listing is always returning the complete set of objects matching your listing condition...
If you want a fast and easy way to only select one field of your object, I recommend to use the pimcore db class:
$db = \Pimcore\Db::get();
$fieldsArray = $db->fetchCol("SELECT `YOUR_FIELD` FROM `object_query_CLASS-ID`");
This will return you an array width all 'YOUR_FIELD' values from the object query table of your class.
To get the class ID for your query dynamically your should use:
$classId = \Pimcore\Model\Object\MyObject::classId();
Edit:
To get more than one field column, you need to use 'fetchAll' instead of 'fetchCol':
$fieldsArray = $db->fetchAll("SELECT `YOUR_FIELD`, `YOUR_FIELD_2` FROM `object_query_CLASS-ID`");

Resources