ReadDocumentAsync always fails claiming Id does not exist

ReadDocumentAsync always fails claiming Id does not exist - azure

I am querying Azure DocumentDb (cosmos) for a document that is present in the container:
try{
doc = await client.ReadDocumentAsync(
GetDocumentUri("tenantString-campaignId"),
new RequestOptions
{
PartitionKey = new PartitionKey(tenantString)
});
}
catch(Exception e)
{
Console.WriteLine(e);
}
for this document:
tenantString-campaignId is the id you can see here and tenantString alone is the partition key this is under. The tenant itself was passed in as a string and I had that working but now I have changed to passing a Tenant object and parsing the string required from it I am not returning the document.
I have tried a few different variations of tenantString and id and I can generate either a DocumentClientException, Id does not exist, Exception or it fails silently; no exception and returns to the calling method where it causes a NullReferenceException as no document is returned.
As far as I can make out from debugging through this I am constructing all my data correctly and yet no document is returned. Does anyone have any idea what I can try next?

This syntax for the .NET SDK v2 is not correct. ReadDocumentAsync() should look like this.
var response = await client.ReadDocumentAsync(
UriFactory.CreateDocumentUri(databaseName, collectionName, "SalesOrder1"),
new RequestOptions { PartitionKey = new PartitionKey("Account1") });
You can see more v2 samples here

Related

CosmosDb Delete in Loop fails after one deletion

I have created a console app to delete documents from CosmosDB based on a partition that we have set up.
using (var client = new DocumentClient(new Uri(Config["CosmosEndpoint"]), Config["CosmosAuthkey"]))
{
var db = Options.Env.ToUpper().Equals("CI") ? Config["CiDatabase"] : Config["QaDatabase"];
var tenantString = $"{Options.Tenant}-{Options.Language}";
Log.Information($"Deleting {tenantString} from {db}-{Config["CosmosCollection"]}");
var query = client.CreateDocumentQuery<Document>(
UriFactory.CreateDocumentCollectionUri(db, Config["CosmosCollection"]),
"SELECT * FROM c",
new FeedOptions()
{
PartitionKey = new PartitionKey(tenantString)
}
).ToList();
Log.Information($"Found {query.Count} records for Tenant: {tenantString}");
if (query.Count > 0)
{
foreach (var doc in query)
{
Log.Information($"Deleting document {doc.Id}");
await client.DeleteDocumentAsync(doc.SelfLink,
new RequestOptions { PartitionKey = new PartitionKey(tenantString) });
Log.Information($"Deleted document {doc.Id}");
}
}
else
{
Log.Information($"Tenant: {tenantString}, No Search Records Found");
}
}
This never reaches the line Log.Information($"Deleted document {doc.Id}"); but also does not seem to throw an exception. I have wrapped the call in a try/catch for DocumentClient/ArgumentNull Exception in an attempt to see what it was but it just bombs on the delete call. It does, however, always delete one document.
This tells me that my config must be correct as I am connecting and querying and even deleting but not for all documents in the query. Even more strange is that I have copied this from another application that I wrote earlier where this code works.
Is there an upper limit on connecting meaning I need to delay my loop
some?
Why do I not see an exception when using a try/catch?
Or is there another reason that I only seem to be able to delete one document at a time with this code?

SQLInjection against CosmosDB in an Azure function

I have implemented an Azure function that is triggered by a HttpRequest. A parameter called name is passed as part of the HttpRequest. In Integration section, I have used the following query to retrieve data from CosmosDB (as an input):
SELECT * FROM c.my_collection pm
WHERE
Contains(pm.first_name,{name})
As you see I am sending the 'name' without sanitizing it. Is there any SQLInjection concern here?
I searched and noticed that parameterization is available but that is not something I can do anything about here.

When the binding occurs (the data from the HTTP Trigger gets sent to the Cosmos DB Input bind), it is passed through a SQLParameterCollection that will handle sanitization.
Please view this article:
Parameterized SQL provides robust handling and escaping of user input, preventing accidental exposure of data through “SQL injection”
This will cover any attempt to inject SQL through the name property.

If you're using Microsoft.Azure.Cosmos instead of Microsoft.Azure.Documents:
public class MyContainerDbService : IMyContainerDbService
{
private Container _container;
public MyContainerDbService(CosmosClient dbClient)
{
this._container = dbClient.GetContainer("MyDatabaseId", "MyContainerId");
}
public async Task<IEnumerable<MyEntry>> GetMyEntriesAsync(string queryString, Dictionary<string, object> parameters)
{
if ((parameters?.Count ?? 0) < 1)
{
throw new ArgumentException("Parameters are required to prevent SQL injection.");
}
var queryDef = new QueryDefinition(queryString);
foreach(var parm in parameters)
{
queryDef.WithParameter(parm.Key, parm.Value);
}
var query = this._container.GetItemQueryIterator<MyEntry>(queryDef);
List<MyEntry> results = new List<MyEntry>();
while (query.HasMoreResults)
{
var response = await query.ReadNextAsync();
results.AddRange(response.ToList());
}
return results;
}
}

Paging in MS Graph API

Graph API Paging explains that the response would contain a field #odata.nextLink which would contain a skiptoken pointing to the next page of contents.
When I test the API, I'm getting a fully-qualified MS Graph URL which contains the skiptoken as a query param. E.g. Below is the value I got for the field #odata.nextLink in the response JSON.
https://graph.microsoft.com/v1.0/users?$top=25&$skiptoken=X%27445370740200001E3A757365723134406F33363561702E6F6E6D6963726F736F66742E636F6D29557365725F31363064343831382D343162382D343961372D383063642D653136636561303437343437001E3A7573657235407368616C696E692D746573742E31626F74322E696E666F29557365725F62666639356437612D333764632D343266652D386335632D373639616534303233396166B900000000000000000000%27
Is it safe to assume we'll always get the full URL and not just the skiptoken? Because if it's true, it helps avoid parsing the skiptoken and then concatenating it to the existing URL to form the full URL ourselves.
EDIT - Compared to MS Graph API, response obtained from Azure AD Graph API differs in that the JSON field #odata.nextLink contains only the skipToken and not the fully-qualified URL.

if you would like to have all users in single list, you can achieve that using the code that follows:
public static async Task<IEnumerable<User>> GetUsersAsync()
{
var graphClient = GetAuthenticatedClient();
List<User> allUsers = new List<User>();
var users = await graphClient.Users.Request().Top(998)
.Select("displayName,mail,givenName,surname,id")
.GetAsync();
while (users.Count > 0)
{
allUsers.AddRange(users);
if (users.NextPageRequest != null)
{
users = await users.NextPageRequest
.GetAsync();
}
else
{
break;
}
}
return allUsers;
}
I am using graph client library

Yes. In Microsoft Graph you can assume that you'll always get the fully qualified URL for the #odata.nextLink. You can simply use the next link to get the next page of results, and clients should treat the nextLink as opaque (which is described in both OData v4 and in the Microsoft REST API guidelines here: https://github.com/Microsoft/api-guidelines/blob/master/Guidelines.md#98-pagination.
This is different from AAD Graph API (which is not OData v4), which doesn't return the fully qualified next link, and means you need to do some more complicated manipulations to get the next page of results.
Hence Microsoft Graph should make this simpler for you.
Hope this helps,

The above code did not work for me without adding a call to 'CurrentPage' on the last line.
Sample taken from here.
var driveItems = new List<DriveItem>();
var driveItemsPage = await graphClient.Me.Drive.Root.Children.Request().GetAsync();
driveItems.AddRange(driveItemsPage.CurrentPage);
while (driveItemsPage.NextPageRequest != null)
{
driveItemsPage = await driveItemsPage.NextPageRequest.GetAsync();
driveItems.AddRange(driveItemsPage.CurrentPage);
}

I followed Tracy's answer and I was able to fetch all the messages at one go.
public List<Message> GetMessages()
{
var messages = new List<Message>();
var pages = Client.Users[_email]
.Messages
.Request(QueryOptions)
// Fetch the emails with attachments directly instead of downloading them later.
.Expand("attachments")
.GetAsync()
.Result;
messages.AddRange(pages.CurrentPage);
while (pages.NextPageRequest != null)
{
pages = pages.NextPageRequest.GetAsync().Result;
messages.AddRange(pages.CurrentPage);
}
return messages;
}

Access resources by Id in Azure DocumentDB

I just started playing with Azure DocumentDB and my excitement has turned into confusion. This thing is weird. It seems like everything (databases, collections, documents) needs to be accessed not by its id, but by its 'SelfLink'. For example:
I create a database:
public void CreateDatabase()
{
using (var client = new DocumentClient(new Uri(endpoint), authKey))
{
Database db = new Database()
{
Id = "TestDB",
};
client.CreateDatabaseAsync(db).Wait();
}
}
Then later sometime I want to create a Collection:
public void CreateCollection()
{
using (var client = new DocumentClient(new Uri(endpoint), authKey))
{
DocumentCollection collection = new DocumentCollection()
{
Id = "TestCollection",
};
client.CreateDocumentCollectionAsync(databaseLink: "???", documentCollection: collection).Wait();
}
}
The api wants a 'databaseLink' when what I'd really prefer to give it is my database Id. I don't have the 'databaseLink' handy. Does DocumentDB really expect me to pull down a list of all databases and go searching through it for the databaseLink everytime I want to do anything?
This problem goes all the way down. I can't save a document to a collection without having the collection's 'link'.
public void CreateDocument()
{
using (var client = new DocumentClient(new Uri(endpoint), authKey))
{
client.CreateDocumentAsync(documentCollectionLink: "???", document: new { Name = "TestName" }).Wait();
}
}
So to save a document I need the collection's link. To get the collections link I need the database link. To get the database link I have to pull down a list of all databases in my account and go sifting through it. Then I have to use that database link that I found to pull down a list of collections in that database that I then have to sift through looking for the link of the collection I want. This doesn't seem right.
Am I missing something? Am I not understanding how to use this? Why am I assigning ids to all my resources when DocumentDB insists on using its own link scheme to identify everything? My question is 'how do I access DocumentDB resources by their Id?'

The information posted in other answers from 2014 is now somewhat out of date. Direct addressing by Id is possible:
Although _selflinks still exist, and can be used to access resources, Microsoft have since added a much simpler way to locate resources by their Ids that does not require you to retain the _selflink :
UriFactory
UriFactory.CreateDocumentCollectionUri(databaseId, collectionId))
UriFactory.CreateDocumentUri(databaseId, collectionId, "document id");
This enables you to create a safe Uri (allowing for example for whitespace) - which is functionally identical to the resources _selflink; the example given in the Microsoft announcement is shown below:
// Use **UriFactory** to build the DocumentLink
Uri docUri = UriFactory.CreateDocumentUri("SalesDb", "Catalog", "prd123");
// Use this constructed Uri to delete the document
await client.DeleteDocumentAsync(docUri);
The announcement, from August 13th 2015, can be found here:
https://azure.microsoft.com/en-us/blog/azure-documentdb-bids-fond-farewell-to-self-links/

I would recommend you look at the code samples here in particular the DocumentDB.Samples.ServerSideScripts project.
In the Program.cs you will find the GetOrCreateDatabaseAsync method:
/// <summary>
/// Get or create a Database by id
/// </summary>
/// <param name="id">The id of the Database to search for, or create.</param>
/// <returns>The matched, or created, Database object</returns>
private static async Task<Database> GetOrCreateDatabaseAsync(string id)
{
Database database = client.CreateDatabaseQuery()
.Where(db => db.Id == id).ToArray().FirstOrDefault();
if (database == null)
{
database = await client.CreateDatabaseAsync(
new Database { Id = id });
}
return database;
}
To answer you question, you can use this method to find your database by its id and other resources (collections, documents etc.) using their respective Create[ResourceType]Query() methods.
Hope that helps.

The create database call returns a the database object:
var database = client.CreateDatabaseAsync(new Database { Id = databaseName }).Result.Resource;
And then you can use that to create your collection
var spec = new DocumentCollection { Id = collectionName };
spec.IndexingPolicy.IndexingMode = IndexingMode.Consistent;
spec.IndexingPolicy.Automatic = true;
spec.IndexingPolicy.IncludedPaths.Add(new IndexingPath { IndexType = IndexType.Range, NumericPrecision = 6, Path = "/" });
var options = new RequestOptions
{
ConsistencyLevel = ConsistencyLevel.Session
};
var collection = client.CreateDocumentCollectionAsync(database.SelfLink, spec, options).Result.Resource;

The client.Create... methods return the objects which have the self links you are looking for
Database database = await client.CreateDatabaseAsync(
new Database { Id = "Foo"});
DocumentCollection collection = await client.CreateDocumentCollectionAsync(
database.SelfLink, new DocumentCollection { Id = "Bar" });
Document document = await client.CreateDocumentAsync(
collection.SelfLink, new { property1 = "Hello World" });

For deleting the document in partitioned collection, please leverage this format:
result = await client.DeleteDocumentAsync(selfLink, new RequestOptions {
PartitionKey = new PartitionKey(partitionKey)
});

Add or replace entity in Azure Table Storage

I'm working with Windows Azure Table Storage and have a simple requirement: add a new row, overwriting any existing row with that PartitionKey/RowKey. However, saving the changes always throws an exception, even if I pass in the ReplaceOnUpdate option:
tableServiceContext.AddObject(TableName, entity);
tableServiceContext.SaveChangesWithRetries(SaveChangesOptions.ReplaceOnUpdate);
If the entity already exists it throws:
System.Data.Services.Client.DataServiceRequestException: An error occurred while processing this request. ---> System.Data.Services.Client.DataServiceClientException: <?xml version="1.0" encoding="utf-8" standalone="yes"?>
<error xmlns="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata">
<code>EntityAlreadyExists</code>
<message xml:lang="en-AU">The specified entity already exists.</message>
</error>
Do I really have to manually query for the existing row first and call DeleteObject on it? That seems very slow. Surely there is a better way?

As you've found, you can't just add another item that has the same row key and partition key, so you will need to run a query to check to see if the item already exists. In situations like this I find it helpful to look at the Azure REST API documentation to see what is available to the storage client library. You'll see that there are separate methods for inserting and updating. The ReplaceOnUpdate only has an effect when you're updating, not inserting.
While you could delete the existing item and then add the new one, you could just update the existing one (saving you one round trip to storage). Your code might look something like this:
var existsQuery = from e
in tableServiceContext.CreateQuery<MyEntity>(TableName)
where
e.PartitionKey == objectToUpsert.PartitionKey
&& e.RowKey == objectToUpsert.RowKey
select e;
MyEntity existingObject = existsQuery.FirstOrDefault();
if (existingObject == null)
{
tableServiceContext.AddObject(TableName, objectToUpsert);
}
else
{
existingObject.Property1 = objectToUpsert.Property1;
existingObject.Property2 = objectToUpsert.Property2;
tableServiceContext.UpdateObject(existingObject);
}
tableServiceContext.SaveChangesWithRetries(SaveChangesOptions.ReplaceOnUpdate);
EDIT: While correct at the time of writing, with the September 2011 update Microsoft have updated the Azure table API to include two upsert commands, Insert or Replace Entity and Insert or Merge Entity

In order to operate on an existing object NOT managed by the TableContext with either Delete or SaveChanges with ReplaceOnUpdate options, you need to call AttachTo and attach the object to the TableContext, instead of calling AddObject which instructs TableContext to attempt to insert it.
http://msdn.microsoft.com/en-us/library/system.data.services.client.dataservicecontext.attachto.aspx

in my case it was not allowed to remove it first, thus I do it like this, this will result in one transaction to server which will first remove existing object and than add new one, removing need to copy property values
var existing = from e in _ServiceContext.AgentTable
where e.PartitionKey == item.PartitionKey
&& e.RowKey == item.RowKey
select e;
_ServiceContext.IgnoreResourceNotFoundException = true;
var existingObject = existing.FirstOrDefault();
if (existingObject != null)
{
_ServiceContext.DeleteObject(existingObject);
}
_ServiceContext.AddObject(AgentConfigTableServiceContext.AgetnConfigTableName, item);
_ServiceContext.SaveChangesWithRetries();
_ServiceContext.IgnoreResourceNotFoundException = false;

Insert/Merge or Update was added to the API in September 2011. Here is an example using the Storage API 2.0 which is easier to understand then the way it is done in the 1.7 api and earlier.
public void InsertOrReplace(ITableEntity entity)
{
retryPolicy.ExecuteAction(
() =>
{
try
{
TableOperation operation = TableOperation.InsertOrReplace(entity);
cloudTable.Execute(operation);
}
catch (StorageException e)
{
string message = "InsertOrReplace entity failed.";
if (e.RequestInformation.HttpStatusCode == 404)
{
message += " Make sure the table is created.";
}
// do something with message
}
});
}

The Storage API does not allow more than one operation per entity (delete+insert) in a group transaction:
An entity can appear only once in the transaction, and only one operation may be performed against it.
see MSDN: Performing Entity Group Transactions
So in fact you need to read first and decide on insert or update.

You may use UpsertEntity and UpsertEntityAsync methods in the official Microsoft Azure.Data.Tables TableClient.
The fully working example is available at https://github.com/Azure-Samples/msdocs-azure-data-tables-sdk-dotnet/blob/main/2-completed-app/AzureTablesDemoApplicaton/Services/TablesService.cs --
public void UpsertTableEntity(WeatherInputModel model)
{
TableEntity entity = new TableEntity();
entity.PartitionKey = model.StationName;
entity.RowKey = $"{model.ObservationDate} {model.ObservationTime}";
// The other values are added like a items to a dictionary
entity["Temperature"] = model.Temperature;
entity["Humidity"] = model.Humidity;
entity["Barometer"] = model.Barometer;
entity["WindDirection"] = model.WindDirection;
entity["WindSpeed"] = model.WindSpeed;
entity["Precipitation"] = model.Precipitation;
_tableClient.UpsertEntity(entity);
}

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

ReadDocumentAsync always fails claiming Id does not exist - azure

Related

CosmosDb Delete in Loop fails after one deletion

SQLInjection against CosmosDB in an Azure function

Paging in MS Graph API

Access resources by Id in Azure DocumentDB

Add or replace entity in Azure Table Storage

Categories

Resources