Is there a way to learn how many RUs were consumed when executing a query using the Cassandra api on CosmosDB?
(My understanding is normal API returns this in an additional HTTP header, but obviously that does not work with CQL as wire protocol..)
The only way I know how to get request charge for specific CQL queries in Cosmos is to turn on diagnostic logging. Then each query you run will result in a diagnostic log entry like this.
{ "time": "2020-03-30T23:55:10.9579593Z", "resourceId": "/SUBSCRIPTIONS/<your_subscription_ID>/RESOURCEGROUPS/<your_resource_group>/PROVIDERS/MICROSOFT.DOCUMENTDB/DATABASEACCOUNTS/<your_database_account>", "category": "CassandraRequests", "operationName": "QuerySelect", "properties": {"activityId": "6b33771c-baec-408a-b305-3127c17465b6","opCode": "<empty>","errorCode": "-1","duration": "0.311900","requestCharge": "1.589237","databaseName": "system","collectionName": "local","retryCount": "<empty>","authorizationTokenType": "PrimaryMasterKey","address": "104.42.195.92","piiCommandText": "{"request":"SELECT key from system.local"}","userAgent": """"}}
For details on how to configure Diagnostic Logging in Cosmos DB see, Monitor Azure Cosmos DB data by using diagnostic settings in Azure
Hope this is helpful.
Related
As mentioned in the Microsoft documentation there is support to increase/decrease the provisioned RU of cosmos containers using cosmosDB Java SDK but when I am trying to perform the steps I am getting below error:
com.azure.cosmos.CosmosException: {"innerErrorMessage":"\"Operation 'PUT' on resource 'offers' is not allowed through Azure Cosmos DB endpoint. Please switch on such operations for your account, or perform this operation through Azure Resource Manager, Azure Portal, Azure CLI or Azure Powershell\"\r\nActivityId: 86fcecc8-5938-46b1-857f-9d57b7, Microsoft.Azure.Documents.Common/2.14.0, StatusCode: Forbidden","cosmosDiagnostics":{"userAgent":"azsdk-java-cosmos/4.28.0 MacOSX/10.16 JRE/1.8.0_301","activityId":"86fcecc8-5938-46b1-857f-9d57b74c6ffe","requestLatencyInMs":89,"requestStartTimeUTC":"2022-07-28T05:34:40.471Z","requestEndTimeUTC":"2022-07-28T05:34:40.560Z","responseStatisticsList":[],"supplementalResponseStatisticsList":[],"addressResolutionStatistics":{},"regionsContacted":[],"retryContext":{"statusAndSubStatusCodes":null,"retryCount":0,"retryLatency":0},"metadataDiagnosticsContext":{"metadataDiagnosticList":null},"serializationDiagnosticsContext":{"serializationDiagnosticsList":null},"gatewayStatistics":{"sessionToken":null,"operationType":"Replace","resourceType":"Offer","statusCode":403,"subStatusCode":0,"requestCharge":"0.0","requestTimeline":[{"eventName":"connectionAcquired","startTimeUTC":"2022-07-28T05:34:40.472Z","durationInMicroSec":1000},{"eventName":"connectionConfigured","startTimeUTC":"2022-07-28T05:34:40.473Z","durationInMicroSec":0},{"eventName":"requestSent","startTimeUTC":"2022-07-28T05:34:40.473Z","durationInMicroSec":5000},{"eventName":"transitTime","startTimeUTC":"2022-07-28T05:34:40.478Z","durationInMicroSec":60000},{"eventName":"received","startTimeUTC":"2022-07-28T05:34:40.538Z","durationInMicroSec":1000}],"partitionKeyRangeId":null},"systemInformation":{"usedMemory":"71913 KB","availableMemory":"3656471 KB","systemCpuLoad":"empty","availableProcessors":8},"clientCfgs":{"id":1,"machineId":"uuid:248bb21a-d1eb-46a5-a29e-1a2f503d1162","connectionMode":"DIRECT","numberOfClients":1,"connCfg":{"rntbd":"(cto:PT5S, nrto:PT5S, icto:PT0S, ieto:PT1H, mcpe:130, mrpc:30, cer:false)","gw":"(cps:1000, nrto:PT1M, icto:PT1M, p:false)","other":"(ed: true, cs: false)"},"consistencyCfg":"(consistency: Session, mm: true, prgns: [])"}}}
at com.azure.cosmos.BridgeInternal.createCosmosException(BridgeInternal.java:486)
at com.azure.cosmos.implementation.RxGatewayStoreModel.validateOrThrow(RxGatewayStoreModel.java:440)
at com.azure.cosmos.implementation.RxGatewayStoreModel.lambda$toDocumentServiceResponse$0(RxGatewayStoreModel.java:347)
at reactor.core.publisher.FluxMap$MapSubscriber.onNext(FluxMap.java:106)
at reactor.core.publisher.FluxSwitchIfEmpty$SwitchIfEmptySubscriber.onNext(FluxSwitchIfEmpty.java:74)
at reactor.core.publisher.FluxPeek$PeekSubscriber.onNext(FluxPeek.java:200)
at reactor.core.publisher.FluxHandle$HandleSubscriber.onNext(FluxHandle.java:119)
Message says to switch on such operations for your accounts but I could not find any page to do that. Can I use Azure functions to do the same thing at a specific time?
Code snippet:
CosmosAsyncContainer container = client.getDatabase("DatabaseName").getContainer("ContainerName");
ThroughputProperties autoscaleContainerThroughput = container.readThroughput().block().getProperties();
container.replaceThroughput(ThroughputProperties.createAutoscaledThroughput(newAutoscaleMaxThroughput)).block();
This is because disableKeyBasedMetadataWriteAccess is set to true on the account. You will need to contact either your subscription owner or someone with DocumentDB Account Contributor to modify the throughput using PowerShell or azure cli, links to samples. You can also do this by redeploying the ARM template or Bicep file used to create the account (be sure to do a GET first on the resource so you don't accidentally change something.
If you are looking for a way to automatically scale resources up and down on a schedule, please refer to this sample here, Scale Azure Cosmos DB throughput by using Azure Functions Timer trigger
To learn more about the disableKeyBasedMetadataWriteAccess property and it's impact to control plane operations from the data plane SDK's see, Preventing changes from the Azure Cosmos DB SDKs
I am trying to build an Azure Stream Analytics job in VS Code using the Azure Stream Analytics Tools extension. I have added an event hub as an input and a data lake gen 2 storage account as an output and I can successfully run the job in VS Code using "Use Live Input and Live Output".
The issue I'm having is when I try to set the output to an Azure Cosmos DB Document DB instead I get an error "Failed to convert output 'cosmosdb' : Unsupported data source type.." when trying to use live input and output. I can however use successfully run the job using "Live input and local output"
Is this a limitation of the VS Code extension that you can't debug live output against Cosmos DB? Or have I set something up incorrectly in my cosmos db output? See cosmos db output code
{
"Name": "cosmosdb",
"DataSourceType": "DocumentDB",
"DocumentDbProperties": {
"AccountId": "cosmosdb-dev-eastau-001",
"AccountKey": null,
"Database": "cosmosdb_db",
"ContainerName": "container1",
"DocumentId": ""
},
"DataSourceCredentialDomain": "xxxxxxxxxxxxxxxxxxxxxxxxxxxx.StreamAnalystics",
"ScriptType": "Output"
}
For Live Input to Live Output mode, the only supported output adapters (for now) are Event Hub, Storage Account, and Azure SQL. https://learn.microsoft.com/en-us/azure/stream-analytics/visual-studio-code-local-run-all#local-run-modes
I have setup an external metrics server in AKS (Azure Kubernetes Service). I could see the metric when querying the external metric api server.
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/default/queuemessages" | jq .
{
"kind": "ExternalMetricValueList",
"apiVersion": "external.metrics.k8s.io/v1beta1",
"metadata": {
"selfLink": "/apis/external.metrics.k8s.io/v1beta1/namespaces/default/queuemessages"
},
"items": [
{
"metricName": "queuemessages",
"metricLabels": null,
"timestamp": "2020-04-09T14:04:08Z",
"value": "0"
}
]
}
I want to know how to delete this metric from the external metrics server?
It looks like you are interested in the Queue Bus metrics.
I found this issue that is still open talking about a big delay in the queue messages metric to get populated.
https://github.com/Azure/azure-k8s-metrics-adapter/issues/63
the way custom-metric adapters work, they will query metrics from external services and make them available over a custom api on the Kubernetes API-server using a APiService resource.
https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#support-for-metrics-apis
https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/#autoscaling-on-metrics-not-related-to-kubernetes-objects
The adapter implements a query to the external service (Service Bus in your case)
base on the spec, the get metric should never fail, so receiving a NULL could be because you don't have a valid connection OR there isn't available m metrics yet.
https://github.com/kubernetes-sigs/custom-metrics-apiserver/blob/master/docs/getting-started.md#writing-a-provider
First, there's a method for listing all metrics available at any point in time. It's used to populate the discovery information in the API, so that clients can know what metrics are available. It's not allowed to fail (it doesn't return any error), and it should return quickly, so it's suggested that you update it asynchronously in real-world code.
Could you explain why you are looking to delete the metrics ? In the end, I don't think it is possible since the adapter is there to fetch and report.
I am using Azure Logic App with Azure BLOB Storage trigger.
When a blob is updated or modified in Azure Storage, I pull the content of blob created or modified from Storage, do some transformations on data and push it back to Azure Storage as new blob content using Create Content - Azure Blob Storage action of LogicApp.
With large number of blobs inserted (for example 10000 files) or updated into blob storage, Logic App gets triggered multiple runs as expected for these inserted blobs, but the further Azure Blob Actions fail with following error:
{
"statusCode": 429,
"message": "Rate limit is exceeded. Try again in 16 seconds."
}
Did someone face similar issue in Logic App? If yes, can you suggest what could be the possible reason and probable fix.
Thanks
Seems like you are hitting the rate limits on the Azure Blob Managed API.
Please refer to Jörgen Bergström's blog about this: http://techstuff.bergstrom.nu/429-rate-limit-exceeded-in-logic-apps/
Essentially he says you can setup multiple API connections that do the same thing and then randomize the connection in the logic app code view to randomly use one of those connection which will eliminate the rate exceeding issue.
An example of this (I was using SQL connectors) is see below API connections I setup for my logic app. You can do the same with a blob storage connection and use a similar naming convention e.g. blob_1, blob_2, blob_3, ... and so on. You can create as many as you would like, I created 10 for mine:
You would then in your logic app code view replace all your current blob connections e.g.
#parameters('$connections')['blob']['connectionId']
Where "blob" is your current blob api connection with the following:
#parameters('$connections')[concat('blob_',rand(1,10))]['connectionId']
And then make sure to add all your "blob_" connections at the end of your code:
"blob_1": {
"connectionId": "/subscriptions/.../resourceGroups/.../providers/Microsoft.Web/connections/blob-1",
"connectionName": "blob-1",
"id": "/subscriptions/.../providers/Microsoft.Web/locations/.../managedApis/blob"
},
"blob_2": {
"connectionId": "/subscriptions/.../resourceGroups/.../providers/Microsoft.Web/connections/blob-2",
"connectionName": "blob-2",
"id": "/subscriptions/.../providers/Microsoft.Web/locations/.../managedApis/blob"
},
...
The logic app would then randomize which connection to use during the run eliminating the 429 rate limit error.
Please check this doc: https://learn.microsoft.com/en-us/azure/azure-resource-manager/resource-manager-request-limits
For each Azure subscription and tenant, Resource Manager allows up to
12,000 read requests per hour and 1,200 write requests per hour.
you can check the usage by:
response.Headers.GetValues("x-ms-ratelimit-remaining-subscription-reads").GetValue(0)
or
response.Headers.GetValues("x-ms-ratelimit-remaining-subscription-writes").GetValue(0)
How can you fetch data from an http rest endpoint as an input for a data factory?
My use case is to fetch new data hourly from a rest HTTP GET and update/insert it into a document db in azure.
Can you just create an endpoint like this and put in the rest endpoint?
{
"name": "OnPremisesFileServerLinkedService",
"properties": {
"type": "OnPremisesFileServer",
"description": "",
"typeProperties": {
"host": "<host name which can be either UNC name e.g. \\\\server or localhost for the same machine hosting the gateway>",
"gatewayName": "<name of the gateway that will be used to connect to the shared folder or localhost>",
"userId": "<domain user name e.g. domain\\user>",
"password": "<domain password>"
}
}
}
And what kind of component do I add to create the data transformation job - I see that there is a a bunch of things like hdinsight, data lake and batch but not sure what the differences or appropriate service would be to simply upsert the new set into the azure documentDb.
I think the simplest way will be to use the Azure Logic Apps.
You can make a call to any Restfull service using the Http Connector in Azure Logic App connectors.
So you can do GET and POST/PUT etc in a flow based on schedule or based on some other GET listener:
Here is the documentation for it:
https://azure.microsoft.com/en-us/documentation/articles/app-service-logic-connector-http/
To do this with Azure Data Factory you will need to utilize Custom Activities.
Similar question here:
Using Azure Data Factory to get data from a REST API
If Azure Data Factory is not an absolute requirement Aram's suggestion might serve you better utilizing Logic Apps.
Hope that helps.
This can be achieved with Data Factory. This is especially good if you want to run batches on a schedule and have a single place for monitoring and management. There is sample code in our GitHub repo for an HTTP loader to blob here https://github.com/Azure/Azure-DataFactory. Then, the act of moving data from the blob to docdb will do the insert for you using our DocDB connector. There is a sample on how to use this connector here https://azure.microsoft.com/en-us/documentation/articles/data-factory-azure-documentdb-connector/ Here are the brief steps you will take to fulfill your scenario
Create a custom .NET activity to get your data to blob.
Create a linked service of type DocumentDb.
Create linked service of type AzureStorage.
Use input dataset of type AzureBlob.
Use output dataset of type DocumentDbCollection.
Create and schedule a pipeline that includes your custom activity, and a Copy Activity that uses BlobSource and DocumentDbCollectionSink schedule the activities to the required frequency and availability of the datasets.
Aside from that, choosing where to run your transforms (HDI, Data Lake, Batch) will depend on your I/o and perf reqs. You can choose to run your custom activity on Azure Batch or HDI in this case.