How to provide cluster name in Azure Databricks Notebook Run Now JSON

How to provide cluster name in Azure Databricks Notebook Run Now JSON - azure

I am able to use the below JSON through POSTMAN to run my Databricks notebook.
I want to be able to give a name to the cluster that is created through the "new_cluster" options.
Is there any such option available?
{
"tasks": [
{
"task_key": "Job_Run_Api",
"description": "To see how the run and trigger api works",
"new_cluster": {
"spark_version": "9.0.x-scala2.12",
"node_type_id": "Standard_E8as_v4",
"num_workers": "1",
"custom_tags": {
"Workload": "Job Run Api"
}
},
"libraries": [
{
"maven": {
"coordinates": "net.sourceforge.jtds:jtds:1.3.1"
}
}
],
"notebook_task": {
"notebook_path": "/Shared/POC/Job_Run_Api_POC",
"base_parameters": {
"name": "Junaid Khan"
}
},
"timeout_seconds": 2100,
"max_retries": 0
}
],
"job_clusters": null,
"run_name": "RUN_API_TEST",
"timeout_seconds": 2100
}
When the above API call is done, the cluster created has a name like "job-5975-run-2" and that is not super explanatory.
I have tried to use the tag "cluster_name" inside the "new_cluster" tag but I got an error that I can't do that, like this:
{
"error_code": "INVALID_PARAMETER_VALUE",
"message": "Cluster name should not be provided for jobs."
}
Appreciate any help here

Cluster name for jobs are automatically generated and can't be changed. If you want somehow track specific jobs, use tags.
P.S. If you want to have more "advanced" tracking capability, look onto Overwatch project.

Related

Databricks API - Instance Pool - How to update an existing job to use instance pool instead?

I am trying to update a batch of jobs to use some instance pools with the databricks api and when I try to use the update endpoint, the job just does not update. It says it executed without errors, but when I check the job, it was not updated.
What am I doing wrong?
What i used to update the job:
I used the get endpoint using the job_id to get my job settings and all
I updated the resulting data with the values that i needed and executed the call to update the job.
'custom_tags': {'ResourceClass': 'Serverless'},
'driver_instance_pool_id': 'my-pool-id',
'driver_node_type_id': None,
'instance_pool_id': 'my-other-pool-id',
'node_type_id': None
I used this documentation, https://docs.databricks.com/dev-tools/api/latest/jobs.html#operation/JobsUpdate
here is my payload
{
"created_time": 1672165913242,
"creator_user_name": "email#email.com",
"job_id": 123123123123,
"run_as_owner": true,
"run_as_user_name": "email#email.com",
"settings": {
"email_notifications": {
"no_alert_for_skipped_runs": false,
"on_failure": [
"email1#email.com",
"email2#email.com"
]
},
"format": "MULTI_TASK",
"job_clusters": [
{
"job_cluster_key": "the_cluster_key",
"new_cluster": {
"autoscale": {
"max_workers": 4,
"min_workers": 2
},
"aws_attributes": {
"availability": "SPOT_WITH_FALLBACK",
"ebs_volume_count": 0,
"first_on_demand": 1,
"instance_profile_arn": "arn:aws:iam::XXXXXXXXXX:instance-profile/instance-profile",
"spot_bid_price_percent": 100,
"zone_id": "us-east-1a"
},
"cluster_log_conf": {
"s3": {
"canned_acl": "bucket-owner-full-control",
"destination": "s3://some-bucket/log/log_123123123/",
"enable_encryption": true,
"region": "us-east-1"
}
},
"cluster_name": "",
"custom_tags": {
"ResourceClass": "Serverless"
},
"data_security_mode": "SINGLE_USER",
"driver_instance_pool_id": "my-driver-pool-id",
"enable_elastic_disk": true,
"instance_pool_id": "my-worker-pool-id",
"runtime_engine": "PHOTON",
"spark_conf": {...},
"spark_env_vars": {...},
"spark_version": "..."
}
}
],
"max_concurrent_runs": 1,
"name": "my_job",
"schedule": {...},
"tags": {...},
"tasks": [{...},{...},{...}],
"timeout_seconds": 79200,
"webhook_notifications": {}
}
}
I tried to use the update endpoint and reading the docs for information but I found nothing related to the issue.

I finally got it
I was using partial update and found that this does not work for the whole job payload
So I changed the endpoint to use full update (reset) and it worked

Unable to create user through API of Azure DevOps

I'm a newbie with Azure DevOps API, and for a future migration case, I want to create new users on my Azure DevOps organization. The users is Azure Active Directory users.
So I tried to do it with that documentation : https://learn.microsoft.com/en-us/rest/api/azure/devops/graph/users/create?view=azure-devops-rest-6.0
The body of my API request look like this:
{
"principalName": "test_user#company.com"
}
It return a status 201 (created) with this informations (for security reason i've put '.' on some lines):
{
"subjectKind": "user",
"metaType": "member",
"directoryAlias": "test_user",
"domain": "....",
"principalName": "test_user#company.com",
"mailAddress": "test_user#company.com",
"origin": "aad",
"originId": "....",
"displayName": "test user",
"_links": {
"self": {
"href": "....."
},
"memberships": {
"href": "....."
},
"membershipState": {
"href": "...."
},
"storageKey": {
"href": "...."
},
"avatar": {
"href": "...."
}
},
"url": "....",
"descriptor": "....."
}
But when I look on the organization users, I don't see any users who was created.
Did I miss something ? When I list users thourgh API it don't appear either...
Thanks in advance for your help.
P.S: It work well in the graphic UI.

Ok, I've finaly solve my own problem...
The parameter groupDescriptors is mandatory in the HTTP request in order to activate the account.
The command should look like that:
https://vssps.dev.azure.com/{{COMPANY}}/_apis/graph/users?groupDescriptors=vssgp.{{GROUPDESCRIPTORS}}&api-version=6.0-preview.1
If you don't add the user to a group when you do the creation, he will not be able to connect.
Get the group descriptor:
https://vssps.dev.azure.com/{{COMPANY}}/_apis/graph/groups?api-version=5.1-preview.1
Hope this will help someone else in the internet.

Graph API: Unable to programmatically upload OneDrive item with specific properties (failing for createdBy and lastModifiedDataTime)

I am uploading an item to OneDrive using Graph API. I am also setting the properties of the item after it has been uploaded successfully. I am able to set "lastModifiedDateTime" but am not able to set "createdBy" and "createdDateTime".
"createdBy" is always set to the Azure AD application I have created for OAuth and in OneDrive UI it always shows "modified By" "SharePoint App".
And the "createdDataTime" is always current time (time of upload). Is there any way I can set these properties correctly?
The json I am using to patch the item properties:
{"createdDateTime":"2020-12-28T12:25:39Z",
"lastModifiedDateTime":"2020-12-28T12:25:39Z",
"createdBy":
{
"user":{
"email":"AlexW#vx2.onmicrosoft.com"}
},
"lastModifiedBy":{
"user":{
"email":"AlexW#vx2.onmicrosoft.com"}
},
"fileSystemInfo":{
"lastModifiedDateTime":"2020-12-28T12:25:39Z",
"createdDateTime":"2020-12-28T12:25:39Z"},
"file":{"mimeType":"image/jpeg"}
}
Please find the properties (queries from graph explorer) after the upload and above patch request:
{
"createdDateTime": "2020-12-28T12:28:09Z",
"lastModifiedDateTime": "2020-12-28T12:25:39Z",
"createdBy":
{
"application": {
"displayName": "ConsoleApp"}
},
"fileSystemInfo": {
"createdDateTime": "2020-12-28T12:28:09Z",
"lastModifiedDateTime": "2020-12-28T12:25:39Z"
},
"file": {
"mimeType": "image/jpeg",
"hashes": {
"quickXorHash": "4EQEGnBnLd04VXEmYqGHHIeZ2po="
}
}
}
As you can see user name has been replaced by the Azure AD app name and created by time is the time the upload was done and not the time specified in the patch request.
Please let me know if anyone has any idea about this.

If you refer the below article : https://learn.microsoft.com/en-us/graph/api/resources/driveitem?view=graph-rest-1.0 under the Properties section.
These are read-only fields meaning you will not be able to manually configure the values for the same.
WorkAround :
Having said that this cannot be achieved through Graph API however, you can make use of the Sharepoint API to update the same.
ValidateUpdateListItem()
For modifiying the created by , last modified by and last modified the sample body would be of below :
{ formValues": [
{
"FieldName": "Editor",
"FieldValue": "[{'Key':'i:0#.w|AlexW#vx2.onmicrosoft.com'}]"
},
{
"FieldName": "Author",
"FieldValue": "[{'Key':'i:0#.w|AlexW#vx2.onmicrosoft.com'}]"
},
{
"FieldName": "Created",
"FieldValue": "02/18/2020 11:25 PM"
}
],
"bNewDocumentUpdate": true
}
Request URL :
https://SPOURL/_api/web/Lists/GetbyTitle('Library Name')/items(1)/ValidateUpdateListItem"

Is there a way to specify target log files for microsoft monitoring agent to listen and pick up the logs from code?

I am considering the use of Microsoft monitoring agent to collect some log records from log files on the system and send them to a log analytics workspace.
Is there a way specifying target files(custom log files) the agent would listen to and stream the logs directly to azure workspace.
I know this is possible to do through azure portal by adding an additional data source in the workspace(as specified by this link https://learn.microsoft.com/en-us/azure/azure-monitor/platform/data-sources-custom-logs).
I am looking for a way to configure these data sources from c# code/powershell script.(possibily api or sdk that i am not aware of ).

To add custom logs Use New-AzOperationalInsightsCustomLogDataSource.
Here are theother powershell commandlets which can be handy to query and create LogAnalytics Datasource.
get-azoperationalinsightsdatasource
New-AzOperationalInsightsApplicationInsightsDataSource
New-AzOperationalInsightsAzureActivityLogDataSource
New-AzOperationalInsightsComputerGroup
New-AzOperationalInsightsCustomLogDataSource
New-AzOperationalInsightsLinuxPerformanceObjectDataSource
New-AzOperationalInsightsLinuxSyslogDataSource
New-AzOperationalInsightsSavedSearch
New-AzOperationalInsightsStorageInsight
New-AzOperationalInsightsWindowsEventDataSource
New-AzOperationalInsightsWindowsPerformanceCounterDataSource
https://learn.microsoft.com/en-us/powershell/module/az.operationalinsights/get-azoperationalinsightsdatasource?view=azps-2.7.0
Also find the link for the Log analytics Rest API's which can be used easily with C# code.
https://learn.microsoft.com/en-us/rest/api/loganalytics/
https://learn.microsoft.com/en-us/rest/api/loganalytics/datasources/createorupdate
Powershell
Custom Log to collect
Link : https://learn.microsoft.com/en-us/azure/azure-monitor/platform/powershell-workspace-configuration
$CustomLog = #"
{
"customLogName": "sampleCustomLog1",
"description": "Example custom log datasource",
"inputs": [
{
"location": {
"fileSystemLocations": {
"windowsFileTypeLogPaths": [ "e:\\iis5\\*.log" ],
"linuxFileTypeLogPaths": [ "/var/logs" ]
}
},
"recordDelimiter": {
"regexDelimiter": {
"pattern": "\\n",
"matchIndex": 0,
"matchIndexSpecified": true,
"numberedGroup": null
}
}
}
],
"extractions": [
{
"extractionName": "TimeGenerated",
"extractionType": "DateTime",
"extractionProperties": {
"dateTimeExtraction": {
"regex": null,
"joinStringRegex": null
}
}
}
]
}
"#
# Custom Logs
New-AzOperationalInsightsCustomLogDataSource -ResourceGroupName $ResourceGroup -WorkspaceName $WorkspaceName -CustomLogRawJson "$CustomLog" -Name "Example Custom Log Collection"
ARM Template
For the Arm template format for the custom logs will be as below. See the detailed link https://learn.microsoft.com/en-us/azure/azure-monitor/platform/template-workspace-configuration
{
"apiVersion": "2015-11-01-preview",
"type": "dataSources",
"name": "[concat(parameters('workspaceName'), parameters('customlogName'))]",
"dependsOn": [
"[concat('Microsoft.OperationalInsights/workspaces/', parameters('workspaceName'))]"
],
"kind": "CustomLog",
"properties": {
"customLogName": "[parameters('customlogName')]",
"description": "this is a description",
"extractions": [
{
"extractionName": "TimeGenerated",
"extractionProperties": {
"dateTimeExtraction": {
"regex": [
{
"matchIndex": 0,
"numberdGroup": null,
"pattern": "((\\d{2})|(\\d{4}))-([0-1]\\d)-(([0-3]\\d)|(\\d))\\s((\\d)|([0-1]\\d)|(2[0-4])):[0-5][0-9]:[0-5][0-9]"
}
]
}
},
"extractionType": "DateTime"
}
],
"inputs": [
{
"location": {
"fileSystemLocations": {
"linuxFileTypeLogPaths": null,
"windowsFileTypeLogPaths": [
"[concat('c:\\Windows\\Logs\\',parameters('customlogName'))]"
]
}
},
"recordDelimiter": {
"regexDelimiter": {
"matchIndex": 0,
"numberdGroup": null,
"pattern": "(^.*((\\d{2})|(\\d{4}))-([0-1]\\d)-(([0-3]\\d)|(\\d))\\s((\\d)|([0-1]\\d)|(2[0-4])):[0-5][0-9]:[0-5][0-9].*$)"
}
}
}
]
}
}

Eclipse ditto forwarding connection fails from time to time

I have connected Eclipse hono with Eclipse ditto using the Connectivity api. When I set it up, this works fine. However, after some time the forwarding connection fails. When I retrieve the metrics, I'm getting following response:
{
"?": {
"?": {
"type": "connectivity.responses:aggregatedResponse",
"status": 200,
"connectionId": "<connectionId>",
"responsesType": "connectivity.responses:retrieveConnectionMetrics",
"responses": {
"connectivity-7cc7b5dc4c-6nn59": {
"type": "connectivity.responses:retrieveConnectionMetrics",
"status": 200,
"connectionId": "<connectionId>",
"connectionMetrics": {
"connectionStatus": "open",
"connectionStatusDetails": "Connected at 2019-03-19T08:28:53.211Z",
"inConnectionStatusSince": "2019-03-19T08:28:53.211Z",
"clientState": "CONNECTED",
"sourcesMetrics": [],
"targetsMetrics": [
{
"addressMetrics": {
"gw/{{ thing:namespace }}/{{ thing:id }}": {
"status": "failed",
"statusDetails": "Producer closed at 2019-03-19T21:00:16.466Z",
"messageCount": 2048,
"lastMessageAt": "2019-03-19T21:00:05.361Z"
}
},
"publishedMessages": 4070
}
]
}
}
}
}
}
}
I've been checking the logs around the time mentioned, but I'm not getting any errors. The logs I'm posting here are the last one before and the first one after the mentioned timestamp (2019-03-19T21:00:16.466Z).
2019-03-19 21:00:11,771 DEBUG [ID:AMQP_NO_PREFIX:TelemetrySenderImpl-42872] o.e.d.s.c.m.a.AmqpPublisherActor akka://ditto-cluster/system/sharding/connection/7/tenant_aloxy_consumer-aloxy-forward/pa/$a/c1/amqpPublisherActor3
- Message JmsTextMessage { org.apache.qpid.jms.provider.amqp.message.AmqpJmsTextMessageFacade#9bc051af } sent successfully.
2019-03-19 21:01:11,733 DEBUG [ID:AMQP_NO_PREFIX:TelemetrySenderImpl-42872] o.e.d.s.c.m.a.AmqpClientActor akka://ditto-cluster/system/sharding/connection/1/tenant_aloxy_consumer-aloxy/pa/$a/c1 - Inbound message: JmsInboundMessageDispatch { sequence = 38885, messageId = TelemetrySenderImpl-42873, consumerId = ID:a4925b59-1bb4-4cd8-9151-96ad422c36df:1:1:1 }
Although the log levels for all ditto services are set to debug, I'm not getting any useful logging.
Does any of you have any idea how I can get the loggging to investigate this problem or, even better, have any idea on what the problem might be and how to fix it?
When I delete the connection and recreate it, everything works as expected again. Maybe ditto can do this under the hood automatically?
UPDATE
When retrieving the connection via the API, I'm getting following response (including the failoverEnabled property which is set to true). This also indicates that the connection uses AMQP 1.0. The broker used is Enmasse.
{
"?": {
"?": {
"type": "connectivity.responses:retrieveConnection",
"status": 200,
"connection": {
"id": "<connectionId>",
"name": null,
"connectionType": "amqp-10",
"connectionStatus": "open",
"uri": "amqp://<consumer>:<password>#<amqp-host>:5672",
"sources": [],
"targets": [
{
"address": "gw/{{ thing:namespace }}/{{ thing:id }}",
"topics": [
"_/_/things/twin/events?filter=exists(features/alp)"
],
"authorizationContext": [
"<auth-context>"
]
}
],
"clientCount": 1,
"failoverEnabled": true,
"validateCertificates": true,
"processorPoolSize": 5,
"tags": []
}
}
}
}

Eclipse Ditto does an automatic failover if configured to so do (see https://www.eclipse.org/ditto/basic-connections.html - "failoverEnabled" property in the model).
It could however be that this was improved since the release 0.8.0 you are using.
The Ditto team is currently working towards a 0.9.0-M1 release which would contain an improved reconnection behavior.
Does the connection to Eclipse Hono automatically reconnect?
You described that the "forwarding connection" fails from time to time. Which technology (broker, etc.) is as endpoint for that gw/{{ thing:namespace }}/{{ thing:id }} address?

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to provide cluster name in Azure Databricks Notebook Run Now JSON - azure

Cluster name for jobs are automatically generated and can't be changed. If you want somehow track specific jobs, use tags. P.S. If you want to have more "advanced" tracking capability, look onto Overwatch project.

Related

Databricks API - Instance Pool - How to update an existing job to use instance pool instead?

Unable to create user through API of Azure DevOps

Graph API: Unable to programmatically upload OneDrive item with specific properties (failing for createdBy and lastModifiedDataTime)

Is there a way to specify target log files for microsoft monitoring agent to listen and pick up the logs from code?

Eclipse ditto forwarding connection fails from time to time

Categories

Resources