Why can't Azure Search import JSON blobs? - azure

When importing data using the configuration found below, Azure Cognitive Search returns the following error:
Error detecting index schema from data source: ""
Is this configured incorrectly? The files are stored in the container "example1" and in the blob folder "json". When creating the same index with the same data in the past there were no errors, so I am not sure why it is different now.
Import data:
Data Source: Azure Blob Storage
Name: test-example
Data to extract: Content and metadata
Parsing mode: JSON
Connection string:
DefaultEndpointsProtocol=https;AccountName=EXAMPLESTORAGEACCOUNT;AccountKey=EXAMPLEACCOUNTKEY;
Container name: example1
Blob folder: json
.json file structure.
{
"string1": "vaule1",
"string2": "vaule2",
"string3": "vaule3",
"string4": "vaule4",
"string5": "vaule5",
"string6": "vaule6",
"string7": "vaule7",
"string8": "vaule8",
"list1": [
{
"nested1": "value1",
"nested2": "value2",
"nested3": "value3",
"nested4": "value4"
}
],
"FileLocation": null
}
Here is an image of the screen with the error when clicking "Next: Add cognitive skills (Optional)" button:

To clarify there are two problems:
1) There is a bug in the portal where the actual error message is not showing up for errors, hence we are observing the unhelpful empty string "" as an error message. A fix is on the way and should be rolled out early next week.
2) There is an error when the portal attempts to detect index schema from your data source. It's hard to say what the problem is when the error message is just "". I've tried your sample data and it works fine with importing.
I'll update the post once the fix for displaying the error message is out. In the meantime (again we're flying blind here without the specific error string) here are a few things to check:
1) Make sure your firewall rules allow the portal to read from your blob storage
2) Make sure there are no extra characters inside your JSON files. Check the whitespace charcters are whitespace (you should be able to open the file in VSCode and check).
Update: The portal fix for the missing error messages has been deployed. You should be able to see a more specific error message should an error occur during import.

Seems to me that is a problem related to the list1 data type. Make sure you're selecting: "Collection(Edm.String)" for it during the index creation.
more info, please check step 5 of the following link: https://learn.microsoft.com/en-us/azure/search/search-howto-index-json-blobs

I have been in contact with Microsoft, and this is a bug in the Azure Portal. The issue is the connection string wizard does not append the Endpoint suffix correctly. They have recommeded to manually pasting the connection string, but this still does not work for me. So this is a suggested answer by Microsoft, but I don't believe is completely correct because the portal outputs the same error message:
Error detecting index schema from data source: ""

Related

ADF: can't build simple Json file transformation (one field flattening)

I need a help in transforming simple json file inside Azure Data Flow. I need to flatten just one field date_sk in example here:
{
"date_sk": {"string":"2021-09-03"}
"is_influencer": 0,
"is_premium": -1,
"doc_id": "234"
}
Desired transformation:
"date_sk": {"string":"2021-09-03"}
to become
"dateToGroupBy" : "2021-09-03"
I create source stream, note the strange projection Azure picks, there is no "string" field anymore, but this is how automatic Azure transformation works for some reason:
Data preview of the same source stream node:
And here's how it suggest me to transform it in a separate "Derived Column" modifier. I played with the right part, but this is the only format (date_sk.{}) that does not display any error I was able to pick:
But then output dateToGroupBy field happens to be empty:
Any ideas on what could got wrong and how can I build the expected transformation? Thank you
Alright, it happened to be a Microsoft bug in ADF.
ADF stumbles upon "string" field name as JSON field, can't handle it, though schema and data validation passes through Ok, and showing no errors.
When I replace date_sk": {"string":"2021-09-03"} by date_sk": {"s1":"2021-09-03"} or anything other than string everything starts working just fine
and dateToGroupBy is filled with date values taken from date_sk.s1
When I return string back, it shows NULL in output values.
It supposed to either show error on verification stage or handle this field naming properly.

Is there a way to resolve this error: "CloudKit integration requires does not support ordered relationships."

I'm trying to use Apple's CoreDataCloudkitDemo app. I've only changed the app settings per their README document. On running the demo, I'm getting the error: "CloudKit integration requires does not support ordered relationships."
(The weird grammar in the title is included in the app)
The console log shows:
Fatal error: ###persistentContainer: Failed to load persistent stores:Error Domain=NSCocoaErrorDomain Code=134060 "A Core Data error occurred." UserInfo={NSLocalizedFailureReason=CloudKit integration requires does not support ordered relationships. The following relationships are marked ordered:
Post: attachments
There is the same error for the "Tags" entity.
I'm using Xcode 11.0 beta 4 (11M374r).
I've only changed the bundle identifier, and set my Team Id.
I removed the original entitlements file - no errors in resulting build.
I've not changed any code from the original.
Does anyone have a workaround, or preferably, a fix? Or did I do something wrong?
Thanks
Firstly, select CoreDataCloudKitDemo.xcdatamodeld -> Post -> RelationShips, select attachments relationship, on the Inspect panel, deselect Ordered, then do the same thing on the tags relationship.
Secondly, there will be some errors in the code now, because we unchecked the Ordered option, the property of attachments and tags in the generated NSManagedObject may changed from NSOrderedSet? to NSSet?. So we could change these error lines of code like below:
Origin:
guard let tag = post?.tags?.object(at: indexPath.row) as? Tag else { return cell }
Changed:
guard let tag = post?.tags?.allObjects[indexPath.row] as? Tag else { return cell }
Finally, you can run the code now. ;-)
Further more, on WWDC19 Session 202, the demo shows they set both attachments and tags relationships as Unordered, so I think there's something wrong in the given demo project.

UserErrorSqlBulkCopyInvalidColumnLength - Azure SQL Database

I am configuring a Salesforce to Azure SQL Database data copy using Azure Data Factory. There appears to be an issue with the column length, but I am unable to identify which column is actually causing an issue.
How can I gain more insight into exactly what is causing my problem? or what column is really invalid?
{
"dataRead":18560714,
"dataWritten":0,
"rowsRead":15514,
"rowsCopied":0,
"copyDuration":34,
"throughput":533.109,
"errors":[
{
"Code":9123,
"Message":"ErrorCode=UserErrorSqlBulkCopyInvalidColumnLength,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=SQL Bulk Copy failed due to received an invalid column length from the bcp client.,Source=Microsoft.DataTransfer.ClientLibrary,''Type=System.Data.SqlClient.SqlException,Message=The service has encountered an error processing your request. Please try again. Error code 4815.\r\nA severe error occurred on the current command. The results, if any, should be discarded.,Source=.Net SqlClient Data Provider,SqlErrorNumber=40197,Class=20,ErrorCode=-2146232060,State=1,Errors=[{Class=20,Number=40197,State=1,Message=The service has encountered an error processing your request. Please try again. Error code 4815.,},{Class=20,Number=0,State=0,Message=A severe error occurred on the current command. The results, if any, should be discarded.,},],'",
"EventType":0,
"Category":5,
"Data":{
},
"MsgId":null,
"ExceptionType":null,
"Source":null,
"StackTrace":null,
"InnerEventInfos":[
]
}
],
"effectiveIntegrationRuntime":"DefaultIntegrationRuntime (East US 2)",
"usedCloudDataMovementUnits":4,
"usedParallelCopies":1,
"executionDetails":[
{
"source":{
"type":"Salesforce"
},
"sink":{
"type":"AzureSqlDatabase"
},
"status":"Failed",
"start":"2018-03-01T18:07:37.5732769Z",
"duration":34,
"usedCloudDataMovementUnits":4,
"usedParallelCopies":1,
"detailedDurations":{
"queuingDuration":5,
"timeToFirstByte":24,
"transferDuration":4
}
}
]
}
"Message":"ErrorCode=UserErrorSqlBulkCopyInvalidColumnLength,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=SQL
Bulk Copy failed due to received an invalid column length from the bcp
client.,Source=Microsoft.DataTransfer.ClientLibrary,''Type=System.Data.SqlClient.SqlException,Message=The
service has encountered an error processing your request. Please try
again. Error code 4815.\r\nA severe error occurred on the current
command. The results, if any, should be
discarded.,Source=.Net SqlClient Data
Provider,SqlErrorNumber=40197,Class=20,ErrorCode=-2146232060,State=1,Errors=[{Class=20,Number=40197,State=1,Message=The service has encountered an error processing your request. Please try
again. Error code 4815.,},{Class=20,Number=0,State=0,Message=A severe
error occurred on the current command. The results, if any,
should be discarded.
I have seen that error when trying copy a string from the source to the destination but the column receiving the string on the destination is shorter than the source.
My suggestion is to make sure you selected data types big enough on the destination so they can receive the data from the source. Make sure also the sink has the columns shown in the preview.
To identify which column may be involved, exclude big columns one by one from the source.

Azure Stream Analytics: "Stream Analytics job has validation errors: The given key was not present in the dictionary."

I burned a couple of hours on a problem today and thought I would share.
I tried to start up a previously-working Azure Stream Analytics job and was greeted by a quick failure:
Failed to start Streaming Job 'shayward10ProcessLogs'.
I looked at the JSON log and found nothing helpful whatsoever. The only description of the problem was:
Stream Analytics job has validation errors: The given key was not present in the dictionary.
Given the error and some changes to our database, I tried the following to no effect:
Deleting and Recreating all Inputs
Deleting and Recreating all Outputs
Running tests against the data (coming from Event Hub) and the output looked good
My query looked as followed:
SELECT
dateTimeUtc,
context.tenantId AS tenantId,
context.userId AS userId,
context.deviceId AS deviceId,
changeType,
dataType,
changeStatus,
failureReason,
ipAddress,
UDF.JsonToString(details) AS details
INTO
[MyOutput]
FROM
[MyInput]
WHERE
logType = 'MyLogType';
Nothing made sense so I started deconstructing my query. I took it down to a single field and it succeeded. I went field by field, trying to figure out which field (if any) was the cause.
See my answer below.
The answer was simple (yet frustrating). When I got to the final field, that's where the failure was:
UDF.JsonToString(details) AS details
This was the only field that used a user-defined function. After futsing around, I noticed that the Function Editor showed the title of the function as:
udf.JsonToString
It was a casing issue. I had UDF in UPPERCASE and Azure Stream Analytics expected it in lowercase. I changed my final field to:
udf.JsonToString(details) AS details
It worked.
The strange thing is, it was previously working. Microsoft may have made a change to Azure Stream Analytics to make it case-sensitive in a place where it seemingly wasn't before.
It makes sense, though. JavaScript is case-sensitive. Every JavaScript object is basically a dictionary of members. Consider the error:
Stream Analytics job has validation errors: The given key was not present in the dictionary.
The "udf" object had a dictionary member with my function in it. The UDF object would be undefined. Undefined doesn't have my function as a member.
I hope my 2-hour head-banging session helps someone else.

Exception of type 'Microsoft.WindowsAzure.StorageClient.StorageClientException' was thrown

Exception of type 'Microsoft.WindowsAzure.StorageClient.StorageClientException' was thrown.
Sometimes even if we have the fabric running and the role manager is up, we get an exception of this sort.
The code breaks at the line:
emailAddressClient.CreateTableIfNotExist("EmailAddress");
public EmailAddressDataContext(CloudStorageAccount account) :
base(account.TableEndpoint.AbsoluteUri, account.Credentials)
{
this.storageAccount = account;
CloudTableClient emailAddressClient =
new CloudTableClient(storageAccount.TableEndpoint.AbsoluteUri,
storageAccount.Credentials);
emailAddressClient.CreateTableIfNotExist("EmailAddress");
}
I give Windows Azure tables camel-cased names all the time without issues.
I wonder if by chance you already used this table name and recently deleted it? For a time after deletion (when the table is still being deleted asynchronously), you won't be able to recreate it. I believe 409 Conflict is the error code to expect in that case.
I agree with Steve Marx, casing does not seem to affect this issue. In fact Microsoft's Azure diagnostics tables are created with unusual casing eg: WADPerformanceCounters. I get the problem even in the dev environment. So it is something else entirely - my opinion.
Error fixed in my case: The problem was an error with the connection string as defined in (or lack thereof) in the webrole or workerrole project properties.
Fix:
Right-click on the webrole under "Roles" folder in your cloud application. Select "Properties" from the context menu.
Select the "Settings" tab.
Verify or Add a setting for you connection string that you will use to initialize table storage.
Mine was a simple error - no setting for my connection string.
Easy fix is to change "EmailAddress" to "Emailaddress". For some reasons it would not allow CamelCasing. So please make sure, you just have one capital letter in the name of the table that too at the beginning. Since the table names are case insensitive, you can also name it as 'emailaddress'

Resources