Got Error Ingest JSON array data into Azure (via Azure Synapse or Azure Data Factory) - azure

I'm trying to ingest the Json array format from US Census API data into Azure either ASA or ADF is fine... Tried using HTTP or REST and none of them is successful.enter image description here
The error after using HTTP connector is shown as
"Error occurred when deserializing source JSON file 'c753cdb5-b33b-4f22-9ca2-778c97a69953'. Check if the data is in valid JSON object format. Error reading JObject from JsonReader. Current JsonReader item is not an object: StartArray. Path '[0]', line 1, position 2. Activity ID: 8608038f-3dd1-474f-86f1-d94bf5a45eba".
I attached the error message as well as sample API data and "test connection successful" screenshots in this post.
Shall I put in some parameters or advanced set up to specify something about the array form for the census data? Please advise.
The sample data link is inserted for your reference.
https://api.census.gov/data/2020/acs/acs5/subject?get=group(S1903)&for=state:51
Greatly appreciate your help in advance!
T.R.
error in azure synapse ingestion
connection test is good
US Census API Sample Test Data

As per official documentation As REST connector only support response in JSON.
Data link provided by you returns json array not Json. That is why ADF can not accept data returned by data link.

Related

Azure Data Factory - Retrieve next pagination link (decoded) from response headers in a copy data activity of Azure Data Factory

I have created a copy data activity in azure data factory and this data pipeline pulls the data from an API (via REST activity source) and writes the response body (json) on a file kept in the azure blob storage.
The API which I am fetching the response from, is paginated and the link to next page is sent in the response headers in response->headers->link.
This URL to next page is in the following general format:
<https%3A%2F%2FsomeAPI.com%2Fv2%2FgetRequest%3FperPage%3D80%26sortOrder%3DDESCENDING%26nextPageToken%3DVAdjkjklfjjgkl>; rel="next"
I want to fetch the next page token present in the above URL and use it in the pagination rule.
I have tried using some pagination rules:
> AbsoluteURL = Headers.link
But, this did not work as the entire encoded link shown above, is getting appended directly and hence, the pipeline throws an error.
> Query Parameters
I have also tried to use the query parameters but could not get any result.
I have followed questions over stackoverflow and have read the documentations:
Please help me with how can I access this next page token or what can be the pagination rule to support the scenario.
Pasting postman output and ADF data pipeline, for reference.
Postman Response Headers Output
Pagination Rules, I need help on

How can I import all records from an Airtable table using an Azure Synapse Analytics pipeline rather than just retrieving the first 100?

When using the REST integration in an Azure Synapse pipeline and supplying the proper authorization (api_key), I'm only getting 100 records loaded into my Azure Synapse data sink. How do I ensure all records are imported?
There is a pagination offset that appears in the JSON response of Airtable. On the Source tab of the copy data step in Synapse, under Pagination rules, select QueryParameter, enter "offset" (no quotes) into the field next to QueryParameter, and enter "$['offset']" (no quotes) into the Value. That's it - no need for relative URL or a parameter configuration. The pagination rule tells synapse to look for the data element "offset" in the response and to continue fetching more data until a response no longer contains that data element in the JSON. See screenshot below. The second screenshot shows the authorization configuration.
The authorization configuration for the Airtable API is shown below - this causes Synapse to include the HTTP header and value "Authorization: Bearer " to the Airtable API. Just replace <api_key> with your Airtable api key which can be found and / or created under your account settings in Airtable.

Send Query or SP output as HTML mail from Azure data factory

I'm trying to implement Azure data factory pipeline. From there trying to execute one SP and need to send output in HTML format from Azure data factory through Logic App. Facing issue while sending output to Web task in ADF.
Any hint ?
Error details
Error code
2108
Troubleshooting guide
Failure type
User configuration issue
Details
{"error":{"code":"InvalidRequestContent","message":"The request content is not valid and could not be deserialized: 'Unexpected character encountered while parsing value: <. Path '', line 0, position 0.'."}}
Source
In Body option of the Setting TAB in Web Activity, Instead of adding the JSON in the "Dynamic Content", add JSON as usual in the box.
Refer below screen shot:

problems to copy data from rest api Azure Data Factory

I am trying to copy data from Office 365 Management API using Copy activity in ADF and REST API dataset.
I am stuck on step when I see that data is fetched from copy activity but with error related to Source and of course I could not save data to some sink.
Error is :
{
"errorCode": "2200",
"message": "Failure happened on 'Source' side. ErrorCode=UserErrorMoreThanOneObjectsReturned,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=More than one objects returned\r\n{\r\n "CreationTime": "2020-07-15T11:59:57",\r\n .... and some data.
Of course I am expected some json data in response but error 'More than one objects returned' confused me.
//Alexander

Why IoT Hub message is not decoded corretly in storage JSON blob?

I am sending a string with an Azure Sphere dev kit using the provided function:
AzureIoT_SendMessage("Hello from sample App")
The message is sent to an IoT Hub and then routed to a storage blob with JSON encoding. If I look at the blob storage I get the following:
{"EnqueuedTimeUtc":"2019-05-22T12:33:42.2320000Z","Properties":{},"SystemProperties":{"connectionDeviceId":"fbea*****************6d**********************9c0","connectionAuthMethod":"{\"scope\":\"device\",\"type\":\"x509Certificate\",\"issuer\":\"external\",\"acceptingIpFilterRule\":null}","connectionDeviceGenerationId":"63************22","enqueuedTime":"2019-05-22T12:33:42.2320000Z"},"Body":"SGVsbG8gZnJvbSBzYW1wbGUgQXBw"}
The field "body" does not show at all the string sent ("Hello from sample App") but it shows "SGVsbG8gZnJvbSBzYW1wbGUgQXBw". Why is this happening? And how can I fix it?
I found that if I format the storage as AVRO (instead of JSON) the string is rendered correctly however the message becomes (literally) a blob and it cannot be used in streaming service such as powerBI (for example). However the message can be found, with some other mess stuff, in the blob (see picture below with the default string message)
See Microsoft's IoT Hub message routing documentation - specifically the Azure Storage section. It says "When using JSON encoding, you must set the contentType to application/json and contentEncoding to UTF-8 in the message system properties. Both of these values are case-insensitive. If the content encoding is not set, then IoT Hub will write the messages in base 64 encoded format."
This blog post further expands on the topic explaining that content type and encoding need to be set as specific headers.
On setting the headers:
If you are using the Azure IoT Device SDKs, it is pretty straightforward to set the message headers to the required properties. If you are using a third-party protocol library, you can use this table to see how the headers manifest in each of the protocols that IoT Hub supports​:
FWiW, using PowerQuery (PQ) within PowerBi, I was able to decode the {body:xxxencoded_base64...} portion of the JSON file sent to blob by Azure IoT Hub.
My steps in PowerBI:
- Connect to blob account, container collecting the JSON files.
- Within PQ, click the double-arrow expand on the initial binary column
- For me, Column10 was the {body:xxx} column. Using PQ's Replace Values function, I removed the prefix "{body:" and the final "} - leaving just the encoded string.
- Create a new column in PQ, using this M-code: =Binary.FromText([Column10],BinaryEncoding.Base64)
- It's now a new Binary column, click the double-arrow and expand the binary. It'll reveal the decoded JSON table, complete w/ all your IoT telemetry.
HTHs
See also pending feature request for IoT-Hub:
feature request

Resources