Azure Data Factory REST API return invalid JSON file with pagination

Azure Data Factory REST API return invalid JSON file with pagination - azure

I'm building a pipeline, which copy a response from a API into a file in my storage account. There is also an element of pagination. However, that works like a charm and i get all my data from all the pages.
My result is something like this:
{"data": {
"id": "Something",
"value": "Some other thing"
}}
The problem, is that the copy function just appends the response to the file and thereby making it invalid JSON, which is a big problem further down the line. The final output would look like:
{"data": {
"id": "22222",
"value": "Some other thing"
}}
{"data": {
"id": "33333",
"value": "Some other thing"
}}
I have tried everything I could think of and google my way to, but nothing changes how the data is appended to the file and i'm stuck with an invalid JSON file :(
As a backup plan, i'll just make a loop and create a JSON file for each PAGE. But that seems a bit janky and really slow
Anyone got an idea or have a solution for my problem?

When you copy data from Rest API to blob storage it will copy data in the form of set of objects by default.
Example:
sample data
{ "time": "2015-04-29T07:12:20.9100000Z", "callingimsi": "466920403025604"}
sink data
{"time":"2015-04-29T07:12:20.9100000Z","callingimsi":"466920403025604"}
{"time":"2015-04-29T07:13:21.0220000Z","callingimsi":"466922202613463"}
{"time":"2015-04-29T07:13:21.4370000Z","callingimsi":"466923101048691"}
This is the invalid format of Json.
To work around this, select file pattern in sink activity setting as Array of objects this will return array of all objects.
Output:

Related

How can I obtain an attached file from a record using the C# REST web services

I'm trying to retrieve an attached file (or files) from a record in Acumatica. I'm using the following example:
https://help-2021r1.acumatica.com/Help?ScreenId=ShowWiki&pageid=b1bc82ee-ae6b-442a-a369-863d98f14630
I've attached a file to the demo inventory stock item "AACOMPUT01".
Most of the code runs as expected, but when it gets to the code line:
JArray jFiles = jItem.Value<JArray>("files");
it returns null for the jFiles JArray - as if there are no files attached.
Is there something wrong with this example - or something I need to add to get it to work?
I'm using 2021 R1 (21.107.0023), and the endpoint is default 20.200.001...
Thanks...

Execute a GET request on StockItem endpoint with the expand files option:
http://localhost/Acumatica/entity/Default/20.200.001/StockItem/AACOMPUT01?$expand=files
This returns the files array:
"files": [
{
"id": "bdb9534c-6aa9-41fa-a65d-3119e32b0fe5",
"filename": "Stock Items (AACOMPUT01)\\AACOMPUT01.jpg",
"href": "/Acumatica/entity/Default/20.200.001/files/bdb9534c-6aa9-41fa-a65d-3119e32b0fe5"
}
Use the href URL parameter value to issue the GET request which returns the file content:
http://localhost/Acumatica/entity/Default/20.200.001/files/bdb9534c-6aa9-41fa-a65d-3119e32b0fe5

Get value from json in LogicApp

Rephrasing question entirely, as first attempt was unclear.
In my logic app I am reading a .json from blob which contains:
{
"alpha": {
"url": "https://linktoalpha.com",
"meta": "This logic app does job aaaa"
},
"beta": {
"url": "https://linktobeta.com",
"meta": "This logic app does job beta"
},
"theta": {
"url": "https://linktotheta.com",
"meta": "This logic app does job theta"
}
}
I'm triggering the logic app with a http post which contains in the body:
{ "logicappname": "beta" }
But the value for 'logicappname' could be alpha, beta or theta. I now need to set a variable which contains the url value for 'beta'. How can this be achieved without jsonpath support?
I am already json parsing the file contents from the blob and this IS giving me the tokens... but I cannot see how to select the value I need. Would appreciate any assistance, thank you.

For your requirement, I think just use "Parse JSON" action to do it. Please refer to the steps below:
1. I upload a file testJson.json to my blob storage, then get it and parse it in my logic app.
2. We can see there are three url in the screenshot below. As you want to get the url value for beta, it is the second one, so we can choose the second one.
If you want to get the url value by the param logicappname from the "When a HTTP request is received" trigger, you can use a expression when you create the result variable.
In my screenshot, the expression is:
body('Parse_JSON')?[triggerBody()?['logicappname']]?['url']
As the description of your question is a little unclear and I'm confused about the meaning of I am already json parsing the file contents from the blob and this IS giving me the tokens, why is "tokens" involved in it ? And in the original question it seems you want to do it by jsonpath but in the latest description you said without jsonpath ? So if I misunderstand your question, please let me know. Thanks.

Not sure if I understand your question. But I believe you can use Pars Json action after the http trigger.
With this you will get a control over the incoming JSON message and you can choose the 'URL' value as a dynamic content in the subsequent actions.
Let me know if my understanding about your question is wrong.

OnlyOffice conversion api keeps returning -1

I am trying to call my OpenOffice conversion API with following data:
{
"async": false,
"filetype": "docx",
"key": "Khirz6zgfTPdfd7",
"outputtype": "pdf",
"title": "Example Document Title.docx",
"url": "https://calibre-ebook.com/downloads/demos/demo.docx"
}
I am not certain about key property value, I used Khirz6zgfTPdfd7 which is also used in their example on https://api.onlyoffice.com/editors/conversionapi ; the document is also not the one stored on docserver.
The response I retrieve is:
<?xml version="1.0" encoding="utf-8"?><FileResult><Error>-1</Error></FileResult>
when means Unknown error.
I suppose the problem might be either in key or document URL. Can I use document that is not stored on docserver and how to generate the key properly?
Or do you think I miss something else?

I am trying to call my OpenOffice conversion API with following data
Seems to be ok, please send log files from /onlyoffice/documentserver/converter/ and /onlyoffice/documentserver/docservice/

U-SQL: How to skip files from analysis based on content

I have a lot of files each containing a set of json objects like this:
{ "Id": "1", "Timestamp":"2017-07-20T10:43:21.8841599+02:00", "Session": { "Origin": "WebClient" }}
{ "Id": "2", "Timestamp":"2017-07-20T10:43:21.8841599+02:00", "Session": { "Origin": "WebClient" }}
{ "Id": "3", "Timestamp":"2017-07-20T10:43:21.8841599+02:00", "Session": { "Origin": "WebClient" }}
etc.
Each file containts information about a specific type of session. In this case it are sessions from a Web App, but it could also be sessions of a Desktop App. In that case the value for Origin is "DesktopClient" instead of "WebClient"
For analysis purposes say I am only interested in DesktopClient sessions.
All files representing a session are stored in Azure Blob Storage like this:
container/2017/07/20/00399076-2b88-4dbc-ba56-c7afeeb9ef77.json
container/2017/07/20/00399076-2b88-4dbc-ba56-c7afeeb9ef78.json
container/2017/07/20/00399076-2b88-4dbc-ba56-c7afeeb9ef79.json
Is it possible to skip files of which the first line already makes it clear if it is not a DesktopClient session file, like in my example? I think it would save a lot of query resources if files that I know of do not contain the right session type can be skipped since they can be quit big.
At the moment my query read the data like this:
#RawExtract = EXTRACT [RawString] string
FROM #"wasb://plancare-events-blobs#centrallogging/2017/07/20/{*}.json"
USING Extractors.Text(delimiter:'\b', quoting : false);
#ParsedJSONLines = SELECT Microsoft.Analytics.Samples.Formats.Json.JsonFunctions.JsonTuple([RawString]) AS JSONLine
FROM #RawExtract;
...
Or should I create my own version of Extractors.Text and if so, how should I do that.

To answer some questions that popped up in the comments to the question first:
At this point we do not provide access to the Blob Store meta data. That means that you need to express any meta data either as part of the data in the file or as part of the file name (or path).
Depending on the cost of extraction and sizes of files, you can either extract all the rows and then filter out the rows where the beginning of the row is not fitting your criteria. That will extract all files and all rows from all files, but does not need a custom extractor.
Alternatively, write a custom extractor that checks for only the files that are appropriate (that may be useful if the first solution does not give you the performance and you can determine the conditions efficiently inside the extractors). Several example extractors can be found at http://usql.io in the example directory (including an example JSON extractor).

Azure Logic Apps - Get Blob Content - Setting Content type

The Azure Logic Apps action "Get Blob Content" doesn't allow us to set the return content-type.
By default, it returns the blob as binary (octet-stream), which is useless in most cases. In general it would be useful to have text (e.g. json, xml, csv, etc.).
I know the action is in beta. Is that on the short term roadmap?

Workaround I found is to use the Logic App expression base64ToString.
For instance, create an action of type "Compose" (Data Operations group) with the following code:
"ComposeToString": {
"inputs": "#base64ToString(body('Get_blob_content').$content)",
"runAfter": {
"Get_blob_content": [
"Succeeded"
]
},
"type": "Compose"
}
The output will be the text representation of the blob.

So I had a blob sitting in az storage with json in it.
Fetching blob got me a octet back that was pretty useless, as I was unable to parse it.
BadRequest. The property 'content' must be of type JSON in the
'ParseJson' action inputs, but was of type 'application/octet-stream'.
So I setup an "Initialize variable", content type of string, pointing to GetBlobContent->File Content. The base64 conversion occurs under the hood and I am now able to access my json via the variable.
No code required.
JSON OUTPUT...
FLOW, NO CODE...
Enjoy! Healy in Tampa...

After fiddling much with Logic Apps, I finally understood what was going on.
The JSON output from the HTTP request is the JSON representation of an XML payload:
{
"$content-type": "application/xml",
"$content": "77u/PD94bWwgdm..."
}
So we can decode it, but it is useless really. That is an XML object for Logic App. We can apply xml functions to it, such as xpath.

You would need to know the content-type.
Use #{body('Get_blob_content')['$content']} to get the content part alone.

Is enough to "Initialize Variable" and take the output of the Get Blob Content as type "String". This will automatically parse the content:

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Azure Data Factory REST API return invalid JSON file with pagination - azure

Related

How can I obtain an attached file from a record using the C# REST web services

Get value from json in LogicApp

OnlyOffice conversion api keeps returning -1

U-SQL: How to skip files from analysis based on content

Azure Logic Apps - Get Blob Content - Setting Content type

Categories

Resources