Logic Apps - Converting HTML symbols in plain text - azure

Fairly new to Logic Apps and not familiar with all the functions.
I created a simple Logic App that will check an RSS feed every so often, loops every item it finds, takes only certain data (title, summary and URL link), paste them in an HTML table and then send an email with the outcome. Sounds fairly simple, right?
The problem I'm facing is that the RSS feed contains certain HTML characters such as & or ', which will then appear in the email I receive. Is it possible to convert these in Logic Apps?
Additionally, I've also noticed that some HTML character are "double encoded??"
Basically if we had to look at <description>&quot;Quando Romelu si mette in testa una cosa, di solito la ottiene. Ora, si sarebbe messo in testa l&rsquo;Inter.</description> straight away you'd realise that the first & is there for the quot; that follows it. So it's kind of expecting that first the & is converted in an actual & to then form " to be then converted to ", if it makes sense what I'm saying. I don't own the feed, or in any way control it. Wanted to get familiar with Logic Apps so I thought I'd start with some football news processing.
Here's a sample of one item(out of 20) in the RSS feed
<item>
<guid>https://www.fcinternews.it/?action=read&idnotizia=310797</guid>
<pubDate>Wed, 19 Jun 2019 09:51:40 +0200</pubDate>
<title>CdS - Il BVB vuole Pinamonti: valutazione schizzata oltre i 20 milioni </title>
<link>https://www.fcinternews.it/rassegna/cds-il-bvb-vuole-pinamonti-valutazione-schizzata-oltre-i-20-milioni-310797</link>
<description>Anche il Corriere dello Sport sottolinea la grande fila che si &egrave; messa in attesa di buone nuove dall&#39;Inter per Andrea Pinamonti, protogonista del Mondiale U-20.</description>
<category>Rassegna</category>
<enclosure url="https://net-storage.tccstatic.com/storage/fcinternews.it/img_notizie/thumb1/ec/ec620af4eeb01ebebbb662d7947a6700-85495-21a8fcf5fc9c392cfa4303d2753d5db6.jpeg" type="image/jpeg" length="9983"/>
</item>

There is the solution to use an Azure function to clean it up link
However, as you are doing something like this I guess you can do replace actions and oh btw single quotation escapes suck. Use a variable to cheat it.
replace(replace(replace(replace(item()['summary'],' ',' '),'&','& '),'"','"'),''','')
replace(replace(replace(replace(item()['title'],' ',' '),'&','& '),'"','"'),''',variables('EscapeSingleQuotation'))
Result
Is this what you are looking for?

Microsoft have included new connector called - Content Conversion.
This converts HTML contents to plain string.
This is available on Logic Apps, Power Automate, Power Apps. But still on preview phase.

Related

How to remove leading zero/zeros in azure logic app for variables?

Removing the first number if it's 0 or 000
eg1- 000123400 è Convert the data to search -123400
eg2 -0001234 è Convert the data to search - 1234
What silent said in the comments is right, his design process is like this:

WebPlayer throwing error when one of the IL added in spotfire

I have total of 12 IL in a report which are added as rows to each other. But due to the one IL:" DD_D30-C6" , the web player URL is not working. Even though its working fine in Spotfire Editor but not in the web player URL.
There is no such logs or error msg recorded, I compared this IL to other IL , syntax wise everything is proper only.Error msg is as attached.
Figured out the answer:
Out of all the 12 IL i have added as rows to by reports, this one particular IL (DD_D36-C6) has a datatype mismatch . ALl IL were having "Datetime" as datatype while this one has "string" datatype associated with it. Hence while running it in webplayer it was not successful.

Azure Cognitives services speech to text recognition of numeric entities as text

I was wondering if its possible that the c++ sdk of Cognitives services Speech to text to return the numeric entities as text instead of numbers.
Current response 'I want to order 2 Cokes'
Expected response 'I want to order two Cokes'
Of course i can implement a feature to the translation. But i was wondering if its something that the service already provides. Particularly on spanish.
take a look at the sample repository at https://github.com/Azure-Samples/cognitive-services-speech-sdk
especially the file speech_recognition_samples.cpp , function SpeechRecognitionWithLanguageAndUsingDetailedOutputFormat
Enabling ‘detailed output’ will give you the result you want:
config->SetOutputFormat(OutputFormat::Detailed);
Then you need to look at the detailed output:
result->Properties.GetProperty(PropertyId::SpeechServiceResponse_JsonResult)
And that would create detailed output like this:
{"Duration":35500000,"NBest":[{"Confidence":0.7535948753356934,"Display":"I want to order 2 Cokes.","ITN":"I want to order 2 cokes","Lexical":"i want to order two cokes","MaskedITN":"I want to order 2 cokes"}],"Offset":17000000,"RecognitionStatus":"Success"}
The lexical output is probably what you want
Wolfgang

Finding Related Topics using Google Knowledge Graph API

I'm currently working on a behavioral targeting application and I need a considerably large keyword database/tool/provider that enables applications to reach to the similar keywords via given keyword for my app. I've recently found that Freebase, which had been providing a similar service before Google acquired them and then integrated to their Knowledge Graph. I was wondering if it's possible to have a list of related topics/keywords for the given entity.
import json
import urllib
api_key = 'API_KEY_HERE'
query = 'Yoga'
service_url = 'https://kgsearch.googleapis.com/v1/entities:search'
params = {
'query': query,
'limit': 10,
'indent': True,
'key': api_key,
}
url = service_url + '?' + urllib.urlencode(params)
response = json.loads(urllib.urlopen(url).read())
for element in response['itemListElement']:
print element['result']['name'] + ' (' + str(element['resultScore']) + ')'
The script above returns the queries below, though I'd like to receive related topics to yoga, such as health, fitness, gym and so on, rather than the things that has the word "Yoga" in their name.
Yoga Sutras of Patanjali (71.245544)
Yōga, Tokyo (28.808222)
Sri Aurobindo (28.727333)
Yoga Vasistha (28.637642)
Yoga Hosers (28.253984)
Yoga Lin (27.524054)
Patanjali (27.061115)
Yoga Journal (26.635073)
Kripalu Center (26.074436)
Yōga Station (25.10318)
I'd really appreciate any suggestions, and I'm also open to using any other API if there is any that I could make use of. Cheers.
See your point:) So here's the script I use for that using Serpstat's API. Here's how it works:
Script collects the keywords from Serpstat's database
Then, collects search suggestions from Serpstat's database
Finally, collects search suggestions from Google's suggestions
Note that to make script work correctly, it's preferable to fill all input boxes. But not all of them are required.
Keyword — required keyword
Search Engine — a search engine for which the analysis will be carried out. For example, for the US Google, you need to set the g_us. The entire list of available search engines can be found here.
Limit the maximum number of phrases from the organic issue, which will participate in the analysis. You cannot set more than 1000 here.
Default keys — list of two-word keywords. You should give each of them some "weight" to receive some kind of result if something goes wrong.
Format: type, keyword, "weight". Every keyword should be written from a new line.
Types:
w — one word
p — two words
Examples:
"w; bottle; 50" — initial weight of word bottle is 50.
"p; plastic bottle; 30" — initial weight of phrase plastic bottle is 30.
"w; plastic bottle; 20" — incorrect. You cannot use a two-word phrase for the "w" type.
Bad words — comma-separated list of words you want the script to exclude from the results.
Token — here you need to enter your token for API access. It can be found on your profile page.
You can download the source code for script here

Why does google add space after my character

On my page i have set meta tag for description, which then use google. It's all normal. But from some reason google add space after one character. Here is my meta description:
Igraj brezplačne online igre! Izbiraj med več kot 6.000 igrami! Vsak dan dodajamo nove igre! Igraj zdaj!
Yes it's not in english, but that's normal :D
And her is how google shows it:
Igraj brezplač ne online igre! Izbiraj med več kot 6.000 igrami! Vsak dan dodajamo nove igre! Igraj zdaj!
The problem is first 'č' character google ads space after it. Check google description of it:
https://www.google.si/search?q=bringler&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a
I have no idea why does it do it. And the funny part is that after second one all is ok, so no space added.
Any ideas?
Your content (the actual one in the meta-description, not the one in your question) contains a hidden control character: U+008D REVERSE LINE FEED
You can see it if you analyze the characters in the string, e.g. with Rishida’s String analyser: analyze "brezplačne"
If you copy the string directly from your meta-description and search for it in Google, it converts it to brezplaÄ Â ne.
So, replace the string "brezplačne" (note that Stack Overflow removes this hidden character, so these strings are actually the same here) in your content with "brezplačne" and you should be fine (when Google visits your page again, in some time).

Resources