Part of sentence alignement

Part of sentence alignement - text

I work on parallel texts in two languages.
Some tools exist to align sentences (ex: Hunalign).
But sentences are often too long for my use case. So I would like to align part of sentences.
Here is an example:
Input:
French
English
Bonjour ! L'année dernière, je suis allé à Paris qui est la capitale de la France.
Hello! Last year, I went to Paris which is the capital of France.
Existing result : Sentence level alignement:
French
English
Bonjour !
Hello!
L'année dernière, je suis allé à Paris qui est la capitale de la France.
Last year, I went to Paris which is the capital of France.
Expected result: Part of sentence alignment:
French
English
Bonjour !
Hello!
L'année dernière,
Last year,
je suis allé à Paris
I went to Paris
qui est la capitale de la France.
which is the capital of France.
Google Translate offers a hover function, which seems to do part of the job internally.
Any idea about how to get this result from Google Translate API?
Any idea about how to do it with any other tool? I'm trying to cut sentences in chunks and evaluate similarity with Facebook LASER and Google Universal Sentence Encoder, without much success so far.
Ideally, I'm looking for a solution which is language agnostic, much like LASER is. Or which supports ~40 most common languages.
Thanks in advance for your kind advice

Related

How do I stop Mendeley referencing in text citations with two names before et al. ? It ONLY does this for references with 3 authors

My in text citations must be harvard referencing but with et al. if there are more that 2 authors. Listing only 1 name followed by et al. : (Smith et al., 2009)
However, Mendeley is writing the names of two authors before et al. but only when there are 3 authors in total. How do i stop this?
These are my inline citation et al. settings:
et-al-min 3,
et-al-use-first 1,
et-al-subsequent-min (left blank),
et-al-subsequent-use-first (left blank)
In text these are the citations I get:
1 author (Smith, 2009),
2 authors (Smith and Jones, 2008),
3 authors (Smith, Jones, et al., 2007) <<<< how do I stop this??,
4 authors or more (Smith et al., 2004)

If you want just the first author followed by et al when there are two or more authors you should set et-al-min 2, et-al-use-first 1, et-al-subsequent-min (blank), et-al-subsequent-use-first (blank)
As per CSL specification use of et-al-min / et-al-use-first attributes enable et-al abbreviation. If the number of names in a name variable matches or exceeds the number set on et-al-min, the rendered name list is truncated after reaching the number of names set on et-al-use-first.
So that would mean it first uses one name (et-al-use-first 1) when there are a minimum of two authors (et-al-min 2) before showing et al.

I had the same problem (seemingly), but it turned out that it was caused not directly because of CSL regarding et-al settings:
It results from two papers of the same author and year in my bibliography.
If I delete one of the paper citations, then this issue does not appear any more.

How to implement the language automatically in detect sentiment in azure text analytics in a logic app?

I've created a logic app which detects the language of a text and then the sentiment with cognitive services. I want to change the language parameter to the actual language which is detected. I've tried the following and much other stuff but it doesn't work out for me.
Can anyone suggest me a solution?
(The text is German but when I run it it says: "Supplied language not supported. Pass in one of: ar,da,de,el,en,es,fi,fr,it,ja,nl,no,pl,pt-PT,ru,sv,tr,zh-Hans")
For a copy:
Hallo E-Bike Team, ich habe ein Problem mit meinem E-Bike. Es ist das
Model ProRide2E. Seit heute Morgen steht auf dem Display folgender
Hinweis: „Akkuleistung beeinträchtigt. Fehlercode: XB1200AB“ Das darf
doch wohl nicht wahr sein. Ich bin echt sauer. Wieso ist das doofe
Bike immer kaputt?
Ein genervter Kunde
Name

I added dynamic content instead of the #('Detect_Language')?['iso6391Name']. The dynamic content was Language Code. That created a for each loop. In the response I didnt output score as usual I added a expression (body('Detect_Sentiment')[0]?['score']) That worked out!

Azure text analytics accuracy in german is different from the demo case

I've created a website which communicates with my logic app while that uses the text analytics from azure. But my applications acts different from the demo case which you can find here: https://azure.microsoft.com/de-de/services/cognitive-services/text-analytics/ ! When you post for example:
Hallo E-Bike Team,
ich habe ein Problem mit meinem E-Bike. Es ist das Model ProRide2E.
Seit heute Morgen steht auf dem Display folgender Hinweis:
„Akkuleistung beeinträchtigt. Fehlercode: XB1200AB“
Das darf doch wohl nicht wahr sein. Ich bin echt sauer. Wieso ist das
doofe Bike immer kaputt?
Ein genervter Kunde
Name
which is very negative it responses with a sentimental score of 31% in the demo but in my app it responses with 50% which is clearly wrong because it is negative and it should be below 50%. I uses the same cognitive services as the demo but my accuracy is not similar to the demo.
Is there any way to improve my accuracy?
ps: I'm using the free subscription. Does the accuracy change if change that?

I tried to repro the issue at my side and found that when you don't set Language parameter it will give you the SENTIMENT score of 0.5
When you set the Language parameter to de it will give you exact SENTIMENT score of 0.30392158031463623.
Below is the screenshot to show how you can set the Language parameter:

Logic Apps - Converting HTML symbols in plain text

Fairly new to Logic Apps and not familiar with all the functions.
I created a simple Logic App that will check an RSS feed every so often, loops every item it finds, takes only certain data (title, summary and URL link), paste them in an HTML table and then send an email with the outcome. Sounds fairly simple, right?
The problem I'm facing is that the RSS feed contains certain HTML characters such as & or ', which will then appear in the email I receive. Is it possible to convert these in Logic Apps?
Additionally, I've also noticed that some HTML character are "double encoded??"
Basically if we had to look at <description>&quot;Quando Romelu si mette in testa una cosa, di solito la ottiene. Ora, si sarebbe messo in testa l&rsquo;Inter.</description> straight away you'd realise that the first & is there for the quot; that follows it. So it's kind of expecting that first the & is converted in an actual & to then form " to be then converted to ", if it makes sense what I'm saying. I don't own the feed, or in any way control it. Wanted to get familiar with Logic Apps so I thought I'd start with some football news processing.
Here's a sample of one item(out of 20) in the RSS feed
<item>
<guid>https://www.fcinternews.it/?action=read&idnotizia=310797</guid>
<pubDate>Wed, 19 Jun 2019 09:51:40 +0200</pubDate>
<title>CdS - Il BVB vuole Pinamonti: valutazione schizzata oltre i 20 milioni </title>
<link>https://www.fcinternews.it/rassegna/cds-il-bvb-vuole-pinamonti-valutazione-schizzata-oltre-i-20-milioni-310797</link>
<description>Anche il Corriere dello Sport sottolinea la grande fila che si &egrave; messa in attesa di buone nuove dall&#39;Inter per Andrea Pinamonti, protogonista del Mondiale U-20.</description>
<category>Rassegna</category>
<enclosure url="https://net-storage.tccstatic.com/storage/fcinternews.it/img_notizie/thumb1/ec/ec620af4eeb01ebebbb662d7947a6700-85495-21a8fcf5fc9c392cfa4303d2753d5db6.jpeg" type="image/jpeg" length="9983"/>
</item>

There is the solution to use an Azure function to clean it up link
However, as you are doing something like this I guess you can do replace actions and oh btw single quotation escapes suck. Use a variable to cheat it.
replace(replace(replace(replace(item()['summary'],' ',' '),'&','& '),'"','"'),''','')
replace(replace(replace(replace(item()['title'],' ',' '),'&','& '),'"','"'),''',variables('EscapeSingleQuotation'))
Result
Is this what you are looking for?

Microsoft have included new connector called - Content Conversion.
This converts HTML contents to plain string.
This is available on Logic Apps, Power Automate, Power Apps. But still on preview phase.

Reverse specific words

I am a beta tester for a hockey game and the csv file has the home and away teams reversed. The day, month, and year are correct though.
This:
20;1;1995;Toronto Maple Leafs;Los Angeles Kings
20;1;1995;Buffalo Sabres;New York Rangers
20;1;1995;St. Louis Blues;San Jose Sharks
20;1;1995;Pittsburgh Penguins;Tampa Bay Lightning
20;1;1995;Dallas Stars;Vancouver Canucks
20;1;1995;Calgary Flames;Winnipeg Jets
To this:
20;1;1995;Los Angeles Kings;Toronto Maple Leafs
20;1;1995;New York Rangers;Buffalo Sabres
20;1;1995;St. Louis Blues;San Jose Sharks
20;1;1995;Tampa Bay Lightning;Pittsburgh Penguins
20;1;1995;Vancouver Canucks;Dallas Stars
20;1;1995;Winnipeg Jets;Calgary Flames
Of course this just a small sample...
Any help would be greatly appreciated!
Thank you!

Try replace all (ctrl+h) with regular expressions enabled
Use (\d+;\d+;\d+;)([\s\S]+);([\s\S]+) for your to replace value and $1$3;$2 for your replace with value
Can't test it right now as I dont have Notepad++ installed on this computer... Tested it in Sublime and it worked.

I would just do it manually after copying it to notepad regular then paste it all back again
find____________
replace________________
find toronto
replace otnorot
)replace all parameter's(
etc
edless it is for a code to do it automatically
then your goin to need allot more integration for the compilation
I had a similiar qeustion before!

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Part of sentence alignement - text

Related

How do I stop Mendeley referencing in text citations with two names before et al. ? It ONLY does this for references with 3 authors

How to implement the language automatically in detect sentiment in azure text analytics in a logic app?

Azure text analytics accuracy in german is different from the demo case

Logic Apps - Converting HTML symbols in plain text

Reverse specific words

Categories

Resources