How to use change stress in words. Azure Speech - azure

Please tell me how I can change the stress in some words in the Azure voice engine text-to-speech. I use Russian voices. I am not working through SSML.
When I send a text for processing, then in some words he puts the stress on the wrong syllable or letter.
I know that some voice engines use special characters like + or 'in front of a stressed vowel. I have not found such an option here

To specify the stress for individual words you can use the SpeakSsmlAsync method and pass a lexicon url or you can directly specify it directly in the ssml by using the phoneme-element. In both cases you can use IPA.

Related

Remove part of a string in each row of a large column of data in KNIME

I am stumbed.
I have a column with some thousand rows of unique adresses regarding universities, pharmacompanies etc. in a KNIME workflow
Example:
55 Shattuck Street Boston Massachusetts 02115 US [NAT: US RES: US] for all designated states
What I need is to clean the data, so each row look like nice and computable like this:
55 Shattuck Street Boston Massachusetts 02115 US.
My problem Is I can't seem to get the system to remove everything after US. Does anyone know a suitable approach in KNIME?
You should be able to use either String Replacer or String Manipulation for this. The first one lets you use either a simple wildcard or a full regular expression pattern while the second one uses a Java-like syntax - the choice comes down to how many different variations on the input data you need to handle and which syntax you prefer.
If you just need to remove any text between square brackets including the space before the open bracket then you can use String Replacer configured like this:
Beside the nodes which were already mentioned by nekomatic and which will work perfectly for the given scenario, there's also a user-friendly regular expression tool in the Palladian nodes extension called Regex Extractor, which allows you to build your regexes with a live preview as you might know from popular online regex testers.
For your scenario, you could e.g. set up a regex like this:
^(?<address>.*)(?:\s\[.*)
In prose, this means: Capture all characters until a space + square opening bracket and output into a column named address.
The Palladian extension is available here as a free plugin for KNIME Desktop and provides a variety of different tools for web, text, and geo data mining and classification.

Is there any option available in Azure Search for spell check?

I am using azure search in my bot application.
In this if we give input with spelling mistake, for small words like trvel => travel we are getting response properly.
But if i enter "travelexpense" for this i am not getting any result.
Currently i am passing input to do fuzzy search.
I have suggested to use Bing Spell Check API, but it is not approved as they think our input may be stored outside.
Is there any option available in azure search to correct the words like "travelexpense".
Is there any option available for this scenario?
The closest I would say is a phonetic Analyzer.
https://learn.microsoft.com/en-us/azure/search/index-add-custom-analyzers
There a couple of other things you can try:
Enable Auto Complete and Suggestions (https://learn.microsoft.com/en-us/azure/search/search-autocomplete-tutorial)
Create synonyms (https://learn.microsoft.com/en-us/azure/search/search-synonyms)

Update pronunciation of words in my Google Action

I have intents with responses done in Dialogflow with fulfillment enabled, and I have integrated with Google Assistant. There is a specific word "FICO" (as in FICO score) where the pronunciation is wrong when the Assistant responds. Is there a way to change the pronunciation of that specific word?
Instead of sending back text which will be used in text-to-speech generation, you can use the SSML <sub> tag to provide an aliased pronunciation for the word in question. So you might try something like this to see how it sounds
<speak>
Your <sub alias="fyeco">FICO</sub> score is
</speak>
or fiddle with it till it sounds the way you want. The part inside the tag will be displayed, while the alias part will be spoken.
The code for this might be something like
const msg = `<speak>Your <sub alias="fyeco">FICO</sub> score is ${score}.</speak>`
conv.add( msg );

Named entity recognition - tagging tools

Does someone have a recommendation of tagging tool for NER types in raw text?
The input for the tool should be a library of text files(.txt simple format) , there should be a convenient UI for selecting words and set the tag/annotation fit to selection, the output should be structural representations of the tags(e.gs tart index , last index, tag in a JSON format)
Founderof LightTag here
We provide a super convenient interface to do span annotations such as named entity recognition, classifications and relationships.
You can work as one labeler or bring in a team and LightTag will disribute work between everyone automatically (no more selecting files and remembering what you labeled already) .
You can upload your own suggestions and let labelers use those, or use LightTags built in model.
Of course you can annotate at the character level and highlight subwords or multi word phrases.
You can try https://github.com/lasigeBioTM/MER (bash)
see the demo at http://labs.fc.ul.pt/mer/
Online tools:
I guess Dataturks' POS tool should work fine for your use case, you can just upload your data and specify the labels. The UI seems convenient enough.
Here is the link:
https://dataturks.com
It's an online tool, so you can work with multiple people to get the tagging done.
The exact output format you are looking for is not supported, but the format can easily be converted to what you are looking for, the output is like: word___LABEL word2___LABEL , so a simple 2-line script can convert it to start and end index.
Offline:
Another tool you can check out is prodigy, it's a downloadable software and does similar things. Just that you might be willing to pay for it upfront.
https://prodi.gy

Google Home -> Dialogflow entity matching very bad? for non dictonary enities

with Dialogflow (API.AI) I find the problem that names from vessel are not well matched when the input comes from google home.
It seems as the speech to text engine completly ignore them and just does the speech to text based on dictionary so Dialogflow cant match the resulting text all at the end.
Is it really like that or is there some way to improve?
Thanks and
Best regards
I'd recommend look at Dialogflow's training feature to identify where the speech recognition of the Google Assistant may not have worked they way you expect. In those cases, you'll see how Google's speech recognition detected words you may not have accounted for. In cases where you'd like to match these unrecognized words to a entity value, simply add them as synonyms.

Resources