IBM Speech to Text Alphanumeric String recognition? - speech-to-text

In trying to get Speech to Text (IBM Voice Gateway IVR app) to recognize alpha-numeric character strings, I am wondering if I could create a custom grammar or entity that would restrict STT to recognizing just individual letters and numbers, excluding words altogether. For example, here's a typical string: 20Y0H8C. Watson comes back with words and numbers, like "two" instead of "2". Digit strings work fine. I realize that letter recognition is problematic with typical ASR, but I'm hoping Watson is up to the task. I noticed there are no system entities for alphanumeric characters. Any suggestions are much appreciated.

In this case, set smart_formatting to true.
The smart_formatting parameter converts dates, times, series of digits and numbers, phone numbers, currency values, and Internet addresses into more conventional representations in the final transcript of a recognition request. The conversion makes the transcript more readable and enables better post-processing of the transcription results. You set the parameter to true to enable smart formatting, as in the following example; by default, the parameter is false and smart formatting is not performed.
Check:
curl -X POST -u {username}:{password}
--header "Content-Type: audio/flac"
--data-binary #{path}audio-file.flac
"https://stream.watsonplatform.net/speech-to-text/api/v1/recognize?smart_formatting=true"
Result:
Voice: The quantity is one million one hundred and one
Result: The quantity is 1000101
Check IBM Official documentation.
Note: The smart formatting feature is currently beta functionality that is available for US English only.

Related

can Google Places API do a fuzzy search

Can I set Google Places API to do a fuzzy search? It seems Google map search (which use JavaScript) does that automatically, but it appears the REST API does not. I am frustrated by having to type in the accurate hotel name....any spelling errors bring up no result.
Try Text Search requests,
The Google Places API Text Search Service is a web service that returns information about a set of places based on a string — for example "pizza in New York" or "shoe stores near Ottawa" or "123 Main Street". The service responds with a list of places matching the text string and any location bias that has been set.
The service is especially useful for making ambiguous address queries in an automated system, and non-address components of the string may match businesses as well as addresses. Examples of ambiguous address queries are incomplete addresses, poorly formatted addresses, or a request that includes non-address components such as business names.

How to get entities of type alpha-numeric using DialogFlow

I am creating a Google Action using DialogFlow, I require a user input for which the entity should be of alpha-numeric type(like part-number or order-number).
I have tried the system entity type #sys.flight-number, but it didn't help.
This is going to be difficult, as Dialogflow is not designed to match alpha-numeric characters. Trying to input a series of letters and numbers can be easily misinterpreted, especially if just being spoken. If you can, I'd recommend not doing that.
Perhaps the user inputs these characters using an app or website, then the Action will authenticate using Google Sign-In or OAuth and pull the values from your account.
Alternatively, your Dialogflow intent could have a custom alphanumeric entity marked as a list, and let the user input values using the phonentic alphabet.

What is Lucene query to search for a wild character string

In Kibana I am trying to pull the my application log messages that had masked fields.
Example log message:
***statusMessage=, displayMessage=, securityInfoOutput=securityPin=pin=****, pinHint=*************
I want to search and pull the messages that have masked data - more than two consecutive *'s in the message.
Trying with search term message:"pin=\*\*\*\*"
but it didn't work
You seem to be thinking of search in the same way you'd type CTRL+F and search in a file. Search engines don't work that way. Search works based on exact matches of tokens. Tokens typically correspond to words extracted from text.
You can control how text is transformed into tokens using a process known as analysis. Analysis runs text through tokenization and various filters that decide how text is broken up into tokens and other pieces of metadata associated with each token.
This blog post I wrote might help put some of this into context.

Google prediction API - Training data syntax for multi classification

Trying to harness the power of Google Prediction API, to classify my data. Each item in my DB can have multi categories assign to it.
For example: "My Nexus phone is rebooting constantly" could be assigned both #Android and #troubleshooting tags.
I would like to upload my training data to Google, but I'm not sure how to apply both tags to the same content. In the following example I've found the syntax that provide one category for each content like so:
"Android" ,"My Nexus phone is rebooting constantly"
What is the right syntax for multi-classification training data?
Unless I'm misunderstanding something from your question, I think the answer to it is in the docs here.
Namely, the section about text strings explains that when you submit a text string, the system actually cuts it into multiple strings, separating everything using whitespaces as a delimiter. They point out to "Godzilla vs Mothra" to be "Godzilla", "vs", and "Mothra". So in your case, you could just use "Android troubleshooting". The system will separate it in "Android" and "troubleshooting".
From the docs:
Each line can only have one label assigned, but you can apply multiple labels to one example by repeating an example and applying different labels to each one. For example:
"excited", "OMG! Just had a fabulous day!"
"annoying", "OMG! Just had a fabulous day!"
If you send a tweet to this model, you might get a classification something like this: "excited":0.6, "annoying":0.2.

Convert a alpha-numeric string into a gibberish barcode?

A container is identified with the label JA1234. This container should always go to destination A.
Another container is identified with the label 1234. The vast majority of containers are labeled this way and these always go to destination B.
(Note: The pool of containers constantly fluctates so we can't maintain a master list.)
The users can either scan/key in the container identifier. Many of the containers aren't barcoded so they need to type in the number. When it gets typed in the prefix 'JA' gets ignored and suddenly the programs error checks fail (allowing wrong destinations).
To prevent entry and to force barcoding I would like to require the program to scan a barcode. The only way to get the users to scan the barcode consistently is the provide a barcode in a gibberish (ie hexadecimal) format.
Is there a any built-in .NET framework feature that would convert the readable string into something unreadable that would require scanning? It would need to be reversible.
It sounds like you want the users to input the whole string always and you users are ignoring part of the string. To solve this you want the users to just use the barcode scanner.
But you really have three choices.
Only print out the barcode. They can't type what they can't see. However this is bad because if a barcode is damaged you won't be able to fallback to user entry
Encode it using something like System.Convert.ToBase64String. This is bad because then you'll have to print values like SkExMjM0 and MTIzNA== for JA1234 and 1234 which is easy to mistype when the users needs to type.
Use a check digit and append it to the string. You can then reject codes incorrectly entered or incorrectly read by the barcode scanner. The downside is there's nothing built in that can directly convert "JA2134" and you have to create your own check digit function.

Resources