I am trying to save the pronunciation of a French word into a .wav or .mp3 file.
I was wondering if there was anywhere on the Google Translate API (since it has a pronunciation functionality) that allows me to achieve this objective. Other libraries would work too.
Since this question was asked, it's gotten much harder to "scrape" MP3s from Google Translate, but Google has (finally) set up a TTS API. Interestingly it is billed in input characters, with the first 1 or 4 million input characters per month being free (depending on whether you use WaveNet or old school voices)
Nowadays to do this using gcloud on the command line (versus building this into an app) you would do roughly as follows (I'm paraphrasing the TTS quick start). You need base64, curl, gcloud, and jq for this walkthrough.
Create a project on the GCP console, or run something like gcloud projects create example-throwaway-tts
Enable billing for the project. Do this even if you don't intend to exceed the freebie quota.
Use the GCP console to enable the TTS API for the project you just set up.
Use the console again, this time to make a new service account.
Use any old name
Don't give it a role. You'll get a warning. This is okay.
Select key type JSON if it isn't already selected
Click Create
Hold onto the JSON file that your browser downloads
Set an environment variable to point at that file, e.g. export GOOGLE_APPLICATION_CREDENTIALS="~/Downloads/service-account-file.json"
Get the appropriate access token:
Tell gcloud to use that new project: gcloud config set project example-throwaway-tts
Set a variable TTS_ACCESS_TOKEN=gcloud auth application-default print-access-token
Put together a JSON request. I'll give an example below. For this example we'll call it request.json
Lastly, run the following
curl \
-H "Authorization: Bearer "$TTS_ACCESS_TOKEN \
-H "Content-Type: application/json; charset=utf-8" \
--data-raw #request.json \
"https://texttospeech.googleapis.com/v1/text:synthesize" \
| jq '.audioContent' \
| base64 --decode > very_simple_example.mp3
What this does is to
authenticate using the default access token for the project you set up
set the content type to JSON (so that jq can extract the payload)
use request.json as the data to send using curl's --data-raw flag
extract the value of audioContent from the response
base64 decode that content
save the whole mess as an MP3
Contents of request.json follow. You can see where to insert your desired text, adjust the voice or change output formats via audioConfig:
{
'input':{
'text':'very simple example'
},
'voice':{
'languageCode':'en-gb',
'name':'en-GB-Standard-A',
'ssmlGender':'FEMALE'
},
'audioConfig':{
'audioEncoding':'MP3'
}
}
Original Answer
As Hugolpz alludes, if you know the word or phrase you want (via a previous Translate API call), you can get MP3s from a URL like http://translate.google.com/translate_tts?ie=UTF-8&q=Bonjour&tl=fr
Note that &tl=fr ensures that you get French instead of the default English.
You will need to rate-limit yourself, but if you're looking for a small number of words or phrases you should be fine.
Similar functionality is provided by the Speech Synthesis API (under development). Third-party libraries are already there, such as ResponsiveVoice.JS.
Related
I am doing a project using WhatsApp cloud API. I need to create a template with a media header. I have created a template with a media header without a sample image and it gets rejected. So I want to create a template with a sample image in Node JS.
Template with a media header
Add sample image for a template
curl -X POST "https://graph.facebook.com/v14.0/{whatsapp-business-account-ID}/message_templates
?name={template-name}
&language=en_US
&category=TRANSACTIONAL,
&components=[{
type:BODY,
text:{message-text}
},
{
type:HEADER,
format:IMAGE,
example:{header_handle:[{uploaded-image-file-url}]}
}],
&access_token={system-user-access-token}"
I want to add a sample image using Node JS (Not manually like the second picture).
header_handle requires a encrypted file upload provided by facebook.
This can be done by calling 2 apis.
First,
We have to create a session for the file to be uploaded.
For creating session refer this
After creating session, we will get session id to upload the original file to it.Response will look something like this:
{"id":"upload:MTphdHRhY2htZW50Ojlk2mJiZxUwLWV6MDUtNDIwMy05yTA3LWQ4ZDPmZGFkNTM0NT8=?sig=ARZqkGCA_uQMxC8nHKI"}
Second,We have to upload the file to
https://graph.facebook.com/v14.0/{above_id}
This will give a response something similar to
{"h":"2:c2FtcGxlLm1wNA==:image/jpeg:GKAj0gAUCZmJ1voFADip2iIAAAAAbugbAAAA:e:1472075513:ARZ_3ybzrQqEaluMUdI"}
Finally,
{header_handle:["2:c2FtcGxlLm1wNA==:image/jpeg:GKAj0gAUCZmJ1voFADip2iIAAAAAbugbAAAA:e:1472075513:ARZ_3ybzrQqEaluMUdI"]}
Should be added during the request to create template.
It worked for me.
See this for better understanding on how to do it.
The answer Provided by Aravindh is correct, you can follow This document from Meta to upload the Image you want.
Just make sure you use a supported type by WhatsApp API ( For WhatsApp Business Platform Cloud API , For WhatsApp Business Platform On-Premises API ) and the upload end point (file-type — The file's MIME type. Valid values are: image/jpeg, image/jpg, image/png, and video/mp4
)
Double Check if you are following exacly the types supported, for example in case of png, you need to set "file_type" to "image/png" no just "png" when creating the upload session.
I have tested it and it works for me.
Hope this helps
I have a google cloud bucket which is having the files with the extension .jtl, I need to get these file names and paths irrespective of the nesting of folders they are in using NodeJS.
How can we do that!
I think this link might help you
https://cloud.devsite.corp.google.com/storage/docs/json_api/v1/objects/list
You can try this API, and give the preferred parameters present in that,
for example: in your case you can give delimiter as jtl, and then you can copy the curl command or http or node JS as you prefer and execute this in your google cloud platform
The command will look something like this:
curl
'https://storage.googleapis.com/storage/v1/b/xyz12345/o?delimiter=jpg&includeTrailingDelimiter=true&key=[YOUR_API_KEY]'
--header 'Authorization: Bearer [YOUR_ACCESS_TOKEN]'
--header 'Accept: application/json'
--compressed
Provide your API key and Access token to be able to run this in your google cloud platform.
I'm having some trouble while trying to send a PDF file to Microsoft's Form Recognizer service.
Instead of sending the PDF Url location, I need to send the PDF file. On my experience, sending files could be done using base64 but it seems that Microsoft service is not compatible with base64 format. Whenever I try sending the file the server responses:
{
"error": {
"code": "1000",
"message": "Invalid input file."
}
}
I need to know how I should convert my PDF to the required application/pdf "Binary PDF data". I can't find any documentation referring to this conversion.
The Form Recognizer API webpage is: https://brazilsouth.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-v2-1-preview-3/operations/AnalyzeWithCustomForm
And here you can find the complete documentation webpage: https://learn.microsoft.com/en-us/azure/cognitive-services/form-recognizer/quickstarts/client-library?tabs=preview%2Cv2-1&pivots=programming-language-rest-api
Thanks!
You are correct that base64-encoded requests are not supported.
If you are using curl and you want to send a local file, run this:
curl -i https://{endpoint}/formrecognizer/v2.1-preview.3/custom/models/{modelId}/analyze -H 'Content-Type: application/pdf' \
-H 'Ocp-Apim-Subscription-Key: {subscription key}' --data-binary #/path/to/your/file.pdf
The key parts are the Content-Type header, which must match a supported value, and the --data-binary flag, which is the path to a local PDF file. Be sure to include the -i flag so that you can see the Operation-Location header in the response, which is where you can retrieve the analyze results.
You may also want to take a look at the Form Recognizer SDKs for C#, Java, JavaScript, and Python.
I'm struggling to connect to the IBM Watson API for Natural Language Understanding.
I've added it to the Resource list in my IAM account. I've got to the page with an example POST request to connect to the API, and I can't seem to authenticate. I've blanked out the API key from this request but in the pages the key is supplied so I'm struggling to see why it's not working
curl -X POST -u "#######" \
-H "Content-Type: application/json" \
-d '{ "text": "I still have a dream. It is a dream deeply rooted in the
American dream. I have a dream that one day this nation will rise up and
live out the true meaning of its creed: \"We hold these truths to be
self-evident, that all men are created equal.\"", "features": {
"sentiment": {}, "keywords": {} }}' \
"https://gateway-lon.watsonplatform.net/natural-language-
understanding/api/v1/analyze?version=2018-03-19"
I've tried pasting this into Postman but I just get a 401 Unauthorized response, which makes me think it's something in the account pages of the IAM, but they've chnage the interface and not update the documentation, and I'm going round in circles because the instructions don't match the menus.
Any pointers would be appreciated. I intend to query through Python, so I'm hoping once I can get past the authentication issue it's as simple as copying the Python code out of Postman
Your -u credentials should be:
-u "apikey:#######"
As per the API documentation -
https://cloud.ibm.com/apidocs/natural-language-understanding#authentication
Somehow the API credentials were not being recognised. I must have done something wrong in the initial IAM setup, which meant that when I deleted the credentials, re-created them and then copied the new key ... everything immediately started working. Complete mystery as to why but hopefully this helps someone. Here are the instructions I followed
https://console.bluemix.net/docs/services/natural-language-understanding/getting-started.html#getting-started-tutorial
I used the SDK as suggested by Simon O'Doherty
It might also be related to me having gone into the "Manage" >> "Account" and deleted any Access Groups and Service IDs I'd attempted to create by following the "Getting Started with IAM" instructions from here, which I suspect might have been what confused me
IAM getting started (not required)
Using the sdk, how do I search a file and get information about it? The only option I see is to go through all files and folders which is inconvenient and takes a lot of code.
Is there an easier way? If not with the SDK, a REST option perhaps?
Thanks
I found this from http://developers.box.com/docs/#search
curl https://api.box.com/2.0/search?query=football&limit=1&offset=0 \
-H "Authorization: Bearer ACCESS_TOKEN"
I can search for any object and it returns a json response with the results.
I still don't know how to do it with the SDK but REST did the same job.
Android sdk provides such an option.
You can do this:
boxClient.getSearchManager().search(searchQuery, null);
this method will return you a BoxCollection, every item in the collection should match your search query.