I'm trying to play a live stream which is .m3u8 file and using embed text tracks in it. (I mean the text track is wrapped with video segment in the same .ts file) It works well on videoElement since I could access textTrack with videoElement.textTracks. However, When I cast it to chromecast, the text tracks is not show up.
Here is code snippet:
Sender:
const englishSubtitle = new chrome.cast.media.Track('1', // track ID
window.chrome.cast.media.TrackType.TEXT);
englishSubtitle.subtype = chrome.cast.media.TextTrackType.CAPTIONS;
englishSubtitle.name = 'English';
englishSubtitle.language = 'en-US';
englishSubtitle.customData = null;
englishSubtitle.trackContentId = '1';
englishSubtitle.trackContentType = 'text/vtt';
mediaInfo.tracks = [englishSubtitle];
}
const loadRequest = new chrome.cast.media.LoadRequest(mediaInfo);
Receiver:
const textTracksManager = this.playerManager.getTextTracksManager();
const tracks = textTracksManager.getTracks();
textTracksManager.setActiveByIds([tracks[0].trackId]);
BTW: This configuration works well when casting an VOD content. But we use out band text tracks in VOD. (I mean we use an separate vtt file URL)
Related
How to append blob to input of type file?
<!-- Input of type file -->
<input type="file" name="uploadedFile" id="uploadedFile" accept="image/*"><br>
// I am getting image from webcam and converting it to a blob
function takepicture() {
canvas.width = width;
canvas.height = height;
canvas.getContext('2d').drawImage(video, 0, 1, width, height);
var data = canvas.toDataURL('image/png');
var dataURL = canvas.toDataURL();
var blob = dataURItoBlob(dataURL);
photo.setAttribute('src', data);
}
function dataURItoBlob(dataURI) {
var binary = atob(dataURI.split(',')[1]);
var array = [];
for(var i = 0; i < binary.length; i++) {
array.push(binary.charCodeAt(i));
return new Blob([new Uint8Array(array)], {type: 'image/jpeg'});
}
// How can I append this var blob to "uploadedFile". I want to add this on form submit
It is possible to set value of <input type="file">.
To do this you create File object from blob and new DataTransfer object:
let file = new File([data], "img.jpg",{type:"image/jpeg", lastModified:new Date().getTime()});
let container = new DataTransfer();
Then you add file to container thus populating its 'files' property, which can be assigned to 'files' property of file input:
container.items.add(file);
fileInputElement.files = container.files;
Here is a fiddle with output, showing that file is correctly placed into input.
The file is also passed correctly to server on form submit. This works at least on Chrome 88.
If you need to pass multiple files to input you can just call container.items.add multiple times. So you can add files to input by keeping track of them separately and overwriting its 'files' property as long as this input contains only generated files (meaning not selected by user). This can be useful for image preprocessing, generating complex files from several simple ones (e.g. pdf from several images), etc.
API references:
File object
DataTransfer object
I had a similar problem with a fairly complex form in an angular app, so instead of the form I just sent the blob individually using XMLHttpRequest(). This particular "blob" was created in a WebAudioAPI context, creating an audio track in the user's browser.
var xhr = new XMLHttpRequest();
xhr.open('POST', 'someURLForTheUpload', true); //my url had the ID of the item that the blob corresponded to
xhr.responseType = 'Blob';
xhr.setRequestHeader("x-csrf-token",csrf); //if you are doing CSRF stuff
xhr.onload = function(e) { /*irrelevant code*/ };
xhr.send(blob);
You can't change the file input but you can use a hidden input to pass data. ex.:
var hidden_elem = document.getElementById("hidden");
hidden_elem.value = blob;
I had take too much time to find how to do "url, blob to input -> preview"
so I had do a example you can check here
https://stackoverflow.com/a/70485949/6443916
https://vulieumang.github.io/vuhocjs/file2input-input2file/
image bonus
I'm using Azure Cognitive Services for Text to Speech in a web app.
I return the bytes to the browser and it works great, however on the server (or local machine) the speechSynthesizer.SpeakTextAsync(inp) line outputs the audio to the speaker.
Is there a way to turn this off, since this runs on a web server (and even if I ignore it, there's the delay while it outputs audio before sending back the data)
Here's my code ...
var speechConfig = SpeechConfig.FromSubscription(speechKey, speechRegion);
speechConfig.SpeechSynthesisVoiceName = "fa-IR-FaridNeural";
speechConfig.OutputFormat = OutputFormat.Detailed;
using (var speechSynthesizer = new SpeechSynthesizer(speechConfig))
{
// todo - how to disable it saying it here?
var speechSynthesisResult = await speechSynthesizer.SpeakTextAsync(inp);
return Convert.ToBase64String(speechSynthesisResult.AudioData);
}
What you can do is add an audioconfig to the speechSynthesizer.
In this audioconfig object you can specify a file path to a .wav file which already exist on the server.
Whenever you run speaktextasyn instead of a speaker it will redirect the data to the .wav file.
This audio file you can read and perform your logic later.
Just add the following code before creating the speechSynthesizer object.
var audioconfig = AudioConfig.FromWavFileOutput(filepath);
here filepath is a location of the .wav file as a string.
Complete code :
string filepath = "<file path> " ;
var speechConfig = SpeechConfig.FromSubscription(speechKey, speechRegion);
var audioconfig = AudioConfig.FromWavFileOutput(filepath);
speechConfig.SpeechSynthesisVoiceName = "fa-IR-FaridNeural";
speechConfig.OutputFormat = OutputFormat.Detailed;
using (var speechSynthesizer = new SpeechSynthesizer(speechConfig, audioconfig))
{
// todo - how to disable it saying it here?
var speechSynthesisResult = await speechSynthesizer.SpeakTextAsync(inp);
return Convert.ToBase64String(speechSynthesisResult.AudioData);
}
Azure TTS standard voice audio files are generated normally. However, for neural voice, the audio file is generated abnormally with the size of 1 byte. The code is below.
C# code
public static async Task SynthesizeAudioAsync()
{
var config = SpeechConfig.FromSubscription("xxxxxxxxxKey", "xxxxxxxRegion");
using var synthesizer = new SpeechSynthesizer(config, null);
var ssml = File.ReadAllText("C:/ssml.xml");
var result = await synthesizer.SpeakSsmlAsync(ssml);
using var stream = AudioDataStream.FromResult(result);
await stream.SaveToWaveFileAsync("C:/file.wav");
}
ssml.xml - The file below, set to standard voice, works fine.
<speak version="1.0" xmlns="https://www.w3.org/2001/10/synthesis" xml:lang="en-US">
<voice name="en-GB-George-Apollo">
When you're on the motorway, it's a good idea to use a sat-nav.
</voice>
</speak>
ssml.xml - However, the following file set for neural voice does not work, and an empty sound source file is created.
<speak version="1.0" xmlns="https://www.w3.org/2001/10/synthesis" xml:lang="en-US">
<voice name="en-US-AriaNeural">
When you're on the motorway, it's a good idea to use a sat-nav.
</voice>
</speak>
Looking at the behavior you have described due to some issues, the Speech service has returned no audio bytes.
I have checked the SSML file at my end it works completely fine i.e. there is no issues with the SSML.
As a next step to the solution, I would recommend you to add error handling code to give better picture of the error and take the action accordingly :
var config = SpeechConfig.FromSubscription("xxxxxxxxxKey", "xxxxxxxRegion");
using var synthesizer = new SpeechSynthesizer(config, null);
var ssml = File.ReadAllText("C:/ssml.xml");
var result = await synthesizer.SpeakSsmlAsync(ssml);
if (result.Reason == ResultReason.Canceled)
{
var cancellation = SpeechSynthesisCancellationDetails.FromResult(result);
Console.WriteLine($"CANCELED: Reason={cancellation.Reason}");
if (result.Reason == ResultReason.SynthesizingAudioCompleted)
{
Console.WriteLine ("No error ")
using var stream = AudioDataStream.FromResult(result);
await stream.SaveToWaveFileAsync("C:/file.wav");
}
else if (cancellation.Reason == CancellationReason.Error)
{
{
Console.WriteLine($"CANCELED: ErrorCode={cancellation.ErrorCode}");
Console.WriteLine($"CANCELED: ErrorDetails=[{cancellation.ErrorDetails}]");
}
}
The above piece of modification will provide friendly error message on the console app.
Note : If you are not using the console app, you will have modify the code.
Sample output :
This is just a sample output. the error you might see would be different.
I'm trying to send an audio file to dialogflow API for intent detection. I already have an agent working quite well but only with text. I'm trying to add the the audio feature but with no luck.
I'm using the example (Java) provided in this page:
https://cloud.google.com/dialogflow-enterprise/docs/detect-intent-audio#detect-intent-text-java
This is my code:
public DetectIntentResponse detectIntentAudio(String projectId, byte [] bytes, String sessionId,
String languageCode)
throws Exception {
// Set the session name using the sessionId (UUID) and projectID (my-project-id)
SessionName session = SessionName.of(projectId, sessionId);
System.out.println("Session Path: " + session.toString());
// Note: hard coding audioEncoding and sampleRateHertz for simplicity.
// Audio encoding of the audio content sent in the query request.
AudioEncoding audioEncoding = AudioEncoding.AUDIO_ENCODING_LINEAR_16;
int sampleRateHertz = 16000;
// Instructs the speech recognizer how to process the audio content.
InputAudioConfig inputAudioConfig = InputAudioConfig.newBuilder()
.setAudioEncoding(audioEncoding) // audioEncoding = AudioEncoding.AUDIO_ENCODING_LINEAR_16
.setLanguageCode(languageCode) // languageCode = "en-US"
.setSampleRateHertz(sampleRateHertz) // sampleRateHertz = 16000
.build();
// Build the query with the InputAudioConfig
QueryInput queryInput = QueryInput.newBuilder().setAudioConfig(inputAudioConfig).build();
// Read the bytes from the audio file
byte[] inputAudio = Files.readAllBytes(Paths.get("/home/rmg/Audio/book_a_room.wav"));
byte[] encodedAudio = Base64.encodeBase64(inputAudio);
// Build the DetectIntentRequest
DetectIntentRequest request = DetectIntentRequest.newBuilder()
.setSession("projects/"+projectId+"/agent/sessions/" + sessionId)
.setQueryInput(queryInput)
.setInputAudio(ByteString.copyFrom(encodedAudio))
.build();
// Performs the detect intent request
DetectIntentResponse response = sessionsClient.detectIntent(request);
// Display the query result
QueryResult queryResult = response.getQueryResult();
System.out.println("====================");
System.out.format("Query Text: '%s'\n", queryResult.getQueryText());
System.out.format("Detected Intent: %s (confidence: %f)\n",
queryResult.getIntent().getDisplayName(), queryResult.getIntentDetectionConfidence());
System.out.format("Fulfillment Text: '%s'\n", queryResult.getFulfillmentText());
return response;
}
I have tried with several formats, wav (PCM 16 bits several sample rates) and FLAC, and also converting the bytes to base64 in two different ways as described here (by code or console):
https://dialogflow.com/docs/reference/text-to-speech
I have even tested with the .wav provided in this example creating a new intent in my agent called "book a room" with that training phrase. It works using text and audio from the dialogflow console but only works with text, not audio from my code... and I'm sending the same wav they provide! (code above)
I always receive the same response (QueryResult):
I need a clue or something, I'm totally stuck here. No logs, no errors in the response... but does not work.
Thanks
I wrote to the dialogflow support and the replied my with a working piece of code. It is basically the same posted above, the only difference is the base64 encoding, it is not necessary to do that.
So I removed:
byte[] encodedAudio = Base64.encodeBase64(inputAudio);
(And used inputAudio directly)
Now It is working as expected...
I need to update orientation tag(EXIF data) for the uploaded image. I am using "PIEXIF" for this. I am not using express but swagger. The code I've written is:
//Get the uploaded buffer
var _originalBuffer = req.swagger.params.uploadedFile.value.buffer;
let Duplex = require('stream').Duplex;
//Create stream from buffer. This stream is required later to send to cloud.
let _uploadedFileStream = new Duplex();
_uploadedFileStream.push(_originalBuffer);
_uploadedFileStream.push(null);
//Create base 64 string so that "PIEXIF" can read exif data from it.
const jpegData = "data:image/jpeg;base64, " + createStringFromBuffer(_originalBuffer, 'base64');
//Read exif data.
var _exifData = piexif.load(jpegData);
//Create a copy of exif data. Will be used to create a new image with updated orientation tag.
var _exifDataCopied = {};
for (var key in _exifData) {
_exifDataCopied[key] = _exifData[key];
}
//Update orientation tag.
if (_exifDataCopied["0th"][piexif.ImageIFD.Orientation])
_exifDataCopied["0th"][piexif.ImageIFD.Orientation] = 1;
//Example taken from https://www.npmjs.com/package/piexifjs
//From here onwards, there seems to be an issue.
var exifbytes = piexif.dump(_exifDataCopied);
var newData = piexif.insert(exifbytes, createStringFromBuffer(_originalBuffer, 'binary'));
var newJpeg = new Buffer(newData);
//Create a new stream and save it as image back.
let _updatedFileStream = new Duplex();
_updatedFileStream.push(newJpeg);
_updatedFileStream.push(null);
var fs = require('fs');
var writeStream = fs.createWriteStream("./uploads/" + "Whatever.jpg")
The issue is there is no error thrown by the code. The image is also getting saved in the directory but it is corrupted. I can not preview it. Since, the code does not breaks anywhere, I am confused what could be the issue? The function to convert buffer to string with different encoding(since I need it a lot) is:
var createStringFromBuffer = function(_buffer, _encoding) {
return Buffer.from(_buffer).toString(_encoding);
}
Can someone point out where I am mistaking? I am using the example given Here