Live transcription using AWS Transcribe - node.js

I'm working on a project that requires a live audio to be transcribed in real-time. I tried the AWS Transcribe with WebSockets using their starter code available on GitHub.
Currently, for testing I have an audio file from a YouTube which I'm streaming to an icecast2 server hosted on a Cloud VM.
The ffmpeg command for streaming to the icecast2 server is
ffmpeg -re -i yt.wav -ar 44100 -ac 1 -c:a libvorbis -aq 5 -content_type 'audio/ogg' -vn -f ogg icecast://source:hackme#serverIP:8000/mystream.ogg
I've modified the code from GitHub such that instead of reading audio data from a microphone it reads the audio from icecast2 server. The problem with this is all it sometimes doesn't return a transcript at all or returns the wrong transcript.
I'd really appreciate if anyone could help

Related

How to combine audio and video in Pytube?

I am trying to write a code to download YouTube videos using Pytube on Python 3.6. But for most videos progressive download(Audio and Video in same file) format is available only upto 360p. So I want to download audio and video files separately and combine it. I am able to to download the audio and video files. How can I combine the two file together?
Basically I don't find any method to marge Audio and Video in Pytube but you can use ffmpeg for muxing.
First of all you have to install ffmpeg
ffmpeg installation guide for Windows
for Ubuntu just sudo apt install ffmpeg
Add a dependency ffmpeg-python a python wrapper of ffmpeg
pip install ffmpeg-python
Now we are ready to go with this code snippet
import ffmpeg
video_stream = ffmpeg.input('Of Monsters and Men - Wild Roses.mp4')
audio_stream = ffmpeg.input('Of Monsters and Men - Wild Roses_audio.mp4')
ffmpeg.output(audio_stream, video_stream, 'out.mp4').run()
for more, ffmpeg-python API References
If you keep getting a video without audio, that's because of the adaptive streaming from pytube. A work-around is to download both video and audio... then merge them with ffpmeg.
For instance, something like this to get both audio and video (audio part adapted from here)
from pytube import YouTube
import os
youtube = YouTube('https://youtu.be/ksu-zTG9HHg')
video = youtube.streams.filter(res="1080p").first().download()
os.rename(video,"video_1080.mp4")
audio = youtube.streams.filter(only_audio=True)
audio[0].download()
and then the ffmpeg part (adapted from both here and here) you can set it up on Windows following this procedure and then run something like
ffmpeg -i video.mp4 -i audio.mp4 -c:v copy -c:a aac output.mp4
Merging audio and video using ffmpeg
Once you have downloaded both video and audio files (‘videoplayback.mp4’ and ‘videoplayback.m4a’ respectively), here’s how you can merge them into a single file:
In case of MP4 format (all, except 1440p 60fps & 2160p 60fps):
ffmpeg -i videoplayback.mp4 -i videoplayback.m4a -c:v copy -c:a copy output.mp4
In case of WebM format (1440p 60fps and 2160p 60fps):
ffmpeg -i videoplayback.webm -i videoplayback.m4a -c:v copy -c:a copy output.mkv
Wait until ffmpeg finishes merging audio and video into a single file named "output.mp4".
How do I convert the downloaded audio file to mp3?
you need to execute the following command in the Command Prompt window:
ffmpeg -i INPUT_FILE -ab BITRATE -vn OUTPUT_FILE
Example:
ffmpeg -i videoplayback.m4a -ab 128000 -vn music.mp3
Example:2 (without bit rate)
ffmpeg -i videoplayback.m4a -vn music.mp3

Recording audio with ffmpeg without output file

I'm trying to record audio from one of my audio interfaces by using ffmpeg.
I want to do this in order to send this audio afterward via websocket with NodeJS.
I'm able to record the audio with ffmpeg and save it to an audio file, but I do not want to save it, I just want the audio stream in order to execute the ffmpeg command from NodeJS and use this audio stream with the websocket.
This is my current ffmpeg command:
ffmpeg -f alsa -i hw:0,0 -af "pan=2c|c0=c0" output.wav
Is there any way of using it without the output file?

No data written to stdin or stderr from ffmpeg

I have a dummy client that is suppose to simulate a video recorder, on this client i want to simulate a video stream; I have gotten so far that i can create a video from bitmap images that i create in code.
The dummy client is a nodejs application running on an Raspberry Pi 3 with the latest version of raspian lite.
In order to use the video I have created, I need to get ffmpeg to dump the video to pipe:1. The problem is that I need the -f rawvideo as a input parameter, else ffmpeg can't understand my video, but when i have that parameter set ffmpeg refuses to write anything to stdio
ffmpeg is running with these parameters
ffmpeg -r 15 -f rawvideo -s 3840x2160 -pixel_format rgba -i pipe:0 -r 15 -vcodec h264 pipe:1
Can anybody help with a solution to my problem?
--Edit
Maybe i sould explain a bit more.
The system i am creating is to be set up in a way, where instead of my stream server ask the video recorder for a video stream, it will be the recorder that tells the server that there is a stream.
I have have slowed my problem on my own. (-:
i now have 2 solutions.
Is to change my -f rawvideo to -f data that works for me anyways.
I can encode my bitmaps as jpeg in code and pipe my jpeg images to stdin. This also requires me to change the ffmpeg parameters to -r 4 -f mjpeg -i pipe:0 -r 4 -vcodec copy -f mjpeg pipe:1 and is by far the slowest thing i have ever done. and i can't use a 4k input
thanks #Mulvya for trying to help.
#eFox Thanks for editing my stupid spelling and grammar mistakes

Ffmpeg iis smooth streaming credentials

I use ffmpeg to stream live video to IIS publishing point by this command
ffmpeg -f dshow -i video="video device name":audio="audio device name" -pix_fmt yuv420p -f ismv http://localhost/test.isml
And it works properly. But when my publishing point has credentials - ffmpeg fails in one minute after starting with
av_interleaved_write_frame(): Unknown error
As I see I need to specify IIS publishing point credentials. But how? Any ideas?

HTTP Live Streaming : The Linux nightmare

I'm working on a music VOD app on iPhone, and thanks to Apple guidelines, I have to run a HTTP Live Streaming in order to be accepted on the AppStore. But, since Apple doesn't care about 98% of servers on earth, they don't provide their so magical HTTP Live Streaming Tools for Linux-based systems. And from this point, the nightmare starts.
My goal is simple : Take an MP3, segmentate it and generate a simple .m3u8 index file.
I googled "HTTP Live Streaming Linux" and "Oh great ! lots of people have already done that"!
First, I visited the (so famous) post by Carson McDonald.
Result : the svn segmentate.c was old, buggy and a nightmare to compile (Nobody in this world can precise what version of ffmpeg they are using !).
Then I came across the Carson's git repo, but too bad, there is a lot of annoying ruby stuff and live_segmenter.c can't take mp3 files.
Then I searched more deeply. I found this stackoverflow topic, and it's exactly what I want to do. So I have followed the advice from juuni to use this script (httpsegmenter). Result: Impossible to compile anything, 2 days of works and finally I managed to compile it (ffmpeg 8.1 w/ httpsegmenter rev17). And no, this is not a good script, it does take mp3 files, but the ts files generated and the index file can't be read by a player.
Then the author of the post krisbulman, came with a solution, and even gave a patched version of m3u8-segmenter by his own (git repo). I test it : doesn't compile, do nothing. So I took the original version from johnf https://github.com/johnf/m3u8-segmenter. I managed to compile and miracle it works (not really).
I used this command line (ffmpeg 0.8.1):
ffmpeg -er 4 -i music.mp3 -f mpegts -acodec libmp3lame -ar 44100 -ab 128k -vn - | m3u8-segmenter -i - -d 10 -p outputdir/prefix -m outputdir/output.m3u8 -u http://test.com/
This script encode my mp3 file (it takes 4 seconds, too long), and pass it to the m3u8-segmenter to segment it into 10 seconds .TS files.
I tested this stream with Apple's mediastreamvalidator on my mac, and it said that it was OK. So i played it into quicktime, but there is about 0.2 seconds blank between each .TS files !!
So here is my situation, it's a nightmare, I can't get a simple mp3 stream over the HLS protocol. Is there a simple WORKING solution to segmentate a mp3 ? Why can't I directly segmentate the mp3 file into multiple mp3 files like Apple's mediafilesegmenter does?
Use libfaac insteam of libmp3lame which eliminates the 0.2 second break.
Elastic Transcoder Service - if you don't need AES encryption just throw your MP3 in an S3 bucket and be done with it:
http://aws.amazon.com/elastictranscoder/
You can then even add Cloudfront CDN support. (P.S. I fully appreciate your pain, this whole space is a nightmare).
For live streaming only, you should try Nginx with RTMP module for this one. https://github.com/arut/nginx-rtmp-module
Live HLS works pretty good but with looooong buffer.
However, it does not support on-demand HLS streaming.
Piece of module`s config for example
# HLS requires libavformat & should be configured as a separate
# NGINX module in addition to nginx-rtmp-module:
# ./configure ... --add-module=/path/to/nginx-rtmp-module/hls ...
# For HLS to work please create a directory in tmpfs (/tmp/app here)
# for the fragments. The directory contents is served via HTTP (see
# http{} section in config)
#
# Incoming stream must be in H264/AAC/MP3. For iPhones use baseline H264
# profile (see ffmpeg example).
# This example creates RTMP stream from movie ready for HLS:
#
# ffmpeg -loglevel verbose -re -i movie.avi -vcodec libx264
# -vprofile baseline -acodec libmp3lame -ar 44100 -ac 1
# -f flv rtmp://localhost:1935/hls/movie
#
# If you need to transcode live stream use 'exec' feature.
#
application hls {
live on;
hls on;
hls_path /tmp/app;
hls_fragment 5s;
}
What problems were you having with httpsegmenter? It's a single C source file that only links against some libraries provided by ffmpeg (or libav). I maintain a Gentoo ebuild for it, as I use it to time-shift talk radio. If you're running Gentoo, building is as simple as this:
sudo bash -l
layman -S
layman -a salfter
echo media-video/httpsegmenter ~\* >>/etc/portage/package.accept_keywords
emerge httpsegmenter
exit
On Ubuntu, I had to make sure libavutil-dev and libavformat-dev were both installed, so the build looks something like this:
sudo apt-get install libavutil-dev libavformat-dev
git clone https://gitlab.com/salfter/httpsegmenter.git
cd httpsegmenter
make -f Makefile.txt
sudo make -f Makefile.txt install
Once it's built (and once I have an audio source URL), usage is fairly simple: curl to stream the audio, ffmpeg to transcode it from whatever it is at the source (often MP3) to AAC, and segmenter to chunk it up:
curl -m 3600 http://invalid.tld/stream | \
ffmpeg -i - -acodec libvo_aacenc -ac 1 -ab 32k -f mpegts - 2>/dev/null | \
segmenter -i - -d 20 -o ExampleStream -x ExampleStream.m3u8 2>/dev/null
This grabs one hour of streaming audio (needs to be MP3 or AAC, not Flash), transcodes it to 32 kbps mono AAC, and chunks it up for HTTP live streaming. Have it dump into a directory served up by your webserver and you're good to go.
Once the show's done, converting to a single .m4a that can be served up as a podcast is also simple:
cat `ls -rt ExampleStream-*.ts` | \
ffmpeg -i - -acodec copy -absf aac_adtstoasc ExampleStream.m4a 2>/dev/null
I know this is an old question, but I am using this in VLC:
## To start playing the playlist out to the encoder
cvlc -vvv playlist.m3u --sout rtp:127.0.0.1 --ttl 2
## To start the encoder
cvlc rtp:// --sout='#transcode{acodec=mp3,ab=96}:duplicate{dst=std{access=livehttp{seglen=10,splitanywhere=true,delsegs=true,numsegs=15,index=/var/www/vlctest/mystream.m3u8,index-url=http://IPANDPORT/vlctest/mystream-########.ts},mux=ts,dst=/var/www/vlctest/mystream-########.ts},select=audio}'
I had problems if I didn't stream the playlist file to another copy of VLC, the first step is optional if you already have a live streaming source. (but you can use any source for the "encoder" portion).
You could try to use our media services on Windows Azure platform: http://mingfeiy.com/how-to-generate-http-live-streaming-hls-content-using-windows-azure-media-services/
You could encode and stream your video in HLS format by using our portal with no configuration and coding required.
Your English is fine.
Your frustration is apparent.
Q: What's the real issue here? It sounds like you just need a working HLS server, correct? Because of Apple requirements, correct?
Can you use any of the ready-made implementations listed here:
http://en.wikipedia.org/wiki/HTTP_Live_Streaming

Resources