Read audio channel data from video file nodejs - node.js

I want to read audio frequency data from video mp4 file get it as array (not mp3 file) that’s it, no need to do anything fancy
Currently I ‘m doing it it with webaudio api in javascript
However i need to do it with nodejs
i want to make this as fast as possible
I don’t care about the video frames data Or anything else
I’m trying with ffmpeg it seems very hard
if there is another way with fs maybe ?!!
Thank you in advance

Related

Is there any visualization tool for .flac audio file or .ts audio file?

I am pretty new with processing audio file. '
I want to build a web app that can take audio file and turn the into visualization for user like this https://github.com/CrowdCurio/audio-annotator
Right now I want to research on visualize audio datas. Original data that was stored in S3 come in two form .ts and .flac. That's why I want to ask if there's any visualization tool which can directly use .ts or .flac audio file.
Because right now the solution I think of will be first convert them into .wav or .mp3, so most visualization tool can process them, but .wav file is really storage-wasting as far as I know.
So if you know any approach or tool to do this. Please let me know!
Audio visualization requires audio data. Your compressed audio isn't audible until decoded. Therefore, you must decode them to PCM before visualizing.
This doesn't require that you store the files as WAV, but you'll at least have to decode them on-the-fly.

Mixing OPUS Audio on a server for a meeting

I am currently trying to challange myself and write a little meeting tool in order to learn stuff. I am currently stuck at the audio part. My audio stack is working, each user sends an OPUS stream (20ms packets) to the server. I am not thinking about how I will handle the audio. The outcome shall be: All users receive all audio, but not the own audio (so he does not hear himself). I want the meeting to support as many concurrent users as possible.
I have the following ideas, but none feels quite right:
send all audio streams to all users, which would mean bigger traffic, mixing would be done on the client side
mix the audio on a per-user-basis, which means if I have n users I would need to include n encodings in each frame.
mix the whole audio together, for alle users, send it to all users, each user will receive a second opus package containing the "own" opus audio packet which was sent to the server (or it will be numbered and stored on the client side so it does not need retransmission). I dont know if after decoding I can remove the "own" audio from the stream without getting some unclean audio.
How is this normally done? Are there options I am missing? I have profiled all steps involved, the most expensive part is encoding (one 20ms encoding takes about 600ns), decoding and mixing need near to no-time at all (5-10ns for each step).
Currently I would prefer option 3, but I do not find informations if the audio will be clean or if it will result in washy audio or some cracking.
The whole thing is written in C++, but I did not include the tag since I dont need code examples, just informations on this topic. I tried googling a lot, read a lot of documentation of opus, but did not find anything related to this.

Extract individual frames as a buffer from a video

I'm trying to process a video using tensorflow in node.js (i.e. on the server - I don't have a web page). I need to process each frame in the video individually. I see some people are using ffmpeg to generate individual image files from the video but that seems wasteful as it creates files on the filesystem. I would prefer to grab each frame as a base64 string in memory. I've got this working using OpenCV4Node but am wondering if there are any lighter weight solutions. Is anyone already doing this? Any help would be appreciated :-)

ffmpeg - Can I draw an audio channel as an image?

I'm wondering if it's possible to draw an audio channel of a video or audio file as an image using ffmpeg, or if there's another tool that would do it on Win2k8 x64. I'm doing this as part of an encoding process after a user uploads a video or audio file.
I'm using ColdFusion 10 to handle the upload and calling cfexecute to run ffmpeg.
I need the image to look something like this (without the horizontal lines):
You can do this programmatically very easily.
Study the basics of FFmpeg. I suggest you to compile this sample. It explains how to open a video/audio, identify the streams and loop over the packets.
Once you have the data packet (in this case you are interested only in the audio packets). You will decode it (line 87 of this document) and obtain the raw data of an audio. It's the waveform itself (the analogue "bitmap" for an audio).
You could also study this sample. This second example is how to write a video/audio file. You don't want to write any video, but with this sample you can easily understand how the audio raw data packet works, if you see the functions get_audio_frame() and write_audio_frame().
You need to have some knowledge about creating a bitmap. Any platform has an easy way to do that.
So, the answer for you: YES, IT IS POSSIBLE TO DO THIS WITH FFMPEG! But you have to code a little bit in order to get what you want...
UPDATE:
Sorry, there are ALSO built-in features for this:
You could use those filters... or
showspectrum, showwaves, avectorscope
Here are some examples on how to use it: FFmpeg Filters - 12.22 showwaves.

How does one Capture MP3s in J2ME?

I was able to capture audio in the WAV format through Manager.createPlayer("capture://audio"). However, is there a way to capture audio in the MP3 format in J2ME?
It will likely depend on the platform in question, you would have to check the different device implementations you want to support.
Rory, what do you mean?
I was really asking for the String for the createPlayer(String s) method. J2ME automatically records to a WAV file, but I was wondering if I could request it to record to MP3. Of course if that MP3 argument did not work, a MediaException would be thrown. Please forgive me if it seems that I am missing the point of your response.

Resources