How do I run Whisper on an entire directory? - audio

I'd like to transcribe speech to text using Whisper. I have been able to successfully run it on a single file using the command:
whisper audio.wav
I'd like to run it on a large number of files in a single director called "Audio" on my desktop. I tried to write this into Python as follows:
import whisper
import os
model = whisper.load_model("base")
for filename in os.listdir('Audio'):
model.transcribe(filename)
It appears to start, but then gives me some errors about "No such file or directory." Is there some way I can correct this to run Whisper on all the .wav files in my Audio directory?
Error:
/opt/homebrew/lib/python3.10/site-packages/whisper/transcribe.py:78: UserWarning: FP16 is not supported on CPU; using FP32 instead
warnings.warn("FP16 is not supported on CPU; using FP32 instead")
/opt/homebrew/lib/python3.10/site-packages/whisper/transcribe.py:78: UserWarning: FP16 is not supported on CPU; using FP32 instead
warnings.warn("FP16 is not supported on CPU; using FP32 instead")
Traceback (most recent call last):
File "/opt/homebrew/lib/python3.10/site-packages/whisper/audio.py", line 42, in load_audio
ffmpeg.input(file, threads=0)
File "/opt/homebrew/lib/python3.10/site-packages/ffmpeg/_run.py", line 325, in run
raise Error('ffmpeg', out, err)
ffmpeg._run.Error: ffmpeg error (see stderr output for detail)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/Users/user/Desktop/transcribe.py", line 7, in <module>
model.transcribe(filename)
File "/opt/homebrew/lib/python3.10/site-packages/whisper/transcribe.py", line 84, in transcribe
mel = log_mel_spectrogram(audio)
File "/opt/homebrew/lib/python3.10/site-packages/whisper/audio.py", line 111, in log_mel_spectrogram
audio = load_audio(audio)
File "/opt/homebrew/lib/python3.10/site-packages/whisper/audio.py", line 47, in load_audio
raise RuntimeError(f"Failed to load audio: {e.stderr.decode()}") from e
RuntimeError: Failed to load audio: ffmpeg version 5.1.2 Copyright (c) 2000-2022 the FFmpeg developers
built with Apple clang version 14.0.0 (clang-1400.0.29.202)
configuration: --prefix=/opt/homebrew/Cellar/ffmpeg/5.1.2_1 --enable-shared --enable-pthreads --enable-version3 --cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libbluray --enable-libdav1d --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librist --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libspeex --enable-libsoxr --enable-libzmq --enable-libzimg --disable-libjack --disable-indev=jack --enable-videotoolbox --enable-neon
libavutil 57. 28.100 / 57. 28.100
libavcodec 59. 37.100 / 59. 37.100
libavformat 59. 27.100 / 59. 27.100
libavdevice 59. 7.100 / 59. 7.100
libavfilter 8. 44.100 / 8. 44.100
libswscale 6. 7.100 / 6. 7.100
libswresample 4. 7.100 / 4. 7.100
libpostproc 56. 6.100 / 56. 6.100
221211_1834.wav: No such file or directory

Here's an option for you. It does the following:
1 - Finds all .wav files in the "root folder" & sub-folders. You need to change this to your "Audio" folder location.
2 - Shows progress bar as it's transcribing the files (done using tqdm).
3 - Saves a .txt file containing the transcription next to the .wav files.
CODE:
import os
import whisper
from tqdm import tqdm
# Define the folder where the wav files are located
root_folder = "/Users/downloads"
# Set up Whisper client
print("Loading whisper model...")
model = whisper.load_model("base")
print("Whisper model complete.")
# Get the number of wav files in the root folder and its sub-folders
print("Getting number of files to transcribe...")
num_files = sum(1 for dirpath, dirnames, filenames in os.walk(root_folder) for filename in filenames if filename.endswith(".wav"))
print("Number of files: ", num_files)
# Transcribe the wav files and display a progress bar
with tqdm(total=num_files, desc="Transcribing Files") as pbar:
for dirpath, dirnames, filenames in os.walk(root_folder):
for filename in filenames:
if filename.endswith(".wav"):
filepath = os.path.join(dirpath, filename)
result = model.transcribe(filepath, fp16=False, verbose=True)
transcription = result['text']
# Write transcription to text file
filename_no_ext = os.path.splitext(filename)[0]
with open(os.path.join(dirpath, filename_no_ext + '.txt'), 'w') as f:
f.write(transcription)
pbar.update(1)

Related

ffmpeg_extract_subclip fails to trim some videos

I am using ffmpeg_extract_subclip(video_name, start-9, start, targetname=target_name) to trim a 9s video.
The start (int) variable comes from a json file:
start = b_info_list[elem+1]["start_time"]
delta_1 = dt.strptime(start, "%H:%M:%S.%f") - dt.strptime("", "")
start = int(delta_1.total_seconds())
The same exact code is working for most of the videos and their json counterparts, but not working for some of them only. So I'm assuming there's nothing wrong with variables start which is always >= 10 and video_name, target_name.
The error log is below:
Traceback (most recent call last):
File "data/video_trim.py", line 324, in <module>
main()
File "data/video_trim.py", line 316, in main
norm_symp_act_videos(f)
File "data/video_trim.py", line 235, in norm_symp_act_videos
ffmpeg_extract_subclip(video_name, start-9, start, targetname=target_name)
File "/home/office/anaconda3/envs/env_2/lib/python3.7/site-packages/moviepy/video/io/ffmpeg_tools.py", line 41, in ffmpeg_extract_subclip
subprocess_call(cmd)
File "/home/office/anaconda3/envs/env_2/lib/python3.7/site-packages/moviepy/tools.py", line 54, in subprocess_call
raise IOError(err.decode('utf8'))
OSError: ffmpeg version 4.2.2-static https://johnvansickle.com/ffmpeg/ Copyright (c) 2000-2019 the FFmpeg developers
built with gcc 8 (Debian 8.3.0-6)
configuration: --enable-gpl --enable-version3 --enable-static --disable-debug --disable-ffplay --disable-indev=sndio --disable-outdev=sndio --cc=gcc --enable-fontconfig --enable-frei0r --enable-gnutls --enable-gmp --enable-libgme --enable-gray --enable-libaom --enable-libfribidi --enable-libass --enable-libvmaf --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-librubberband --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libvorbis --enable-libopus --enable-libtheora --enable-libvidstab --enable-libvo-amrwbenc --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libdav1d --enable-libxvid --enable-libzvbi --enable-libzimg
libavutil 56. 31.100 / 56. 31.100
libavcodec 58. 54.100 / 58. 54.100
libavformat 58. 29.100 / 58. 29.100
libavdevice 58. 8.100 / 58. 8.100
libavfilter 7. 57.100 / 7. 57.100
libswscale 5. 5.100 / 5. 5.100
libswresample 3. 5.100 / 3. 5.100
libpostproc 55. 5.100 / 55. 5.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/home/office/data/video_05.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
creation_time : 2021-10-13T04:33:45.000000Z
encoder : Lavf58.45.100
Duration: 00:02:00.07, start: 0.000000, bitrate: 3149 kb/s
Stream #0:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p(tv, bt709), 1920x1080, 3148 kb/s, SAR 1:1 DAR 16:9, 30 fps, 30 tbr, 15360 tbn, 60 tbc (default)
Metadata:
creation_time : 2021-10-13T04:33:45.000000Z
handler_name : VideoHandler
timecode : 01:00:00:00
Stream #0:1(eng): Data: none (tmcd / 0x64636D74)
Metadata:
creation_time : 2021-10-13T04:33:45.000000Z
handler_name : TimeCodeHandler
timecode : 01:00:00:00
[mp4 # 0x62a7600] You requested a copy of the original timecode track so timecode metadata are now ignored
[mp4 # 0x62a7600] Could not find tag for codec none in stream #1, codec not currently supported in container
Could not write header for output file #0 (incorrect codec parameters ?): Invalid argument
Stream mapping:
Stream #0:0 -> #0:0 (copy)
Stream #0:1 -> #0:1 (copy)
Last message repeated 1 times
The function looks like this:
def ffmpeg_extract_subclip(filename, t1, t2, targetname=None):
""" Makes a new video file playing video file ``filename`` between
the times ``t1`` and ``t2``. """
name, ext = os.path.splitext(filename)
if not targetname:
T1, T2 = [int(1000*t) for t in [t1, t2]]
targetname = "%sSUB%d_%d.%s" % (name, T1, T2, ext)
cmd = [get_setting("FFMPEG_BINARY"),"-y",
"-ss", "%0.2f"%t1,
"-i", filename,
"-t", "%0.2f"%(t2-t1),
"-map", "0", "-vcodec", "copy", "-acodec", "copy", targetname]
subprocess_call(cmd)
I looked for the similar issues, but solutions to them are in command format which I don't understand. Is there anyone who can help solve this issue? Thanks.

Unable to find a suitable output format for 'ffmpeg' for joining mp4 audio with mp4 audio

I want to add audio from a mp4 file to a video form a mp4 file with FFmpeg. I am using this command ffmpeg -i video.mp4 -i audio.mp4 -c:v copy -c:a aac output.mp4. However FFmpeg is throwing Unable to find a suitable output format for 'ffmpeg'. I've added the log below for more information. How can I fix this? Is this because both and audio files are in mp4 format?
Here's the error log:
2021-11-15 23:52:20.698658+0200 app[666:70371] [javascript] libavutil 56. 55.100 / 56. 55.100
2021-11-15 23:52:20.698892+0200 app[666:70371] [javascript] libavcodec 58. 96.100 / 58. 96.100
2021-11-15 23:52:20.699091+0200 app[666:70371] [javascript] libavformat 58. 48.100 / 58. 48.100
2021-11-15 23:52:20.699299+0200 app[666:70371] [javascript] libavdevice 58. 11.101 / 58. 11.101
2021-11-15 23:52:20.699501+0200 app[666:70371] [javascript] libavfilter 7. 87.100 / 7. 87.100
2021-11-15 23:52:20.705405+0200 app[666:70371] [javascript] libswscale 5. 8.100 / 5. 8.100
2021-11-15 23:52:20.705768+0200 app[666:70371] [javascript] libswresample 3. 8.100 / 3. 8.100
2021-11-15 23:52:20.716141+0200 app[666:70371] [javascript] Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/var/mobile/Containers/Data/Application/Documents/video.mp4':
2021-11-15 23:52:20.716431+0200 app[666:70371] [javascript] Metadata:
2021-11-15 23:52:20.716646+0200 app[666:70371] [javascript] major_brand :
2021-11-15 23:52:20.716849+0200 app[666:70371] [javascript] isom
2021-11-15 23:52:20.717051+0200 app[666:70371] [javascript]
2021-11-15 23:52:20.717248+0200 app[666:70371] [javascript] minor_version :
2021-11-15 23:52:20.717443+0200 app[666:70371] [javascript] 512
2021-11-15 23:52:20.717637+0200 app[666:70371] [javascript]
2021-11-15 23:52:20.717827+0200 app[666:70371] [javascript] compatible_brands:
2021-11-15 23:52:20.718035+0200 app[666:70371] [javascript] isomiso2avc1mp41
2021-11-15 23:52:20.718232+0200 app[666:70371] [javascript]
2021-11-15 23:52:20.718460+0200 app[666:70371] [javascript] encoder :
2021-11-15 23:52:20.718659+0200 app[666:70371] [javascript] Lavf58.48.100
2021-11-15 23:52:20.718866+0200 app[666:70371] [javascript]
2021-11-15 23:52:20.719064+0200 app[666:70371] [javascript] Duration:
2021-11-15 23:52:20.719276+0200 app[666:70371] [javascript] 00:00:28.67
2021-11-15 23:52:20.719480+0200 app[666:70371] [javascript] , start:
2021-11-15 23:52:20.719682+0200 app[666:70371] [javascript] 0.000000
2021-11-15 23:52:20.719872+0200 app[666:70371] [javascript] , bitrate:
2021-11-15 23:52:20.720425+0200 app[666:70371] [javascript] 6170 kb/s
2021-11-15 23:52:20.720860+0200 app[666:70371] [javascript]
2021-11-15 23:52:20.721073+0200 app[666:70371] [javascript] Stream #0:0
2021-11-15 23:52:20.721385+0200 app[666:70371] [javascript] (und)
2021-11-15 23:52:20.721623+0200 app[666:70371] [javascript] : Video: h264 (avc1 / 0x31637661), yuv420p(tv), 720x1280, 6166 kb/s
2021-11-15 23:52:20.721831+0200 app[666:70371] [javascript] ,
2021-11-15 23:52:20.722030+0200 app[666:70371] [javascript] 30 fps,
2021-11-15 23:52:20.722229+0200 app[666:70371] [javascript] 30 tbr,
2021-11-15 23:52:20.722426+0200 app[666:70371] [javascript] 19200 tbn,
2021-11-15 23:52:20.722622+0200 app[666:70371] [javascript] 38400 tbc
2021-11-15 23:52:20.722818+0200 app[666:70371] [javascript] (default)
2021-11-15 23:52:20.723013+0200 app[666:70371] [javascript]
2021-11-15 23:52:20.723223+0200 app[666:70371] [javascript] Metadata:
2021-11-15 23:52:20.723424+0200 app[666:70371] [javascript] handler_name :
2021-11-15 23:52:20.723624+0200 app[666:70371] [javascript] Core Media Video
2021-11-15 23:52:20.723820+0200 app[666:70371] [javascript]
2021-11-15 23:52:20.724032+0200 app[666:70371] [javascript] Input #1, mov,mp4,m4a,3gp,3g2,mj2, from '/var/mobile/Containers/Data/Application/Documents/audio.mp4':
2021-11-15 23:52:20.724232+0200 app[666:70371] [javascript] Metadata:
2021-11-15 23:52:20.724430+0200 app[666:70371] [javascript] major_brand :
2021-11-15 23:52:20.724627+0200 app[666:70371] [javascript] isom
2021-11-15 23:52:20.724822+0200 app[666:70371] [javascript]
2021-11-15 23:52:20.725025+0200 app[666:70371] [javascript] minor_version :
2021-11-15 23:52:20.725227+0200 app[666:70371] [javascript] 512
2021-11-15 23:52:20.725426+0200 app[666:70371] [javascript]
2021-11-15 23:52:20.725624+0200 app[666:70371] [javascript] compatible_brands:
2021-11-15 23:52:20.725820+0200 app[666:70371] [javascript] isomiso2mp41
2021-11-15 23:52:20.726026+0200 app[666:70371] [javascript]
2021-11-15 23:52:20.726230+0200 app[666:70371] [javascript] encoder :
2021-11-15 23:52:20.726427+0200 app[666:70371] [javascript] Lavf58.48.100
2021-11-15 23:52:20.726624+0200 app[666:70371] [javascript]
2021-11-15 23:52:20.726827+0200 app[666:70371] [javascript] Duration:
2021-11-15 23:52:20.727025+0200 app[666:70371] [javascript] 00:00:28.65
2021-11-15 23:52:20.727230+0200 app[666:70371] [javascript] , start:
2021-11-15 23:52:20.727428+0200 app[666:70371] [javascript] 0.000000
2021-11-15 23:52:20.727627+0200 app[666:70371] [javascript] , bitrate:
2021-11-15 23:52:20.727829+0200 app[666:70371] [javascript] 93 kb/s
2021-11-15 23:52:20.728025+0200 app[666:70371] [javascript]
2021-11-15 23:52:20.728220+0200 app[666:70371] [javascript] Stream #1:0
2021-11-15 23:52:20.728414+0200 app[666:70371] [javascript] (eng)
2021-11-15 23:52:20.728733+0200 app[666:70371] [javascript] : Audio: aac (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 91 kb/s
2021-11-15 23:52:20.728941+0200 app[666:70371] [javascript] (default)
2021-11-15 23:52:20.729134+0200 app[666:70371] [javascript]
2021-11-15 23:52:20.729339+0200 app[666:70371] [javascript] Metadata:
2021-11-15 23:52:20.729593+0200 app[666:70371] [javascript] handler_name :
2021-11-15 23:52:20.729799+0200 app[666:70371] [javascript] SoundHandler
2021-11-15 23:52:20.729992+0200 app[666:70371] [javascript]
2021-11-15 23:52:20.730261+0200 app[666:70371] [javascript] FFmpeg process exited with rc=1.
2021-11-15 23:52:20.731495+0200 app[666:70371] [javascript] Unable to find a suitable output format for 'ffmpeg'
2021-11-15 23:52:20.731715+0200 app[666:70371] [javascript] ffmpeg: Invalid argument
silly me, after hours of research I found that I actually had to run the command without ffmeg, as the error states Unable to find a suitable output format for 'ffmpeg'. So all i had to do is change it to -i video.mp4 -i audio.mp4 -c:v copy -c:a aac output.mp4

Unrecognized option 'c copy'

I have been working on a script as a part of both learning process and creating handy tools. I am trying to loop over a list of video files to extract a certain part of each video on the list. By looking at example scripts and ffmpeg documentation I finally came up with this:
import os
import sys
import subprocess as sp
from moviepy.tools import subprocess_call
def ffmpeg_extract_pandomim_subclip():
with open('videolist.txt') as f:
lines = f.readlines()
lines = [x.strip() for x in lines]
for video in lines:
name, ext = os.path.splitext(video)
targetname = "%s-pandomim%s" % (name, ext)
t1 = "00:10:00"
t2 = "00:15:00"
cmd = ["ffmpeg",
"-i", "%s%s" % (name, ext),
"-ss", t1,
"-to", t2, "-c copy", targetname]
subprocess_call(cmd)
ffmpeg_extract_pandomim_subclip()
I know this is not the ideal way to do it: I created a videolist.txt and listed all the video file names in that txt file, line by line,(T1-1.mp4, T1-2.mp4,... ) that share the same folder with the python script "new 1.py" and the actual videos which are T1-1.mp4, T1-2.mp4,...
The error I am getting really confuses me because when I use -c copy from cmd it works just fine.
The full error is:
C:\Users\çomak\AppData\Local\Programs\Python\Python35-32\python.exe "C:/ffmpeg/bin/new 1.py"
[MoviePy] Running:
>>> ffmpeg -i T1-1.mp4 -ss 00:10:00 -to 00:15:00 -c copy T1-1-pandomim.mp4
[MoviePy] This command returned an error !Traceback (most recent call last):
File "C:/ffmpeg/bin/new 1.py", line 28, in <module>
ffmpeg_extract_pandomim_subclip()
File "C:/ffmpeg/bin/new 1.py", line 25, in ffmpeg_extract_pandomim_subclip
subprocess_call(cmd)
File "C:\Users\çomak\AppData\Local\Programs\Python\Python35-32\lib\site-packages\moviepy\tools.py", line 48, in subprocess_call
raise IOError(err.decode('utf8'))
OSError: ffmpeg version N-83975-g6c4665d Copyright (c) 2000-2017 the FFmpeg developers
built with gcc 6.3.0 (GCC)
configuration: --enable-gpl --enable-version3 --enable-cuda --enable-cuvid --enable-d3d11va --enable-dxva2 --enable-libmfx --enable-nvenc --enable-avisynth --enable-bzlib --enable-fontconfig --enable-frei0r --enable-gnutls --enable-iconv --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libfreetype --enable-libgme --enable-libgsm --enable-libilbc --enable-libmodplug --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenh264 --enable-libopenjpeg --enable-libopus --enable-librtmp --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvo-amrwbenc --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs --enable-libxvid --enable-libzimg --enable-lzma --enable-zlib
libavutil 55. 48.100 / 55. 48.100
libavcodec 57. 83.100 / 57. 83.100
libavformat 57. 66.104 / 57. 66.104
libavdevice 57. 3.100 / 57. 3.100
libavfilter 6. 76.100 / 6. 76.100
libswscale 4. 3.101 / 4. 3.101
libswresample 2. 4.100 / 2. 4.100
libpostproc 54. 2.100 / 54. 2.100
Unrecognized option 'c copy'.
Error splitting the argument list: Option not found
Process finished with exit code 1
I am using Pycharm and if I remove the -c copy part it works, but the process is slow... With -c copy, it is much faster.
I appreciate your time and effort to help me out!
This is going to be due to the way that subprocess.call actually invokes the command; you can not condense all of your commands into a single string, they should be fully expanded.
To fix this, simply expand it to the two parameters that they are
[ ..., "-c", "copy", ... ]

How to create a video from image buffers using fluent-ffmpeg?

I've been trying to create a slideshow from a series of images using nodejs + fluent-ffmpeg, however it is not working well or consistently. ffmpeg occasionally emits "Error: ffmpeg exited with code 1: pipe:0: Invalid data found when processing input", and in case an eventual video (mp4) is created it seems to be missing images/frames.
The process is as follows: images are loaded into memory, transformed resized to same dimensions using lwip, written sequentially into a passthrough stream, which is fed to ffmpeg as input.
Relevant code snippets:
var lwip = require('lwip');
var ffmpeg = require('fluent-ffmpeg');
var stream = require('stream');
var imagesStream = new stream.PassThrough();
...
image.batch()
.contain(options.video.width, options.video.height, 'lanczos')
.toBuffer(options.frames.format, {quality: 100}, (err, buffer) => {
if (err) {
throw ('error convering image to buffer. ' + err);
}
imagesStream.write(buffer, 'utf8');
resolve();
});
...
ffmpeg(imagesStream)
.inputOptions('-framerate 1/' + options.frames.secsPerImage)
.input(path.join(AUDIO_ROOT, options.audio.track))
.save(path.join(path.join(OUTPUT_FOLDER, `${options.video.output.prefix}${timestamp}.${options.video.output.format}`)))
.size(`${options.video.width}x${options.video.height}`)
.on('start', () => {
console.log('creating the clip now...')
})
.on('progress', (progress) => {
var progPercent = Math.round(100 * progress.frames / (numImages * options.frames.secsPerImage * 25));
progPercent = Math.min(progPercent, 100);
console.log(`processing: ${progPercent}% done`)
})
.on('stderr', (line) => {
console.error('ffmpeg error: ' + line);
})
.on('error', (error) => {
reject('ffmpeg transcoding error: ' + error);
})
.on('end', () => {
console.log('done!');
resolve(true);
})
.run();
And here is the output:
"C:\Program Files (x86)\JetBrains\WebStorm 2016.1.1\bin\runnerw.exe" "C:\Program Files\nodejs\node.exe" vm2.js
image count: 18
image count: 6
creating the clip now...
ffmpeg error: ffmpeg version N-80335-gcb46b78 Copyright (c) 2000-2016 the FFmpeg developers
ffmpeg error: built with gcc 5.4.0 (GCC)
ffmpeg error: configuration: --enable-gpl --enable-version3 --disable-w32threads --enable-nvenc --enable-avisynth --enable-bzlib --enable-fontconfig --enable-frei0r --enable-gnutls --enable-iconv --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libfreetype --enable-libgme --enable-libgsm --enable-libilbc --enable-libmodplug --enable-libmfx --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-librtmp --enable-libschroedinger --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvo-amrwbenc --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs --enable-libxvid --enable-libzimg --enable-lzma --enable-decklink --enable-zlib
ffmpeg error: libavutil 55. 24.100 / 55. 24.100
ffmpeg error: libavcodec 57. 46.100 / 57. 46.100
ffmpeg error: libavformat 57. 38.100 / 57. 38.100
ffmpeg error: libavdevice 57. 0.101 / 57. 0.101
ffmpeg error: libavfilter 6. 46.101 / 6. 46.101
ffmpeg error: libswscale 4. 1.100 / 4. 1.100
ffmpeg error: libswresample 2. 1.100 / 2. 1.100
ffmpeg error: libpostproc 54. 0.100 / 54. 0.100
creating the clip now...
ffmpeg error: ffmpeg version N-80335-gcb46b78 Copyright (c) 2000-2016 the FFmpeg developers
ffmpeg error: built with gcc 5.4.0 (GCC)
ffmpeg error: configuration: --enable-gpl --enable-version3 --disable-w32threads --enable-nvenc --enable-avisynth --enable-bzlib --enable-fontconfig --enable-frei0r --enable-gnutls --enable-iconv --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libfreetype --enable-libgme --enable-libgsm --enable-libilbc --enable-libmodplug --enable-libmfx --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-librtmp --enable-libschroedinger --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvo-amrwbenc --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs --enable-libxvid --enable-libzimg --enable-lzma --enable-decklink --enable-zlib
ffmpeg error: libavutil 55. 24.100 / 55. 24.100
ffmpeg error: libavcodec 57. 46.100 / 57. 46.100
ffmpeg error: libavformat 57. 38.100 / 57. 38.100
ffmpeg error: libavdevice 57. 0.101 / 57. 0.101
ffmpeg error: libavfilter 6. 46.101 / 6. 46.101
ffmpeg error: libswscale 4. 1.100 / 4. 1.100
ffmpeg error: libswresample 2. 1.100 / 2. 1.100
ffmpeg error: libpostproc 54. 0.100 / 54. 0.100
ffmpeg error: pipe:0: Invalid data found when processing input
ffmpeg error:
an error has occurred: ffmpeg transcoding error: Error: ffmpeg exited with code 1: pipe:0: Invalid data found when processing input
ffmpeg error: [jpeg_pipe # 0000000000308fe0] Format jpeg_pipe detected only with low score of 6, misdetection possible!
ffmpeg error: Input #0, jpeg_pipe, from 'pipe:0':
ffmpeg error: Duration: N/A, bitrate: N/A
ffmpeg error: Stream #0:0: Video: mjpeg, yuvj420p(pc, bt470bg/unknown/unknown), 1920x1080 [SAR 1:1 DAR 16:9], 0.33 tbr, 0.33 tbn, 0.33 tbc
ffmpeg error: [mp3 # 0000000002fa0720] Estimating duration from bitrate, this may be inaccurate
ffmpeg error: Input #1, mp3, from 'audio\avicii.mp3':
ffmpeg error: Metadata:
ffmpeg error: album : True
ffmpeg error: genre : House
ffmpeg error: copyright : ℗ 2013 Avicii Music AB, / PRMD under exclusive license to Universal Music AB
ffmpeg error: encoded_by : Oz
ffmpeg error: title : Wake Me Up
ffmpeg error: artist : Avicii
ffmpeg error: album_artist : Avicii
ffmpeg error: disc : 1/1
ffmpeg error: track : 1/12
ffmpeg error: TYER : 2013-09-13T07:00:00Z
ffmpeg error: Duration: 00:04:09.73, start: 0.000000, bitrate: 321 kb/s
ffmpeg error: Stream #1:0: Audio: mp3, 44100 Hz, stereo, s16p, 320 kb/s
ffmpeg error: Stream #1:1: Video: mjpeg, yuvj444p(pc, bt470bg/unknown/unknown), 600x600 [SAR 305:305 DAR 1:1], 90k tbr, 90k tbn, 90k tbc
ffmpeg error: Metadata:
ffmpeg error: comment : Cover (front)
ffmpeg error: No pixel format specified, yuvj420p for H.264 encoding chosen.
ffmpeg error: Use -pix_fmt yuv420p for compatibility with outdated media players.
ffmpeg error: [libx264 # 000000000030e860] using SAR=1/1
ffmpeg error: [libx264 # 000000000030e860] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX
ffmpeg error: [libx264 # 000000000030e860] profile High, level 4.0
ffmpeg error: [libx264 # 000000000030e860] 264 - core 148 r2694 3b70645 - H.264/MPEG-4 AVC codec - Copyleft 2003-2016 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=12 lookahead_threads=2 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
ffmpeg error: [mp4 # 0000000002ec6980] Using AVStream.codec to pass codec parameters to muxers is deprecated, use AVStream.codecpar instead.
ffmpeg error: Last message repeated 1 times
ffmpeg error: Output #0, mp4, to 'output\clip_2016-06-22_06-17-25.mp4':
ffmpeg error: Metadata:
ffmpeg error: encoder : Lavf57.38.100
ffmpeg error: Stream #0:0: Video: h264 (libx264) ([33][0][0][0] / 0x0021), yuvj420p(pc), 1920x1080 [SAR 1:1 DAR 16:9], q=-1--1, 25 fps, 12800 tbn, 25 tbc
ffmpeg error: Metadata:
ffmpeg error: encoder : Lavc57.46.100 libx264
ffmpeg error: Side data:
ffmpeg error: cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: -1
ffmpeg error: Stream #0:1: Audio: aac (LC) ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp, 128 kb/s
ffmpeg error: Metadata:
ffmpeg error: encoder : Lavc57.46.100 aac
ffmpeg error: Stream mapping:
ffmpeg error: Stream #0:0 -> #0:0 (mjpeg (native) -> h264 (libx264))
ffmpeg error: Stream #1:0 -> #0:1 (mp3 (native) -> aac (native))
ffmpeg error: frame= 75 fps=0.0 q=28.0 size= 0kB time=00:00:00.64 bitrate= 0.6kbits/s dup=74 drop=0 speed=1.09x
processing: 17% done
ffmpeg error: frame= 150 fps= 97 q=28.0 size= 371kB time=00:00:03.64 bitrate= 835.6kbits/s dup=148 drop=0 speed=2.35x
processing: 33% done
processing: 33% done
ffmpeg error: frame= 150 fps= 73 q=28.0 size= 879kB time=00:00:07.36 bitrate= 977.8kbits/s dup=148 drop=0 speed= 3.6x
processing: 33% done
ffmpeg error: frame= 150 fps= 59 q=28.0 size= 952kB time=00:00:18.36 bitrate= 424.7kbits/s dup=148 drop=0 speed=7.21x
processing: 33% done
ffmpeg error: frame= 150 fps= 49 q=28.0 size= 1190kB time=00:00:32.99 bitrate= 295.3kbits/s dup=148 drop=0 speed=10.8x
ffmpeg error: frame= 150 fps= 42 q=28.0 size= 1409kB time=00:00:46.64 bitrate= 247.4kbits/s dup=148 drop=0 speed=13.1x
processing: 33% done
processing: 33% done
ffmpeg error: frame= 150 fps= 37 q=28.0 size= 1628kB time=00:01:00.30 bitrate= 221.1kbits/s dup=148 drop=0 speed=14.9x
processing: 33% done
ffmpeg error: frame= 150 fps= 33 q=28.0 size= 1878kB time=00:01:15.83 bitrate= 202.8kbits/s dup=148 drop=0 speed=16.7x
processing: 33% done
ffmpeg error: frame= 150 fps= 30 q=28.0 size= 2130kB time=00:01:31.64 bitrate= 190.4kbits/s dup=148 drop=0 speed=18.2x
processing: 33% done
ffmpeg error: frame= 150 fps= 27 q=28.0 size= 2375kB time=00:01:47.18 bitrate= 181.5kbits/s dup=148 drop=0 speed=19.3x
processing: 33% done
ffmpeg error: frame= 150 fps= 25 q=28.0 size= 2626kB time=00:02:03.15 bitrate= 174.7kbits/s dup=148 drop=0 speed=20.4x
processing: 33% done
ffmpeg error: frame= 150 fps= 23 q=28.0 size= 2832kB time=00:02:16.20 bitrate= 170.3kbits/s dup=148 drop=0 speed=20.8x
ffmpeg error: frame= 150 fps= 21 q=28.0 size= 3063kB time=00:02:30.34 bitrate= 166.9kbits/s dup=148 drop=0 speed=21.3x
processing: 33% done
processing: 33% done
ffmpeg error: frame= 150 fps= 20 q=28.0 size= 3298kB time=00:02:44.93 bitrate= 163.8kbits/s dup=148 drop=0 speed=21.8x
processing: 33% done
ffmpeg error: frame= 150 fps= 19 q=28.0 size= 3522kB time=00:02:58.93 bitrate= 161.3kbits/s dup=148 drop=0 speed=22.2x
processing: 33% done
ffmpeg error: frame= 150 fps= 18 q=28.0 size= 3792kB time=00:03:15.83 bitrate= 158.6kbits/s dup=148 drop=0 speed=22.9x
processing: 33% done
ffmpeg error: frame= 150 fps= 17 q=28.0 size= 4035kB time=00:03:31.11 bitrate= 156.6kbits/s dup=148 drop=0 speed=23.3x
processing: 33% done
ffmpeg error: frame= 150 fps= 16 q=28.0 size= 4294kB time=00:03:47.62 bitrate= 154.5kbits/s dup=148 drop=0 speed=23.8x
processing: 33% done
ffmpeg error: frame= 150 fps= 15 q=28.0 size= 4566kB time=00:04:04.87 bitrate= 152.7kbits/s dup=148 drop=0 speed=24.4x
processing: 33% done
ffmpeg error: frame= 150 fps= 14 q=-1.0 Lsize= 4851kB time=00:04:09.73 bitrate= 159.1kbits/s dup=148 drop=0 speed=23.6x
ffmpeg error: video:826kB audio:3978kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.967442%
ffmpeg error: [libx264 # 000000000030e860] frame I:2 Avg QP:14.53 size:414604
ffmpeg error: [libx264 # 000000000030e860] frame P:38 Avg QP:16.59 size: 222
ffmpeg error: [libx264 # 000000000030e860] frame B:110 Avg QP:12.67 size: 69
ffmpeg error: [libx264 # 000000000030e860] consecutive B-frames: 1.3% 2.7% 0.0% 96.0%
ffmpeg error: [libx264 # 000000000030e860] mb I I16..4: 25.5% 49.4% 25.1%
ffmpeg error: [libx264 # 000000000030e860] mb P I16..4: 0.0% 0.0% 0.0% P16..4: 0.7% 0.0% 0.0% 0.0% 0.0% skip:99.2%
ffmpeg error: [libx264 # 000000000030e860] mb B I16..4: 0.0% 0.0% 0.0% B16..8: 0.0% 0.0% 0.0% direct: 0.0% skip:100.0% L0: 1.2% L1:98.8% BI: 0.0%
ffmpeg error: [libx264 # 000000000030e860] 8x8 transform intra:49.4% inter:92.1%
ffmpeg error: [libx264 # 000000000030e860] coded y,uvDC,uvAC intra: 74.2% 73.7% 69.0% inter: 0.0% 0.2% 0.0%
ffmpeg error: [libx264 # 000000000030e860] i16 v,h,dc,p: 97% 0% 2% 1%
ffmpeg error: [libx264 # 000000000030e860] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 13% 19% 15% 7% 9% 7% 10% 7% 13%
ffmpeg error: [libx264 # 000000000030e860] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 12% 21% 8% 8% 11% 9% 12% 7% 13%
ffmpeg error: [libx264 # 000000000030e860] i8c dc,h,v,p: 55% 19% 16% 10%
ffmpeg error: [libx264 # 000000000030e860] Weighted P-Frames: Y:0.0% UV:0.0%
ffmpeg error: [libx264 # 000000000030e860] ref P L0: 95.4% 0.6% 3.1% 0.9%
ffmpeg error: [libx264 # 000000000030e860] ref B L1: 98.8% 1.2%
ffmpeg error: [libx264 # 000000000030e860] kb/s:1127.01
ffmpeg error: [aac # 0000000002ecc880] Qavg: 541.237
ffmpeg error:
done!
Apparently this is a bug in fluent-ffmpeg, so I reverted to using ffmpeg directly as a child process. Here is the final code for creating an mp4 from a stream of raw in-memory jpeg images:
var timestamp = moment().format(options.video.output.timestampFormat);
var framerate = '1/' + options.frames.secsPerImage;
var videosize = `${options.video.width}x${options.video.height}`;
var audiotrack = path.join(AUDIO_ROOT, options.audio.track);
var outputFilename = path.join(path.join(OUTPUT_FOLDER,
`${options.video.output.prefix}${timestamp}.${options.video.output.format}`));
var childProcess = spawn('bin/ffmpeg.exe', ['-y', '-f', 'image2pipe',
'-s', videosize,
'-framerate', framerate,
'-pix_fmt', 'yuv420p',
'-i', '-',
'-i', audiotrack,
'-vcodec', 'mpeg4',
'-shortest',
outputFilename
]);
childProcess.stdout.on('data', data => console.log(data.toString()));
childProcess.stderr.on('data', data => console.log(data.toString()));
childProcess.on('close', code => {
console.log(`done! (${code})`);
resolve();
});
imagesStream.pipe(childProcess.stdin);
And as for the stream:
var stream = require('stream');
var imagesStream = new stream.PassThrough();
...
// repeat for every image...
imagesStream.write(buffer, 'utf8');
...
// then finally end the stream
imagesStream.end();

Node.js ffmetadata incorrect codec parameters

I'm making a song downloader app in Node.js. I managed to get everything to work, the app downloads the song and downloads its artwork (image). So I have the mp3 file and jpg file. The only problem is attaching the jpg file to the mp3 file.
I'm using the ffmetadata node.js module. I downloaded and installed its dependency "ffmpeg" cli.
Now when I try to write the metadata to the mp3 file and attach the artwork it spits this error:
Could not write header for output file #0 (incorrect codec parameters ?): Invalid argument
My code:
ffmetadata.write('test.mp3', {}, {attachments: ['test.jpg']}, function(err) {
if (err) console.error(err);
});
The error:
[Error: ffmpeg version N-73872-g6b96c70 Copyright (c) 2000-2015 the FFmpeg developers
built with gcc 4.9.2 (GCC)
configuration: --enable-gpl --enable-version3 --disable-w32threads --enable-avisynth --enable-bzlib --enable-fontconfig --enable-frei0r --enable-gnutls --enable-iconv --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libdcadec --enable-libfreetype --enable-libgme --enable-libgsm --enable-libilbc --enable-libmodplug --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-librtmp --enable-libschroedinger --enable-libsoxr --enable-libspeex --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvo-aacenc --enable-libvo-amrwbenc --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs --enable-libxvid --enable-lzma --enable-decklink --enable-zlib
libavutil 54. 28.100 / 54. 28.100
libavcodec 56. 50.101 / 56. 50.101
libavformat 56. 40.101 / 56. 40.101
libavdevice 56. 4.100 / 56. 4.100
libavfilter 5. 25.100 / 5. 25.100
libswscale 3. 1.101 / 3. 1.101
libswresample 1. 2.101 / 1. 2.101
libpostproc 53. 3.100 / 53. 3.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'songs/Irresistible - Fall Out Boy (Lyrics).mp3':
Metadata:
major_brand : dash
minor_version : 0
compatible_brands: iso6mp41
creation_time : 2015-04-03 10:45:25
Duration: 00:03:26.94, start: 0.000000, bitrate: 128 kb/s
Stream #0:0(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 125 kb/s (default)
Metadata:
creation_time : 2015-04-03 10:45:25
handler_name : SoundHandler
Input #1, image2, from 'songs/albumart/Irresistible - Fall Out Boy (Lyrics).jpg':
Duration: 00:00:00.04, start: 0.000000, bitrate: 24578 kb/s
Stream #1:0: Video: mjpeg, yuvj444p(pc, bt470bg/unknown/unknown), 640x640 [SAR 300:300 DAR 1:1], 25 tbr, 25 tbn, 25 tbc
[mp3 # 0000000004865d80] Invalid audio stream. Exactly one MP3 audio stream is required.
Output #0, mp3, to 'songs\Irresistible - Fall Out Boy (Lyrics).ffmetadata.mp3':
Metadata:
major_brand : dash
minor_version : 0
compatible_brands: iso6mp41
dryRun : true
encoder : Lavf56.40.101
Stream #0:0(und): Audio: aac (mp4a / 0x6134706D), 44100 Hz, stereo, 125 kb/s (default)
Metadata:
creation_time : 2015-04-03 10:45:25
handler_name : SoundHandler
Stream #0:1: Video: mjpeg, yuvj444p, 640x640 [SAR 300:300 DAR 1:1], q=2-31, 25 tbr, 25 tbn, 25 tbc
Stream mapping:
Stream #0:0 -> #0:0 (copy)
Stream #1:0 -> #0:1 (copy)
Could not write header for output file #0 (incorrect codec parameters ?): Invalid argument
]
Read the output:
Stream #0:0(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 125 kb/s (default)
[mp3 # 0000000004865d80] Invalid audio stream. Exactly one MP3 audio stream is required
Stream mapping:
Stream #0:0 -> #0:0 (copy)
Stream #1:0 -> #0:1 (copy)
Your input file is not an MP3 but an MP4A /AAC. It has the wrong extension. You are then attempting to copy the AAC stream as MP3 which obviously fails. Check the source file in your downloader.

Resources