How to get audio length for all audios using sox? - linux

I can do soxi -d * to get audio length information in hours, minutes, and seconds.
However it would only give me info on the individual audio length.
If I wanted to see the audio length for the entire folder, how can I accomplish such task?
like when you do "wc -w" it shows the sum of everything at the end. Is there a flag or something I can integrate with soxi?

soxi -T *
from the man file
-T Used with multiple files; changes the behaviour of -s, -d and -D to
display the total across all given files. Note that when used with -s with files > with different sampling rates, this is of questionable value.

Assume you have .wav files in directory.
For single file use soxi -D ..., and then sum values with bc (how to)
Use any of this commands to get total in seconds:
# in seconds
soxi -D *.wav | awk '{s+=$1}END{print s}' | bc
# in seconds
soxi -D *.wav | paste -sd+ - | bc
# in minutes
soxi -D *.wav | awk '{s+=$1}END{print s/60}' | bc
# in hours
soxi -D *.wav | awk '{s+=$1}END{print s/60/60}' | bc

does it have to be soxi ? if you're willing to use ffprobe as part of ffmpeg, here's how it scanned a folder of mine with different file types - .mp3, AAC in .m4a, and FLAC :
making the lazy assumption that your filenames don't contain the equal sign ("=") in their names. if they do, adjust the csv=s= option accordingly
gfind . -type f -not -name ".*" -print0 |
parallel -0 --bar -N 1 -j 8
'ffprobe -hide_banner -v 0
-select_streams a:0
-show_entries format=format_long_name,size,filename,duration
-of csv=s="=":p=0:nk=0
-i {}'
./genieaudio_93508443_.lossless.mp3
MP2/3 (MPEG audio layer 2/3)
232.150204
9287144
./genieaudio_16277926_.aac.flac.m4a
QuickTime / MOV
232.181000
63572859
./genieaudio_16277926_.lossless.mp3
MP2/3 (MPEG audio layer 2/3)
232.280816
92923682/3)
250.096327
10004990
./genieaudio_79412303_.lossless.mp3
MP2/3 (MPEG audio layer 2/3)
250.383673
10016483
./genieaudio_16108705_.192k.mp3.flac
raw FLAC
251.122000
55480793
./backupgenieaudio_16108705_test1.192k.mp3
MP2/3 (MPEG audio layer 2/3)
251.928000
6046272
./genieaudio_16108705_test1.192k.mp3
MP2/3 (MPEG audio layer 2/3)
251.928000
6046893
./genieaudio_16108705_test2.192k.mp3
MP2/3 (MPEG audio layer 2/3)
251.928000
6046848
./genieaudio_16254360_192_b.mp3
MP2/3 (MPEG audio layer 2/3)
255.111837
6123354
./genieaudio_16268888_.192k.mp3.flac
raw FLAC
259.442979
55115022

Related

Mute Volume with Minimal Re-encoding

Is it possible to mute a section of a video file (say 5 seconds) without having to re-encode the whole audio stream with ffmpeg? I know it's technically (though probably not easily) possible by reusing the majority of the existing audio stream and only re-encoding the changed section and possibly a short section before and after, but I'm not sure if ffmpeg supports this. If it doesn't, anyone know of any other library that does?
You can do the partial segmented encode, as you suggest, but if the source codec is DCT-based such as AAC/MP3, there will be glitches at the start and end of the re-encoded segment once you stitch it all back together.
You would use the segment muxer and concat demuxer to do this.
ffmpeg -i input -vn -c copy -f segment -segment_time 5 aud_%d.m4a
Re-encode the offending segment, say aud_2.m4a to noaud_2.m4a.
Now create a text file
file aud_0.mp4
file aud_1.mp4
file noaud_2.mp4
file aud_3.mp4
and run
ffmpeg -an -i input -f concat -safe 0 -i list.txt -c copy new.mp4
Download the small sample file.
Here is my plan visualized:
# original video
| video |
| audio |
# cut video into 3 parts. Mute the middle part.
| video | | video | | video |
| audio | | - | | audio |
# concatenate the 3 parts
| video | video | video |
| audio | - | audio |
# mux uncut original video with audio from concatenated video
| video |
| audio | - | audio |
Let's do this.
Store filename:
i="fridayafternext_http.mp4"
To mute the line "What the hell are you doing in my house!?", the silence should start at second 34 with a duration of 2 seconds.
Store all that for your convenience:
mute_starttime=34
mute_duration=2
bash supports simple math so we can automatically calculate the start time where the audio starts again, which is 36 of course:
rest_starttime=$(( $starttime + $duration))
Create all 3 parts. Notice that for the 2nd part we use -an to mute the audio:
ffmpeg -i "$i" -c copy -t $mute_starttime start.mp4 && \
ffmpeg -i "$i" -ss $mute_starttime -c copy -an -t ${mute_duration} muted.mp4 && \
ffmpeg -i "$i" -ss $rest_starttime -c copy rest.mp4
Create concat_videos.txt with the following text:
file 'start.mp4'
file 'muted.mp4'
file 'rest.mp4'
Concat videos with the Concat demuxer:
ffmpeg -f concat -safe 0 -i concat_videos.txt -c copy muted_audio.mp4
Mux original video with new audio
ffmpeg -i "$i" -i "muted_audio.mp4" -map 0:v -map 1:a -c copy "${i}_partly_muted.mp4"
Note:
I've learned from Gyan's answer that you do the last 2 steps in 1 take which is really cool.
ffmpeg -an -i "$i" -f concat -safe 0 -i concat_videos.txt -c copy "${i}_partly_muted.mp4"

"sox --combine merge": how to limit the mixed output length to the shorter of the two inputs?

I am using SoX command line tool on Linux inside a Makefile to interleave two raw (float 32 bit) input audio files into one file:
make_combine:
sox \
--bits 32 --channels 1 --rate 48000 signal_1.f32 \
--bits 32 --channels 1 --rate 48000 signal_2.f32 \
--type raw --channels 2 --combine merge signal_mixed.f32
I ran into problems when signal_1 and signal_2 are different length. How would I limit the mixed output to shorter of the two inputs?
Use soxi -s to find the shortest file, e.g.:
samps=$(soxi -s signal_1.f32 signal_2.f32 | sort -n | head -n1)
Then use the trim effect to shorten the files, e.g. (untested):
sox --combine merge \
"| sox signal_1.f32 -p trim 0 ${samps}s" \
"| sox signal_2.f32 -p trim 0 ${samps}s" \
signal_mixed.f32
Note: If you want me to test it, provide some sample data.

Extract AUDIO, manipulate & merge again

I'm using Spleeter to remove music from audios.
My goal is to build a script that automates the process of extracting audio from the video, execute Spleeter on the extracted audio & than merge the manipulated audio back to the video replacing the original one.
The main issue I had is that I don't have enough ram to process the whole extracted audio. I need to split it the into multiple pieces & execute Spleeter upon each piece.
Then concatenate the manipulated pieces together and merge the result to the video.
Here's what I tried:
#!/bin/bash
cd ~/Desktop/Video-convert
# create audio from video
ffmpeg -i *.mp4 output.mp3
# Split the audio into pieces
ffmpeg -i output.mp3 -f segment -segment_time 120 -c copy output_%03d.mp3
# Execute Spleeter upon each sample
FILES=~/Desktop/Video-convert/*.mp3
for f in $FILES
do
spleeter separate -i $f -o output_vocal
done
# delete unneeded audios
rm *.mp3
cd output_vocal
# ===========================================================
# the problem starts here
# ===========================================================
# concatenate manipulated audios together
find . -name 'vocals.wav' -exec echo {} >> mylist.txt \;
ffmpeg -f concat -safe 0 -i mylist.txt -c copy vocal.mp3
mv vocal.mp3 ../
cd ../
# merge the audio back to video
ffmpeg -i *.mp4 -i vocal.mp3 \
-c:v copy -c:a aac -strict experimental \
-map 0:v:0 -map 1:a:0 vocal-vid.mp4
Everything works well until having to concatenate the audios together. Spleeter outputs the result into vocal.wav & accompaniment.wav within a sub-folder that is named the same as the audio that was processed.
The File Tree looks like this:
output_vocal
- output_000
----- vocal.wav
----- accompaniment.wav
- output_001
----- vocal.wav
----- accompaniment.wav
- output_002
----- vocal.wav
----- accompaniment.wav
As you can see the problem comes with the naming. My objective is to concatenate all vocal.wav into one mp3 audio.
And then merge the final vocal.mp3 audio with the *.mp4 video.
Only issue is going around the way that Spleeter outputs the result audios.
the problem you are experiencing is that ffmpeg's concat demuxer requires an input file that contains directives, rather than a naive file-list.
Your find invocation creates a file like:
output_vocal/output_000/vocal.wav
output_vocal/output_001/vocal.wav
output_vocal/output_002/vocal.wav
whereas ffmpeg's concat demuxer really requires a file like:
file output_vocal/output_000/vocal.wav
file output_vocal/output_001/vocal.wav
file output_vocal/output_002/vocal.wav
Also note that find does not necessarily return the files in alphabetic order, whereas you will most likely want to concatenate the files in that order.
Finally, when concatenating the WAV-files, you cannot use the copy codec to generate an MP3 file (since the WAV/RIFF codec is not MP3). but you don't need an intermediate MP3-file anyhow
Here's an updated script, that
- uses a temporary directory for all intermediate files
- iterates over all mp4-files provided at the cmdline (rather than hardcoding the input directory)
- creates a "XXX_voc.mp4" file for each input file "XXX.mp4" (overwriting any existing files)
#!/bin/bash
for infile in "$#"
do
outfile=${infile%.mp4}_voc.mp4
# create a temp-directory to put our stuff to
TMPDIR=$(mktemp -d)
# create audio from video
ffmpeg -i "${infile}" "${TMPDIR}/output.mp3"
# Split the audio into pieces
ffmpeg -i "${TMPDIR}/output.mp3" -f segment -segment_time 120 -c copy "${TMPDIR}/output_%03d.mp3"
# Execute Spleeter upon each sample
find "${TMPDIR}" -maxdepth 1 -type f -name "output_*.mp3" \
-exec spleeter separate -i {} -o "${TMPDIR}/output_vocal" ";"
# find all 'vocal.wav' files generated by spleeter, sort them,
# prefix them with 'file ', and put them into output.txt
find "${TMPDIR}/output_vocal" -type f -name "vocal.wav" -print0 \
| sort -z \
| xargs -0 -I{} echo "file '{}'" \
> "${TMPDIR}/output.txt"
# concatenate the files and create an MP3 file
ffmpeg -f concat -safe 0 -i "${TMPDIR}/output.txt" -c copy "${TMPDIR}/vocal.wav"
# merge the audio back to video
ffmpeg -y -i "${infile}" -i "${TMPDIR}/vocal.wav" \
-c:v copy -c:a aac -strict experimental \
-map 0:v:0 -map 1:a:0 "${outfile}"
rm -rf "${TMPDIR}"
done

Linux: How to combine multiple FLAC audio files into 1 file, with differing sample rates, but not changing pitch

I've looked everywhere to try to combine a bunch of FLAC files with differing sample rates into 1 file. What I've tried so far is:
ffmpeg concat with a wildcard:
ffmpeg -f concat -i <( for f in *.flac; do echo "file '$(pwd)/$f'"; done ) -safe 0 output.flac
I get for every filename, (even if I change pwd to './' for relative):
ffmpeg unsafe filename
Regardless of the file's filename.
I've tried sox:
sox *.flac output.flac
Which leads to:
sox FAIL sox: Input files must have the same sample-rate
I've even tried combining the two:
#!/usr/bin/env bash
set -eu
for i in *.flac *.ogg *.mp3
do
ffmpeg -i "$i" "$i.wav"
done
sox *.wav combined.wav
Same error as above.
Anyone have any tips? I'm sure that in some Windows program you can drag in 5 differing sound files and combine them with ease. Is there not a simple way to do this on linux cmdline?
safe 0 is a private option for the concat demuxer, so it has to appear before the input i.e. -f concat -safe 0 -i ...

pipe sox play command to stdout

So I'm currently trying to stream my microphone input from my raspberry pi (rasbian)
to some sort of network stream in order to receive it later on my phone.
In order to do this I use arecord -D plughw:1,0 -f dat -r 44100 | top pipe the soundstream from my usb-microphone to stdout which works fine as far as I can see but I needed it to be a bit louder so I can understand people standing far away from it .
So i piped it to the sox play command like this :
arecord -D plughw:1,0 -f dat -r 44100| play -t raw -b 16 -e signed -c 2 -v 7 -r 44100 - test.wav
(test.wav is just some random wav file id doesn't work without it and there is a whitespace between the - behind 44100 and test.wav because i think - is a seperate parameter:
SPECIAL FILENAMES (infile, outfile):
- Pipe/redirect input/output (stdin/stdout); may need -t
-d, --default-device Use the default audio device (where available))
I figured out by using the -v parameter i can increase the volume.
This plays the recorded stream to the speakers I connected to the raspberry pi 3 .
Final goal : pipe the volume increased soundstream to the stdout(or some fifopipe file) so i can get it from stdin inside another script to send it to my phone.
However im very confused by the manpage of the play command http://sox.sourceforge.net/sox.html
i need to select the outputdevice to pipe or stout or something
if you know a better way to just increase the voulme of the i think Recording WAVE 'stdin' : Signed 16 bit Little Endian, Rate 44100 Hz, Stereosoundstream let me know
As far as I'm aware you can't pipe the output from play, you'll have to use the regular sox command for that.
For example:
# example sound file
sox -n -r 48k -b 16 test16.wav synth 2 sine 200 gain -9 fade t 0 0 0.1
# redundant piping
sox test16.wav -t wav - | sox -t wav - gain 8 -t wav - | play -
In the case of the command in your question it should be sufficient to change play to sox and add -t wav to let sox know in what format you want to pipe the sound.
arecord -D plughw:1,0 -f dat -r 44100 | \
sox -t raw -b 16 -e signed -c 2 -v 7 -r 44100 - -t wav -

Resources