LAME -- decoding and encoding audio file - audio

I used lame for decoding from mp3 file to raw pcm file or encoding into mp3 from from raw pcm.
The question is When I use one test.0.pcm file, doing encoding and decoding over and over again(generating 0.mp3, 1.mp3,2.mp3, ... and test.1.pcm, test.2.pcm ....), although the size of for all .pcm files or all .mp3 files remain the same, but the contents are different. I tried to listen these audio files and found that 99.mp3's volume is much less than 1.mp3.
The script I use is like following:
#!/bin/bash
num=$1
last=0
now=1
for((i=0;i<num;i++));do
lame -r -b 64 -s 16000 -m m test.$last.pcm $last.mp3
lame --decode --mp3input -t -m m -s 16000 $last.mp3 test.$now.pcm
last=$now
now=$[now+1]
done
The original test.0.pcm has only 1 channel and sampling freq is 16k.
Some logs are like following, they are all same except for Replay Gain:
input: 97.mp3 (16 kHz, 1 channel, MPEG-2 Layer III)
output: test.98.pcm (16 bit, Microsoft WAVE)
skipping initial 1105 samples (encoder+decoder delay)
skipping final 47 samples (encoder padding-decoder delay)
Frame# 49/49 64 kbps
Assuming raw pcm input file
LAME 3.100 64bits (http://lame.sf.net)
polyphase lowpass filter disabled
Encoding test.98.pcm to 98.mp3
Encoding as 16 kHz single-ch MPEG-2 Layer III (4x) 64 kbps qval=3
Frame | CPU time/estim | REAL time/estim | play/CPU | ETA
49/49 (100%)| 0:00/ 0:00| 0:00/ 0:00| 88.200x| 0:00
----------------------------------------------------------------------------------------------------------
kbps mono % long %
64.0 100.0 100.0
Writing LAME Tag...done
ReplayGain: +46.1dB
Noticing ReplayGain is increasing constantly but I have no knowledge about mp3 encoding method, so I am not sure if this is the reason.

MP3 is a lossy codec. You're going to lose quality each time you encode another generation.

Related

FFmpeg - Crossfading inputs with a duration < 1s creates an empty output

I am trying to crossfade a silent input with a music to delay the moment when the music starts to play.
I built the command using fluent-ffmpeg so I could choose the duration of the silent input through my program. The duration of the crossfade is calculated according to the duration of the 2 inputs, and equals 0 if one of them is too short.
Below is an example of the resulting command:
ffmpeg -f lavfi -i anullsrc=r=44100 -i music.mp3 -y -filter_complex [0]atrim=duration=0.28[atrim_0];[atrim_0][1]acrossfade=d=0:c1=tri:c2=tri[final] -map [final] output.mp3
However, this command creates an empty output file when the duration of the silent input is inferior to 1 second, regardless of which music input is next. Using the same command with a trim duration > 1 second creates a valid output with the silence and the music.
I have tried to look through the FFmpeg debug report but couldn't really see what was wrong.
Below is an excerpt of the debug log report:
Input file #0 (anullsrc=r=44100):
Input stream #0:0 (audio): 14 packets read (28672 bytes); 14 frames decoded (14336 samples);
Total: 14 packets (28672 bytes) demuxed
Input file #1 (music.mp3):
Input stream #1:0 (audio): 504 packets read (210651 bytes); 504 frames decoded (578372 samples);
Total: 504 packets (210651 bytes) demuxed
Output file #0 (output.mp3):
Output stream #0:0 (audio): 0 frames encoded (0 samples); 0 packets muxed (0 bytes);
Total: 0 packets (0 bytes) muxed
Any idea what could cause this?
PS: I am using FFmpeg 4.4, and the same command with FFmpeg 4.2 lead to a segmentation fault. Don't know if this can be of any help
acrossfade can accept crossfade duration through two exclusive options: nb_samples (default: 44100) and duration (default: 0). When the latter isn't set, the former is used. So, in your command, acrossfade uses a crossfade duration of 44100 samples or 1 second. The filter needs both inputs to be at least as long as the crossfade duration.
However, in your case, it seems you just want to do two things: fade in the audio and maybe delay it. Just use afade for that.
ffmpeg -i music.mp3 -y -af afade=d=1:curve=tri,adelay=0.28s:all=1 output.mp3
This will fade-in the music over one second and delay the start by 0.28s.

Prevent SoX from doing the clipping?

I have a heavily "overflowed" WAV file (samples written in float 32-bit format): instead of [-1.0,+1.0], the samples range goes as far as [-5.0,+5.0].
Using SoX to get raw PCM audio samples from WAV file:
sox --bits 32 --channels 1 --encoding floating-point --rate 48000 input.wav output.raw
I get warnings:
sox WARN sox: `input.wav' input clipped 1163400 samples
sox WARN sox: `output.raw' output clipped 605664 samples; decrease volume?
When I look into the output, I see the samples were clipped, and the range is [-1.0,+1.0] now. However, that is not what I want.
I would like to have exactly same output as input, just in different format (RAW instead of WAV). I need to use command line tool for the task. Is there a way to prevent SoX from doing the clipping?
I got the answer from the SoX mailing list.
The required behaviour isn't possible, however, in this specific case there is a workaround. Overriding input sample encoding as 32-bit signed integer instead of float, the values will be copied untouched; since the output is headerless, it doesn't matter what SoX thinks the samples are.
So, this works:
sox --bits 32 --channels 1 --encoding signed-integer --rate 48000 input.wav output.raw

Piping LAME and SOX togteher in a shell script. Is it possible?

I'm using the following in a shell script :
##For ease of understanding I'll declare the $1 variable here even though it's actually arriving remotely via an ssh2_exec command.
$1="my.mp3"
lame --decode /root/incoming/shows/$1 - | /root/incoming/stereo_tool_cmd_64 - - -s /usr/incoming/settings/setting.sts | lame -b 128 - /root/incoming/processing/$1;
So what is happening? LAME decodes the mp3 file to wav, then it is piped to STEREO TOOL (audio processing script), then back to LAME where it's re-encoded as an mp3 file and the result is written to a different directory.
This all works great but I want to use SOX during this pipe to remove all silence from the start and end of the file, while it's in it's decoded wav state, before hitting STEREO TOOL.
I've tried this but it doesn't work (SOX breaks the pipe) :
lame --decode /root/incoming/shows/$1 - | sox silence 1 0.1 0.1% reverse silence 1 0.1 0.1% reverse | /root/incoming/stereo_tool_cmd_64 - - -s /usr/incoming/settings/setting.sts | lame -b 128 - /root/incoming/processing/$1;
I know the standard way to use SOX would be :
sox input.wav output.wav silence 1 0.1 0.1% reverse silence 1 0.1 0.1% reverse;
But I can't declare the wav file here as it's being created on the fly by LAME.
Is what I'm trying to do impossible or is there a solution that will allow me to do this?

Sox conversion from .au to .wav

I have a file, sound.au, which file describes as Sun/NeXT audo data: 8-bit ISDN mu-law, mono, 8000 Hz. I'd like to convert this to a WAV that file would describe as RIFF (little-endian) data, WAVE audio, Microsoft PCM, 16 bit, mono 8000 Hz. However, I cannot get the right set of arguments to make this conversion and see what it sounds like.
Has anyone performed this conversion or similar before? sox -t auto -w -s -r 8000 -c sound.au sound.wav gets me close, but it's G711 mu-law, not 16 bit PCM.
Thanks.
I don't have an .au file to try, but I suspect sox sound.au -e signed-integer sound.wav would work. You are only trying to change the encoding from u-law to PCM, right? sox should pick up all the necessary input info from the .au header. If it doesn't, maybe you need sox -t auto sound.au -e signed-integer sound.wav.

Big file conversion .ogv to .avi ogv2avi on Ubuntu

I'm trying to convert a 200MB .ogv file to .avi with a script I found online:
#!/bin/bash
# ogv to avi
# Call this with multiple arguments
# for example : ls *.{ogv,OGV} | xargs ogv2avi
N=$#;
echo "Converting $N files !"
for ((i=0; i<=(N-1); i++))
do
echo "converting" $1
filename=${1%.*}
mencoder "$1" -ovc xvid -oac mp3lame -xvidencopts pass=1 -o $filename.avi
shift 1
done
After this all I have to do is $ ogv2avi name_of_file.ogv
and it creates the converted.avi file.
It works great for small file, but it seems to crash for big files, and I only get around the first 3 minutes of the 30 minute recording.
Too many audio packets in the buffer: (4096 in 850860 bytes).
Maybe you are playing a non-interleaved stream/file or the codec failed?
For AVI files, try to force non-interleaved mode with the -ni option.
Flushing video frames.
Writing index...
Writing header...
ODML: vprp aspect is 16384:10142.
Setting audio delay to 0.078s.
Video stream: 784.308 kbit/s (98038 B/s) size: 21254748 bytes 216.800 secs 3000 frames
Audio stream: 87.341 kbit/s (10917 B/s) size: 2372536 bytes 217.313 secs
I had the exact same problem, and the only way i got around it (a sloppy solution but it works) is to play the .ogv video on the Ubuntu Desktop and record the square were the video is located with a desktop recorder that don't produces .ogv files(I recommend Kazam which produces .webm files). Then use Audacity to edit the audio of the output video if necessary and mix the edited audio with the output video using MkvMerge.

Resources