I am looking for a way to "hear" a bat.
I have a 192khz sound recording of a bat and want to hear it. So "transform" it into a 0-12kHz recording?
I saw what I thought might be similar:
change pitch of multiple audio files with Sox
And tried using something like:
log(12/192) * log(2) * 1200 == 4800
sox 331817.flac 331817_warp.wav pitch -4800
You can see the whole spectrogram here:(192Khz)
sox 331817.flac -n rate 192.0k spectrogram -l -m -X 160 -z 95 -Z 0 -r -Y 257 -o spectro.png
You can see my warped spectro here:
sox 331817_warp.wav -n rate 12.0k spectrogram -l -m -X 160 -z 95 -Z 0 -r -Y 257 -o spectro_warp.png
Any help would be appeciated.
Here's a video which encouraged me its possible:
https://www.youtube.com/watch?v=qJOloliWvB8
Not really a programming question, but intriguing nevertheless, so here's my two cents...
Try speed -4800c; it lowers both pitch and tempo. This is the least intrusive way of lowering pitch as it does not need to resample the sound. It will make the entire sound fragment a factor 16 longer, so take your time listening to it. Trim it down if possible; I suspect this is also what they did in the video.
Keep in mind that even a sample rate of 192 kHz may not be enough to accurately capture the full spectrum of a bat's voice. Nyquist frequency is half of the sample rate; any audio above 96 kHz will be distorted. No post-processing is going to fix that.
Related
I do the cut via:
ffmpeg -i long_clip.mp4 -ss 00:00:10.0 -c copy -t 00:00:04.0 short_clip.mp4
I need to know the precise time where did the ffmpeg do the cut (Time of the closest keyframe before the 00:00:10.0)
Currently, I'm using the following ffprobe command to list all the keyframes and select the closest before 00:00:10.0
ffprobe -show_frames -skip_frame nokey long_clip.mp4
It works extremely slow (I run It on Jetson Nano, and It is a few minutes to list the keyframes for 30 sec video, although the cutting is done in 0.2seconds)
I hope there is the much faster way to know the time of the keyframe where ffmpeg does the cut, at least because ffmpeg seeks to this keyframe and cuts the video less than in half a second.
So in other words the question is: How to get the time of the keyframe where ffmpeg does the cut not listing all the keyframes?
I think this is not possible. The most information you can get from a program is obtained when you use the verbosity level of debugging. For ffmpeg I just used
ffmpeg -v debug -i "Princess Chelsea - Frack.mp4" -ss 00:03:00.600 -c copy -to 00:03:03.800 3.mkv 2> out.txt
One has to redirect output, because there is too much of it with debug, it doesn't fit the terminal.
Unfortunately, it gives only some cryptic/internal messages, like
Automatically inserted bitstream filter 'vp9_superframe'; args=''
[matroska # 0x55987904cac0] Starting new cluster with timestamp 5 at offset 885 bytes
[matroska # 0x55987904cac0] Writing block of size 375 with pts 5, dts 5, duration 23 at relative offset 9 in cluster at offset 885. TrackNumber 2, keyframe 1
With less verbosity it gives less information. Therefore I think this is not possible. However, what is your actual question? Maybe you need something different apart from just knowing the time of cuts?..
For those who look how to actually cut at the proper time (as I was looking for): one has to apply not copy, but to actually decode the video anew.
I'm using ffmpeg to decode and encode signal. It works perfectly and I added filters. For example, I'm using such a command :
ffmpeg -re -i /home/dr_click/live.wav -af "anequalizer=c0 f=200 w=100 g=-5 t=0|c1 f=200 w=100 g=-5 t=0, anequalizer=c0 f=1000 w=100 g=3 t=0|c1 f=1000 w=100 g=3 t=0" -acodec pcm_s16be -ar 44100 -ac 2 -f rtp rtp://127.0.0.1:1234
I'm streaming my file, adding 2 filters with 200 Hz and 1000 Hz as central frequency and 100 Hz width and it works.
With such a filter, I know my gain will be -5db at 200Hz. But what is the gain for frequencies at 250 Hz ? Still -5db ? -4.5db ? -3db ? And same question at 350Hz or any other frequency.
What I'm looking for and didn't found is the way to get the frequency response of such a filter for a bandwith from 20Hz to 20kHz. In other words, what I'd like to know for any frequency is : gain = f (frequency) with a given ffmpeg filter
Thank you for your help,
Dr_Click
i'm working on a quite similar issue. Mine is to replace the system wide 15 band graphical LADSPA equalizer (mbeq_1197, controlled by JACK Rack) with an ffmpeg filter. As it is AFAIK impossible to adjust ffmpeg filter parameters during runtime, I have to rely on my already generated JACK EQ settings and need to transfer them to the ffmpeg EQ. Alas, I could not find any two "comparable" EQs: ffmpeg only offers a 18 band "superequalizer". My previous EQ has 15 bands, so I decided to do some interpolations and compare the frequency responses of the old and the new EQ.
Now to answer your question: I'm not an audio engineer, and I'm sure there are more professional ways. But what I found out for now is my current workflow:
Generate some white noise. In Linux you can e.g. use sox oder Audacity. In Audacity do Generate -> Built-in -> Noise... => White noise (1 min should be enough)
Save the file as WAV.
Apply your filter to this WAV: ffmpeg -i whitenoise.wav -af "<your filter>" whitenoise_filtered.wav
Load the filtered file into Audacity and do Analyze -> Plot Spectrum...
The output will be a little scattered because the white noise is not perfect, but this should be negligible.
Good luck!
Flittermice
I use the following code to trim, pipe and concatenate my audio files.
sox "|sox audio.wav -p trim 0.000 =15.000" "|sox audio.wav -p trim 15.000" concatenated.wav
One would expect that concatenated.wav will sound identical compared to a.wav.
However, when both files are played simultaneously together, there is a distinct audio shift on concatenated.wav.
Normally this error is acceptable as it is in the milliseconds range. However, as the number of pipe increases (say more than 100), the amount of audio shift increases substantially.
What is the correct method to trim, pipe and concatenate audio files using SoX to prevent this error?
Edit 1: Samples was used instead of milliseconds. Still met the same problem.
The following code was used:
sox "|sox audio.wav -p trim 0s =661500s" "|sox audio.wav -p trim 661500s" concatenated.wav
Wave file sample rate is 44100hz. Sample size is 16 bit.
SoX 14-4-2 was used.
The problem is that sox may lose a few samples at the cut point of the trim command.
I had a similar problem and solved it by cutting not by milliseconds, but by samples, which of course depend on the sample rate.
If your cutpoints are multiples of the used sample rate, you will no longer lose samples and the combined parts will have the exact same length as the original.
I've licensed some audio clips, but some of them come with what I have learned is a "DC Offset" that should normally have been removed during production.
Audacity's "Normalize" filter is able to fix a static DC Offset, but after applying it to my audio clips, I noticed that their DC offset varies (within 0.5 seconds it could go from 0.05 to 0.03 along a normalized amplitude range). For example:
To the left, silence is at 0.02, to the right, it's at 0.00 - this is after normalization by Audacity.
With me not being an audio engineer and not having any professional tools, is there a way to fix this?
A DC offset is a frequency component at 0 Hz. The "wandering DC offset" will be made of very low frequency components, so you should be able to remove this by using a high-pass filter with a cutoff of around 15 Hz. That way, you'll remove any sub-sonic DC related stuff without altering the audible frequency range.
Use a filter with a steep rolloff. Seeing as you're doing this offline, you can use a simple IIR type and filter the signal in both forward and reverse directions to remove any phase distortion that would otherwise be imposed by the filtering.
If you use matlab, the operation would look something like this . .
[x, fs] = wavread('myfile.wav');
[b,a] = butter(8, 15/(fs/2), 'highpass');
y = filtfilt(b,a,x);
From the command line, you can have a try with sox.
sox fileIn.wav fileOut.wav highpass 10
This will apply an high pass filter at a frequency of 10 Hz.
This should remove the DC offset (but maybe not in the early beginning of the files).
See the sox manual for a little bit more information (but not so much).
As #learnvst explains in his answer, what looks like "wandering DC offset" is actually just content at very low frequencies. You can remove this LF content with a high pass filter. Since frequencies below 20 Hz are generally inaudible, you should be able to take out the "wandering DC" without actually changing how the file sounds.
The latest version of Audacity (2.0.5) includes a high pass filter. Select Effect > High pass filter ... and adjust the cutoff frequency and rolloff parameters. A cutoff of around 15 Hz and a rolloff of 6 dB/oct should do the trick.
for f in *.wav; do
mv "$f" /tmp/dc1.wav
dc=$(ffprobe -f lavfi "amovie=/tmp/dc1.wav,astats=metadata=1" 2>&1 | sed '/Overall/,$!d' | grep DC )
#echo "$dc"
dc=$(echo "$dc" | awk '{ print $6 }')
#echo "$dc"
dc=$(echo "$dc * -1" | bc)
echo "bc" "$dc"
ffmpeg -hide_banner -loglevel error -y -i "/tmp/dc1.wav" -af "dcshift=$dc:limitergain=0.02" "$f"
done
I posted this as comments under this related thread. However, they seem to have gone unnoticed =(
I've used
ffmpeg -i myfile.avi -f image2 image-%05d.bmp
to split myfile.avi into frames stored as .bmp files. It seemed to work except not quite. When recording my video, I recorded at a rate of 1000fps and the video turned out to be 2min29sec long. If my math is correct, that should amount to a total of 149,000 frames for the entire video. However, when I ran
ffmpeg -i myfile.avi -f image2 image-%05d.bmp
I only obtained 4472 files. How can I get the original 149k frames?
I also tried to convert the frame rate of my original AVI to 1000fps by doing
ffmpeg -i myfile.avi -r 1000 otherfile.avi
but this didn't seem to fix my concern.
ffmpeg -i myfile.avi -r 1000 -f image2 image-%07d.png
I am not sure outputting 150k bmp files will be a good idea. Perhaps png is good enough?
Part one of your math is good, the 2 minutes and 29 seconds is about 149 seconds. With 1000 fps that makes 149000 frames. However your output filename only has 5 positions for the number where 149000 has 6 positions, so try "image-%06d.bmp".
Then there is the disk size: Do your images fit on the disk? With bmp every image uses its own size. You might try to use jpeg pictures, they compress about 10 times better.
Another idea: If ffmpeg does not find a (reasonable) frame rate, it drops to 25 or 30 frames per second. You might need to specify it. Do so for both source and target, see the man page (man ffmpeg on unix):
To force the frame rate of the input file (valid for raw formats
only) to 1 fps and the frame rate of the output file to 24 fps:
ffmpeg -r 1 -i input.m2v -r 24 output.avi
For what it's worth: I use ffmpeg -y -i "video.mpg" -sameq "video.%04d.jpg" to split my video to pictures. The -sameq is to force the jpeg in a reasonable quality, the -y is to avoid allow overwrite questions. For you:
ffmpeg -y -r 1000 -i "myfile.avi" -sameq "image.%06d.jpg"
I think, there is a misconception here: the output of a HS video system is unlikely to have an output frame rate of 1000 fps but something rather normal as 30 (or 50/60) fps. Apart from overloading most video players with this kind of speed it would be counterproductive to show the sequence in the same speed as it was recorded.
Basically: 1 sec # 1000 fps input is something like 33 sec # 30 fps output.
Was the duration of the scene recorded really 2:29 min (resulting in a video ~82 min at normal rate) or took it about 4.5 sec (4472 frames) which is 2:29 min in normal playback?
I tried this on ubuntu 18.04 terminal.
ffmpeg -i input_video.avi output_frame_path_images%5d.png
where,
-i = Input