Post code that plays a Christmas tune [closed] - audio

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
Is there a way to play a Christmas tune on a PC or Mac without having a pre-recorded sound file? (No .mp3 or .wav or whatever-sound file)
I remember on my TI 99/4A and Apple II sounds (resembling music) could be played. Not sure if modern computers have these abilities (aside from beep).

"Jingle Bells" in java (bloated as usual), using JFugue, with tubular bells and xylophones (polyphonic!):
import org.jfugue.*;
public class JingleBells
{
public static void main(String[] args)
{
Player player = new Player();
player.play("T170 "+
" V0 I[XYLOPHONE] C4q C4q C3h C4q C4q C3h C3q B3q A3q G3q C4h "+
" V1 I[TUBULAR_BELLS] E5q E5q E5h E5q E5q E5h E5q G5q C5q D5q Eqh "+
" V2 I[XYLOPHONE] G3h G2q G3q G3h G3h");
}
}

Speaking of "as bad as beep", if you have beep installed on your linux box, you can run the following shell script (in the same vein as Jeremy Ruten's answer):
#!/bin/sh
beep -f 659 -l 400
sleep 0.05
beep -f 659 -l 400
sleep 0.05
beep -f 659 -l 800
sleep 0.05
beep -f 659 -l 400
sleep 0.05
beep -f 659 -l 400
sleep 0.05
beep -f 659 -l 800
sleep 0.05
beep -f 659 -l 400
sleep 0.05
beep -f 783 -l 400
sleep 0.05
beep -f 523 -l 400
sleep 0.05
beep -f 587 -l 400
sleep 0.05
beep -f 659 -l 800

Yes, you can play midi.
Midi doesn't encode sounds per se, it encodes information used to play music; the pitch, tone, intensity, etc.
There is a C# midi toolkit on codeplex at :http://www.codeproject.com/KB/audio-video/MIDIToolkit.aspx
The quality of the sound depends entirely on the midi device used to play it, so it will vary in quality from computer to computer.
You can find a nice list of Christmas midi files at: http://www.lockergnome.com/midi/
Windows Media Player can play midi files as can Quick Time (I believe).

PLAY "e4 e4 e2 e4 e4 e2 e4 g4 c4 d4 e2"

What about generating PCM data on the fly? PCM - Pulse Code Modulated - sound is just a bunch of samples of voltage across an analog sound system.
Think about a speaker. As sound is played, it vibrates. What if you took a ruler and measured the location of the speaker at a rate faster then the frequency of the sound? You would get a picture of a waveform. That's exactly what PCM data looks like, with each measurement stored as an 8 or 16 bit int. The frequency, say 44khz is the number of samples per second. CDs use 44khz sampling frequency and 16 bit samples.
DirectSound (on windows) and OpenAL (cross platform) are two libraries you can use to play databuffers full of PCM data. I've used DirectSound in the past, not to play data but rather to read in data from the microphone to get the volume level.
If you wanted to create a PCM sample for a certain note, you just calculate the frequency (here's a table), and then put a sinewave in your buffer. You can mix different frequencies together just by adding them (make sure the sum is less then the maximum volume, to avoid clipping)

MIDI is an option, although on a PC it usually sounds almost as bad as beep.

Related

How to warp/shift a pitch so I can hear a bat

I am looking for a way to "hear" a bat.
I have a 192khz sound recording of a bat and want to hear it. So "transform" it into a 0-12kHz recording?
I saw what I thought might be similar:
change pitch of multiple audio files with Sox
And tried using something like:
log(12/192) * log(2) * 1200 == 4800
sox 331817.flac 331817_warp.wav pitch -4800
You can see the whole spectrogram here:(192Khz)
sox 331817.flac -n rate 192.0k spectrogram -l -m -X 160 -z 95 -Z 0 -r -Y 257 -o spectro.png
You can see my warped spectro here:
sox 331817_warp.wav -n rate 12.0k spectrogram -l -m -X 160 -z 95 -Z 0 -r -Y 257 -o spectro_warp.png
Any help would be appeciated.
Here's a video which encouraged me its possible:
https://www.youtube.com/watch?v=qJOloliWvB8
Not really a programming question, but intriguing nevertheless, so here's my two cents...
Try speed -4800c; it lowers both pitch and tempo. This is the least intrusive way of lowering pitch as it does not need to resample the sound. It will make the entire sound fragment a factor 16 longer, so take your time listening to it. Trim it down if possible; I suspect this is also what they did in the video.
Keep in mind that even a sample rate of 192 kHz may not be enough to accurately capture the full spectrum of a bat's voice. Nyquist frequency is half of the sample rate; any audio above 96 kHz will be distorted. No post-processing is going to fix that.

Frequency response of ffmpeg filters

I'm using ffmpeg to decode and encode signal. It works perfectly and I added filters. For example, I'm using such a command :
ffmpeg -re -i /home/dr_click/live.wav -af "anequalizer=c0 f=200 w=100 g=-5 t=0|c1 f=200 w=100 g=-5 t=0, anequalizer=c0 f=1000 w=100 g=3 t=0|c1 f=1000 w=100 g=3 t=0" -acodec pcm_s16be -ar 44100 -ac 2 -f rtp rtp://127.0.0.1:1234
I'm streaming my file, adding 2 filters with 200 Hz and 1000 Hz as central frequency and 100 Hz width and it works.
With such a filter, I know my gain will be -5db at 200Hz. But what is the gain for frequencies at 250 Hz ? Still -5db ? -4.5db ? -3db ? And same question at 350Hz or any other frequency.
What I'm looking for and didn't found is the way to get the frequency response of such a filter for a bandwith from 20Hz to 20kHz. In other words, what I'd like to know for any frequency is : gain = f (frequency) with a given ffmpeg filter
Thank you for your help,
Dr_Click
i'm working on a quite similar issue. Mine is to replace the system wide 15 band graphical LADSPA equalizer (mbeq_1197, controlled by JACK Rack) with an ffmpeg filter. As it is AFAIK impossible to adjust ffmpeg filter parameters during runtime, I have to rely on my already generated JACK EQ settings and need to transfer them to the ffmpeg EQ. Alas, I could not find any two "comparable" EQs: ffmpeg only offers a 18 band "superequalizer". My previous EQ has 15 bands, so I decided to do some interpolations and compare the frequency responses of the old and the new EQ.
Now to answer your question: I'm not an audio engineer, and I'm sure there are more professional ways. But what I found out for now is my current workflow:
Generate some white noise. In Linux you can e.g. use sox oder Audacity. In Audacity do Generate -> Built-in -> Noise... => White noise (1 min should be enough)
Save the file as WAV.
Apply your filter to this WAV: ffmpeg -i whitenoise.wav -af "<your filter>" whitenoise_filtered.wav
Load the filtered file into Audacity and do Analyze -> Plot Spectrum...
The output will be a little scattered because the white noise is not perfect, but this should be negligible.
Good luck!
Flittermice

sox effect: retriggerable silence

To detect speech I'm playing with this sox command:
rec voice.wav silence 1 5 30% 1 0:00:02 30%
It should start recording whenever the input volume raises about the threshold of 30% and stops after 2 seconds the audio falls below the same threshold.
It works. But It would be much better if it could be "retriggerable". I mean: after the audio falls below the threshold and the audio rises again, it should continue the registration (i.e. the user is still speaking).
It should stops only when it detects silence for whole 2 seconds.
Or do you recommend any other "VOX" tool?
I've spent a lot of time experimenting with SOX to do VOX and have gotten it to work reasonably well. I've been using Audacity to view the resultant wave form, and have settled on the following SOX command...
rec snd.wav silence 1 .5 2.85% 1 1.0 3.0% vad gain -n : newfile : restart
This will:
wait until it hears activity above the threshold for a half second, then start recording (silence 1 .5 2.85%)
stop recording when audible activity falls to zero for one second (... 1 1.0 3.0%)
trim off any initial silence up to voice detection (vad)
normalize the gain (gain -n)
store the result into a new file (snd001.wav, snd002.wav)
restart the process
Getting the "silence" numbers correct involved a lot of trial and error, and will depend on ambient noise as well as the sensitivity of your microphone. I'm using the microphone in the Logitech QuickCam IM on a Raspberry Pi through USB.
On a side note, this whole thing complains with the following...
rec FAIL formats: can't open input `default': snd_pcm_open error: No such file or directory
... until I created this variable in the environment:
export AUDIODEV=hw:1,0
Again - this involved a lot of experimentation with the values for "silence", and it WILL need some tweaking for your environment.

Using sox for voice detection and streaming

Currently, I use sox like this:
sox -d -e u-law --endian little -b 8 -c 1 -r 8000 -t ul - silence 1 0.3 1% 1 0.3 1%
For reference, this is recording audio from the default microphone and outputting little endian, ulaw formatted audio at 8 bits and a 8k rate. The effects filter trims audio until the noise hits a threshold for 0.3 seconds, then continues to record until there is 0.3 seconds of silence. All of this streams to stdout which I use to stream to a remote server.
I am using all of this to record a bit of voice and finish when I am done speaking. To trigger sox, I use specialized hardware to trigger the start of the recording. I can switch to using almost any audio format or codec as long as it supports on the fly formatting/encoding. My target platform is raspbian on the raspberry pi 2 B.
My ideal solution would be to use vad to stop the recording when the user is finished speaking. My hope is that this would work even with background chatter. However, the sox documentation on the vad effect states this:
The use of the norm effect is recommended, but remember that neither
reverse nor norm is suitable for use with streamed audio.
I haven't been able to piece parameters together to get vad and streaming working. Is it possible to use the vad effect to stop the recording of audio while still maintaining the stdin->sox->stdout piping? Are there better alternatives?
Is it possible to use the vad effect to stop the recording of audio while still maintaining the stdin->sox->stdout piping?
No. The vad effect can trim silence only from the front of the audio. So you could only use it to detect recording start, and not ending and pauses.
The reverse and norm filters need all the input data before they produce any data on output, that is why they cannot be used with streaming.
The key is to select a good threshold for silence filter so it takes "background chatter" as silence.
You could use also noisered (with a profile based on previous recordings) before silence to reduce noise triggering the recording, but this will also affect output and probably will not take "background chatter" as noise.

ffmpeg split avi into frames with known frame rate

I posted this as comments under this related thread. However, they seem to have gone unnoticed =(
I've used
ffmpeg -i myfile.avi -f image2 image-%05d.bmp
to split myfile.avi into frames stored as .bmp files. It seemed to work except not quite. When recording my video, I recorded at a rate of 1000fps and the video turned out to be 2min29sec long. If my math is correct, that should amount to a total of 149,000 frames for the entire video. However, when I ran
ffmpeg -i myfile.avi -f image2 image-%05d.bmp
I only obtained 4472 files. How can I get the original 149k frames?
I also tried to convert the frame rate of my original AVI to 1000fps by doing
ffmpeg -i myfile.avi -r 1000 otherfile.avi
but this didn't seem to fix my concern.
ffmpeg -i myfile.avi -r 1000 -f image2 image-%07d.png
I am not sure outputting 150k bmp files will be a good idea. Perhaps png is good enough?
Part one of your math is good, the 2 minutes and 29 seconds is about 149 seconds. With 1000 fps that makes 149000 frames. However your output filename only has 5 positions for the number where 149000 has 6 positions, so try "image-%06d.bmp".
Then there is the disk size: Do your images fit on the disk? With bmp every image uses its own size. You might try to use jpeg pictures, they compress about 10 times better.
Another idea: If ffmpeg does not find a (reasonable) frame rate, it drops to 25 or 30 frames per second. You might need to specify it. Do so for both source and target, see the man page (man ffmpeg on unix):
To force the frame rate of the input file (valid for raw formats
only) to 1 fps and the frame rate of the output file to 24 fps:
ffmpeg -r 1 -i input.m2v -r 24 output.avi
For what it's worth: I use ffmpeg -y -i "video.mpg" -sameq "video.%04d.jpg" to split my video to pictures. The -sameq is to force the jpeg in a reasonable quality, the -y is to avoid allow overwrite questions. For you:
ffmpeg -y -r 1000 -i "myfile.avi" -sameq "image.%06d.jpg"
I think, there is a misconception here: the output of a HS video system is unlikely to have an output frame rate of 1000 fps but something rather normal as 30 (or 50/60) fps. Apart from overloading most video players with this kind of speed it would be counterproductive to show the sequence in the same speed as it was recorded.
Basically: 1 sec # 1000 fps input is something like 33 sec # 30 fps output.
Was the duration of the scene recorded really 2:29 min (resulting in a video ~82 min at normal rate) or took it about 4.5 sec (4472 frames) which is 2:29 min in normal playback?
I tried this on ubuntu 18.04 terminal.
ffmpeg -i input_video.avi output_frame_path_images%5d.png
where,
-i = Input

Resources