I've got a short, crossfaded ambient sound clip running on a loop using Tone.js. Trouble is, there's an audible gap between the end of one playback and the beginning of the next.
I know it's possible to achieve a seamlessly crossfaded loop in Howler.js using audio sprites, but I'm not sure how to do it in Tone.js (and I'd rather stick with this library if possible).
Does anybody out there know how to resolve this?
To avoid discontinuities in looped audio, you need to crossfade the end of the loop with the beginning of it. If your program is only working with a fixed set of loops, you can pre-render the crossfade into the clip using an audio editor like Audacity or a DAW like Pro Tools or Reaper.
If you have a general-purpose app that needs to work with user-supplied audio, then you'll need to write code to mix the end of the loop fading out with the beginning of the loop fading in.
Looping seamlessly MP3 tracks is difficult, but not impossible. MP3 files contain an additional padding that is decoded as audio data by decoders. I would recommend one of these two options to work around the paddings:
Prepare seamless MP3 loops with a special software. You can read more about that here: https://www.compuphase.com/mp3/mp3loops.htm
Switch to a different format: AAC (best support in all browsers, but there still might be some issues with paddings, can be overcame), OPUS, OGG. Read more here: https://en.wikipedia.org/wiki/Gapless_playback#Prerequisites
Related
My Zoom H4n somehow decided it didn't want to properly save two recordings this weekend, leaving me with four zero byte files (which I have tried any which way to open/convert, but nothing was working).
I then used CardRescue to scan the SD card for any audio it could find, and - lo and behold - I got .wav files! However, instead of two files for each session (one was an XLR output from the desk, the other the on-Zoom mics), or even a nice stereo with one left, the other right, I have a mess.
In importing as raw data to Audacity (the rescued .wavs themselves do not open), the right channel has the on-Zoom mic audio, with intermittent silence. The left has the on-Zoom audio, followed by the same part of the XLR input audio. This follows the same pattern as the silences.
I have spent hours chopping up in Garageband, but as it is audio for a video, it needs to match what 'really' happened perfectly (I appreciate for a podcast/audio-only I could relatively simply take away the on-Zoom mic audio from the left channel). I began attempting to sync the mic audio to the on-camera audio (which, despite playing around with settings is as unusable as it always is) but because it's a pattern, can't help but wonder if there's a cleaner fix: either analysing the audio somehow as there are clean lines if I look at the spectral data, or a case of adding a couple of numbers to the wav's binary that'd click the two into place?
I've tried importing to Audacity with different settings, different offsets - this has ended up in either slow audio, fast audio, or heavily distorted audio (but always the same patterns with the files).
I use a Mac (and don't know any PC users close by!) so any software suggestions will need to run on Mac. However, I'm willing to try just about anything that's not dragging tiny clips.
I need to make a video of an audio equalizer.
So i need a script that analyses audio every frame, and extracts the frequency apectrum so i can draw that somehow and make an equalizer.
The first part of the problem is easily solvable on frontend as there is a myriad of open source equalizer visualisations in canvas.
The thing works nicely in browser but i have a problem to make an mp4 of that.
Ive tried using headless browsers(pupeteer and phantomjs) to capture frames from canvas, but i could not get the framerate above 10fps, resulting in unacceptable video quality and sync issues when connecting the jpg frames and mp3 via ffmpeg. The plan was to speed it up, so you dont have to wait for the full audio length to finish to get an mp4, but i cant even get it to show above 10fps on regular playback speed.
I feel the tech i thought would work is not there yet, and i might be in need of a different approach.
The only condition is that it has to run as a script on a linux server. So any programmimg language or any equalizer design will work.
Any ideas or resources are more than welcome. Thanks
I have a situation where I have a video capture of HD content via HDMI with audio from a sound board that goes through a impedance drop into a microphone input of a camcorder. That same signal is split at line level to a 'line in' jack on the same computer that is capturing the HDMI. Alternatively I can capture the audio via USB from the soundboard which is probably the best plan, but carries with it the same issue.
The point is that the line in or usb capture will be much higher quality than the one on HDMI because the line out -> impedance change -> mic in path generates inferior quality in that simply brushing the mic jack on the camera while trying to change the zoom (close proximity) can cause noise on the recording.
So I can do this today:
Take the good sound and the camera captured sound and load each into
audacity and pretty quickly use the timeshift toot to perfectly fit
the good audio to the questionable audio from the HDMI capture and
cut the good audio to the exact size of the video. Then I can use
ffmpeg or other video editing software to replace the questionable
audio with the better audio.
But while somewhat quick and easy, it always carries with it a bit of human error and time. I'd like to automate this if possible as this process is repeated at least weekly throughout the year.
Does anyone have a suggestion if any of these ideas have merit or could suggest another approach?
I suspect but have yet to confirm that the system timestamp of the start time may be recorded in both audio captured with something like Audacity, or the USB capture tool from the sound board as well as the HDMI mpeg-2 video. I tried ffprobe on a couple audacity captured .wav files but didn't see anything in the results about such a time code, but perhaps other audio formats or other probing tools may include this info. Can anyone advise if this is common with any particular capture tools or file formats?
if so, I think I could get best results by extracting this information and then using simple adelay and atrim filters in ffmpeg to sync reliably directly from the two sources in one ffmpeg call. This is all theoretical for me right now-- I've never tried either of these filters yet-- just trying to optimize against blind alleys by asking for advice up front.
If such timestamps are not embedded, possibly I can use the file system timestamp for the same idea expressed in 1a, but I suspect the file open of the two capture tools may have different inherant delays. Possibly these delays will be found to be nearly constant and the approach can work with a built-in constant anticipation delay but sounds messy and less reliable than idea 1. Still, I'd take it, if it turns out reasonably reliable
Are there any ffmpeg or general digital audio experts out there that know of particular filters that can be employed on the actual data to look for similarities like normalizing the peak amplitudes or normalizing the amplification of the two to some RMS value and then stepping through a short 10 second snippet of audio, moving one time stream .01s left against the other repeatedly and subtracting the two and looking for a minimum? Sounds like it could take a while, but if it could do this in less than a minute and be reliable, I suspect it could work. But I have only rudimentary knowledge of audio streams and perhaps what I suggest is just not plausible-- but since each stream starts with the same source I think there should be a chance. I am just way out of my depth as to how to go down this road, so if someone out there knows such magic or can throw me some names of filters and example calls, I can explore if I can make it work.
any hardware level suggestions to take a line level output down to a mic level input and not have the problems I am seeing using a simple in-line impedance drop module, so that I can simply rely on the audio from the HDMI?
Thanks in advance for any pointers or suggestinons!
I have a bunch of video clips from a webcam (duration is 5, 10, 60 seconds), and I'm looking for a way to detect "does this video clip have movement", to decide whether the file should be saved or discarded in a future processing phase.
I've looked into motion and OpenCV, but motion seems to only want to work on the raw video stream, and OpenCV seems to be way too advanced for my use.
My ideal solution would be a linux command-line tool that I can feed video files into, and get a simple "does/doesn't contain movement" answer back, so I can discard the irrelevant files. False positives (in a reasonable quantity) are perfectly acceptable for my use.
Does such a tool exist? Or any simple examples of doing this with other tools?
You can check dvr-scan which is simple cross-platform command line tool based on OpenCV.
To just list motion events in csv format (scan only):
dvr-scan -i some_video.mp4 -so
To extract motion in single video:
dvr-scan -i some_video.mp4 -o some_video_motion_only.avi
For more examples and various other parameters see:
https://dvr-scan.readthedocs.io/en/latest/guide/examples/
I had the same problem and wrote the solution: https://github.com/jooray/motion-detection
Should be fairly easy to use from command-line.
If you would like to post-process already-captured video then motion can be useful.
VLC allow you to stream or convert your media for use locally, on your private network, or on the Internet. So an already-captured video can be streamed over HTTP, RTSP, etc. and motion can handle it as a network camera.
Furthermore:
How to Stream using VLC Media Player
If OpenCv is to advanced for you, maybe you should consider something easier which is... SimpleCV (wrapper for OpenCV) "This is computer vision made easy". There is even an example of motion detection using SimpleCV - https://github.com/sightmachine/simplecv-examples/blob/master/code/motion-detection.py Unfortunetely i can't test it(because my OpenCv version isn't compatible with SimpleCV), but generally it looks fine (and isn't complicated) - it just substract previous frame from current and calculate mean of the result. If this value is bigger than some threshold (which most likely you will have to adjust) than we can assume that there were some motion between those 2 frames. Note that setting threshold to 0 is really a bad idea, because always there is some difference between 2 consecuitve frames (changes of lighting, noises, etc).
I searched many questions - but no one seems to be giving simplest, most uniform approach, hence please do not close as duplicate.
My requirement is simple: I have quiz app.
I want to include:
background music that plays continually - probably more than one
audio.
I need occassional sounds played at specific events - they
are very short in duration. Maybe 4-5 in number.
What sound format do I use? [aac etc]
How do I produce it? (optionally, get it from internet, if free)
What is the best approach to incorporate it? [audioplayback, openal etc)
Forgive me if this is quite stupid, but I am going very generic here and can't seem to find it.
Thanks for the help!
For sound format, use AAC or uncompressed 16-bit little endian in a CAF container (avoid mp3 since it's difficult to make it loop cleanly). You can convert using the command line tool 'afconvert':
Compressed:
afconvert -f caff -d aac sourcefile.wav destfile.caf
Uncompressed 16-bit:
afconvert -f caff -d LEI16 sourcefile.wav destfile.caf
For production, either record it yourself (using an audio program such as Audacity), get a professional to do it, or buy royalty free sounds/music.
To incorporate it, use AVAudioPlayer for music and OpenAL for sounds. OpenAL is difficult to use and doesn't decode compressed audio on its own, so you may want to use an audio library such as https://github.com/kstenerud/ObjectAL-for-iPhone