Find each word in audio files - linux

I have watches a way of doing this with audacity by using sound finder option. But since audacity is only gui it cannot be used with terminal commands. So is there a program that does the same work but in command interface like sox for example.

sox input.wav slice.wav silence 1 1.0 2% 1 3.0 2% : newfile : restart

Currently, Audacity does have experimental support for scripting.
The scripting module is an experimental GUI plug-in that allows Audacity to be driven from an external Perl or Python script. Commands are sent to Audacity over a named pipe. Any scripting language that supports named pipes can be used. External scripting is one of several ways to extend the functionality of Audacity.
Scripting support is currently considered experimental and is mainly intended for use by developers for the time being.
Feel free to try it out, but don't be too surprised if there are problems, or if the details of commands change between versions of Audacity.
Currently Windows or Linux are recommended. Mac requires more work to get anything useful at all.
There is a fuller list of limitations at the foot of this page.
Specifically, try out the ‘Menu Command’ option:
https://manual.audacityteam.org/man/scripting.html#MenuCommand

Related

How do I play a wav file from a Free Pascal application running on Linux?

I have a multi-platform application written in Free Pascal. This application plays a short sound on some event. On Windows, I can do this by MMSystem and sndPlaySound('sound.wav'). However, I don't know how to do this on Linux without external libraries.
I have a solution to play it with SDL and OpenAL, but I don't want any dependency on these libraries to play one short sound. Does there exist a Linux command line player that exists on most distros by default? The file format doesn't matter; I will convert it.
mplayer is command line and graphical. You can start it on tty and pty.
You could try aplay, but that has a dependency on ALSA. Maybe sox?
The program mplayer - "the movie player" gives you the option to use a graphical user interface or to use the console. So i would imagine it has a solution to your problem.
Are you looking to BEEP, BLEEP and BOOP and BOP ( and low frequency fart) ? Use sox. If youre looking to play a file: use sox or SDL.
You need a for looped array to get a sort-of piano effect, like a song. Its ugly, messy, and cant be tweaked much like the ole PC speaker, but its passable.
Beep is probably want you want, tho. Install the package, put one on your motherboard(YEAH...no hookup? use sox), and enable the pcspkr module. (On ubuntu its blacklisted by default.) If BEEP produces nothing, try sox.
At least youll have something. Yes, you can check for loaded modules and installed packages. I believe Ive done both.

Crossplatform Offline Audio Processing Library

I am looking for an audio dsp library for cleaning up some speech (voice) recording. I have not decided which language to use yet.
Here are the feature I am looking for:
Work in Linux and Windows
Importing MP3
Working with multiple channels mixing
Noise Filter
Bandpass filter
Compressor
I love to have these as well, but I can write my own if they are not available:
De-esser
multi-band compressor
Expender
Envelopes
(if you can suggest an application that do these in scripting / one mouse click, I will accept your answer too)
What about something like SoX?? http://sox.sourceforge.net/
Take a look at Juce from Raw Material Software.
It is free for non-commercial use, and very reasonably priced for commercial use. it also has a lot of built in audio capabilities (mixing, file I/O, etc.) and has a nice cross platform GUI toolkit as well.
Audacity does most of those things.

Capturing audio input from microphone, with Haskell?

Is there a mature library that could enable audio input and output and work within Haskell? (A nice wrapper is fine, of course.)
I'm looking for something that can easily capture microphone input and, perhaps, play various audio files as well.
easily capture microphone input and, perhaps, play various audio files as well..
It will strongly depend on your OS platform: there are standard C libraries for this functionality on each OS, and you'll be looking for Haskell bindings to them (e.g. PulseAudio, etc). Look in the Sound category on Hackage:
http://hackage.haskell.org/packages/archive/pkg-list.html#cat:sound
E.g. HSndFile for audio file writing, http://hackage.haskell.org/package/HSoundFile
the module pulse-simple exposes bindings to capture sound from the microphone, see the second example at the top of the page;
https://hackage.haskell.org/package/pulse-simple-0.1.13/docs/Sound-Pulse-Simple.html
pulse audio libraries required by cabal are obtainable via cygwin (search "pulse" in the cygwin installer).
there is a also binding to sox, which looks promising.
https://hackage.haskell.org/package/sox
im sure there are other api wrappers to be found in hackage sound category.
for linux there is a binding to jack, has "unix" as a dependency, it WILL NOT build on windows...
Just in case you're not familiar with hackage: http://hackage.haskell.org/packages/archive/pkg-list.html
It looks like there is some audio-related stuff there. Not sure if there is anything that will meet your needs. But most "mature" haskell libraries will be there.
You can do it with OpenAL and ALUT. I managed to install both on Windows 8, although it wasn't exactly effortless; ALUT requires the underlying C library to be compiled manually into a DLL.
Installing OpenAL - on the other hand - is as simple as downloading the SDK and typing cabal install OpenAL in the command prompt.
With ALUT, you can create OpenAL buffers from audio files (including WAV) and memory views.
I found an example of recording and audio playback here. It should be fairly straightforward to adapt the code to your needs.
Let me know if I left something out and I'll try to elaborate.

How do I De-Ess a sound file with SoX?

I am using SoX to create slow but pitch corrected audio files. The resulting files sound pretty good, but often have a very hard "S" sound that I would like to filter out. Many desktop programs include a "De-Essing" filter that works well, but I would like to have a filter that works on the server side.
What SoX filter and parameters should I use to De-Ess an audio file?
Edit: I should add that this needs to work on Linux.
There is a LADSPA DeEsser plugin that can be used from SoX. You need to have tap plugins installed and properly configured on your system. On Archlinux this can be easily achieved with
pacman -S tap-plugins
You can specify threshold and frequency as first and second arguments. I succesfully used a variant of the following command
# -30: threshold (dB)
# 6200: hiss frequency (Hz)
sox from.wav to.wav ladspa tap_deesser tap_deesser -30 6200
The filter has a fistful of other options I did not analyzed. More details can be found here.
While far from perfect, you may be able to get sufficient results by a suitable low-pass filter. That should not affect other parts of a speech signal too much.
You could use a de-esser VST such as spitfish and a command-line VST host such as MissWatson. Sox has very limited plugin support, so if you need something more specific, you're better off going the VST route.

How to programmatically create videos?

Is there a freely available library to create a MPEG (or any other simple video format) out of an image sequence ?
It must run on Linux too, and ideally have Python bindings.
I know there's mencoder (part of the mplayer project), and ffmpeg, which both can do this.
ffmpeg is a great (open source) program for building all kinds of video, and converting one type of video (a sequence of images in this case) into other types of video.
Usually it is utilized from the command line, but that is really just a wrapper around its internal libraries. It is expressly available to be used from within another program.
There are also python bindings that wrap the c api, though this particular project doesn't seem to be getting the best support (there are probably other projects out there doing the same thing).
There's also this link where someone has used ffmpeg to do something similar to what you're looking for.
GStreamer is a popular choice. It's a full multimedia framework much like DirectShow or QuickTime, has the advantage of having legally licensed codecs available, and has excellent Python bindings.
in c++ OpenCV (open source Computer Vision library from Intel) let you create an AVI file and just push frames inside...
but it's like shooting with a cannon to a fly.
Not a library, but mplayer has the ability to encode JPEG sequences to any kind of format. It runs on Linux, Windows, BSD and other platforms and you can write a python script if you want to use it with python.
ffmpeg has an API and also python bindings, seems to be the way to go !
Thanks
ffmpeg minimal runnable C example
I have provided a full runnable example at: How to resize a picture using ffmpeg's sws_scale()?

Resources