I am looking for an audio dsp library for cleaning up some speech (voice) recording. I have not decided which language to use yet.
Here are the feature I am looking for:
Work in Linux and Windows
Importing MP3
Working with multiple channels mixing
Noise Filter
Bandpass filter
Compressor
I love to have these as well, but I can write my own if they are not available:
De-esser
multi-band compressor
Expender
Envelopes
(if you can suggest an application that do these in scripting / one mouse click, I will accept your answer too)
What about something like SoX?? http://sox.sourceforge.net/
Take a look at Juce from Raw Material Software.
It is free for non-commercial use, and very reasonably priced for commercial use. it also has a lot of built in audio capabilities (mixing, file I/O, etc.) and has a nice cross platform GUI toolkit as well.
Audacity does most of those things.
Related
I'm looking for an audio processing language or library which will allow me to experiment with different synthesis techniques. I've looked at Processing which I think is great at what it does, but haven't found any inspiring (and simple) audio libraries.
As a baseline, I want to simply create my own sample buffers and play them back (ideally in realtime). As a plus, the ability to handle MIDI events would be great. I'm an experienced C++ programmer so I could do it natively on but had hoped there was a more DSL (domain specific language) approach.
I have access to Windows, Mac or Linux so not too bothered yet about platform. Other languages I can deal with are C#, Java & Python.
Thanks
James
Depending on how much you want to stay out of the low-level housekeeping details, you may want to look at CSound , or if you want to not actually write code, the patching-based system PureData is great to work with. As #Lou points out, ChucK is interesting (but was too buggy to use the last time I checked it out).
If you really do want to write code, look at the Synthesis Toolkit, a set of C++ classes for audio processing and synthesis.
For an app framework, I recommend JUCE, which has incredibly nice cross-platform handling of audio/midi IO and GUI elements.
Max MSP is an audio production tool that is highly expressive.
I guess you could say it's a high-level tool, and not a low-level programming language. My impression of it is that it's geared towards the technical musician or the artistic engineer, but anyway it kicks ass and you could go low-level with it if you want.
I've always been a big fan of SuperCollider. It's designed for Mac OS X but also works on Linux.
The language is mostly based on SmallTalk, and it's pretty easy to pick up if you understand the basics of functional programming. The quality of the sound output by the SC Server is very good and there is plenty of documentation both built into the app environment and available online.
One interesting point of SuperCollider is the usage on android devices, and it's intercommunication with python trough out other modules.
Here goes an example
I know you didn't say Ruby, but check out Archaeopteryx
https://github.com/gilesbowkett/archaeopteryx/wiki
or ChucK
http://chuck.cs.princeton.edu/
Have a look at NAudio, an open source .NET audio SDK for working with audio files and devices in Windows. Some features include:
http://naudio.codeplex.com/
NAudio Features:
Play back audio using a variety of APIs
Decompress audio from different Wave Formats
Record audio using WaveIn, WASAPI or ASIO
Read and Write standard .WAV files
Mix and manipulate audio streams using a 32 bit floating mixing engine
Extensive support for reading and writing MIDI files
Full MIDI event model
Basic support for Windows Mixer APIs
A collection of useful Windows Forms Controls
Some basic audio effects, including a compressor
I undertaking a personal project which involves the development of a system which will automatically generate audio thumbnail clips (about 30 seconds in length) from a full length track.
In order to do this I want to look at the energy and pitch of the audio to try and correctly identify its major structural features.
Is there any open source software available that can do energy/pitch extraction? If not I will start looking into alternative methods using MATLAB.
Thanks!
YAAFE (Yet Another Audio Feature Extractor) http://yaafe.sourceforge.net/ does audio feature extraction in MATLAB, Python and C.
You might want to look into the Echo Nest API. It has a lot of audio analysis capabilities, and I know there's a script bundled in the Remix package that can automagically turn songs into shorter or longer versions (I believe the script is called earworm).
Audacity may do it.
Try JAudio which can extract features from an audio.
MARSYAS contains bextract for analysis, can find MFCCs and various other timbral and spectral features. http://marsyas.info/
Is there a mature library that could enable audio input and output and work within Haskell? (A nice wrapper is fine, of course.)
I'm looking for something that can easily capture microphone input and, perhaps, play various audio files as well.
easily capture microphone input and, perhaps, play various audio files as well..
It will strongly depend on your OS platform: there are standard C libraries for this functionality on each OS, and you'll be looking for Haskell bindings to them (e.g. PulseAudio, etc). Look in the Sound category on Hackage:
http://hackage.haskell.org/packages/archive/pkg-list.html#cat:sound
E.g. HSndFile for audio file writing, http://hackage.haskell.org/package/HSoundFile
the module pulse-simple exposes bindings to capture sound from the microphone, see the second example at the top of the page;
https://hackage.haskell.org/package/pulse-simple-0.1.13/docs/Sound-Pulse-Simple.html
pulse audio libraries required by cabal are obtainable via cygwin (search "pulse" in the cygwin installer).
there is a also binding to sox, which looks promising.
https://hackage.haskell.org/package/sox
im sure there are other api wrappers to be found in hackage sound category.
for linux there is a binding to jack, has "unix" as a dependency, it WILL NOT build on windows...
Just in case you're not familiar with hackage: http://hackage.haskell.org/packages/archive/pkg-list.html
It looks like there is some audio-related stuff there. Not sure if there is anything that will meet your needs. But most "mature" haskell libraries will be there.
You can do it with OpenAL and ALUT. I managed to install both on Windows 8, although it wasn't exactly effortless; ALUT requires the underlying C library to be compiled manually into a DLL.
Installing OpenAL - on the other hand - is as simple as downloading the SDK and typing cabal install OpenAL in the command prompt.
With ALUT, you can create OpenAL buffers from audio files (including WAV) and memory views.
I found an example of recording and audio playback here. It should be fairly straightforward to adapt the code to your needs.
Let me know if I left something out and I'll try to elaborate.
I'm looking for something like paint.net or Gimp, but for audio files, and runs on windows.
Audacity is fantastic
As already mentioned, Audacity is fantastic. If you're looking to batch convert sound files at the command line, check out mencoder and (for MP3s only) LAME.
Audacity is painfully limited. If you are looking to do something a little more complicated. You should check out Reaper. It has a 60 day free trial and if you are still doing editing and recording after that long, the price to buy is extremely cheap when compared to other fully functional editing software. Pro Tools is crazy expensive.
Personally I use REAPER for "complex" tasks (tracking, mastering, batch processing) and Audacity for basic cutting/normalizing/exporting to MP3.
If you need free alternative for Audacity for basic mono or stereo file processing, you may try Wavosaur or other software that has realtime previews.
NCH Wavepad is also the best application for the audio. it's easy to understand
Reaper has always worked very well for me. It is free to download and try for 60 days and requires a cheap license after that (there are two different price options). Highly recommended.
Here is the website link:
http://www.reaper.fm/
For a project we're currently working on, we need a library of spoken words in many different languages.
Two options seem possible: text-to-speech or "real" recordings by native speakers. As the quality is important to us, we're thinking about going the latter path.
In order to create a prototype for our application, we're looking for libraries that contain as many words in different languages as possible. To get a feeling for the quality of our approach, this library should not be made up of synthesized speech.
Do you know of any available/accessible libraries?
A co-worker just found this community based library, which is nice, but rather small in size:
Forvo.com
I've just found this on the Audacity wiki: VoxForge. From their site:
VoxForge was set up to collect transcribed speech for use with Free and Open Source Speech Recognition Engines (on Linux, Windows and Mac).
We will make available all submitted audio files under the GPL license, and then 'compile' them into acoustic models for use with Open Source speech recognition engines such as Sphinx, ISIP, Julius and HTK (note: HTK has distribution restrictions).
There is also Old time radio, not sure if this is the sort of spoken word you're after though.
My guess is that you won't find a library anywhere that consists of just individual words. Whatever you find, you're going to have to open the audio up in an editor (like Pro Tools or Cool Edit) and chop it up into individual words.
You would probably be better off creating a list of all the words you need for each language, and then finding native speakers to read them while you record. You can have them read slowly, so that you'll have an easy time chopping up each individual word.
One I use to use a lot: http://shtooka.net/index.php
Easy access to the recordings.