I was wondering if there was a tool similar to jCrop, with the exception that instead of an image I'd allow the user to crop an audio file? Google didn't give me any useful results sadly :(
The reason why I'm asking is that I'm making a tool to convert audio files to popular ringtone formats, and only letting the user specify the offsets in numbers is somewhat inconvenient. Obviously the tool doesn't have to be in javascript - anything that fits into a website is ok.
Here's a browser-based audio editor written in Flash that you could probably adapt (it supports cropping):
http://www.hisschemoller.com/2010/audio-editor-1-0/
One thing I found a bit confusing is that you have to hold down the play button on the editor to play the full sound.
Related
I am working on a project that involves using a lot of found audio clips (some new, some very old archival and poor quality etc).
I am trying to figure out a way to have all audio clips to be of a similar quality (if this is possible) and play at a similar volume?
I have use of both audacity and ableton...any suggestions would be great.
What you are asking for is commonly called normalization. There are several tools that can do it, including commandline tools and also audacity.
You'll find the tool in audacity under Effect > Normalize...
You can select multiple audio tracks.
You could also consider using a limiter and/or a compressor on your track. Have a look in the Live effect reference for more info on these: https://www.ableton.com/en/manual/live-audio-effect-reference/
The results will not be as good as applying normalization by hand, but it will be a lot quicker.
I'm using YouTube's "auto-generated" captions feature to generate transcripts of mp3 files. I do this by first converting the mp3 to a blank mp4, uploading to YouTube, waiting for the auto generated captions to appear, then extracting the SRT file.
The issue I'm having though is that a few of the mp3 files I've uploaded have been flagged as having copyrighted content, and as such no auto-generated captions have been made for them.
I have no desire to publish the mp3s on YouTube, they're uploaded as unlisted videos and all I require are the SRT files. Is there a way to manipulate the audio to bypass YouTube's content ID system? I've tried altering the pitch in Audacity, but it doesn't matter how subtle or extreme the pitch change is, they're still flagged as having copyrighted content. Is there anything else I can do to the audio other than adjusting the pitch that might work?
I'm hoping this post doesn't breach any rules on here, and I can't stress enough that I'm not looking to publish these mp3s, I just want the auto-generated SRTs.
No one can know how to cheat on Content ID
Obviously, as Content ID is a private algorithm developed by Google, no one can know for sure how do they detect copyrighted audio in a video.
But, we can assume that one of the first things they did was to make their algorithm pitch-independent. Otherwise, everyone would change the pitch of their videos and cheat on Content ID easily.
How to use Youtube to get your subtitles anyway
If I am not mistaken, Content ID blocks you because of musical content, rather than vocal content. Thus, to address your original problem, one solution would be to detect musical content (based on spectral analysis) and cut it from the original audio. If the problem is with pure vocal content as well, you could try to filter it heavily and that might work.
Other solutions
Youtube being made by Google, why not using directly the Speech API that Google offers and which most likely perform audio transcription on Youtube? And if results are not satisfying, you could try other services (IBM, Microsoft, Amazon and others have theirs).
I want to start on a hobby project that focuses on displaying audio files in a folder in a certain fashion and has the ability to play such an audio file and shows basic control options for playing. However, i'm struggling to find a fit programming language for this.
The displaying part shouldn't be too hard and can probably be done in most of the programming languages. The audio part is what concerns me the most since it's not the main focus of the project and should only do limited things (so it shouldn't be too hard) and i do not know anything about sound support in the programming languages i currently know. (Java, C and C++)
Specifically i would like to be able to do these things:
Play a sound file
Stop/pause a playing song
Adjust volume
Show a bar that displays the current position in the song
Most files will be .mp3 files but being able to process other formats is certainly a plus. Since this is just a small project it's ok if it runs just on Windows. Scalabilty would be nice but not required.
It would be nice to have a small overview of audio support/audio libraries of programming languages (i'm always up for something new) that can accomplish these simple things, in a not too complicated way, aswell as personal experiences.
In this way i hope to create a better understanding of which programming language fits my project best. (i would very much like to not have to change language mid-way the project)
--
Edit:
This is only for a later stage of the project if the first part was successfull: i will want to change the file names of the audio files that are displayed. (to make them follow a specific format)
I haven't written audio processing programs much, but I know a lot of them exist for C and C++. For Java perhaps, too, but I don't know Java. I had used audio with SDL in a game, but that doesn't have that many features and I don't recommend it.
There's this question asking for a library in C, and there are a couple of similar questions that SO brings up on the side. You may want to take a look at those.
You would also need to look for a library that loads different file types. SDL at least, only opens .wav files, which I believe most of the playback libraries would support. For MP3, you will most likely need an additional library. I know Audacity uses LAME Mp3 so I'm guessing that should be good.
Some of the functionalities you want is also doable by yourself. For example, knowing the length of the music and the amount you have already read, you will know how far in the audio you are. Adjusting the volume is also a multiplication (in the simplest case) that you can do on the audio data if the library doesn't provide it.
A very good choice seems to be PortAudio which is used by Audacity, and also recommended in the accepted answer of the question I mentioned above.
I've done audio apps in both Java and C++. Java development goes way faster because it's a more powerful language and has garbage collection, but JavaSound is a pretty awful solution for audio. Of course, there are wrappers for FFMPEG and other stuff, so you can get a lot of things working. Here's an example of a Java audio app: http://www.indabamusic.com/help/mantis
OTOH, C++ gives you lots of control, low latency and wealth of libraries. (another answer mentioned Portaudio, which is, indeed, great.) But you will definitely find it also has a much longer development cycle.
You can certainly do everything you want to do with either language.
I'm trying to develop an online application where the user writes some text and the software sings it back to the user.
I can currently generate the audio file with the words spoken by the computer using espeak, but I have no idea how to make it sound like a song, how to add rhythm to it.
I'm able to change the pitch and tempo using rubberband, but that's as far as I've gotten.
Does anyone have a clue how to make this happen?
If you want to use rubberband to change duration and pitch, then I think the hard part is going to be mapping from phonemes/syllables in the text to corresponding audio ranges in the speech systhesis output, for which I have no simple suggestion. (Ideally you'd get inside the speech synthesiser so that it would provide you with the mapping from phonemes to audio location.)
A simpler alternative might be to try Speech Synthesizer Markup Language - SSML. It has a "pitch" and "duration" elements that can absolutely specify pitch in Hz and duration in seconds. You can also specify volume, for controlling dynamics.
Given this, you could try to convert the text into a SSML document, and mark up words/syllables/phonemees with pitch/duration and volume attributes.
I've ended up using Festival's singing mode. It sounds reasonably well, except for the fact it only works with English voices.
There is a 2d-game based on Direct3D. This game has a lot of graphics and animations. What is the best way to extract animation image sequences from the running game (e.g. using memory dump)? Is there any special tools for such purposes?
Depending on what you call 'the best'
FRAPS - http://www.fraps.com/
Allows you to capture screen shots which you can edit the frames out of.
Alternatively you may be able to use graphical debugging tools like PIX (http://msdn.microsoft.com/en-us/library/bb173085(VS.85).aspx) to capture the graphical commands and pull the textures out directly (games often disable PIX support on release though).
Or, try and pull the images directly out of the files (they have to be loaded somewhere and file formats are usually pretty easy to reverse engineer).
NB: I'm assuming by 2D game you don't mean actually really mean 3D assets but 2D game play.
I don't know if it can work on full screen mode, but with a desktop screen recorder tool like CamStudio you can record the animation in uncompressed avi format.
With an extra tool for video processing you can do whatever you want with the captured frames.
there is a tool which can extract the resource files from many popular games and binary formats: Game file explorer.
Saves you the trouble of screen grabbing