I am currently trying to challange myself and write a little meeting tool in order to learn stuff. I am currently stuck at the audio part. My audio stack is working, each user sends an OPUS stream (20ms packets) to the server. I am not thinking about how I will handle the audio. The outcome shall be: All users receive all audio, but not the own audio (so he does not hear himself). I want the meeting to support as many concurrent users as possible.
I have the following ideas, but none feels quite right:
send all audio streams to all users, which would mean bigger traffic, mixing would be done on the client side
mix the audio on a per-user-basis, which means if I have n users I would need to include n encodings in each frame.
mix the whole audio together, for alle users, send it to all users, each user will receive a second opus package containing the "own" opus audio packet which was sent to the server (or it will be numbered and stored on the client side so it does not need retransmission). I dont know if after decoding I can remove the "own" audio from the stream without getting some unclean audio.
How is this normally done? Are there options I am missing? I have profiled all steps involved, the most expensive part is encoding (one 20ms encoding takes about 600ns), decoding and mixing need near to no-time at all (5-10ns for each step).
Currently I would prefer option 3, but I do not find informations if the audio will be clean or if it will result in washy audio or some cracking.
The whole thing is written in C++, but I did not include the tag since I dont need code examples, just informations on this topic. I tried googling a lot, read a lot of documentation of opus, but did not find anything related to this.
Related
I'm using YouTube's "auto-generated" captions feature to generate transcripts of mp3 files. I do this by first converting the mp3 to a blank mp4, uploading to YouTube, waiting for the auto generated captions to appear, then extracting the SRT file.
The issue I'm having though is that a few of the mp3 files I've uploaded have been flagged as having copyrighted content, and as such no auto-generated captions have been made for them.
I have no desire to publish the mp3s on YouTube, they're uploaded as unlisted videos and all I require are the SRT files. Is there a way to manipulate the audio to bypass YouTube's content ID system? I've tried altering the pitch in Audacity, but it doesn't matter how subtle or extreme the pitch change is, they're still flagged as having copyrighted content. Is there anything else I can do to the audio other than adjusting the pitch that might work?
I'm hoping this post doesn't breach any rules on here, and I can't stress enough that I'm not looking to publish these mp3s, I just want the auto-generated SRTs.
No one can know how to cheat on Content ID
Obviously, as Content ID is a private algorithm developed by Google, no one can know for sure how do they detect copyrighted audio in a video.
But, we can assume that one of the first things they did was to make their algorithm pitch-independent. Otherwise, everyone would change the pitch of their videos and cheat on Content ID easily.
How to use Youtube to get your subtitles anyway
If I am not mistaken, Content ID blocks you because of musical content, rather than vocal content. Thus, to address your original problem, one solution would be to detect musical content (based on spectral analysis) and cut it from the original audio. If the problem is with pure vocal content as well, you could try to filter it heavily and that might work.
Other solutions
Youtube being made by Google, why not using directly the Speech API that Google offers and which most likely perform audio transcription on Youtube? And if results are not satisfying, you could try other services (IBM, Microsoft, Amazon and others have theirs).
I have built a source client using Portaudio and LAME which streams the microphone input to an Icecast server to be listened to online via the HTML5 tag. I have managed to (supposedly) get the quality of the stream to MP3 320kbps at 44.1kHz and am looking for a way to confirm this using tests and or benchmarks.
I have an indication that these stats are somewhat correct from looking at stream inspectors in software such as iTunes and VLC, but I am looking to get a more in-depth data set.
What I basically want is to be able to test how much of the original file is being lost over the stream and if or how much the quality changes depending on environmental conditions of the broadcaster or streamer.
Does anyone know of any tools, frameworks to get some hard numbers or representations of this data?
If VLC tells you the stream is 320kbit CBR, then it is.
It sounds like what you're looking for is a comparison of the actual audio content. This is highly subjective. MP3 is built to use features of how our hearing works to save bandwidth. For example, quiet sounds are masked by loud sounds. High frequencies are harder to hear and are simply rolled off.
You can compare the spectral analysis between the original PCM-sampled waveform and the MP3 decoded waveform, but this doesn't tell you how humans interpret that sound. For that, you would have to survey humans.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I'm developing a music site which will stream audio files stored in a server to users, audio files will be played through flash player placed in a webpage..
As I heard I need to use a streaming media server for streaming audio files ( like 2mb to 3mb in size).. Do I need to use one?
I found some streaming media server softwares like http://www.icecast.org - but as in their documentation, It is used for streaming radio stations and live streaming purposes, but I just need to stream audio files faster and in low size (low bandwidth) with good quality..
I heard I need to encode the audio files first and then send them to listeners and in their end audio files need to be decoded again. Is that true? How can I do that? if I need to use a special web server, where should I host my files? Any good hosting providers?
if I host audio files in a normal web server, they will use HTTP or TCP to deliver my audio files to users/ listners but I found that HTTP and TCP are not good ways to use for multi media purposes like streaming audio and video files, and they are used for delivering HTML and stuff. I found I should use RSTP or UDP for streaming audio files.. What should I use?
I know that .MP3 files has much better quality than the other formats but it also gives huge size to the audio files.. which format should I use for audio files?
Most of the best quality audio files are more than 7mb so I'm planning to convert them my self using a software so I could get low size files with some level of good quality. If I'm converting my audio files what is the good BITRATE I should use for my files?
Any known best softwares for converting audio files while keeping
quality in a good level?
Note** - I know that I will not need complex requirements at the beginning of the site but I want to know the best ways like they are using for soundcloud.com
Here´s a reply from someone who actually runs a shoutcast radio station, is an audio-technician and web-designer. Below is knowledge gathered
from over 5000 hours of up-to-date research !
6)
Audio Software ?
You need to have software that can:
Convert to other bitrates and formats
Normalize the audiovolume to a same "normalized" level for all mp3´s. (-1 dB)
Cut-off silence at beginning and/or end.
Equalize the audio so it sounds good.
Add effects, Mix...etc.
Best,most-used, very solid and FREE is "Audacity"
5)
Good bitrate ?
If the bitrate is to high your listeners on slower connections wil suffer from "bufferunderuns"
ie: hickups / short breaks in the audio cause their connection cant keep up with the (to high) speed.
If its to low then the quality is no good.
Best choice is 128 kb/s it sounds good and wont cause underruns for most.
Best format is Mp3 since its the format that can be handled by most players and shoutcast-providers.
Using above your average filesize for a 4 Min track will be around 4 Mb.
Since Mp3 # 128kb/s is the most popular you will get the best price/quality-deal
from a shoutcast server provider .
5b)
Audio tagging ?
You did forget that one.
You need to make sure to have your audio-files "Tagged" ie: what is displayed in the
players as "Artist - Title" information is not taken from the filename..but instead from the (iD1/iD3) "Tag"
Best, most used, very solid and FREE software is: "mp3tag"
it can do "Bulk" also (a 1000 mp3´s at once)
http://www.mp3tag.de/en/
4)
Codec ?
You upload your files to a server in the format described above "Mp3 # 128 kb/s"
since its the most used format all players can play it.
Make sure you upload in the same format (above) as the output of the server
this will keep a (important) low processor-load on your server (it wont need to convert).
A Shoutcast-server (or other streamserver) will take take your separate mp3´s and convert them
into one single realtime stream, it will create multiple streams to multiple listeners (100´s).
It will also provide you with statistics (nr of listeners,from where,now playing,played before)
A listener can play it 2 ways:
a-From a embedded player embedded on your website.
b-Or by clicking a link on your websit which will open your stream in any (standalone) player
your visitor has installed ( Winamp, WindowsMediaPlayer, Realplayer, Quicktime, iTunes...etc)
A standalone will give best quality because it will have more/better audiocontrols (equalizer...etc)
Best practice is to offer BOTH a embedded player and a simple clickable link.
check out at least 20 radio-station-websites (both professional and amateurs)
to see how they do it.
Best , and free embedded-player right now is "jPlayer"
because its dual-mode (HTML5 / Flash) so ALL BROWSERS and ALL MOBILES will play it.
and its very well supported with a forum,tutorials...etc
http://www.jplayer.org
2)
Hosting providers ?
Google for "Shoutcast streaming" or "Shoutcast server"
compare 20 of them for best price / quality...research them again using Google.
They will have special shoutcast software (webbased) such as "Centova"
you control it from any browser, you can stream live to it...or create playlists that play unattended from the server while you sleep ("autodj")
You can create multiple playlists such that they will play at certain times/days/random...etc.
You could create your whole station based on autodj playlists only
like that you will not have to worry about your own upload-connection interrupting
and you can shutoff your own pc.
For autodj you want a shoutcast service with at least 5 Gb storage (mp3´s)
that will give you around 3 to 4 days music without repeats...using the playlists in a clever way
and taking into account that listeners will on average listen between 30 mins and 2 hours at certain times,..you can make sure that they will not hear the same tracks all the time.
If you insist to do "live" (realtime) broadcast (streaming) from your OWN computer (directly or via a stream-server-provider then most used software is "Sam broadcaster"
That is it...start with a good Shoutcast server provider, then built your website and create a clickable link to the stream, after that you do the embedded player.
To begin, let me clarify my understanding of your needs. Please add a comment and clarify in your question if these are wrong:
You intend to build a site that will play audio
Audio will not be one continuous stream, but will be made up of individual files
Your audio will generally be music
Now, on to your questions:
(1) As I heard I need to use a streaming media server for streaming audio files ( like 2mb to 3mb in size).. Do I need to use one?
(3A) if I host audio files in a normal web server, they will use HTTP or TCP to deliver my audio files to users/ listners but I found that HTTP and TCP are not good ways to use for multi media purposes like streaming audio and video files, and they are used for delivering HTML and stuff.
Nonsense. Streaming media servers, such as SHOUTcast/Icecast, are actually just HTTP servers that send content as it comes in from an encoder. The client doesn't know the difference between it and HTTP. Metadata is interleaved into the content stream at the client's request (made with a special request header), but it is still compatible with HTTP.
HTTP is a protocol that is good for transferring any type of content. Ever download something from a website? That would have been with HTTP.
If it's good enough for YouTube, Sound Cloud, Pandora, and just about everyone else, it's probably good enough for you as well, 'eh?
(3B) I found I should use RSTP or UDP for streaming audio files.. What should I use?
TCP is an underlying network protocol that ensures reliable transmission. Packets are received in the proper order, and are acknowledged so that any lost packets can be re-transmitted. There is some overhead with this. The reason UDP is sometimes used is that it provides lower latency at the cost of being unreliable. This is fine for telephony communications, but is pointless for media that is not time sensitive, such as a bunch of audio files coming from a server. In fact, if you get a few too many corrupt packets, your audio player will often simply stop decoding the file, and would need to be restarted.
RTSP is way overkill for your needs. It supports a bunch of stuff for media control, variying bitrate on the fly, etc. This is not appropriate for your situation. Perhaps if you were streaming live video, or lengthy content, this would be more appropriate.
(2) I heard I need to encode the audio files first and then send them to listeners and in their end audio files need to be decoded again. Is that true? How can I do that? if I need to use a special web server, where should I host my files? Any good hosting providers?
You need to pick a codec for encoding audio that the client supports. I assume you will be using HTML5 with a Flash fallback. Unfortunately, there is no codec available that is universally supported. See the chart here: http://html5doctor.com/html5-audio-the-state-of-play/#support
(4) I know that .MP3 files has much better quality than the other formats but it also gives huge size to the audio files.. which format should I use for audio files?
Check your assumptions at the door, you are very wrong here. Keep in mind that the raw PCM data is often 8 times larger than MP3 (depending on chosen bitrate of course). In any case, you will want to encode to AAC, MP3, and Vorbis for widest client compatibility. aacPlus is an extension of AAC and is generally considered the standard for decent quality audio at relatively low bitrates. A 128kbit stream in AAC will sound better than a 128kbit stream in MP3.
(5) Most of the best quality audio files are more than 7mb so I'm planning to convert them my self using a software so I could get low size files with some level of good quality. If I'm converting my audio files what is the good BITRATE I should use for my files?
This question is very subjective. Personally, as a musician and audiophile, I prefer to hear stuff in its original quality. I use FLAC for compressing my music library, as the quality is lossless. For your needs, this will take up way too much bandwidth. Most folks don't know the difference between a 128kbit MP3 and the original. Many "premium" internet radio stations offer 128kbit aacPlus and 256kbit MP3. Pandora offers 96kbit MP3 for regular users, and 192kbit MP3 for premium users. Experiment, and pick a set of bitrates that work well for you and users.
Always keep the original around. It doesn't have to be on your servers, but you need it. If you re-compress a file that was already lossy compressed, then you are losing additional quality. If you make 3 compressed versions of one source, make sure you're doing so from the original source.
(6) Any known best softwares for converting audio files while keeping quality in a good level?
If it is legal for you to use, take a look at FFMPEG. It can handle just about any codec you can think of. As a word of caution though, do look into it to make sure you are paying all of the license fees necessary. Some of the codecs contained within are patented. I'm not a lawyer, and have yet to be able to figure out the legalities of using them on a commercial site. All I know is that it is heavily debated.
I've been using http://www.yagosta.com for years for a music company client. Free service and SSssooooo easy. Requires NO tech knowledge. I haven't updated this site in several years but you can see what it looks like at the following link. They probably have plenty of new designs which you can customize too. Perfectly adequate for most requirements.
http://www.bluedotmusic.net/selector01.html
Want a player (easy enough to put up) that plays back a directory of mp3s in such a way that if you join at 3:33:33 pm, you hear what others hear, not track one. like a pseudo broadcast/stream. how do i achieve that - what looks nice / is probably minimizable / is easy?
i am trying to use mirvling but no such luck. any ideas?
It's unlikely you're going to find something to drop in place. Plus, this isn't typically handled on the client side of things. You neglected to specify what languages and what not that you are using, so I'll provide a general answer.
There are two methods to accomplish this.
Method 1: Encode the stream on the server
Basically with this, you create an audio stream on the server that is made up of the audio files being played back. The clients play an audio stream like any traditional "live" internet radio station, without knowledge of how the stream was created. You can use SHOUTcast/Icecast for the servers, and a number of different source stream encoders, such as Ices.
Method 2: Make the media available and let the clients figure it out
For this, you'll be starting from scratch. Have a JSON feed or similar served up that contains a playlist of the audio files that should be played and when. On the client side, you can use JWPlayer or similar, and seek to the desired position of the current track when it starts, and then play tracks in order from there.
I'm working on a project where we have many small audio files of around 500-600k. Then there are audio files of around 15M.
The 15M files are full narrated articles. The smaller ones are individual sentences within the article.
There are going to be many users and many articles in the future.
I want to be able to load the audio files relatively fast -- either through pre-loading or streaming or something of that nature. Basically if a user clicks on a button -- I want the audio to start more or less immediately.
What are my options here? Red5? Icecast?
EDIT:
I'd like to avoid flash if at all possible but not opposed to it -- I definitely can't use html5 audio as much as I'd like too.
I've already tried doing document onload to issue get requests for the files -- there are usually 15-20 per page. (19 small files, one big one). That doesn't seem to work as well as I thought it might.
In terms of latency -- I'm looking for push-button instant play -- right now I can count to 2 or 3 for the small files and 6-7 for the big one. Flash would be able to do this?
Streaming solutions such as Icecast are not appropriate here. All you need is simple HTTP.
You don't mention what you are playing these things on the client side with. If you are doing this in flash, it is relatively simple to preload or play while the download is still running.
For audio compression, you should be using MP3. For speech, you can easily get away with a lower bitrate. 48kbit 44.1kHz Mono is generally acceptable. This will load fine, even on decent mobile connections.
In any case, HTTP is the way to go. That way you can request the separate files easily. Icecast is for a single stream that runs for awhile, such as internet radio.
ok -- so i did some investigation and figured out what the competition was using
it was this:
http://www.schillmania.com/projects/soundmanager2/
basically what it does is try and use html5 audio tags with the ever so helpful 'preload=true' flag set and if it can't do that it fallsback on flash to preload the mp3