How to anonymize (mask) audio (human voice) using javascript - audio

I'm hoping to record the audio of some stories from remote study participants via web browsers. I would like to give them an option of anonymizing their voices before they submit their audio clips. Is there a way to do that in Javascript (or any other library--for example, Python--that I can invoke in the background on the server before serving it back to the participant to verify before they submit?
This youtube video comes really close to what I would like to accomplish. Thanks in advance for your suggestions and advice!

You can use the PitchShifter object of SoundTouchJS to change the 'key' of the input, and even the playback rate if necessary. It might be helpful to run further convolvers against the AudioNode as well, to futher anonamize it.


Detect different speakers in an audio recording

I want to make an application that counts the speaking time of each speaker in an audio recording. I don't care about doing full voice recognition and transcribing every word in the recording, I just want the speaking time of each voice.
Is there a piece of software that provides such feature?
If possible, I would like to avoid using a third-party service (such as Google Cloud) to achieve this, and I would like the solution to be light enough to run on a modern smartphone.
Thank you for your help.
I had the same idea. Check this out
Haven't tried it myself yet. Will add an edit after.

Mixing OPUS Audio on a server for a meeting

I am currently trying to challange myself and write a little meeting tool in order to learn stuff. I am currently stuck at the audio part. My audio stack is working, each user sends an OPUS stream (20ms packets) to the server. I am not thinking about how I will handle the audio. The outcome shall be: All users receive all audio, but not the own audio (so he does not hear himself). I want the meeting to support as many concurrent users as possible.
I have the following ideas, but none feels quite right:
send all audio streams to all users, which would mean bigger traffic, mixing would be done on the client side
mix the audio on a per-user-basis, which means if I have n users I would need to include n encodings in each frame.
mix the whole audio together, for alle users, send it to all users, each user will receive a second opus package containing the "own" opus audio packet which was sent to the server (or it will be numbered and stored on the client side so it does not need retransmission). I dont know if after decoding I can remove the "own" audio from the stream without getting some unclean audio.
How is this normally done? Are there options I am missing? I have profiled all steps involved, the most expensive part is encoding (one 20ms encoding takes about 600ns), decoding and mixing need near to no-time at all (5-10ns for each step).
Currently I would prefer option 3, but I do not find informations if the audio will be clean or if it will result in washy audio or some cracking.
The whole thing is written in C++, but I did not include the tag since I dont need code examples, just informations on this topic. I tried googling a lot, read a lot of documentation of opus, but did not find anything related to this.

Make microphone record system audio output

So, yesterday, i was calling with my friend and i wanted to play him an sound from a game, but i had no idea how, because he couldn't hear mine sound. So, is there a way to direct all the system audio to record on the microphone? And also keep the normal output so i can hear it. Thanks for help!
And also, please make your answer noob-friendly, because i am pretty bad with this, thanks!
Other stuff:
I want to direct the output into the microphone, not just the calling program (it's Skype by the way).
Please don't answer that i should put the output to the speakers and record it like it, because i don't even have speakers... just headphones.

mp3 website player with synchronized playback (not streaming)

Want a player (easy enough to put up) that plays back a directory of mp3s in such a way that if you join at 3:33:33 pm, you hear what others hear, not track one. like a pseudo broadcast/stream. how do i achieve that - what looks nice / is probably minimizable / is easy?
i am trying to use mirvling but no such luck. any ideas?
It's unlikely you're going to find something to drop in place. Plus, this isn't typically handled on the client side of things. You neglected to specify what languages and what not that you are using, so I'll provide a general answer.
There are two methods to accomplish this.
Method 1: Encode the stream on the server
Basically with this, you create an audio stream on the server that is made up of the audio files being played back. The clients play an audio stream like any traditional "live" internet radio station, without knowledge of how the stream was created. You can use SHOUTcast/Icecast for the servers, and a number of different source stream encoders, such as Ices.
Method 2: Make the media available and let the clients figure it out
For this, you'll be starting from scratch. Have a JSON feed or similar served up that contains a playlist of the audio files that should be played and when. On the client side, you can use JWPlayer or similar, and seek to the desired position of the current track when it starts, and then play tracks in order from there.

How to check that ads are being played, before the real video plays?

I am working on a site, which airs ads before the real video plays.
The business requirement is that the ads should play before the video plays.
I am Using watir for testing. can you help me in this regard.
You may want to investigate Sikuli I've seen other threads where people were using it in combination with watir to work with things like flash. However, since it works based on visual recognition, I expect it would not work at all with video (a changing image that might only be 'right' for a fraction of a second) while it is playing unless there is some aspect of the screen that is relatively static that could be used to know the video play is in progress. See this blog posting for more info
