How to find time difference/lag between two audio signals? - audio

I need to find the time difference/lag between two audio signals.
Background Info: It is a sample multimedia application, which has a server and a client. A .mp3 file will be sent as streaming data to client. Two TI boards are used one as server and other as client. The server board will also listen, and client will also listen. Here I need to find difference between these two audio signals from remote and local client.
OS used : Linux
I found Audacity tool to capture signals. whether it has capability to capture the two signals and find difference between them ?
(or)
whether cross correlation has to performed to find difference between them ?
Any ideas are welcome!
Thank you.

Related

How to stream webcam and record at the same time on linux?

I am trying to run a webcam as a IP cam using a Single Board Computer like raspberry. I trying to stream on a web browser and hoping to record the stream on the SBC as well and not on the client side.
I found out that "mjpg streamer" can be used to either stream or record. Anyway to do this both at the same time. Is there any program that can do that on raspbian or armbian? Can I also limit the file size as well?
I found out that the program motion is very suitable for this use.
Hope this answer helps anyone looking for the same thing.
to elaborate more.
The program I used is motion
https://motion-project.github.io/motion_config.html
ALl you need is just to configure the configuration file and start motion, It works what I needed. I can record and stream it at the same time. you can tweak it to your needs.

Getting data from audio mixer

I am trying to build an open-source in-ear monitoring system. I have created the UI and was wondering how I would get the channels that are on an audio mixing console so that I can edit the channels and stream them to each musician. Is there a certain protocol that all the mixers use? You can find the project at https://gitlab.com/openstagemix. We would love to have contributors.
I can't really test whether this is the correct answer as I am trapped in my house during the coronavirus time. But, all mixers use something called OSC which is a protocol between mixers, synthesizers, etc. to computers. You can find more information here http://opensoundcontrol.org/introduction-osc.
Update:
It's neither! I am going to use the AES67 standard to receive information from my mixer and with that process the audio. This is because my mixer is ethernet capable.

Convert voip audio to text for debugging

While working on voip apps, I usually end up picking up one phone, talking to it, picking up the other phone and check if I hear myself. This even gets trickier if I'm doing apps with three way calling.
Using a softphone doesn't help.
Ideally, I want to be able to run multiple instances of some command line based sip ua wherein i can dial a number. Once the ua has dialed and the other party ha picked up, both agents exchange audio. But instead of having to hear some audio, the apps instead display some text which identifies the other end. Possibly some frequency pattern that can be converted to text. Then this text is displayed on the app.
Can something like this be done? I'm creating apps against freeswitch. Ideas how to debug voip apps are also welcome in the comments.
yes, absolutely. The easiest would be to have a separate FreeSWITCH server that is used for placing the test calls and sending/receiving your test signals.
tone_stream will generate the tones at frequencies that you need: https://freeswitch.org/confluence/display/FREESWITCH/Tone_stream
tone_detect can detect the frequencies and execute actions, or even better, generate events that you can catch over an ESL socket: https://freeswitch.org/confluence/display/FREESWITCH/mod_dptools%3A+tone_detect
The best way to generate such calls is to use a dialer script which communicates to FreeSWITCH via Event Socket. Here you can see some (working) examples that I made with Perl:
https://github.com/voxserv/rring/blob/master/lib/Rring/Caller/FreeSWITCH.pm -- this is a part of a test suite tat I build for testing a provider's SIP infrastructure. As you can see, it connects to FreeSWITCH, starts event listener, and then originates the call and also expects an inbound call. It then sends and analyzes DTMF.
https://github.com/voxserv/freeswitch-helper-scripts/tree/master/esl -- these are special-purpose dialers, you can also use them as examples.
https://github.com/voxserv/freeswitch-perf-dialer -- this one generates a series of calls, like SIPp does.
Another technique is to play a sample audio file and record the audio being received on the other end[call recording] and then comparing the two. This system works on setup where systems are located at various places and you are testing end to end quality.
There are lot of Audio Comparison tools [like PESQ] should help you not just detect the presence of Audio but also give stats about the degradation of various parameters in the audio stream.
This can be extended to do test analysis of FS patches as and when they are released and also for other hooks or quality standards you want to enforce.

What libraries/APIs allow me access real time audio waveforms of a phone call?

I am looking to build an app that needs to process incoming audio on a phone call in real time.
WebRTC allows for this but i think this works only in their browser based P2P audio communications functionality but not for phone calls/ VOIP.
Twilio and Plivo allow you record the audio for batch/later processing.
Is there a library that will give me access to the audio streams in real time? If not, what would I need to build such a service from scratch?
Thanks
If you are open to using a media server (so that the call is not longe P2P but it's mediated by the media server using a B2B model), then perhaps the Kurento Media Server may solve your problem. Kurento Media Server makes possible to create processing capabilities which are applyied in real time onto the media streams. There are many examples in the documentation of computer vision and augmented reality algorithms applied in real time over the video streams. I've never seen an only-audio processing module, but it should be simple to implement just by creating an additional module, which is not too complex if you have some knowledge about C/C++ and media processing concepts.
Disclaimer: I'm part of the Kurento development team.

API for manipulating audio output in windows 8

I want to manipulate audio output data, for all the different running applications, before it is sent to the speakers.
Turn the volume up or down, filter the audio, things like that.
How can I gain access to the audio output in real time?
Is there a way to not depend on the audio driver interface?
Thanks! :)
Windows Store apps allow you to use WASAPI. In WASAPI, there is a concept of "audio sessions", of which there is one for every stream of audio being sent to the soundcard. You can enumerate the audio sessions which give you access to IAudioSessionControl. However, this doesn't let you manipulate the audio, which as far as I know WASAPI simply doesn't allow. The best you can hope for is to get hold of ISimpleAudioVolume for each session, but last time I tried that, I found that you couldn't get hold of the session GUIDs you needed to adjust the volume for other processes. You may be able to get hold of the audio endpoints and adjust the master volume for the soundcard.
In short, WASAPI is the most powerful audio API for Windows Store apps but unfortunately I don't think it will let you do very much of what you are asking here.

Resources