Convert OPUS stream to PCM or RTP - voip

Currently, I am developing a VOIP system. I can send the SIP stream to the mumble server, after converting the RTP stream to PCM and the voice quality is fine in mumble client. But, the Mumble stream (OPUS) to SIP is not working, always getting noise. I am converting the mumble channel's stream (OPUS) to RTP and sending it to the SIP client. Is there any library or source code to find out the RTP problem, that is not receiving by SIP?
N.B: After converting The receiving stream into a .wav file, the voice quality is ok, but not working in the SIP client.
OPUS Decoded Code:
public byte[] SendDecodeMumbleDataToMulticast(byte[] encodedData, int uid, int cid)
{
var _decoder = new OpusDecoder(8000, 1) { EnableForwardErrorCorrection = true };
if (encodedData == null)
{
//_decoder.Decode(null, 0, 0, new byte[_sampleRate / _frameSize], 0);
return null;
}
int samples = OpusDecoder.GetSamples(encodedData, 0, encodedData.Length, 8000);
if (samples < 1)
{
return null;
}
byte[] dst = new byte[samples * sizeof(ushort)];
int length = _decoder.Decode(encodedData, 0, encodedData.Length, dst, 0);
if (dst.Length != length)
{
Array.Resize(ref dst, length);
}
RTPPacket rtp = ToRTPPacket(dst, 16, 1);
OnRTPReceived?.Invoke(rtp);
return dst;
}

Related

Video call using PJSUA

I'm using pjsua to create a video call from a monitor to a phone. I'm able to establish an audio call without problem, but if I try to establish a video call (vid_cnt=1), I'm getting an error.
My purpose is to get and save the audio and video of the phone.
This is my configuration:
void hard_account_config(pjsua_acc_config& acc_cfg, pjsua_transport_id transport_tcp) {
pjsua_acc_config_default(&acc_cfg);
acc_cfg.ka_interval = 15;
// VIDEO
acc_cfg.vid_in_auto_show = PJ_TRUE;
acc_cfg.vid_out_auto_transmit = PJ_TRUE;
acc_cfg.vid_cap_dev = VideoCaptureDeviceId();
acc_cfg.vid_wnd_flags = PJMEDIA_VID_DEV_WND_BORDER | PJMEDIA_VID_DEV_WND_RESIZABLE;
acc_cfg.reg_timeout = 300;
acc_cfg.use_srtp = PJMEDIA_SRTP_DISABLED;
pjsua_srtp_opt_default(&acc_cfg.srtp_opt);
acc_cfg.ice_cfg_use = PJSUA_ICE_CONFIG_USE_CUSTOM;
acc_cfg.ice_cfg.enable_ice = PJ_FALSE;
acc_cfg.allow_via_rewrite = PJ_FALSE;
acc_cfg.allow_sdp_nat_rewrite = acc_cfg.allow_via_rewrite;
acc_cfg.allow_contact_rewrite = acc_cfg.allow_via_rewrite ? 2 : PJ_FALSE;
acc_cfg.publish_enabled = PJ_TRUE;
acc_cfg.transport_id = transport_tcp;
acc_cfg.cred_count = 1;
acc_cfg.cred_info[0].username = pj_string(USER);
acc_cfg.cred_info[0].realm = pj_string("*");
acc_cfg.cred_info[0].scheme = pj_string("Digest");
acc_cfg.cred_info[0].data_type = PJSIP_CRED_DATA_PLAIN_PASSWD;
acc_cfg.cred_info[0].data = pj_string(PASS);
}
Once registration is completed, I run the following code:
prn("=== Test Call ===");
pj_str_t uri = pj_string("sip:" + call_target + "#" + SERVER);
pjsua_call_id call_id;
pjsua_call_setting call_setting;
pjsua_call_setting_default(&call_setting);
call_setting.flag = 0;
call_setting.vid_cnt = PJMEDIA_HAS_VIDEO ? 1 : 0;
pjsua_msg_data msg_data;
pjsua_msg_data_init(&msg_data);
pj_status_t status = pjsua_call_make_call(acc_id, &uri, &call_setting, NULL, &msg_data, &call_id);
if (status != PJ_SUCCESS) {
prn("Error trying: pjsua_call_make_call");
return;
}
I know that PJMEDIA_HAS_VIDEO is equal to 1 on the conf_site.h and pjsua_call_make_call return PJ_SUCCESS.
I've seen that if I have headphones connected, there is no problem. But if I disconnect them, the following error is shown:
#pjsua_aud.c ..Error retrieving default audio device parameters: Unable to find default audio device (PJMEDIA_EAUD_NODEFDEV) [status=420006]
If I connect the headphones, I enable the video and run my code, the following error is shown:
#pjsua_media.c ......pjsua_vid_channel_update() failed for call_id 0 media 1: Unable to find default video device (PJMEDIA_EVID_NODEFDEV)
So, using PJSUA it is necessary to have audio and video devices on the monitor and phone? Should I create virtual ports if I don't have the devices?
You can use the following code to get a list of audio/video devices in PJSUA, which will most likely provide you with a loopback device (among others).
pjmedia_aud_dev_info audio_device[64];
unsigned int audio_device_cnt = 64;
status = pjsua_enum_aud_devs(audio_device, &audio_device_cnt);
printf("There are %d audio devices\n", audio_device_cnt);
for (int i = 0; i < audio_device_cnt; i++) {
printf("%d: %s\n", i, audio_device[i].name);
}
pjmedia_vid_dev_info video_device[64];
unsigned int video_device_cnt = 64;
status = pjsua_vid_enum_devs(video_device, &video_device_cnt);
printf("There are %d video devices\n", video_device_cnt);
for (int i = 0; i < video_device_cnt; i++) {
printf("%d: %s\n", i, video_device[i].name);
}
I have not personally tried capturing a loopback audio device but for video, PJSUA provides an internal colorbar generator (Colorbar generator in this list), which you can use.
Once you find the indices of loopback or dummy audio/video devices you want to use, you can set them by using
pjsua_set_snd_dev(<YOUR DUMMY CAPTURE DEVICE>, <YOUR DUMMY PLAYBACK DEVICE>);
acc_cfg.vid_cap_dev = <YOUR VIDEO CAPTURE DEVICE>;

get audio stream from server http request and play the stream in browser

I have C# api that get Audio stream and play it on my pc. (using localhost)
var response = (HttpWebResponse)request.GetResponse();
using (Stream receiveStream = response.GetResponseStream())
{
PlayWav(receiveStream, false);
}
public static void PlayWav(Stream stream, bool play_looping)
{
if (Player != null)
{
Player.Stop();
Player.Dispose();
Player = null;
}
if (stream == null) return;
Player = new SoundPlayer(stream);
if (play_looping)
Player.PlayLooping();
else
Player.Play();
}
This works fine, However I need to use the function PlayWav in the client :
To get the audio stream from the server and play it in the client browser. (I don't want to save it , just to play it)
I tried using ajax/fetch ,put it on
I tried ReadableStream , I tried SoundPlayer..
I am confused and not sure how to handle audio streams.
Nothing works.
I write javascript on client side.
What am I doing wrong ?

Distorted microphone audio when the loudspeaker is enabled (Xamarin.iOS)

I am maintaining a Push-to-talk VoIP app.
When a PTT call is running the app create an audio session
m_AudioSession = AVAudioSession.SharedInstance();
NSError error;
if (!m_AudioSession.SetCategory(AVAudioSession.CategoryPlayAndRecord, AVAudioSessionCategoryOptions.DefaultToSpeaker | AVAudioSessionCategoryOptions.AllowBluetooth, out error))
{
IOSErrorLogger.Log(DammLoggerLevel.Error, TAG, error, "Error setting the category");
}
if (!m_AudioSession.SetMode(AVAudioSession.ModeVoiceChat, out error))
{
IOSErrorLogger.Log(DammLoggerLevel.Error, TAG, error, "Error setting the mode");
}
if (!m_AudioSession.OverrideOutputAudioPort(AVAudioSessionPortOverride.Speaker, out error))
{
IOSErrorLogger.Log(DammLoggerLevel.Error, TAG, error, "Error redirecting the audio to the loudspeaker");
}
if (!m_AudioSession.SetPreferredIOBufferDuration(0.06, out error)) // 60 milli seconds
{
IOSErrorLogger.Log(DammLoggerLevel.Error, TAG, error, "Error setting the preferred buffer duration");
}
if (!m_AudioSession.SetPreferredSampleRate(8000, out error)) // kHz
{
IOSErrorLogger.Log(DammLoggerLevel.Error, TAG, error, "Error setting the preferred sample rate");
}
if (!m_AudioSession.SetActive(true, out error))
{
IOSErrorLogger.Log(DammLoggerLevel.Error, TAG, error, "Error activating the audio session");
}
The received audio is played using the OutputAudioQueue and the microphone audio is captured (as mentioned in the Apple Doc: https://developer.apple.com/documentation/avfaudio/avaudiosession/mode/1616455-voicechat) using a Voice-Processing I/O Unit.
The initialization code for Voice-Processing I/O Unit is:
AudioStreamBasicDescription audioFormat = new AudioStreamBasicDescription()
{
SampleRate = SAMPLERATE_8000,
Format = AudioFormatType.LinearPCM,
FormatFlags = AudioFormatFlags.LinearPCMIsSignedInteger | AudioFormatFlags.LinearPCMIsPacked,
FramesPerPacket = 1,
ChannelsPerFrame = CHANNELS,
BitsPerChannel = BITS_X_SAMPLE,
BytesPerPacket = BYTES_X_SAMPLE,
BytesPerFrame = BYTES_X_FRAME,
Reserved = 0
};
AudioComponent audioComp = AudioComponent.FindComponent(AudioTypeOutput.VoiceProcessingIO);
AudioUnit.AudioUnit voiceProcessing = new AudioUnit.AudioUnit(audioComp);
AudioUnitStatus unitStatus = AudioUnitStatus.NoError;
unitStatus = voiceProcessing.SetEnableIO(true, AudioUnitScopeType.Input, ELEM_Mic);
if (unitStatus != AudioUnitStatus.NoError)
{
DammLogger.Log(DammLoggerLevel.Warn, TAG, "Audio Unit SetEnableIO(true, AudioUnitScopeType.Input, ELEM_Mic) returned: {0}", unitStatus);
}
unitStatus = voiceProcessing.SetEnableIO(true, AudioUnitScopeType.Output, ELEM_Speaker);
if (unitStatus != AudioUnitStatus.NoError)
{
DammLogger.Log(DammLoggerLevel.Warn, TAG, "Audio Unit SetEnableIO(false, AudioUnitScopeType.Output, ELEM_Speaker) returned: {0}", unitStatus);
}
unitStatus = voiceProcessing.SetFormat(audioFormat, AudioUnitScopeType.Output, ELEM_Mic);
if (unitStatus != AudioUnitStatus.NoError)
{
DammLogger.Log(DammLoggerLevel.Warn, TAG, "Audio Unit SetFormat (MIC-OUTPUT) returned: {0}", unitStatus);
}
unitStatus = voiceProcessing.SetFormat(audioFormat, AudioUnitScopeType.Input, ELEM_Speaker);
if (unitStatus != AudioUnitStatus.NoError)
{
DammLogger.Log(DammLoggerLevel.Warn, TAG, "Audio Unit SetFormat (ELEM 0-INPUT) returned: {0}", unitStatus);
}
unitStatus = voiceProcessing.SetRenderCallback(AudioUnit_RenderCallback, AudioUnitScopeType.Input, ELEM_Speaker);
if (unitStatus != AudioUnitStatus.NoError)
{
DammLogger.Log(DammLoggerLevel.Warn, TAG, "Audio Unit SetRenderCallback returned: {0}", unitStatus);
}
...
voiceProcessing.Initialize();
voiceProcessing.Start();
And the RenderCallback function is:
private AudioUnitStatus AudioUnit_RenderCallback(AudioUnitRenderActionFlags actionFlags, AudioTimeStamp timeStamp, uint busNumber, uint numberFrames, AudioBuffers data)
{
AudioUnit.AudioUnit voiceProcessing = m_VoiceProcessing;
if (voiceProcessing != null)
{
// getting microphone input signal
var status = voiceProcessing.Render(ref actionFlags, timeStamp, ELEM_Mic, numberFrames, data);
if (status != AudioUnitStatus.OK)
{
return status;
}
if (data.Count > 0)
{
unsafe
{
short* samples = (short*)data[0].Data.ToPointer();
for (uint idxSrcFrame = 0; idxSrcFrame < numberFrames; idxSrcFrame++)
{
... send the collected microphone audio (samples[idxSrcFrame])
}
}
}
}
return AudioUnitStatus.NoError;
}
I am facing the problem that if the loudspeaker is enabled: m_AudioSession.OverrideOutputAudioPort(AVAudioSessionPortOverride.Speaker, out error)
then the microphone audio is corrupted (some times is impossible to understand the speech).
If the loudspeaker is NOT enabled (the AVAudioSessionPortOverride.Speaker is not set) then the audio is very nice.
I have already verified that the NumberChannels in the AudioBuffer returned by the Render function is 1 (mono audio).
Any hit helping solved the problem is very appreciated. Thanks
Update:
The AudioUnit_RenderCallback method is called every 32 ms. When the loudspeaker is disabled the received number of frames is 256 which is exact (sample rate is 8000). When the loudspeaker is enabled the received number of frames is 85.
In both cases the GetAudioFormat returns the expected values: BitsPerChannel=16, BytesPerFrame=2, FramesPerPacket=1, ChannelsPerFrame=1, SampleRate=8000
Update:
I end up using the Sample Rate from the Hardware and performing the down-sampling self. It is must understanding that the Audio Unit should be able to perform the down sampling https://developer.apple.com/library/archive/documentation/MusicAudio/Conceptual/AudioUnitHostingGuide_iOS/AudioUnitHostingFundamentals/AudioUnitHostingFundamentals.html#//apple_ref/doc/uid/TP40009492-CH3-SW11)) but it was not possible for me to make it working when the loudspeaker was enabled.
I hope you are testing this on an actual device and not a simulator.
In the code, have you tried using this:
sampleRate = AudioSession.CurrentHardwareSampleRate;
Rather than forcing the sample rate, it's best to check the sample rate from the Hardware. It could be that during loudspeaker usage, it changes the sample rate and thus creating an issue.
I would suggest recording based on the above changes and see if the audio improves and then experiment with other flags.
Standard recording pattern:
https://learn.microsoft.com/en-us/dotnet/api/audiotoolbox.audiostreambasicdescription?view=xamarin-ios-sdk-12#remarks

audio playback error

I'm trying to record an audio file and play it, but I'm getting en error: Locator does not reference a valid media file. What am I missing?
here is my code of saving audio file:
//Stop recording, capture data from the OutputStream,
//close the OutputStream and player.
_rcontrol.commit();
_data = _output.toByteArray();
_output.close();
_player.close();
isRecording = 0;
try {
FileConnection fc = (FileConnection)Connector.open("file:///store/home/user/Audio.mp3");
// If no exception is thrown, then the URI is valid, but the file may or may not exist.
if (!fc.exists()) {
fc.create(); // create the file if it doesn't exist
}
OutputStream outStream = fc.openOutputStream();
outStream.write(_data);
outStream.close();
fc.close();
System.out.println("audio size: "+_data.length);
I can see in log, that my audio file has some length (about 5000 after 3-4 seconds of recording)
here is my code where i'm trying to play it:
FileConnection fc = (FileConnection)Connector.open("file:///store/home/user/Audio.mp3");
// If no exception is thrown, then the URI is valid, but the file may or may not exist.
if (fc.exists()) {
_player = Manager.createPlayer("file:///store/home/user/Audio.mp3");
_player.realize();
_player.prefetch();
if (_player.getMediaTime() == _player.TIME_UNKNOWN) {
System.out.println("zero audio doration");
}
else {
System.out.println("audio doration: "+_player.getMediaTime());
}
_player.start();
}
fc.close();

Simplest way to capture raw audio from audio input for real time processing on a mac

What is the simplest way to capture audio from the built in audio input and be able to read the raw sampled values (as in a .wav) in real time as they come in when requested, like reading from a socket.
Hopefully code that uses one of Apple's frameworks (Audio Queues). Documentation is not very clear, and what I need is very basic.
Try the AudioQueue Framework for this. You mainly have to perform 3 steps:
setup an audio format how to sample the incoming analog audio
start a new recording AudioQueue with AudioQueueNewInput()
Register a callback routine which handles the incoming audio data packages
In step 3 you have a chance to analyze the incoming audio data with AudioQueueGetProperty()
It's roughly like this:
static void HandleAudioCallback (void *aqData,
AudioQueueRef inAQ,
AudioQueueBufferRef inBuffer,
const AudioTimeStamp *inStartTime,
UInt32 inNumPackets,
const AudioStreamPacketDescription *inPacketDesc) {
// Here you examine your audio data
}
static void StartRecording() {
// now let's start the recording
AudioQueueNewInput (&aqData.mDataFormat, // The sampling format how to record
HandleAudioCallback, // Your callback routine
&aqData, // e.g. AudioStreamBasicDescription
NULL,
kCFRunLoopCommonModes,
0,
&aqData.mQueue); // Your fresh created AudioQueue
AudioQueueStart(aqData.mQueue,
NULL);
}
I suggest the Apple AudioQueue Services Programming Guide for detailled information about how to start and stop the AudioQueue and how to setup correctly all ther required objects.
You may also have a closer look into Apple's demo prog SpeakHere. But this is IMHO a bit confusing to start with.
It depends how ' real-time ' you need it
if you need it very crisp, go down right at the bottom level and use audio units. that means setting up an INPUT callback. remember, when this fires you need to allocate your own buffers and then request the audio from the microphone.
ie don't get fooled by the presence of a buffer pointer in the parameters... it is only there because Apple are using the same function declaration for the input and render callbacks.
here is a paste out of one of my projects:
OSStatus dataArrivedFromMic(
void * inRefCon,
AudioUnitRenderActionFlags * ioActionFlags,
const AudioTimeStamp * inTimeStamp,
UInt32 inBusNumber,
UInt32 inNumberFrames,
AudioBufferList * dummy_notused )
{
OSStatus status;
RemoteIOAudioUnit* unitClass = (RemoteIOAudioUnit *)inRefCon;
AudioComponentInstance myUnit = unitClass.myAudioUnit;
AudioBufferList ioData;
{
int kNumChannels = 1; // one channel...
enum {
kMono = 1,
kStereo = 2
};
ioData.mNumberBuffers = kNumChannels;
for (int i = 0; i < kNumChannels; i++)
{
int bytesNeeded = inNumberFrames * sizeof( Float32 );
ioData.mBuffers[i].mNumberChannels = kMono;
ioData.mBuffers[i].mDataByteSize = bytesNeeded;
ioData.mBuffers[i].mData = malloc( bytesNeeded );
}
}
// actually GET the data that arrived
status = AudioUnitRender( (void *)myUnit,
ioActionFlags,
inTimeStamp,
inBusNumber,
inNumberFrames,
& ioData );
// take MONO from mic
const int channel = 0;
Float32 * outBuffer = (Float32 *) ioData.mBuffers[channel].mData;
// get a handle to our game object
static KPRing* kpRing = nil;
if ( ! kpRing )
{
//AppDelegate * appDelegate = [UIApplication sharedApplication].delegate;
kpRing = [Game singleton].kpRing;
assert( kpRing );
}
// ... and send it the data we just got from the mic
[ kpRing floatsArrivedFromMic: outBuffer
count: inNumberFrames ];
return status;
}

Resources