Consider that I don't know anything of asterisk, so one of my questions is who we are the main actors we know to be aware of in order to start this project.
Basically we want to create a bot (well, asterisk) that is able to call the users phones, have a short conversation with them where each line pronounced by the system (they'll be audio file) depends on the previous answer of the user (speech recognition, in fact we need to intercept the audio stream and pass it to a 3rd party speech recognition engine) and some logic that can be handled by an external module. Saying that the requirements are up to 200 concurrent conversations, and that the conversations will take place in the USA only, what services should we buy? One VOIP provider, one hosting solution for asterisk. How difficult is it to write the asterisk configuration for such a project?Thank you
Can you help me to separate the actors in such a project: professionals, software, facilities?
1) Dedicated server. But for diallout 200 calls need very hi end server. I think you will got 100 on usual server if got nice 2)
2) Dialling software/core
3) Call managment software - you need write it.
4) voice recognition. If it on your server, i not think it will work with more then 20-30 channels.
5) voip account for dialout(most provider NOT allow do automated dialout for marketing purpose).
most problematic is voice recognition - unlikly you will got quality of recognition if more then yes/no answer. reason: telephony use 8khz sounds, not enought quality for recognition.
Also unlikly you will got 200 channels on one server, so will need clustering=clustering expert or hi cost voip expert.
In general, if you got recognition, all other is doable, cost of development will be 1~100k depend of features.
I actually would suggest that you use Tropo for this (https://www.tropo.com/). The rates are reasonable, you can develop in your favorite language, they handle the massive infrastructure you'll require and its got a top-drawer TTS/STT engine built in.
Related
I want to know the real difference between Agora and Webrtc? What did I know Agora provides you SDK for different platforms for video, audio calls, and chats and it charges you accordingly, it provides 10,000 minutes free monthly and charges you if you exceed, Webrtc is a Web Real-Time communication that provides you different API to implememt in your app or web to have video, audio or chats in free & unlimited? Am I Right? If yes then why people would use agora and pay money when they have free WebRTC with unlimited audio videos calls and chats for a long time? pls guide your help will be appreciated
I do not know much about WebRTC pls help me out thanks in advance
This is similar to saying that you can use HTML5 to build websites but still people pay AWS to host machines, databases, storage, application logic, etc.
WebRTC is a web technology that is part of HTML5 and is implemented by all modern browser. To make it work, you will need to create websites, install servers, pay for media traffic, optimize your code - you can do it on your own (and pay the cloud hosting vendors for their service) or you can use a third party that offers that as a managed service for you - like Agora and others that do it, where you end up paying to them for their efforts.
To decide which approach is for you, I can suggest two things:
Build a simple demo app with WebRTC. One where you understand what the code does. If you are happy with it and truly understand what goes on - make the decision if you want to continue in that route or use a 3rd party
Just go use Agora or other 3rd parties and pay them. WebRTC isn't rocket science but it isn't simple either
Agora and similar APIs provide a layer on top of WebRTC. They aim to make it easier for developers to leverage WebRTC capabilities without having to build everything themselves from scratch.
Devs who are fine getting into the nitty gritty of WebRTC can indeed do everything themselves, like you say. I recommend trying it! But often, a team or single dev might just want a straightforward way to integrate video/audio into their product without necessarily knowing what an SFU is, how a TURN server works, how to ensure certain data privacy standards, etc. So these kinds of APIs help make WebRTC easier to leverage by doing most of the heavy lifting and letting people focus on all the other important parts of their app. Pricing-wise, there are a few different very good API options that might be more or less affordable depending on the use case.
These days I am finding myself in the position of having to implement for one of my college courses a system that should act as a giant wrapper over many communications apps like Gmail , Facebook Messenger maybe even WhatsApp .To put it simply you should have a giant web interface where you can authorize Gmail , Messenger and use them at once when required. I am thinking of going with an REST API to manage the user's services authorized by OAuth2.Also I am thinking of using Node.JS and Express.js in the backend and React.js in the frontend. I found some sweet libraries in npm that should take care of interacting with the involved APIs(https://www.npmjs.com/package/node-gmail-api this one for instance), but I am also doubtful about this approach , for example I have no idea how to keep the use notified about its incoming mails or messages for example . I am in dire need of some expertise since I forgot to mention but I am quite the newbie in this field. To sum it up for once my question is how would you implement such an infrastructure ? Is it my approach viable or I am bound to hit some really hard to overcome obstacles?
As a college exercise, it would be a really fun experiment, so it definitely worth the time you want to put into it. However, once you want to add more features, the complexity will go up pretty fast.
Here are a couple of ideas you can think of:
It's pretty clear that your system can't do more things than the capabilities exposed by the APIs of communication apps (e.g. you can't have notifications in gmail if the API doesn't have this capability).
For that reason, you should carefully study the APIs and what functionalities they expose. They have public docs that you can check out: (Gmail API, Facebook Messanger API)
Some of the apps you want to communicate with may not have an official API (e.g. WhatsApp) - those kinds of details you definitely want to know from the start.
Based on the analysis of those APIs, you should lay out a list of requirements for your system, which can be extracted from all the APIs, for example: message notifications, file transfers, user profiles, etc.
In this way, you know exactly what capabilities your system should have, and you don't end up implementing a feature that is available only in 1 API out of 4.
Also, it would be a bit challenging to design your system from a user perspective, because the apps have different usage patterns - chat apps, where messages are coming in real-time, vs email, which is not real-time communication. That's just a detail anyway, the gist of your project is to play with those APIs.
Also, it may worth checking out the Gateway Aggergation Pattern, which is related to this project - you may want the user to send a message to multiple apps, by using a single request to your service.
I am very new when it comes to VOIP and integrations with VOIP systems.
Here is what I am trying to do:
A caller calls in and operator answers the call.
1.1. Start streaming audio of caller to an analysis service in cloud.
Once the audio analysis is performed (generally in a few seconds), operator will press the "Hold" button to perform an action suggested by the analysis.
2.1. Depending on the result of analysis, play a particular audio file back to the caller to let them know that the operator is doing "x," "y," or "z" while on hold.
Given my non-experience working with VOIP systems, I am looking for any suggestions / pointers to topics, areas, articles, technologies that can point me to the right direction.
I could give some general point of view. I would be assuming the SIP-based VOIP which is actually pretty omnipresent (IMS, LTE, 3GPP, etc.).
The VOIP has two parts that you might have spotted while searching:
SIP (the control plane)
RTP (the data or payload plane = audio)
In general, there are two approaches the one comes from a peer-to-peer world where every change in media flow is communicated to the other party with REFER doing actually call transfer for any purpose. But that is usually not a prefered way of doing things. Here comes the second approach which is kind of hiding whatever changes on the B-party (called party) side. Such thing is used also in IMS (which is behind the modern GSM networks). The trick is that the A-party (caller) actually reaches the B-party proxy. In terms of SIP, it is B2BUA aka back to back user agent. Which as the name suggests it covers all the magic that happens in the called party network.
The magic is then actually hidden behind that B2BUA which actually behaves as an entity in the middle and thus can manipulate both SIP and RTP.
Therefore this entity can actually fork the audio using an MGW (media gateway) towards the "real" B-Party (a human/operator) as well as directing the audio to the ML/AI/Expert System analysis. This process also incorporates an appropriate control plane events like starting the analytic process attach, actual audio forking (RTP) and also triggering the SIP INVITE for final B-party. Whenever the analysis is concluded then out of band messaging to some "rich" client at the SIP Agent (computer/tablet with SoftPhone) or some CRM system attached to the call centre system. Such a message should inform the B-Party about the result of the analysis.
All the magic is hidden either inside the B2BUA or eventually inside SIP application server which is a generic name for various services like call distribution to call centre agents, voice mail, IVR, etc.
The voice analysis is today used at banks for caller verification, mood analysis and many "smart" audio processing.
In that domain, there are some opensource and proprietary SIP systems. They tend to be somehow complex. And moreover, the logic is pretty different compared to request-response systems (like HTTP). The call is a stateful system with "session" (call ~ Call-ID) and everything is bound to that.
Hope that this can help you.
Have you considered using an API based VOIP provider like Plivo?
The realtime streaming part of your use case might be difficult but I bet you could find a decent work around. I used to work there as a solutions engineer so I'm pretty familiar with the APIs. Feel free to message me if you have any questions.
We've made an application where two people can video chat with each other using TokBox, but are running into a lot of technical issues surrounding WebRTC and TokBox itself. I know that Twilio recently launched a Javascript version for their video service, but both TokBox and Twilio seem to be aiming for larger scale publish/subscribe operations. It also isn't as far along as TokBox.
Are there other services out there that can do web video 1 on 1's? Perhaps some that don't use WebRTC and therefore don't have the problems we are facing?
I can't help but to think back to ChatRoulette and similar apps.
If what you need is an application that needs to run within the context of a browser, then WebRTC is your only choice when it comes to the technology to use. There's just nothing else there now that Flash is officially dead.
If you need it to run purely inside a packaged PC/mobile application, then you can use something other than WebRTC, but I don't really see the need for that.
When using real time video technologies, one aspect to look at closely is the quality of the network you are using. The questions I usually ask myself are things like does Skype/Hangouts/FaceTime run any better? If the answer is "yes they do", then the problem is in the implementation you have done/used. If the answer is "no, they are just as bad" then you probably can't do a lot better either.
For alternatives, you can check out the vendors listed in this WebRTC Develoepr Tools Landscape: https://bloggeek.me/webrtc-developer-tools-landscape/
I don't know what you mean with "a lot of technical issues surrounding WebRTC and Tokbox itself", but I do know Tokbox handles millions of 1:1 streaming minutes every day, without issues, and it can even handle sessions with 1 publisher and 3000 subscribers at the same time, so, maybe the technical issues are not there, but in another place...
I have a very general question on how to implement VoIP for our current mobile & Web App. (we have an Android+iOS App and a Web Application based on AngularJS/NodeJS).
What we want to achieve
In the first step we want to achieve inter Application Voice and Video Calls. Later on we might expand into outbound calls into the normal telephone network. But this post is mainly for getting info on how to implement only our first step.
general thoughts
We had some experiences with Asterisk before which turned out to be far from easy. So for this project we wanted to get some feedback before actually implementing anything.
thoughts on technology
At first I thought it might be a good idea to use WebRTC, but since it's only supported on Chrome, FF and Opera for the moment and pretty much is unsupported for native mobile Apps we think that WebRTC is probably out of the picture for now. (or do you think otherwise?)
After searching the web a bit more we found this: http://www.webrtc.org/native-code
Has anyone experience with this libs? It seems to us, that this could be the best solution for a modern voip solution (and also would allow us to skip the asterisk server)
The second idea would be to setup an Asterisk Server for ourselves. Every time a user logs into the App we would connect him as a SIP Client to the asterisk. If one user calls the other one we think we should be able to make the call for example with the node package Asterisk Manager API (https://github.com/pipobscure/NodeJS-AsteriskManager).
The third idea would be to use a SIP Provider, but at the moment I'm not sure if that's really the best idea.
Since we're no VoIP experts, are there any other possibilities for VoIP integration into our apps?
Any thoughts on that subject would be very appreciated! Thank you!
The main factor is the network configuration that you app will be working with. Given you're using mobile clients and web apps it's almost certain that you're using the internet and also likely that you'll have 3G and 4G mobile networks in the mix (3G/4G cause a lot more problems for VoIP than WiFi).
Given the above assumption holds the biggest challenge your app will have is establishing media (audio and/or video) connections between mobile clients which are behind different NATs and in a lot of cases multiple NATs. There is almost no chance you'd be able to get by without a server here. The server will be needed to act as a relay point for the media streams for the mobile clients. You will use the RTP protocol for the media and working out how to get it reliably from client A to client B is your biggest obstacle. The signalling side - whether it be SIP, web sockets or something else - will be secondary (note both SIP and WebRTC use RTP to carry the media).
If I were in your shoes the steps I'd take would be:
Install and try out some softphones (blink, bria, zoiper et al) on your own mobile devices, find a SIP provider that supports video calls and get some experience with calls. It may not be the experience you anticipated...
Once you are comfortable with the softphone experience you would then need to make two decisions:
Whether to deploy your own server or use an existing provider,
Whether to write your own client, find an existing one or something in between.
I can answer the deploy your own server question. You don't want to do that unless the VoIP part of your app is going to be something you charge for and make a good margin off. Running a VoIP server and all the security and network considerations that go along with it is a full time job. It may start out being easy but once a few customers start connecting and the fraudsters come along it will take on a life of its own. In the decade I've been messing around with SIP I'd estimate 75% of providers have gone out of business and it was their full time job.
Besides all that I'd be surprised if there wasn't a SIP provider that suited your needs. These days there are highly sophisticated services available that led you control every aspect of your call flow with your own code (anveo, tropo, twilio) right down to free services (sip2sip, sipbroker) that may be all you need to get started.
For the client software there are various SIP SDK's you'll be able to leverage (pjsip).