Creating an audio streaming platform from scratch

Creating an audio streaming platform from scratch - node.js

I am trying to create an on-demand audio streaming platform (similar to Spotify) from scratch. It will have 1000 users (I am optimizing for time to build, not scalability as of right now).
I want to use web-based technologies ( I am experienced with React/Redux/Node). Could I get some advice on the architecture (what technologies I should use for the project)?
Here are things I need help with
What Storage service I should use for my music files (my song catalog is about 50000)
How to stream music from the storage service to each user
What server protocol I should use (RTMP/WebRTC/RTS)
(Optional) How to store data in cache to reduce buffer
I know this is a huge ask so thank you guys for your help in advance

What Storage service I should use for my music files (my song catalog is about 50000)
S3 (or equivalent).
Audio files fit this use case precisely, and you're already using AWS. If you find the cost too high, there are compatible services that are more affordable, all the way down to DIY on Minio.
How to stream music from the storage service to each user
Use a CDN (or multiple CDNs) to optimize delivery and keep the latency low. CDNs are also better at spoon-feeding slow clients.
What server protocol I should use (RTMP/WebRTC/RTS)
Normal HTTP! That's all you need, and that's all that's been necessary for decades for this use case.
RTMP is a dead protocol, only supported by Flash on the client side. Its usage today is limited to sending source streams from video encoders, and even that is well on its way out the door.
WebRTC is intended for low latency connections, like voice calls and video chat. This is not something that matters in a unidirectional stream. You actually want a robust streaming mechanism... not one that drops audio like a cell phone to keep up to realtime.
RTSP is not something you can use in a browser, and is overly complex for what you need.
Just a simple HTTP service is sufficient. Your servers should support ranged requests so that the browser can lose a connection and still pick up right where they left off, without the listener even knowing. (All CDNs support this, as does any properly configured web server.)
(Optional) How to store data in cache to reduce buffer
CDNs will generally improve performance of the initial connect and load. I also recommend pre-loading the next track to be played in the list so that you can start it immediately. In most browsers, you can actually start the next track at the tail end of the previous track for a smooth transition.

Related

Should I use webRTC alongside Socket.IO if I want a live chat ability beside the real-time video streaming?

What I am trying to do is to create a simple virtual classroom project like Adobe connect, but obviously simpler, using Flutter and NodeJS, and I need the following options:
Real-time video or only voice streaming
Live chat box
Screen sharing ability
File sharing ability(Like PDF or PowerPoint or other text/doc files)
Whiteboard
As I searched so far I found that it seems WebRTC works for video/voice streaming and also screen sharing as well.
Also most of the livechat projects using Socket.IO.
My main question here is to know can I use only WebRTC for both real-time video/voice streaming and also live chat as well? Is it a good idea or it's better to combine Socket.IO and WebRTC together?
Furthermore I want to know can I use each of those libraries for File-Sharing purposes?

WebRTC gives you lower latency and a lot of functionality for conferencing out of the box. So for video/audio calls and screen sharing this is definitely a better choice.
Also, there's an option to use p2p communication which reduces latency even more and saves you resources on the server-side. Though if you intend to support many participants it looks less beneficial - you will need to maintain n-1 connections for each user if you have n users in total.
For live chat, whiteboard and file sharing there would be no big difference in terms of performance.
Things to consider:
WebRTC is more complex technology than websockets to setup and support
There might be opensource solutions for this features, i would make a decision based on what you can reuse in your project
You can use WebRTC for some of the features and websockets for others
can I use only WebRTC for both real-time video/voice streaming and
also live chat as well
Yes you can, there's a RTCDataChannel interface for exchanging arbitrary data. It can be used for live chat / whiteboard / file transfer.
As a good example, there's an opensource project peercalls, that implements chat and file transfer via WebRTC through the same connection that is used for conferencing.
Websockets can be used for file transfer as well, check out this library.

Using WebRTC requires signaling server and signaling is often implemented using websocket, check this mdn article Signaling and video calling
And with websocket you can implement livechat too, so it is not an either or situation but both quite often.

How to retain audio from Websocket call to Bluemix speech2text service?

We have a iOS native app client making calls to a Bluemix speech2text service using Websockets in Direct interaction mode, which works great for us (very fast, very low latency). But we do need to retain a copy of the audio stream. Most audio clips are short (< 60 seconds). Is there an easy way to do that?
We can certainly have the client buffer the audio clip and upload it somewhere when convenient. This may increase memory footprint, particularly for longer clips. And impact app performance, if not done carefully.
Alternatively, we could switch to using HTTP interface and relay via a proxy, which could then keep a copy for us. The concern here (other that re-writing an app that works perfectly fine for us) is that this may increase latency due to extra hops in the main call thread.
Any insights would be appreciated.
-rg

After some additional research we settled on using Amazon S3 TransferUtility Mobile SDK for iOS. It encapsulates data chunking and multi-threading within a single object, and even completes transfer in the background after iOS suspends the app.
http://docs.aws.amazon.com/mobile/sdkforios/developerguide/s3transferutility.html
The main advantages we see:
no impact on existing code--simply add a call to initiate a transfer
no need to implement and maintain a proxy server, which reduces complexity
Bluemix provides cloud object storage similar to S3 but we were unable to find an iOS SDK that supports anything other than a synchronous, single-threaded solution right out of the box (we were initially psyched to see 'Swift' support, but that has proven to be just a coincidental use of terms).

My two cents....
I would switch to the HTTP interface, if you make things tougher for your users, then they won't use your app and will figure out a better way to do things. You shouldn't have to rewrite the app - just the communications, and then have some sort of server side application that will "cache" those audio streams.
Another approach would be to leave your application as is, and just add a step to send the audio file to some repository, AFTER sending it to speech to text, in a different thread. In this case you could save off not only the audio file, but the text translation as well.

Stream WebCam using socket.io

I have been trying to implement a web application that will be able to handle following scenario:
Streaming video/audio from a client to other clients (actually a particular set of them, no broadcasting) and server at the same time. The data source would be a webcam of the client.
This streamed data has to be displayed in the real time on the other clients' browser and be saved on the server side for the 'archiving' purposes.
It has to be implemented in node.js + socket.io environment.
To put it in some more specific context... The scenario is that there is a guy that makes a kind of a room for the users that he chooses. After the chosen users join the room, the creator starts streaming video/audio from his/her built in devices (webcam). All of the guests receive the data in real time, moreover the data is being sent to the server where it is stored so it can be recovered after the stream and room get closed.
I was thinking about mixing Socket.IO with WebRTC. In theory the combination of these two seem just perfect for the job.
The Socket.IO is great for gathering specific set of users by assigning some sockets to a room and for signaling process demanded by the WebRTC.
At the same time WebRTC is awesome for P2P connection between users gathered in the same room, it is also really easy to get access to the webcam and other built in devices that I might want to use.
So yeah, everything is looking pretty decent in theory but I would really need to see some code in action so I could actually try to implement it on my own. Moreover, I see some issues:
How do I save the stream that is sent by the P2P connection? Obviously server does not have access to that. I was thinking that I might treat the server as another 'guest', so it would be just another endpoint of the P2P connection with the creator of the room. Somehow it feels edgy, though.
Wouldn't it be better to treat server as the middleman between the creator and the clients? At one point there might be some, probably insignificant, delay comparing to P2P but presumably it would be the same for all the clients. (I tried that but I can't get the streaming from webcam to the server done, that's however is the topic for a different question as I am having problems with processing the MediaStream)
I was looking for some nice solutions but without any success. I have seen that there is this nice P2P solution made for socket.io: http://socket.io/blog/socket-io-p2p/ . The thing is - I don't think it will handle the data stream well. The examples mention only simple chat app and I need something a little bit heavier than that.
I would be really thankful for some specific examples, docs, whatever may lead me a little closer to the implementation of it as I really don't know how to approach it.
Thanks in advance :)

You task can be solved by using one of the open source WebRTC-servers.
For example, kurento.
You can realize schemas of stream:
One to one
One to many
Many to many
WebRtc-server schema
Clients will connect to each other through the WebRTC server.
So, on server side you can record the stream, or send it for transcoding.
webSocket is used for communicating with server.
You can find some examples according to your task

Video streaming to multiple users is a really hard problem that unfortunately requires extensive infrastructure to achieve. You will not be able to stream video data through a websocket. WebRTC is also not a viable solution for what you are describing because, as you mentioned, the WebRTC protocol is P2P, as in the streaming user will need to make a direct connection to all the 'viewers'. This will obviously not scale beyond a few 'viewers'. WebRTC is more for direct video calls like in Skype for example.
Here is an article describing the architecture used by a somewhat popular live streaming service. As you can see achieving live video at any sort of scale will require considerable resources.

Using nodeJS for streaming videos

I am planning to write a nodeJS server for streaming videos, One of my critical requirement is
to prevent video download( as much as possible ), something similar to safaribooksonline.com
I am planning to use amazon s3 for storage and nodeJS for streaming the videos to the client.
I want to know if nodeJS is the right tool for streaming videos( max size 100mb ) for an application expecting lot of users. If not then what are the alternatives ?
Let me know if any additional details are required.

In very simple terms you can't prevent video download. If a rogue client wants to do it they they generally can - the video has to make it to the client for the client to be able to play it back.
What is most commonly done is to encrypt the video so the downloaded version is unplayable without the right decryption key. A DRM system will allow the client play the video, without being able to copy it (depending on how determined the user is - a high quality camera pointed at a high quality screen is hard to protect against (!). In these scenarios, other tracing technologies come in to play).
As others have mentioned in the comments, streaming servers are not simple - they have to handle a wide range or encoders, packaging formats, streaming formats etc to allow as much each as possible and will have quite complicated mechanisms to ensure speed and reduce file storage requirements.
It might be an idea to look at some open source streaming servers to get a feel for the area, for example:
VideoLan (http://www.videolan.org/vlc/streaming.html)
GStreamer (https://gstreamer.freedesktop.org)
You can still use noedejs for the main web server component of your solution and just hand off the video streaming to the specialised streaming engines, if this meets your needs.

Scalable cloud storage

I'm going to publish a video in a Web page for streaming. I expect having more than 100.000 visits per day in a month. I want to upload my video to a server (or service) that offers the same band-with for all the clients, even if there are hundreds of thousands of clients connected simultaneously.
I will connect the player with the external video.
Note: I cannot use Youtube or Vimeo because the video is 360º technology, so I need to use my custom player.
Please, could you suggest any service that offers this feature?
Thanks!!

I would say this is mostly a question of the streaming technology you'd like use but not the storage alone.
E.g. if you wish to stream via some binary protocol like RTMP, you'll have to use software like Wowza for transcoding and delivery. Hence the load balancing for proper usage of bandwidth will also be served via load balancer like Wowza.
So you should decide what protocols and other technologies you plan using. This will narrow your search parameters.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string