Maximizing Encoding Speed with Windows Azure Media Service Encoding - azure

I have an android application (client), asp.net web api web server (server), and Windows Azure Media Services (WAMS) account.
What I Want: To upload a 3-30 second video from the client to the server and have it encoded with WAMS then available for streaming via HLSv3 as quickly as possible. Ideally a video preview image would be generated as well. As fast as possible is something like sub one minute turn around. That's likely not realistic, I realize, but the faster the better.
Where I'm At: We upload the video to the server as a stream, which then stores it in Azure blob storage. The server returns to the client indicating upload success. The server has an action that kicks off the encoding which then get's called. I run a custom encoding task based off of the H264 Adaptive Bitrate MP4 Set 720p preset modified for taking a 640x480 video and cropping it to 480x480 at the same time as encoding. Then I run a thumbnail job that generates one thumbnail at 480x480. Depending on the reserved encoder quality this can take ~5 mins to ~2 mins. The encoding job time is only 30-60 seconds of that and the rest is a mix of queue time, publishing time, and communication delay.
What can I do to improve the client upload to video streamable turn around time? Where are the bottle necks in the encoding process? Is there a reasonable max speed that can be achieved? Are there config settings that can be tweaked to improve the process performance?

Reduce the number of jobs
The first thing that springs to mind is given you're only interested in a single thumbnail, you should be able to consolidate your encode and thumbnail jobs by adding something like this to the MediaFile element of your encode preset:
<MediaFile ThumbnailTime="00:00:00"
ThumbnailMode="BestFrame"
ThumbnailJpegCompression="75"
ThumbnailCodec="Jpeg"
ThumbnailSize="480, 480"
ThumbnailEmbed="False">
The thumbnail will end up in the output asset along with your video streams.
Reduce the number of presets in the task
Another thing to consider is that the preset that you linked to has multiple presets defined within it in order to produce audio streams at different bitrates. My current understanding is that each of these presets is processed sequentially by the encode unit.
The first preset defines the video streams, and also specifies that each video stream should have the audio muxed in at 96kbps. This means that your video files will be larger than they probably need to be, and some time will be taken up in the muxing process.
The second and third presets just define the audio streams to output - these wouldn't contain any video. The first of these outputs the audio at 96kbps, the second at 56kbps.
Assuming you're happy with a fixed audio quality of 96kbps, I would suggest removing the audio from the video streams and the last of the audio streams (56kbps) - that would save the same audio stream being encoded twice, and audio being muxed in with the video. (Given what I can tell from your usage, you probably don't need that anyway)
The side benefit of this would be that your encoder output file size will go down marginally, and hence the cost of encodes will too.
Workflow optimisation
The only other point I would is regarding the workflow by which you get your video files into Azure in the first place. You say that you're uploading them into blob storage - I assume that you're subsequently copying them into an AMS asset so they can be configured as inputs for the job. If that's right, you may save a bit of time by uploading directly into an asset.
Hope that helps, and good luck!

Related

Azure Communication Services (Calling SDK) - How many video streams are supported?

I am very confused about the calling sdk specs. They are clear about the fact that only one video stream can be rendered at one time see here...
BUT when I try out the following sample I get video streams for all members of the group call. When I try the other example (both from ms), it behaves like written in the specs... So I am totally confused here why this other example can render more than one video stream in parallel? Can anybody tell me how to understand this? Is it possible or not?
EDIT: I found out that both examples work with multiple videos streams. So it is cool that the service provide more than the specs say, but I do not get the point why the specs tell about that not existing limitations...
Only one video stream is supported on ACS Web (JS) calling SDK, multiple video stream can be rendered for incoming calls but A/V quality is not guaranteed at this stage for more than one video. Support for 4(2x2) and 9(3x3) is on the roadmap and we'll publish support as network bandwidth paired with quality assurance testing and verification is identified and completed.

Fragmented mp4 for DASH and HLS On Demand vs Live Profiles

I'm experimenting with Bento4 and Shaka Packager to output files for both DASH and HLS using fragmented mp4.
I'm having some trouble understanding the differences and pros and cons between the MPEG-DASH Live and On-Demand profiles. If I was streaming live broadcast content I would use the Live profile but for static on demand videos it seems I can use the On-Demand or Live profile. Each profile outputs files in a completely different file format and folder structure with On-Demand outputting a flat folder structure containing .mp4 files and Live outputting a nested folder structure containing m4s files.
Is it advisable to use one profile rather than the other for static video content that will not be broadcast live (e.g. browser support, efficacy etc) and if so why?
The "live" profile is somewhat of a misnomer, because it isn't really related to live streaming. The main difference is that with the on-demand profile, the server hosts large flat files with many segments per file (where a segment is a short portion of a media asset, like audio or video, typically 2 to 10 seconds each), including an index of where the segments are in the file. It is then up to the streaming client to access the segments one by one by doing HTTP "range" requests to access portions of the media assets. For the "live" profile, segments are not accessed as ranges in a flat resource, but as a separate resource for each segment (a separate URL for each segment). This doesn't necessarily mean that the HTTP server needs to have the segments in separate files, but it need to be able to map each segment URL to its corresponding media, either by performing itself a lookup in an index into a flat file, or by having each segment in a separate file, or by any other means. So it is up to the server to do the heavy lifting (as opposed to the "on-demand" profile where it is the player/client that does that.
With packagers like Bento4, if there's no special assumptions made of the HTTP server that will server the media, the default mode for the "live" profile is to store each segment in a separate file, so that the stream can be served by any off-the-shelf HTTP server.
So, for simplicity, if your player supports the on-demand profile, that's an easier one to choose, since you'll have fewer files.

How online radio live stream music and are there available resources to build one with Node.js?

i'm little curious about 'How live streaming web application works'. Recently I want to built something like a online radio that can perform live stream through all the client, like music, speech etc. I'm quite familiar with Java Spring MVC and Node.js . If there are some resource using thease above technology, it would be really helpful for me to see how it works. Thanks in advance.
There are two good articles about it:
Streaming Audio on the Web with NodeJS
Using NodeJS to Stream a Radio Broadcast
You may also find this module helpful:
https://www.npmjs.com/package/websockets-streaming-audio
The best way to do this is use Node.js as your source application, and leave the actual serving of streams to existing servers. No reason to re-invent streaming on the web if you can get all the flexibility you need by writing the source end.
The flow will look like this:
Your Radio Source App --> Icecast (or similar) --> Listeners
Inside your app itself:
Raw audio sources --> Codecs (MP3, AAC w/ADTS, etc.) --> Icecast Source Client
Basically, you'll need to create a raw PCM audio stream using whatever method you want for your use case. From there, you'll send that stream off to a handful of codecs, configured with different bitrates. What bitrate and quality you use is up to you, based on the bandwidth availability of your users and tradeoff with quality you prefer. These days, I usually have 64k streams for bad mobile connections, and 256k streams for good connections. As long as you have at least a 128k stream in there, you'll be putting out acceptable quality.
The Icecast source client can be a simple HTTP PUT these days. The old method is very similar... instead of PUT, the verb was SOURCE. (There are some other minor differences as well, but that's the gist.)

Azure Media Services Streaming Endpoint Is useful for MP4 videos streaming speed?

We are using Azure Media Services to play the MP4 videos.
In Azure Media Services there is a option to change the Streaming endpoints and it's units ( 1 unit = 200 mbps)
In my MP4 videos case if I increase my streaming endpoint units there will be any improvement in performance/ speed? or streaming is only applicable for MPEG-DASH/ HLS (.ism) videos only
Currently we didn't put any streaming endpoint units but it is playing instantly using Azure Media Player with out any delay in desktop.
But when it comes to Mobile Device(Android Samsung S4 5.0.1) same Azure Media Player in chrome browser taking 10 sec or more delay, to overcome this finally I used exoplayer to play the video it is also taking 6 to 7 seconds very first time but if we play same video second time it is taking max of 3 seconds delay.
I don't want that delay also and it should come down to 1 or 2 seconds max either first time or any time.
Is Streaming Endpoints really useful in this case or what should be the alternative ways to achieve streaming speed in mobile device.
if you guys suggest me best instant play video player for Xamarin Android also welcome.
If current your video is Multi-bitrate Mp4 and you have no streaming reserved unit, I guess you are getting a SAS URL for your video. That's progressive download and essentially the video is getting downloaded like a file from storage directly and our streaming service is just passing it through. However, if you purchase one reserved unit, you are actually adaptive streaming the video in streaming format such as Smooth Streaming, HLS and MPEG-DASH. The player will pull the right bitrates according to your current bandwidth and device CPU. Therefore, it will minimize the buffering. Here are of my blogs to explain the concept:
http://mingfeiy.com/progressive-download-video-streaming
http://mingfeiy.com/traditional-streaming-video-streaming
http://mingfeiy.com/adaptive-streaming-video-streaming
Therefore, by increasing the reserved unit more than 1 doesn't help on the loading if you don't hit the bandwidth limit. However, turning reserved unit from 0 to 1 does fundamentally improve the performance.

Using nodeJS for streaming videos

I am planning to write a nodeJS server for streaming videos, One of my critical requirement is
to prevent video download( as much as possible ), something similar to safaribooksonline.com
I am planning to use amazon s3 for storage and nodeJS for streaming the videos to the client.
I want to know if nodeJS is the right tool for streaming videos( max size 100mb ) for an application expecting lot of users. If not then what are the alternatives ?
Let me know if any additional details are required.
In very simple terms you can't prevent video download. If a rogue client wants to do it they they generally can - the video has to make it to the client for the client to be able to play it back.
What is most commonly done is to encrypt the video so the downloaded version is unplayable without the right decryption key. A DRM system will allow the client play the video, without being able to copy it (depending on how determined the user is - a high quality camera pointed at a high quality screen is hard to protect against (!). In these scenarios, other tracing technologies come in to play).
As others have mentioned in the comments, streaming servers are not simple - they have to handle a wide range or encoders, packaging formats, streaming formats etc to allow as much each as possible and will have quite complicated mechanisms to ensure speed and reduce file storage requirements.
It might be an idea to look at some open source streaming servers to get a feel for the area, for example:
VideoLan (http://www.videolan.org/vlc/streaming.html)
GStreamer (https://gstreamer.freedesktop.org)
You can still use noedejs for the main web server component of your solution and just hand off the video streaming to the specialised streaming engines, if this meets your needs.

Resources