I have a weird problem with playing some streams. Those streams don't play, and AVPlayerItem returns the error "Cannot Decode". By inspecting it on different software (ie VLC Media Player), I noticed that the problem is with audio, mean that all streams with audio bitrate of 128 kb/s and higher are not playable. Is it possible to convert this bitrate "on the fly" to lower one, or is there another approach? I use AVplayerLayer, AVPlayer and AVPlayerItem to play streams.
Related
I have a recording as a collection of files in mpegts format, like
audio: a-1.ts, a-2.ts, a-3.ts, a-4.ts
video: v-1.ts, v-2.ts, v-3.ts
I need to make a single video clip in mp4 or mkv format.
However, there are two problems:
audio and video segments have different duration each, number of audio segments is different from number of video segments. Total duration of audio and video matches. Hence I can not concat pairwise audio video segments using mpeg and merge them afterwards, I get sync issues increasing progressively
few segments are corrupt or missing. So if I concat audio and video streams separately using ffmpeg I get streams of different lengths. When I merge these streams using ffmpeg I have correct a/v synchronization until time when first missing packet is encountered.
It's OK if video freezes for a while or there is silence for a while as long as most of the video is in sync with audio.
I've checked with tsduck and PCR seems to be present in all audio and video segments yet I could not find a way to merge streams using mpegTS PCR as sync reference. Please advise how can I achieve this.
I want to do live audio translation via microphone, to get streamed live vid/audio from Facebook, plug the mic into laptop and do live translation by mixing existing audio stream with one coming from the mic (translation). This is OK, somehow I got this part by using audio filter "amix" and mix two audio streams together into one. Now I want to add more perfection to it, is it possible to (probably is) upon mic voice detection to automatically decrease/fade down 20% volume of input/original audio stream to hear translation (mic audio) more loudly and then when mic action/voice stops for lets say 3-5 seconds the volume of original audio stream fades up/goes up to normal volume... is this too much, i can play with sox or similar?
I use a video player called MPV to transcode a dynamic playlist of media files.
I pipe MPV's encoded output into FFMPEG and format it for rtmp delivery.
However the playlist may contain media with misaligned audio and video, ie - the audio track may be shorter / longer than the video track.
No matter what MPV will only output what it's given. So if my media file has audio that is 1 second long and video that is 2 seconds long, it will output a media stream with exactly the same misalignment, rather than generating null audio or skipping to the next item in the playlist when it first encounters an active stream ending (eof).
For example, assuming my playlist was full of problematic media where the audio and video of each file was misaligned:
If I output this media stream to a popular streaming service's server, it could lead to stuttering and/or loss of a/v sync.
Similarly, if I output this media stream to a file and played it back in MPV or another video player, the result appears to be more like this:
I have tried to fix this in MPV in all sorts of ways, trying every relevant command line option available. I even wrote a user script that detects 'eof' audio and skips to the next item in the playlist, but it is not fast enough and still leads to small gaps of audio.
So my only hope is correcting it in ffmpeg. In the event of null audio/video, I need a fallback or a generative filter that can fill these empty gaps with silence (audio) or a colour/image (video).
I'm open to any ideas, and if my understanding in a/v encoding is a little off please educate me.
Due to the richness and complexity of my app's audio content, I am using AVAudioEngine to manage all audio across the app. I am converting every audio source to be represented as a node in my AVAudioEngine graph.
For example, instead using AVAudioPlayer objects to play mp3 files in my app, I create AVAudioPlayerNode objects using buffers of those audio files.
However, I do have a video player in my app that plays video files with audio using the AVPlayer framework (I know of nothing else in iOS that can play video files). Unfortunately, there seems to be no way I can obtain the audio output stream as a node in my AVAudioEngine graph.
Any pointers?
If you have a video file, you can extract audio data and pull it out from the video.
Then you can set the volume of AVPlayer to 0. (If you didn't remove audio data from the video)
and Play AVAudioPlayerNode.
If you receive the video data through network, You should make parser of the packet and divide them.
But AV-sync is very tough thing.
I want to add a 5.1 .flac audio track to a .ts file that already has three audio tracks. I tried with tsMuxer and ffmpeg with unsuccessful results. In tsMuxeR the .flac track is not recognized and in ffmpeg everything seems to work fine until the very last moment when I check the file and the .flac audio track is not included in the "output.ts". The .flac track is about 3GB and its lenght is around two and a half hours.
Thank you so much.
I don't think you'll find any existing software that maps FLAC into a MPEG-2 Transport Stream.
This gives you an idea what sort of issues you run into: https://xiph.org/flac/ogg_mapping.html
Let's say you came up with a reasonable way of mapping FLAC into a MPEG-2 Transport Stream - there won't be anything reading it.
Unless there is a specified way of mapping FLAC into a MPEG-2 Tranport Stream - you are on your own.
But PCM is supported in a MPEG-2 Transport Stream (for example Blu-Ray).
I'd use ffmpeg to transcode your audio from FLAC to PCM and then mux it into your transport stream.
Your audio transcode (FLAC to PCM) is lossless.