With an ultimate aim to crop/cut/trim the Ogg file containing a single opus stream,
I'm trying to retrieve and filter ogg pages form the file and those which sit between the crop window of startTimeMs and endTimeMs I'll append them to the 2 ogg heads resulting in trimmed opus without transcoding
I have reached a stage where I have access to the ogg pages but I'm confused on how to determine if the page lies in crop window or not
OggOpusStream oggOpusStream = OggOpusStream.from("audio/technology.opus");
// Get ID Header
IdHeader idHeader = oggOpusStream.getIdHeader();
// Get Comment Header
CommentHeader commentHeader = oggOpusStream.getCommentHeader();
while (true) {
AudioDataPacket audioDataPacket = oggOpusStream.readAudioPacket();
if (audioDataPacket == null) break;
for (OpusPacket opusPacket : audioDataPacket.getOpusPackets()) {
if(packetLiesWithinTrimRange(opusPacket )){ writeToOutput(opusPacket); }
}
}
// Create an output stream
OutputStream outputStream = ...;
// Create a new Ogg page
OggPage oggPage = OggPage.empty();
// Set header fields by calling setX() method
oggPage.setSerialNum(100);
// Add a data packet to this page
oggPage.addDataPacket(audioDataPacket.dump());
// Call dump() method to dump the OggPage object to byte array binary
byte[] binary = oggPage.dump();
// Write the binary to stream
outputStream.write(binary);
It should work if I would be able to complete this method
private boolean packetLiesWithinTrimRange(OpusPacket packet){
if(????????){ return true;}
return false;
}
or maybe
private boolean pageLiesWithinTrimRange(OggPage page){
if(????????){ return true;}
return false;
}
Any ogg/opus help is appreciated
https://github.com/leonfancy/oggus/issues/2
OggPage.java with private long granulePosition;
https://github.com/leonfancy/oggus/blob/master/src/main/java/org/chenliang/oggus/ogg/OggPage.java
Ogg Encapsulation for the Opus Audio Codec
https://datatracker.ietf.org/doc/html/rfc7845
An audio page's end time can be calculated with using the stream's pre-skip and first/initial granule position. The page's start time can be obtained using the previous page's end time. See pseudocode below:
sampleRate = 48_000
streamPreskip = ...
streamGranulePosFirst = ...
isPageWithinTimeRange(page, prevPage, msStart, msEnd) {
pageMsStart = getPageMsEnd(prevPage)
pageMsEnd = getPageMsEnd(page)
return (pageMsStart >= msStart && pageMsEnd <= msEnd)
}
getPageMsEnd(page) {
return (page.granulePos - streamGranulePosFirst - streamPreskip) / sampleRate
}
Related
I have stored a bunch of images in Azure Blob Storage. Now I want to retrieve them & resize them.
I have successfully managed to read much information from the account, such as the filename, the date last modified, and the size, but how do I get the actual image? Examples I have seen show me how to download it to a file, but that is no use to me, I want to download it as an image so I can process it.
This is what I have so far:
BlobContainerClient containerClient = blobServiceClient.GetBlobContainerClient(containerName);
Console.WriteLine("Listing blobs...");
// build table to hold the info
DataTable table = new DataTable();
table.Columns.Add("ID", typeof(int));
table.Columns.Add("blobItemName", typeof(string));
table.Columns.Add("blobItemLastModified", typeof(DateTime));
table.Columns.Add("blobItemSizeKB", typeof(double));
table.Columns.Add("blobImage", typeof(Image));
// row counter for table
int intRowNo = 0;
// divider to convert Bytes to KB
double dblBytesToKB = 1024.00;
// List all blobs in the container
await foreach (BlobItem blobItem in containerClient.GetBlobsAsync())
{
// increment row number
intRowNo++;
//Console.WriteLine("\t" + blobItem.Name);
// length in bytes
long? longContentLength = blobItem.Properties.ContentLength;
double dblKb = 0;
if (longContentLength.HasValue == true)
{
long longContentLengthValue = longContentLength.Value;
// convert to double DataType
double dblContentLength = Convert.ToDouble(longContentLengthValue);
// Convert to KB
dblKb = dblContentLength / dblBytesToKB;
}
// get the image
// **** Image thisImage = what goes here ?? actual data from blobItem ****
// Last modified date
string date = blobItem.Properties.LastModified.ToString();
try
{
DateTime dateTime = DateTime.Parse(date);
//Console.WriteLine("The specified date is valid: " + dateTime);
table.Rows.Add(intRowNo, blobItem.Name, dateTime, dblKb);
}
catch (FormatException)
{
Console.WriteLine("Unable to parse the specified date");
}
}
You need to open a read stream for your image, and construct your .NET Image from this stream:
await foreach (BlobItem item in containerClient.GetBlobsAsync()){
var blobClient = containerClient.GetBlobClient(item.Name);
using Stream stream = await blobClient.OpenReadAsync();
Image myImage = Image.FromStream(stream);
//...
}
The blobclient class also exposes some other helpful methods, like a download to a stream.
private IEnumerator GetData()
{
WWWForm form = new WWWForm();//php에 보낼 폼을 만듦
form.AddField("data", num);
UnityWebRequest request = new UnityWebRequest();
using (request =
UnityWebRequest.Post("http://localhost/LoadData.php", form))
{
yield return request.SendWebRequest();
if (request.isNetworkError)
{
Debug.Log(request.error);
}
else
{
Debug.Log(request.downloadHandler.text.Length);
Debug.Log(request.downloadHandler.text[request.downloadHandler.text.Length - 1]);
Debug.Log(request.downloadHandler.text[request.downloadHandler.text.Length - 2]);
Debug.Log(request.downloadHandler.text[request.downloadHandler.text.Length - 3]);
string str=request.downloadHandler.text;
str = str.Substring(0, str.Length - 2);
Debug.Log(str[str.Length - 1]);
results = request.downloadHandler.data;
byte[] by = Convert.FromBase64String(request.downloadHandler.text);
Debug.Log(by.Length);
Mesh mesh = MeshSerializer.ReadMesh(by);
transform.GetComponent<MeshFilter>().mesh = mesh;
}
}
I am sending a byte[] as a unity web request, but when sending and receiving, a space is added at the end. so at the beginning, I calculated this space, but it seems to be different for each character length.
The base64 type seems to end with == but I don't know if this is correct
If you use Unity Web Request, can you tell how many null are added?
Or is there a standard for how many null are there?
---added
After sending to Unity Web Request, Post-> Download, the length of the string increased by 2, so at the beginning, only 2 deleted.
As I keep using this method, I don't know what to do because each object has a different null length.
Is it correct to find and compare only the last blank column?
ZWSP: zero width space
https://en.wikipedia.org/wiki/Zero-width_space
ZWSP is created if it is not received in an unusual way.
but don't know that it was data end space add ZWSP
I'm at a loss with this — I have several personal projects in mind that essentially require that I "tap" into the audio stream: read the audio data, do some processing and modify the audio data before it is finally sent to the audio device.
One example of these personal projects is a software-based active crossover. If I have an audio device with 6 channels (i.e., 3 left + 3 right), then I can read the data, apply a LP filter (×2 – left + right), a BP filter, and a HP filter and output the streams through each of the six channels.
Notice that I know how to write a player application that does this — instead, I would want to do this so that any audio from any source (audio players, video players, youtube or any other source of audio being played by the web browser, etc.) is subject to this processing.
I've seen some of the examples (e.g., pcm_min.c from the alsa-project web site, play and record examples in the Linux Journal article by Jeff Tranter from Sep 2004) but I don't seem to have enough information to do something like what I describe above.
Any help or pointers will be appreciated.
You can implement your project as a LADSPA plugin, test it with audacity or any other program supporting LADSPA plugins, and when you like it, insert it into alsa/pulseaudio/jack playback chain.
"LADSPA" is a single header file defining a simple interface to write audio processing plugins. Each plugin has its input/output/control ports and run() function. The run() function is executed for each block of samples to do actual audio processing — apply "control" arguments to "input" buffers and write result to "output" buffers.
Example LADSPA stereo amplifier plugin (single control argument: "Amplification factor", two input ports, two output ports):
///gcc -shared -o /full/path/to/plugindir/amp_example.so amp_example.c
#include <stdlib.h>
#include "ladspa.h"
enum PORTS {
PORT_CAMP,
PORT_INPUT1,
PORT_INPUT2,
PORT_OUTPUT1,
PORT_OUTPUT2
};
typedef struct {
LADSPA_Data *c_amp;
LADSPA_Data *i_audio1;
LADSPA_Data *i_audio2;
LADSPA_Data *o_audio1;
LADSPA_Data *o_audio2;
} MyAmpData;
static LADSPA_Handle myamp_instantiate(const LADSPA_Descriptor *Descriptor, unsigned long SampleRate)
{
MyAmpData *data = (MyAmpData*)malloc(sizeof(MyAmpData));
data->c_amp = NULL;
data->i_audio1 = NULL;
data->i_audio2 = NULL;
data->o_audio1 = NULL;
data->o_audio2 = NULL;
return data;
}
static void myamp_connect_port(LADSPA_Handle Instance, unsigned long Port, LADSPA_Data *DataLocation)
{
MyAmpData *data = (MyAmpData*)Instance;
switch (Port)
{
case PORT_CAMP: data->c_amp = DataLocation; break;
case PORT_INPUT1: data->i_audio1 = DataLocation; break;
case PORT_INPUT2: data->i_audio2 = DataLocation; break;
case PORT_OUTPUT1: data->o_audio1 = DataLocation; break;
case PORT_OUTPUT2: data->o_audio2 = DataLocation; break;
}
}
static void myamp_run(LADSPA_Handle Instance, unsigned long SampleCount)
{
MyAmpData *data = (MyAmpData*)Instance;
double amp = *data->c_amp;
size_t i;
for (i = 0; i < SampleCount; i++)
{
data->o_audio1[i] = data->i_audio1[i]*amp;
data->o_audio2[i] = data->i_audio2[i]*amp;
}
}
static void myamp_cleanup(LADSPA_Handle Instance)
{
MyAmpData *data = (MyAmpData*)Instance;
free(data);
}
static LADSPA_Descriptor myampDescriptor = {
.UniqueID = 123, // for public release see http://ladspa.org/ladspa_sdk/unique_ids.html
.Label = "amp_example",
.Name = "My Amplify Plugin",
.Maker = "alsauser",
.Copyright = "WTFPL",
.PortCount = 5,
.PortDescriptors = (LADSPA_PortDescriptor[]){
LADSPA_PORT_INPUT | LADSPA_PORT_CONTROL,
LADSPA_PORT_INPUT | LADSPA_PORT_AUDIO,
LADSPA_PORT_INPUT | LADSPA_PORT_AUDIO,
LADSPA_PORT_OUTPUT | LADSPA_PORT_AUDIO,
LADSPA_PORT_OUTPUT | LADSPA_PORT_AUDIO
},
.PortNames = (const char * const[]){
"Amplification factor",
"Input left",
"Input right",
"Output left",
"Output right"
},
.PortRangeHints = (LADSPA_PortRangeHint[]){
{ /* PORT_CAMP */
LADSPA_HINT_BOUNDED_BELOW | LADSPA_HINT_BOUNDED_ABOVE | LADSPA_HINT_DEFAULT_1,
0, /* LowerBound*/
10 /* UpperBound */
},
{0, 0, 0}, /* PORT_INPUT1 */
{0, 0, 0}, /* PORT_INPUT2 */
{0, 0, 0}, /* PORT_OUTPUT1 */
{0, 0, 0} /* PORT_OUTPUT2 */
},
.instantiate = myamp_instantiate,
//.activate = myamp_activate,
.connect_port = myamp_connect_port,
.run = myamp_run,
//.deactivate = myamp_deactivate,
.cleanup = myamp_cleanup
};
// NULL-terminated list of plugins in this library
const LADSPA_Descriptor *ladspa_descriptor(unsigned long Index)
{
if (Index == 0)
return &myampDescriptor;
else
return NULL;
}
(if you prefer "short" 40-lines version see https://pastebin.com/unCnjYfD)
Add as many input/output channels as you need, implement your code in myamp_run() function. Build the plugin and set LADSPA_PATH environment variable to the directory where you've built it, so that other apps could find it:
export LADSPA_PATH=/usr/lib/ladspa:/full/path/to/plugindir
Test it in audacity or any other program supporting LADSPA plugins. To test it in terminal you can use applyplugin tool from "ladspa-sdk" package:
applyplugin input.wav output.wav /full/path/to/plugindir/amp_example.so amp_example 2
And if you like the result insert it into your default playback chain. For plain alsa you can use a config like (won't work for pulse/jack):
# ~/.asoundrc
pcm.myamp {
type plug
slave.pcm {
type ladspa
path "/usr/lib/ladspa" # required but ignored as `filename` is set
slave.pcm "sysdefault"
playback_plugins [{
filename "/full/path/to/plugindir/amp_example.so"
label "amp_example"
input.controls [ 2.0 ] # Amplification=2
}]
}
}
# to test it: aplay -Dmyamp input.wav
# to point "default" pcm to it uncomment next line:
#pcm.!default "myamp"
See also:
ladspa.h - answers to most technical questions are there in comments
LADSPA SDK overview
listplugins and analyseplugin tools from "ladspa-sdk" package
alsa plugins : "type ladspa" syntax, and alsa configuration file syntax
ladspa plugins usage examples
If you want to get your hands dirty with some code, you could check out some of these articles by Paul Davis (Paul Davis is a Linux audio guru). You'll have to combine the playback and capture examples to get live audio. Give it a shot, and if you have problems you can post a code-specific problem on SO.
Once you get the live audio working, you can implement an LP filter and go from there.
There are plenty of LADSPA and LV2 audio plugins that implement LP, HP and BP filters but I'm not sure if any are available for your particular channel configuration. It sounds like you want to roll your own anyway.
Currently we're implementing Libspotify in a win 7 64 bit system. Everything seems to work fine except the playback. We get data from the callback , but even using audicity on the saved audio, is filled with abnormalities. So to research further we took the win32 sample (spshell ) and modified it to save the music data to file. Same problem, definitely music with these ticks in it. I'm sure there's something simple I'm missing here, but I'm at a loss as to what could be the problem. Any help would be great since as it stands our project is at a stand still until we can resolve this.
The audio saved can be viewed here
http://uploader.crestron.com/download.php?file=8001d80992480280dba365752aeaca81
Below are the code changes I made to save the file ( for testing only )
static FILE *pFile;
int numBytesToWrite=0;
CRITICAL_SECTION m_cs;
int SP_CALLCONV music_delivery(sp_session *s, const sp_audioformat *fmt, const void *frames, int num_frames)
{
if ( num_frames == 0 )
return;
EnterCriticalSection(&m_cs);
numBytesToWrite = ( num_frames ) * fmt->channels * sizeof(short);
if (numBytesToWrite > 0 )
fwrite(frames, sizeof(short), numBytesToWrite, pFile);
LeaveCriticalSection(&m_cs);
return num_frames;
}
static void playtrack_test(void)
{
sp_error err;
InitializeCriticalSection(&m_cs);
pFile = fopen ("C:\\zzzspotify.pcm","wb");
test_start(&playtrack);
if((err = sp_session_player_load(g_session, stream_track)) != SP_ERROR_OK) {
test_report(&playtrack, "Unable to load track: %s", sp_error_message(err));
return;
}
info_report("Streaming '%s' by '%s' this will take a while", sp_track_name(stream_track),
sp_artist_name(sp_track_artist(stream_track, 0)));
sp_session_player_play(g_session, 1);
}
void SP_CALLCONV play_token_lost(sp_session *s)
{
fclose(pFile);
DeleteCriticalSection(&m_cs);
stream_track_end = 2;
notify_main_thread(g_session);
info_report("Playtoken lost");
}
static int check_streaming_done(void)
{
if(stream_track_end == 2)
test_report(&playtrack, "Playtoken lost");
else if(stream_track_end == 1)
test_ok(&playtrack);
else
return 0;
fclose(pFile);
stream_track_end = 0;
return 1;
}
It looks like this is the problem:
fwrite(frames, sizeof(short), numBytesToWrite, pFile);
The fwrite documentation states that the second argument is the "size in bytes of each element to be written", and the third is this "number of elements, each one with a size of size bytes".
The way you're calling frwritewill tell it to write numBytesToWrite * sizeof(short) bytes, which will run right off the end of the given buffer. I'm actually surprised it doesn't crash!
I'd suggest changing your fwrite call to something like:
fwrite(frames, sizeof(char), numBytesToWrite, pFile);
or:
int numSamplesToWrite = num_frames * fmt->channels;
fwrite(frames, sizeof(short), numSamplesToWrite, pFile);
Edit:
After looking at your audio in detail, I'm more convinced that this is the case. The song seems to be playing at half speed (i.e., 2x as much data is being written) and the artefacts seem to look like buffer overrun into random memory.
I'm a newbie in ffmpeg. I have a problem when some media has multiple audio streams.
Suppose in MKV file, it has three audio streams (MP3, WMA and WMAPro)
How do I change the stream index when demuxing using:
AVPacket inputPacket;
ret = av_read_frame(avInputFmtCtx, &inputPacket)
So I'm searching something like change_stream_index(int streamindex), and when I call that function (suppose change_stream_index(2)), the next call to av_read_frame will demux WMAPro frame instead of MP3.
Thanks guys!
Well, at first you check for the number of streams within the input. Then you write them in some buffer(in my case I only have 2 streams, but you can easily expand that)
ptrFormatContext = avformat_alloc_context();
if(avformat_open_input(&ptrFormatContext, filename, NULL, NULL) != 0 )
{
qDebug("Error opening the input");
exit(-1);
}
if(av_find_stream_info( ptrFormatContext) < 0)
{
qDebug("Could not find any stream info");
exit(-2);
}
dump_format(ptrFormatContext, 0, filename, (int) NULL);
for(i=0; i<ptrFormatContext->nb_streams; i++)
{
switch(ptrFormatContext->streams[i]->codec->codec_type)
{
case AVMEDIA_TYPE_VIDEO:
{
if(videoStream < 0) videoStream = i;
break;
}
case AVMEDIA_TYPE_AUDIO:
{
if(audioStream < 0) audioStream = i;
}
}
}
if(audioStream == -1)
{
qDebug("Could not find any audio stream");
exit(-3);
}
if(videoStream == -1)
{
qDebug("Could not find any video stream");
exit(-4);
}
Since you don't know in which order the streams come in, you'll also have to check for the name of the codec: ptrFormatContext->streams[i]->codec->codec_name and then save the index for the regarding target_format.
Then you can just access the stream through the given index:
while(av_read_frame(ptrFormatContext,&ptrPacket) >= 0)
{
if(ptrPacket.stream_index == videoStream)
{
//decode the video stream to raw format
if(avcodec_decode_video2(ptrCodecCtxt, ptrFrame, &frameFinished, &ptrPacket) < 0)
{
qDebug("Error decoding the Videostream");
exit(-13);
}
if(frameFinished)
{
printf("%s\n", (char*) ptrPacket.data);
//encode the video stream to target format
// av_free_packet(&ptrPacket);
}
}
else if (ptrPacket.stream_index == audioStream)
{
//decode the audio stream to raw format
// if(avcodec_decode_audio3(aCodecCtx, , ,&ptrPacket) < 0)
// {
// qDebug("Error decoding the Audiostream");
// exit(-14);
// }
//encode the audio stream to target format
}
}
I just copied some extracts from a program of mine but this will hopefully help you to understand how to select streams from the input.
I did not post complete code, only excerpts, so you will have to do some initialization etc on your own, but if you have any questions I'll gladly help you!
I was facing the same problem today and in my case I deal with a mp4 file with 80 tracks and obviously if you just need to demux a single track you don't want to skip up to 79 packets each time you want to process a single packet from a selected stream.
The solution for me was setting the discard attribute of all streams which I'm not interested in to AVDISCARD_ALL. For example in order to select only a single stream with index 71 you can do this:
int32_t stream_index = 71;
for(int32_t i = 0; i<pFormatContext->nb_streams; i++)
{
if(stream_index != i) pFormatContext->streams[i]->discard = AVDISCARD_ALL;
}
after this you can call av_seek_frame or av_read_frame and only track 71 is processed.
Just for the reference, here is the list of all available discard types:
AVDISCARD_NONE =-16, ///< discard nothing
AVDISCARD_DEFAULT = 0, ///< discard useless packets like 0 size packets in avi
AVDISCARD_NONREF = 8, ///< discard all non reference
AVDISCARD_BIDIR = 16, ///< discard all bidirectional frames
AVDISCARD_NONINTRA= 24, ///< discard all non intra frames
AVDISCARD_NONKEY = 32, ///< discard all frames except keyframes
AVDISCARD_ALL = 48, ///< discard all
The answer by Dimitri Podborski is good! But there's a small issue with that approach. If you inspect the code of av_read_frame function, you'll find that there can be two cases:
format_context->flags & AVFMT_FLAG_GENPTS == true - then OK, the approach works
format_context->flags & AVFMT_FLAG_GENPTS == false - then the discard field of a stream will be ignored, and av_read_frame will read all the packets.
So, obviously, you should go with the AVDISCARD_ALL approach, but in case of absence of the GENPTS flag - fallback to a classic stream index inspection.