How can i extract image from JPEG-compressed TIFF file ?
I've read bytes according to StripOffests and StripBytesCount fields, but i couldn't load an image from them.
Old style TIFF-JPEG (compression type 6) basically stuffed a normal JFIF file inside of a TIFF wrapper. The newer style TIFF-JPEG (compression type 7) allows the JPEG table data (Huffman, quantization), to be stored in a separate tag (0x015B JPEGTables). This allows you to put strips of JPEG data with SOI/EOI markers in the file without having to repeat the Huffman and Quantization tables. This is probably what you're seeing with your file. The individual strips begin with the sequence FFD8, but are missing the Huffman and quantization tables. This is the way that Photoshop products usually write the files.
Using JAI:
int TAG_COMPRESSION = 259;
int TAG_JPEG_INTERCHANGE_FORMAT = 513;
int COMP_JPEG_OLD = 6;
int COMP_JPEG_TTN2 = 7;
SeekableStream stream = new ByteArraySeekableStream(imageData);
TIFFDirectory tdir = new TIFFDirectory(stream, 0);
int compression = tdir.getField(TAG_COMPRESSION).getAsInt(0);
// Decoder name
String decoder2use = "tiff";
if (compression == COMP_JPEG_OLD) {
// Special handling for old/unsupported JPEG-in-TIFF format:
// {#link: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4929147 }
stream.seek(tdir.getField(TAG_JPEG_INTERCHANGE_FORMAT).getAsLong(0));
decoder2use = "jpeg";
}
// Decode image
ImageDecoder dec = ImageCodec.createImageDecoder(decoder2use, stream, null);
RenderedImage img = dec.decodeAsRenderedImage();
Great solution , helped me a lot .
Just to add , if you have multiple pages in TIFF you have to repeat reading the stream with defining a different directory number in TIFFDirectory object and repeat all of the above.
TIFFDirectory tdir = new TIFFDirectory(stream, 1);
The problem with the mentioned library libtiff is that it does extract the image and then saves it recompressed which means another quality loss in case of jpg. That said I can accomplish the same without even using a 3rd party lib by just calling GDI+ methods of NET Framework.
The initial author of this thread tries to get the jpeg binary without having to recompress it and that is exactly what I am trying to do as well.
This is a possible solution if you can live with quality loss and do not want to use anythin but .NET library classes:
public static int SplitMultiPage(string sourceFileName, string targetPath)
{
using (Image multipageTIFF = Image.FromFile(sourceFileName))
{
int pageCount = multipageTIFF.GetFrameCount(FrameDimension.Page);
if (pageCount > 1)
{
string sFileName = Path.GetFileNameWithoutExtension (sourceFileName);
for (int i = 0; i < pageCount; i++)
{
multipageTIFF.SelectActiveFrame(FrameDimension.Page, i);
// ein einzelner Frame könnte auch ein anderes Format haben, z.B. JPG, PNG, BMP, etc.
// Damit die Datei die korrekte Endung bekommt, holen wir uns eine Endung aus der Beschreibung des Codecs
// Interessanterweise liefert uns das RawFormat im Fall TIFF (der einzige Multiframefall) immer den Codec für TIFF,
// statt den des Frames
ImageCodecInfo codec = Helpers.GetEncoder(multipageTIFF.RawFormat);
string sExtension = codec.FilenameExtension.Split(new char[] { ';' })[0];
sExtension = sExtension.Substring(sExtension.IndexOf('.') + 1);
string newFileName = Path.Combine(targetPath, string.Format("{0}_{1}.{2}", sFileName, i + 1, sExtension));
EncoderParameters encoderParams = new EncoderParameters(2);
encoderParams.Param[0] = new EncoderParameter(System.Drawing.Imaging.Encoder.SaveFlag, (long)EncoderValue.LastFrame);
// für TIF 1 Bit machen wir CompressionCCITT4 Kompression, da das die besten Ergebnisse liefert
switch (GetCompressionType(multipageTIFF))
{
case 1: // No compression -> BMP?
encoderParams.Param[1] = new EncoderParameter(System.Drawing.Imaging.Encoder.Compression, (long)EncoderValue.CompressionNone);
break;
case 2: // CCITT modified Huffman RLE 32773 = PackBits compression, aka Macintosh RLE
encoderParams.Param[1] = new EncoderParameter(System.Drawing.Imaging.Encoder.Compression, (long)EncoderValue.CompressionRle);
break;
case 3: // CCITT Group 3 fax encoding
encoderParams.Param[1] = new EncoderParameter(System.Drawing.Imaging.Encoder.Compression, (long)EncoderValue.CompressionCCITT3);
break;
case 4: // CCITT Group 4 fax encoding
encoderParams.Param[1] = new EncoderParameter(System.Drawing.Imaging.Encoder.Compression, (long)EncoderValue.CompressionCCITT4);
break;
case 5: // LZW
encoderParams.Param[1] = new EncoderParameter(System.Drawing.Imaging.Encoder.Compression, (long)EncoderValue.CompressionLZW);
break;
case 6: //JPEG ('old-style' JPEG, later overriden in Technote2)
case 7: // Technote2 overrides old-style JPEG compression, and defines 7 = JPEG ('new-style' JPEG)
{
codec = Helpers.GetEncoder(ImageFormat.Jpeg);
encoderParams.Param[1] = new EncoderParameter(System.Drawing.Imaging.Encoder.Quality, 90);
}
break;
}
multipageTIFF.Save(newFileName, codec, encoderParams);
}
}
return pageCount;
}
}
the used helper method:
public static ImageCodecInfo GetEncoder(ImageFormat format)
{
ImageCodecInfo[] codecs = ImageCodecInfo.GetImageDecoders();
foreach (ImageCodecInfo codec in codecs)
{
if (codec.FormatID == format.Guid)
{
return codec;
}
}
return null;
}
Reading the compression flag:
public static int GetCompressionType(Image image)
{
/* TIFF Tag Compression
IFD Image
Code 259 (hex 0x0103)
Name Compression
LibTiff name TIFFTAG_COMPRESSION
Type SHORT
Count 1
Default 1 (No compression)
Description
Compression scheme used on the image data.
The specification defines these values to be baseline:
1 = No compression
2 = CCITT modified Huffman RLE
32773 = PackBits compression, aka Macintosh RLE
Additionally, the specification defines these values as part of the TIFF extensions:
3 = CCITT Group 3 fax encoding
4 = CCITT Group 4 fax encoding
5 = LZW
6 = JPEG ('old-style' JPEG, later overriden in Technote2)
Technote2 overrides old-style JPEG compression, and defines:
7 = JPEG ('new-style' JPEG)
Adobe later added the deflate compression scheme:
8 = Deflate ('Adobe-style')
The TIFF-F specification (RFC 2301) defines:
9 = Defined by TIFF-F and TIFF-FX standard (RFC 2301) as ITU-T Rec. T.82 coding, using ITU-T Rec. T.85 (which boils down to JBIG on black and white).
10 = Defined by TIFF-F and TIFF-FX standard (RFC 2301) as ITU-T Rec. T.82 coding, using ITU-T Rec. T.43 (which boils down to JBIG on color).
*/
int compressionTagIndex = Array.IndexOf(image.PropertyIdList, 0x103);
PropertyItem compressionTag = image.PropertyItems[compressionTagIndex];
return BitConverter.ToInt16(compressionTag.Value, 0);
}
If you are trying to extract the actual image from a TIFF, JPEG or otherwise, you are best off using a library such as libtiff in order to do so. TIFF is a very complicated spec and while you might be able to do this yourself and get one or two classes of images, chances are you wouldn't be able to handle the other cases that arise frequently, especially "old-style" JPEG which is a sub-format that was foisted upon TIFF and doesn't fit well into the overall.
My company, Atalasoft, makes a .NET product that includes a very good codec for TIFF. If you only need to worry about single page images, our free product will work just fine for you.
In the .NET realm, you could also look at Bit Miracle's managed version of libtiff. It is a pretty decent port of the library.
Related
Lets say that I am reading from data stream, and that stream is sending the content of an h264 video feed. Given I read from that stream and I have some amount of data consisting of an indeterminate number of frames (NAL?). Given that i know the framerate, and size of the originating video, how would I go about converting this snippet into a mp4 that i could view? The video does not contain audio.
I want to do this using nodejs? My attempts to do so have produced nothing resembling a valid h264 file to convert into mp4. My thoughts so far were to strip any data preceding the first found start code in the data and feed that into a file and use ffmpeg (currently just testing in the command line) to convert the file to mp4.
What's the correct way to go about doing this?
ie. something like this (it's in Typescript but same thing)
//We assume here that when this while loop exist at least one full frame of data will have been read and written to disk
let stream: WriteStream = fs.createWriteStream("./test.h264")
while(someDataStream.available()) { //just an example not real code
let data: Buffer = someDataStream.readSomeData() //just an example not a real method call
let file = null;
try {
file = fs.statSync("./test.h264");
} catch (error) {
console.error(error)
}
if(!stream.writable) {
console.error("stream not writable")
} else if(file == null || file.size <= 0) {
let index = data.indexOf(0x7C)
console.log("index: " + index)
if(index > 0) {
console.log("index2: " + data.slice(index).indexOf(0x7c))
stream.write(data.slice(index))
}
} else {
stream.write(data)
}
}
To handle a data stream, you'll need to emit fragmented MP4. Like all MP4, fMP4 streams begin with a preamble containing ftyp, moov, and styp boxes. Then each frame is encoded with a moof / mdat box pair.
In order to generate a useful preamble from your H.264 bitstream, you need to locate a SPS / PPS pair of NALUs in the H264 data, to set up the avc1 box within the moov box. Those two NALUs are often immediately followed by an I-frame (a key frame). The first frame in a stream must be an I-frame, and subsequent ones can be P- or B- frames. E
It's a fairly complex task involving lots of bit-banging and buffer-shuffling (those are technical terms ;-).
I've been working on a piece of js code to extract H.264 from webm and put it into fmp4. It's not yet complete. It's backed up by another piece of code to decode the parts of the H264 stream that are needed to pack it properly into fMP4.
I wish I could write, "here are the ten lines of code you need" but those formats (fMP4 and H264) aren't simple enough to make that possible.
Idk why none of those questions doesn't actually have an easy answer. Here you go, Node.js solution, i argument just in case you need to offset the search
const soi = Buffer.from([0x00, 0x00, 0x00, 0x01]);
function findStartFrame(buffer, i = -1) {
while ((i = buffer.indexOf(soi, i + 1)) !== -1) {
if ((buffer[i + 4] & 0x1F) === 7) return i
}
return -1
}
I need to convert audio data from AV_CODEC_ID_PCM_S16LE to AV_CODEC_ID_PCM_ALAW and I am using this code as an example. The example code does essentially this (error checking omitted for brevity):
const AVCodec* codec = avcodec_find_encoder(AV_CODEC_ID_MP2);
AVCodecContext* c = avcodec_alloc_context3(codec);
c->bit_rate = 64000;
c->sample_fmt = AV_SAMPLE_FMT_S16;
c->sample_rate = select_sample_rate(codec);
c->channel_layout = select_channel_layout(codec);
c->channels = av_get_channel_layout_nb_channels(c->channel_layout);
avcodec_open2(c, codec, NULL);
AVFrame* frame = av_frame_alloc();
frame->nb_samples = c->frame_size;
frame->format = c->sample_fmt;
frame->channel_layout = c->channel_layout;
The example code subsequently uses c->frame_size in a for loop.
My code is similar to the above with the following differences:
const AVCodec* codec = avcodec_find_encoder(AV_CODEC_ID_PCM_ALAW);
c->sample_rate = 8000;
c->channel_layout = AV_CH_LAYOUT_MONO;
c->channels = 1;
After calling avcodec_open2, c->frame_size is zero. The example code never sets the frame size so I assume that it expects either avcodec_alloc_context3 or avcodec_open2 to set it. Is this a correct assumption? Is the setting of the frame size based on the codec being used? If I have to set the frame size explicitly, is there a recommended size?
EDIT:
Based on #the-kamilz answer it appears that the example code is not robust. The example assumes that c->frame_size will be set but that appears to be dependent on the codec. In my case, codec->capabilities was in fact set to AV_CODEC_CAP_VARIABLE_FRAME_SIZE. So I modified my code to check c->frame_size and use it only if it is not zero. If it is zero, I just picked an arbitrary one second worth of data for frame->nb_samples.
In the FFmpeg documentation it is mentioned as:
int AVCodecContext::frame_size
Number of samples per channel in an audio frame.
encoding: set by libavcodec in avcodec_open2(). Each submitted frame except the last must contain exactly frame_size samples per channel.
May be 0 when the codec has AV_CODEC_CAP_VARIABLE_FRAME_SIZE set, then
the frame size is not restricted.
decoding: may be set by some decoders to indicate constant frame size
Hope that helps.
you don't control the frame size explicitly, it is set by the encoder depending on the codecs provided at initialization (opening) time
once avcodec_open2() is successful, you can retrieve the frame's buffer size with av_samples_get_buffer_size()
I need to concatenate two wav audio files with 30 seconds of whute sound between them.
I want to use the NAudio library - or with any other way that work.
How to do it ?
( the different from any other question is that i need not only to make one audio file from two different audio files .. i also need to add silent between them )
Assuming your WAV files have the same sample rate and channel count, you can concatenate using FollowedBy and use SignalGenerator combined with Take to get the white noise.
var f1 = new AudioFileReader("ex1.wav");
var f2 = new SignalGenerator(f1.WaveFormat.SampleRate, f1.WaveFormat.Channels) { Type = SignalGeneratorType.White, Gain = 0.2f }.Take(TimeSpan.FromSeconds(5));
var f3 = new AudioFileReader("ex3.wav");
using (var wo = new WaveOutEvent())
{
wo.Init(f1.FollowedBy(f2).FollowedBy(f3));
wo.Play();
while (wo.PlaybackState == PlaybackState.Playing) Thread.Sleep(500);
}
I have downloaded the audio-echo app from the android NDK portal for opensl. Due to the lack of documentation I'm not able to identify how to change the sampling rate and buffer size of the audio in and out.
If anybody has any idea on how to:
Change the buffer size and sampling rate on OpenSL
Read the buffers to be fed to a C code to be processed
Fed to the output module of OpenSL to be fed to the speakers
Another alternative I feel is read it at the preferred sampling rate and buffer size but downsample and upsample in the code itself and use a circular buffer to get desired data. But how are we reading and feeding the data in openSL?
In the OpenSL ES API, there are calls to create either a Player or a Recorder:
SLresult (*CreateAudioPlayer) (
SLEngineItf self,
SLObjectItf * pPlayer,
SLDataSource *pAudioSrc,
SLDataSink *pAudioSnk,
SLuint32 numInterfaces,
const SLInterfaceID * pInterfaceIds,
const SLboolean * pInterfaceRequired
);
SLresult (*CreateAudioRecorder) (
SLEngineItf self,
SLObjectItf * pRecorder,
SLDataSource *pAudioSrc,
SLDataSink *pAudioSnk,
SLuint32 numInterfaces,
const SLInterfaceID * pInterfaceIds,
const SLboolean * pInterfaceRequired
);
Note that both of these take a SLDataSource *pAudioSrc parameter.
To use a custom playback rate or recording rate, you have to set up this data source properly.
I use an 11Khz playback rate using this code:
// Configure data format.
SLDataFormat_PCM pcm;
pcm.formatType = SL_DATAFORMAT_PCM;
pcm.numChannels = 1;
pcm.samplesPerSec = SL_SAMPLINGRATE_11_025;
pcm.bitsPerSample = SL_PCMSAMPLEFORMAT_FIXED_16;
pcm.containerSize = 16;
pcm.channelMask = SL_SPEAKER_FRONT_CENTER;
pcm.endianness = SL_BYTEORDER_LITTLEENDIAN;
// Configure Audio Source.
SLDataSource source;
source.pFormat = &pcm;
source.pLocator = &bufferQueue;
To feed data to the speakers, a buffer queue is used that is filled by a callback. To set this callback, use SLAndroidSimpleBufferQueueItf, documented in section 8.12 SLBufferQueueItf of the OpenGL ES specification.
I tried to decode the audio using ffmpeg with the following code:
NSMutableData *finalData = [NSMutableData data];
......
while(av_read_frame(pFormatCtx, &packet) >= 0){
if(packet.stream_index == videoStream)
{
int consumed = avcodec_decode_audio4(pCodecCtx, pFrame, &got_frame_ptr, &packet);
if(got_frame_ptr)
{
[finalData appendBytes:(pFrame->data)[0] length:(pFrame->linesize)[0]];
}
}
av_free_packet(&packet);
}
......
[finalData writeToFile:path atomically:YES];
Bu the saved file can't be played, even I changed the file extension to wav. When I look into it in HexEdit (a Hex editor), I found there are many zero bytes. For example the content of the file before offset 0x970 are all zero. Is there any error in my code? Any help will be appreciated.
Actually the decode result is good. The zero bytes in the file is normal, because the decode result is PCM data. I tried to import the data into Adobe Audition, it can be played. FYI.