Extract thumbnail from jpeg file - jpeg

I'd like to extract thumbnail image from jpegs, without any external library. I mean this is not too difficult, because I need to know where the thumbnail starts, and ends in the file, and simply cut it. I study many documentation ( ie.: http://www.media.mit.edu/pia/Research/deepview/exif.html ), and try to analyze jpegs, but not everything clear. I tried to track step by step the bytes, but in the deep I confused. Is there any good documentation, or readable source code to extract the info about thumbnail start and end position within a jpeg file?
Thank you!

Exiftool is very capable of doing this quickly and easily:
exiftool -b -ThumbnailImage my_image.jpg > my_thumbnail.jpg

For most JPEG images created by phones or digital cameras, the thumbnail image (if present) is stored in the APP1 marker (FFE1). Inside this marker segment is a TIFF file containing the EXIF information for the main image and the optional thumbnail image stored as a JPEG compressed image. The TIFF file usually contains two "pages" where the first page is the EXIF info and the second page is the thumbnail stored in the "old" TIFF type 6 format. Type 6 format is when a JPEG file is just stored as-is inside of a TIFF wrapper. If you want the simplest possible code to extract the thumbnail as a JFIF, you will need to do the following steps:
Familiarize yourself with JFIF and TIFF markers/tags. JFIF markers consist of two bytes: 0xFF followed by the marker type (0xE1 for APP1). These two bytes are followed by the two-byte length stored in big-endian order. For TIFF files, consult the Adobe TIFF 6.0 reference.
Search your JPEG file for the APP1 (FFE1) EXIF marker. There may be multiple APP1 markers and there may be multiple markers before the APP1.
The APP1 marker you're looking for contains the letters "EXIF" immediately after the length field.
Look for "II" or "MM" (6 bytes away from length) to indicate the endianness used in the TIFF file. II = Intel = little endian, MM = Motorola = big endian.
Skip through the first page's tags to find the second IFD where the image is stored. In the second "page", look for the two TIFF tags which point to the JPEG data. Tag 0x201 has the offset of the JPEG data (relative to the II/MM) and tag 0x202 has the length in bytes.

There is a much simpler solution for this problem, but I don't know how reliable it is: Start reading the JPEG file from the third byte and search for FFD8 (start of JPEG image marker), then for FFD9 (end of JPEG image marker). Extract it and voila, that's your thumbnail.
A simple JavaScript implementation:
function getThumbnail(file, callback) {
if (file.type == "image/jpeg") {
var reader = new FileReader();
reader.onload = function (e) {
var array = new Uint8Array(e.target.result),
start, end;
for (var i = 2; i < array.length; i++) {
if (array[i] == 0xFF) {
if (!start) {
if (array[i + 1] == 0xD8) {
start = i;
}
} else {
if (array[i + 1] == 0xD9) {
end = i;
break;
}
}
}
}
if (start && end) {
callback(new Blob([array.subarray(start, end)], {type:"image/jpeg"}));
} else {
// TODO scale with canvas
}
}
reader.readAsArrayBuffer(file.slice(0, 50000));
} else if (file.type.indexOf("image/") === 0) {
// TODO scale with canvas
}
}

The wikipedia page on JFIF at http://en.wikipedia.org/wiki/JPEG_File_Interchange_Format gives a good description of the JPEG Header(the header contains the thumbnail as an uncompressed raster image). That should give you an idea of the layout and thus the code needed to extract the info.
Hexdump of an image header (little endian display):
sdk#AndroidDev:~$ head -c 48 stfu.jpg |hexdump
0000000 d8ff e0ff 1000 464a 4649 0100 0101 4800
0000010 4800 0000 e1ff 1600 7845 6669 0000 4d4d
0000020 2a00 0000 0800 0000 0000 0000 feff 1700
Image Magic (bytes 1,0), App0 Segment header Magic(bytes 3,2), Header Length (5,4) Header Type signature ("JFIF\0"||"JFXX\0")(bytes 6-10), Version (bytes 11,12) Density units (byte 13), X Density (bytes 15,14), Y Density (bytes 17,16), Thumbnail width (byte 19), Thumbnail height (byte 18), and finally rest up to "Header Length" is thumbnail data.
From the above example, you can see that the header length is 16 bytes (bytes 6,5) and version is 01.01 (bytes 12,13). Further, as Thumbnail Width and Thumbnail Height are both 0x00, the image doesn't contain a thumbnail.

Related

Is a cursor greater than 512x512 pixels in size possible?

Goal:
I'm trying to create a cursor file which can cover the whole screen with a flashlight effect on a full hd (1920x1080) screen. For that, the cursor image resolution would need to be at 4K (3840x2160) along with having an alpha channel (32bpp). Axialis Cursor Workshop is the only cursor creation program I've tried which goes above the usual 256² pixel limit, but still caps at 512² pixels...
File format analysis:
Looking at the file format specifications, the usual upper bound of 256² pixels might be caused by the CUR/ICO format working with 8 bits for width and height fields each. ANI format looks more promising since it has 32 bits reserved for those. On the flip side, it seems to have no hotspot fields, and itself uses CUR/ICO format for the animation frames, unless the IconFlag bit is set to FALSE. Looking at a cursor file produced by Axialis CW, I see the flag set to TRUE weirdly enough.
Hex edit approach:
I've tried inserting raster data from a (converted) bmp of same size (521²) by the means of hex editing. Then I tried to insert raster data from a 1024² bpm, updating image dimensions and the file size in the headers. Which only kind of works, I guess.
I'd appreciate any help or pointers in the right direction.
Related things, in no particular order:
install cursor scheme.inf (Creates a certain cursor scheme from cur/ani files)
Set Cursor.ps1 (Applies a certain cursor scheme & size)
File format specification index (For the technical details)
PNG to BMP Converter (Properly converts png to 32bpp bmp files)
Axialis CursorWorkshop (Can create ani files up to 512² pixels at 32bpp)
Got it working with Hex Editor Neo and a binary template I put together for the ico/cur file format:
// ico.h
#pragma once
#pragma byte_order(LittleEndian)
#include "stddefs.h"
#include "bitmap.h"
struct ICONDIRENTRY;
struct ICONFILE;
public struct ICONDIR {
[description("")]
uint16 Reserved;
$assert(Reserved==0);
[description("Specifies image type: 1 for icon (.ICO) image, 2 for cursor (.CUR) image. Other values are invalid.")]
uint16 Type;
[description("Specifies number of images in the file.")]
uint16 Count;
[description("")]
ICONDIRENTRY Entries[Count];
};
struct ICONDIRENTRY {
var entryIndex = array_index;
[description("Cursor Width")]
uint8 Width;
[description("Cursor Height (added height of XORbitmap and ANDbitmap). A negative value would indicate pixel order being top to bottom")]
int8 Height;
[description("Specifies number of colors in the color palette. Should be 0 if the image does not use a color palette.")]
uint8 ColorCount;
[description("")]
uint8 Reserved;
$assert(Reserved==0);
[description("In ICO format: Specifies color planes. Should be 0 or 1. In CUR format: Specifies the horizontal coordinates of the hotspot in number of pixels from the left.")]
uint16 XHotspot;
[description("In ICO format: Specifies bits per pixel. In CUR format: Specifies the vertical coordinates of the hotspot in number of pixels from the top.")]
uint16 YHotspot;
[description("Size of (InfoHeader + ANDBitmap + XORBitmap)")]
uint32 SizeInBytes;
[description("FilePos, where InfoHeader starts")]
uint32 FileOffset as ICONFILE*;
};
struct ICONFILE {
BITMAPINFO Info;
// no idea why this isn't working
/*var bmiv1header = BITMAPINFOHEADER(Info.bmiHeader);
var size = bmiv1header.biSizeImage;
if(size == 0) {
size = Entries[entryIndex].SizeInBytes - bmiv1header.biSize;
}
uint8 RawData[size];*/
uint8 __firstPixel;
};
The cursor file I created successfully looks something like this with the template applied:
The trick was to set value of the image height field in the BITMAPHEADERINFO structure to twice the amount of pixels in height. The reason for this is that two separate pixel arrays are expected which are applied using bitwise XOR and AND. I was surprised when it already worked in the preview without even adding an AND pixel array. Seems like you can omit that or something, idk.

Dump subtitle from AVSubtitle in the file

In FFMPEG sofftware, AVPicture is used to store image data using data pointer and linesizes.It means all subtitles are stored in the form of picture inside ffmpeg. Now I have DVB subtitle and I want to dump picture of subtitles stored in AVPicture in a buffer. I know these images subtitles can be dump using for, fopen and sprintf. But do not know how to dump Subtitle.I have to dump subtitles in .ppm file format.
Can anyone help me to dump picture of subtitles in buffer from AVSubtitle .
This process looks complex but actually very simple.
AVSubtitle is generic format, supports text and bitmap modes. Dvbsub format afaik bitmap only and the bitmap format can be differ like 16color or 256color mode as called CLUT_DEPTH.
I believe (in current ffmpeg) the bitmaps stored in AVSubtitleRect structure, which is member of AVSubtitle.
I assume you have a valid AVSubtitle packet(s) and if I understand correctly you can do these and it should work:
1) Check pkt->rect[0]->type. The pkt here is a valid AVSubtitle packet. It must be type of SUBTITLE_BITMAP.
2) If so, bitmap with and height can be read from pkt->rects[0]->w and pkt->rects[0]->h.
3) Bitmap data itself in will be pkt->rects[0]->data[0].
4) CLUT_DEPTH can be read from pkt->rects[0]->nb_colors.
5) And CLUT itself (color table) will be in pkt->rects[0]->data[1].
With these data, you can construct a valid .bmp file that can be viewable on windows or linux desktop, but I left this part to you.
PPM Info
First check this info about PPM format:
https://www.cs.swarthmore.edu/~soni/cs35/f13/Labs/extras/01/ppm_info.html
What I understand is PPM format uses RGB values (24bit/3bytes). It looks like to me all you have to do is construct a header according to data obtained from AVSubtitle packet above. And write a conversion function for dvbsub's indexed color buffer to RGB. I'm pretty sure somewhere there are some ready to use codes out there but I'll explain anyway.
In the picture frame data Dvbsub uses is liner and every pixel is 1 byte (even in 16color mode). This byte value is actually index value that correspond RGB (?) values stored in Color Look-Up Table (CLUT), in 16 color mode there are 16 index each 4 bytes, first 3 are R, G, B values and 4th one is alpha (transparency values, if PPM doesn't support this, ignore it).
I'm not sure if decoded subtitle still has encoded YUV values. I remember it should be plain RGBA format.
encode_dvb_subtitles function on ffmpeg shows how this encoding done. If you need it.
https://github.com/FFmpeg/FFmpeg/blob/a0ac49e38ee1d1011c394d7be67d0f08b2281526/libavcodec/dvbsub.c
Hope that helps.
As this is where I ended up when searching for answers to how to create a thumbnail of an AVSubtitle, here is what I ended up using in my test application. The code is optimized for readability only. I got some help from this question which had some sample code.
Using avcodec_decode_subtitle2() I get a AVSubtitle structure. This contains a number of rectangles. First I iterate over the rectangles to find the max of x + w and y + h to determine the width and height of the target frame.
The color table in data[1] is RGBA, so I allocate an AVFrame called frame in AV_PIX_FMT_RGBA format and shuffle the pixels over to it:
struct [[gnu::packed]] rgbaPixel {
uint8_t r;
uint8_t g;
uint8_t b;
uint8_t a;
};
// Copy the pixel buffers
for (unsigned int i = 0; i < sub.num_rects; ++ i) {
AVSubtitleRect* rect = sub.rects[i];
for (int y = 0; y < rect->h; ++ y) {
int dest_y = y + rect->y;
// data[0] holds index data
uint8_t *in_linedata = rect->data[0] + y * rect->linesize[0];
// In AVFrame, data[0] holds the pixel buffer directly
uint8_t *out_linedata = frame->data[0] + dest_y * frame->linesize[0];
rgbaPixel *out_pixels = reinterpret_cast<rgbaPixel*>(out_linedata);
for (int x = 0; x < rect->w; ++ x) {
// data[1] contains the color map
// compare libavcodec/dvbsubenc.c
uint8_t colidx = in_linedata[x];
uint32_t color = reinterpret_cast<uint32_t*>(rect->data[1])[colidx];
// Now store the pixel in the target buffer
out_pixels[x + rect->x] = rgbaPixel{
.r = static_cast<uint8_t>((color >> 16) & 0xff),
.g = static_cast<uint8_t>((color >> 8) & 0xff),
.b = static_cast<uint8_t>((color >> 0) & 0xff),
.a = static_cast<uint8_t>((color >> 24) & 0xff),
};
}
}
}
I did manage to push that AVFrame through an image decoder to output it as a bitmap image, and it looked OK. I did get green areas where the alpha channel is, but that might be an artifact of the settings in the JPEG encoder I used.

How to fix .gif with corrupted alpha channel (stuck pixels) collected with Graphicsmagick?

I want to convert an .avi with alpha channel into a .gif.
Firstly, I use
ffmpeg -i source.avi -vf scale=720:-1:flags=lanczos,fps=10 frames/ffout%03d.png
to convert .avi to sequence of .png's with aplha channel.
Then, I use
gm convert -loop 0 frames/ffout*.png output.gif
to collect a .gif.
But it seems that pixels of the output.gif just get stuck when something opaque is rendered on top of the transparent areas.
Here's an example:
As you can see the hearts and explosions do not get derendered.
P.S.
FFMPEG output (collection on .png's) is fine.
I do not use Graphicsmagick but your GIF has image disposal mode 0 (no animation). You should use disposal mode 2 (clear with background) or 3 (restore previous image) both works for your GIF. The disposal is present in gfx extension of each frame in the Packed value.
So if you can try to configure encoder to use disposal = 2 or 3 or write script that direct stream copy your GIF and change the Packed value of gfx extension chunk frame by frame. Similar to this:
GIF Image getting distorted on interlacing
If you need help with the script then take a look at:
How to find where does Image Block start in GIF images?
Decode data bytes of GIF87a raster data stream
When I tried this (C++ script) on your GIF using disposal 2 I got this result:
The disposal is changed in C++ like this:
struct __gfxext
{
BYTE Introducer; /* Extension Introducer (always 21h) */
BYTE Label; /* Graphic Control Label (always F9h) */
BYTE BlockSize; /* Size of remaining fields (always 04h) */
BYTE Packed; /* Method of graphics disposal to use */
WORD DelayTime; /* Hundredths of seconds to wait */
BYTE ColorIndex; /* Transparent Color Index */
BYTE Terminator; /* Block Terminator (always 0) */
__gfxext(){}; __gfxext(__gfxext& a){ *this=a; }; ~__gfxext(){}; __gfxext* operator = (const __gfxext *a) { *this=*a; return this; }; /*__gfxext* operator = (const __gfxext &a) { ...copy... return this; };*/
};
__gfxext p;
p.Packed&=255-(7<<2); // clear old disposal and leave the rest as is
p.Packed|= 2<<2; // set new disposal=2 (the first 2 is disposal , the <<2 just shifts it to the correct position in Packed)
It is a good idea to leave other bits of Packed as are because no one knows what could be encoded in there in time ...

Image Encryption

i am doing image steganography and if i type message greater than 3 chars to encrypt there is an exception that Quantization table 0x01 is not defined and is message is less than 3 char i got an encrypted image as i needed .I think this is due to JPEG format (I think while injecting bits in image byte array i hv destroyed the property and attributes of an image ).Help me i am sure its something related to metadata but don`t know much about it.
i am adding code what i am doing
Creating_image()
{
File f=new File(file.getParent()+"/encrypt.jpg");
if(file==null)
{
JOptionPane.showMessageDialog(rootPane, "file null ho gyi encrypt mein");
}
try{
FileInputStream imageInFile = new FileInputStream(file);
byte imageData[] = new byte[(int) file.length()];
imageInFile.read(imageData);
// Converting Image byte array into Base64 String
String imageDataString = Base64.encode(imageData);
// Converting a Base64 String into Image byte array
pixels = Base64.decode(imageDataString);
// Write a image byte array into file system
imageInFile.close();
}
catch(Exception as)
{
JOptionPane.showMessageDialog(rootPane,"Please first select an Image");
}
String msg=jTextArea1.getText();
byte[] bmsg=msg.getBytes();
String as=Base64.encode(bmsg);
bmsg=Base64.decode(as);
int len=msg.length();
byte[] blen=inttobyte(len);
String sd=Base64.encode(blen);
blen=Base64.decode(sd);
pixels=encode(pixels,blen,32);
pixels=encode(pixels,bmsg,64);
try{
// Converting Image byte array into Base64 String
String imageDataString = Base64.encode(pixels);
// Converting a Base64 String into Image byte array
pixels = Base64.decode(imageDataString);
InputStream baisData = new ByteArrayInputStream(pixels,0,pixels.length);
image= ImageIO.read(baisData);
if(image == null)
{
System.out.println("imag is empty");
}
ImageIO.write(image, "jpg", f);
}
catch(Exception s)
{
System.out.println(s.getMessage());
}
}
and thats what encode fxn looks like
byte[] encode(byte [] old,byte[] add,int offset)
{
try{ if(add.length+offset>old.length)
{
JOptionPane.showMessageDialog(rootPane, "File too short");
}
}
catch(Exception d)
{
JOptionPane.showMessageDialog(rootPane, d.getLocalizedMessage());
}
byte no;
for(int i=0;i<add.length;i++)
{
no=add[i];
for(int bit=7;bit>=0;bit--,++offset)
{
int b=(no>>bit)&1;
old[offset]=(byte)((old[offset]&0xfe)|b);
}
}
return old;
}
You are correct in that you have disturbed the file structure. The JPEG format contains highly compressed data to the point none of its bytes represent any pixel values directly. In fact, JPEG doesn't even store the pixel values, but the DCT coefficients of pixel blocks.
Your method of reading the raw bytes of the file would work only for a format like BMP, where the pixels are directly stored in the file. However, you'd still have to skip the first few bytes (header), which contain information like the width and height of the image, number of colour planes and bits per pixel.
If you want to embed your message by modifying the least significant bits of pixels, you have to load the actual pixels in a byte array. Then you can modify the pixels with your encode() method. To save the data to a file, convert the byte array to a BuffferedImage object and use ImageIO.write(). However, you must use a format that does not involve lossy compression, because that can distort the pixel values, thereby destroying your message. Losslessly compressed (or uncompressed) file formats include BMP and PNG, while JPEG is lossy.
If you still want to do JPEG steganography, the process is a bit more involving, but this answer pretty much covers what you need to do. Briefly, you want to borrow the source code of a jpeg encoder because writing one is very complex and requires intricate understanding of the whole format. The encoder will convert the pixels to a bunch of different numbers (lossy step) and store them compactly to a file. Your steganography algorithm should then be injected between these two steps, where you can modify those numbers before saving them to file.

How can I detect whether a WAV file has a 44 or 46-byte header?

I've discovered it is dangerous to assume that all PCM wav audio files have 44 bytes of header data before the samples begin. Though this is common, many applications (ffmpeg for example), will generate wavs with a 46-byte header and ignoring this fact while processing will result in a corrupt and unreadable file. But how can you detect how long the header actually is?
Obviously there is a way to do this, but I searched and found little discussion about this. A LOT of audio projects out there assume 44 (or conversely, 46) depending on the authors own context.
You should be checking all of the header data to see what the actual sizes are. Broadcast Wave Format files will contain an even larger extension subchunk. WAV and AIFF files from Pro Tools have even more extension chunks that are undocumented as well as data after the audio. If you want to be sure where the sample data begins and ends you need to actually look for the data chunk ('data' for WAV files and 'SSND' for AIFF).
As a review, all WAV subchunks conform to the following format:
Subchunk Descriptor (4 bytes)
Subchunk Size (4 byte integer, little endian)
Subchunk Data (size is Subchunk Size)
This is very easy to process. All you need to do is read the descriptor, if it's not the one you are looking for, read the data size and skip ahead to the next. A simple Java routine to do that would look like this:
//
// Quick note for people who don't know Java well:
// 'in.read(...)' returns -1 when the stream reaches
// the end of the file, so 'if (in.read(...) < 0)'
// is checking for the end of file.
//
public static void printWaveDescriptors(File file)
throws IOException {
try (FileInputStream in = new FileInputStream(file)) {
byte[] bytes = new byte[4];
// Read first 4 bytes.
// (Should be RIFF descriptor.)
if (in.read(bytes) < 0) {
return;
}
printDescriptor(bytes);
// First subchunk will always be at byte 12.
// (There is no other dependable constant.)
in.skip(8);
for (;;) {
// Read each chunk descriptor.
if (in.read(bytes) < 0) {
break;
}
printDescriptor(bytes);
// Read chunk length.
if (in.read(bytes) < 0) {
break;
}
// Skip the length of this chunk.
// Next bytes should be another descriptor or EOF.
int length = (
Byte.toUnsignedInt(bytes[0])
| Byte.toUnsignedInt(bytes[1]) << 8
| Byte.toUnsignedInt(bytes[2]) << 16
| Byte.toUnsignedInt(bytes[3]) << 24
);
in.skip(Integer.toUnsignedLong(length));
}
System.out.println("End of file.");
}
}
private static void printDescriptor(byte[] bytes)
throws IOException {
String desc = new String(bytes, "US-ASCII");
System.out.println("Found '" + desc + "' descriptor.");
}
For example here is a random WAV file I had:
Found 'RIFF' descriptor.
Found 'bext' descriptor.
Found 'fmt ' descriptor.
Found 'minf' descriptor.
Found 'elm1' descriptor.
Found 'data' descriptor.
Found 'regn' descriptor.
Found 'ovwf' descriptor.
Found 'umid' descriptor.
End of file.
Notably, here both 'fmt ' and 'data' legitimately appear in between other chunks because Microsoft's RIFF specification says that subchunks can appear in any order. Even some major audio systems that I know of get this wrong and don't account for that.
So if you want to find a certain chunk, loop through the file checking each descriptor until you find the one you're looking for.
The trick is to look at the "Subchunk1Size", which is a 4-byte integer beginning at byte 16 of the header. In a normal 44-byte wav, this integer will be 16 [10, 0, 0, 0]. If it's a 46-byte header, this integer will be 18 [12, 0, 0, 0] or maybe even higher if there is extra extensible meta data (rare?).
The extra data itself (if present), begins in byte 36.
So a simple C# program to detect the header length would look like this:
static void Main(string[] args)
{
byte[] bytes = new byte[4];
FileStream fileStream = new FileStream(args[0], FileMode.Open, FileAccess.Read);
fileStream.Seek(16, 0);
fileStream.Read(bytes, 0, 4);
fileStream.Close();
int Subchunk1Size = BitConverter.ToInt32(bytes, 0);
if (Subchunk1Size < 16)
Console.WriteLine("This is not a valid wav file");
else
switch (Subchunk1Size)
{
case 16:
Console.WriteLine("44-byte header");
break;
case 18:
Console.WriteLine("46-byte header");
break;
default:
Console.WriteLine("Header contains extra data and is larger than 46 bytes");
break;
}
}
In addition to Radiodef's excellent reply, I'd like to add 3 things that aren't obvious.
The only rule for WAV files is the FMT chunk comes before the DATA chunk. Apart from that, you will find chunks you don't know about at the beginning, before the DATA chunk and after it. You must read the header for each chunk to skip forward to find the next chunk.
The FMT chunk is commonly found in 16 byte and 18 byte variations, but the spec actually allows more than 18 bytes as well.
If the FMT chunk' header size field says greater than 16, Bytes 17 and 18 also specify how many extra bytes there are, so if they are both zero, you end up with an 18 byte FMT chunk identical to the 16 byte one.
It is safe to read in just the first 16 bytes of the FMT chunk and parse those, ignoring any more.
Why does this matter? - not much any more, but Windows XP's Media Player was able to play 16 bit WAV files, but 24 bit WAV files only if the FMT chunk was the Extended (18+ byte) version. There used to be a lot of complaints that "Windows doesn't play my 24 bit WAV files", but if it had an 18 byte FMT chunk, it would... Microsoft fixed that sometime during the early days of Windows 7, so 24 bit with 16 byte FMT files work fine now.
(Newly added) Chunk sizes with odd sizes occur quite often. Mostly seen when a 24 bit mono file is made. It is unclear from the spec, but the chunk size specifies the actual data length (the odd value) and a pad byte (zero) is added after the chunk and before the start of the next chunk. So chunks always start on even boundaries, but the chunk size itself is stored as the actual odd value.

Resources