jpeg image compression : byte order issue

jpeg image compression : byte order issue - jpeg

If jpeg image is little-endian, then what difference will come in its format? I have two types of images in jpeg - one is little-endian and another one is big-endian using jpegsnoop application, but in hex-format, I found everything the same, no change at all. But while compressing it I'm getting the issue on little-endian jpeg image.
please look at app marker of both image (0xFFE1)
Image 1 (which having compression error)
JPEGsnoop 1.7.3 by Calvin Hass
http://www.impulseadventure.com/photo/
-------------------------------------
Filename: [D:\New folder\PICT0001.JPG]
Filesize: [2700154] Bytes
Start Offset: 0x00000000
*** Marker: SOI (xFFD8) ***
OFFSET: 0x00000000
*** Marker: APP1 (xFFE1) ***
OFFSET: 0x00000002
Length = 15358
Identifier = [Exif]
Identifier TIFF = 0x[49492A00 08020000]
***Endian = Intel (little)***
TAG Mark x002A = 0x002A
/***********************************************************/
**image 2** (ok image )
JPEGsnoop 1.7.3 by Calvin Hass
http://www.impulseadventure.com/photo/
-------------------------------------
Filename: [D:\Trailing_Project\OBSERVATION\12-09-2017\original jpeg\MFDC0058.JPG]
Filesize: [1398695] Bytes
Start Offset: 0x00000000
*** Marker: SOI (xFFD8) ***
OFFSET: 0x00000000
*** Marker: APP1 (xFFE1) ***
OFFSET: 0x00000002
Length = 3556
Identifier = [Exif]
Identifier TIFF = 0x[4D4D002A 00000008]
***Endian = Motorola (big)***
TAG Mark x002A = 0x002A

All integers in JPEG are BIG ENDIAN (Network order). If you have little endian integers it is not JPEG.

Related

Golang : fatal error: runtime: out of memory

I trying to use this package in Github for string matching. My dictionary is 4 MB. When creating the Trie, I got fatal error: runtime: out of memory. I am using Ubuntu 14.04 with 8 GB of RAM and Golang version 1.4.2.
It seems the error come from the line 99 (now) here : m.trie = make([]node, max)
The program stops at this line.
This is the error:
fatal error: runtime: out of memory
runtime stack:
runtime.SysMap(0xc209cd0000, 0x3b1bc0000, 0x570a00, 0x5783f8)
/usr/local/go/src/runtime/mem_linux.c:149 +0x98
runtime.MHeap_SysAlloc(0x57dae0, 0x3b1bc0000, 0x4296f2)
/usr/local/go/src/runtime/malloc.c:284 +0x124
runtime.MHeap_Alloc(0x57dae0, 0x1d8dda, 0x10100000000, 0x8)
/usr/local/go/src/runtime/mheap.c:240 +0x66
goroutine 1 [running]:
runtime.switchtoM()
/usr/local/go/src/runtime/asm_amd64.s:198 fp=0xc208518a60 sp=0xc208518a58
runtime.mallocgc(0x3b1bb25f0, 0x4d7fc0, 0x0, 0xc20803c0d0)
/usr/local/go/src/runtime/malloc.go:199 +0x9f3 fp=0xc208518b10 sp=0xc208518a60
runtime.newarray(0x4d7fc0, 0x3a164e, 0x1)
/usr/local/go/src/runtime/malloc.go:365 +0xc1 fp=0xc208518b48 sp=0xc208518b10
runtime.makeslice(0x4a52a0, 0x3a164e, 0x3a164e, 0x0, 0x0, 0x0)
/usr/local/go/src/runtime/slice.go:32 +0x15c fp=0xc208518b90 sp=0xc208518b48
github.com/mf/ahocorasick.(*Matcher).buildTrie(0xc2083c7e60, 0xc209860000, 0x26afb, 0x2f555)
/home/go/ahocorasick/ahocorasick.go:104 +0x28b fp=0xc208518d90 sp=0xc208518b90
github.com/mf/ahocorasick.NewStringMatcher(0xc208bd0000, 0x26afb, 0x2d600, 0x8)
/home/go/ahocorasick/ahocorasick.go:222 +0x34b fp=0xc208518ec0 sp=0xc208518d90
main.main()
/home/go/seme/substrings.go:66 +0x257 fp=0xc208518f98 sp=0xc208518ec0
runtime.main()
/usr/local/go/src/runtime/proc.go:63 +0xf3 fp=0xc208518fe0 sp=0xc208518f98
runtime.goexit()
/usr/local/go/src/runtime/asm_amd64.s:2232 +0x1 fp=0xc208518fe8 sp=0xc208518fe0
exit status 2
This is the content of the main function (taken from the same repo: test file)
var dictionary = InitDictionary()
var bytes = []byte(""Partial invoice (€100,000, so roughly 40%) for the consignment C27655 we shipped on 15th August to London from the Make Believe Town depot. INV2345 is for the balance.. Customer contact (Sigourney) says they will pay this on the usual credit terms (30 days).")
var precomputed = ahocorasick.NewStringMatcher(dictionary)// line 66 here
fmt.Println(precomputed.Match(bytes))

Your structure is awfully inefficient in terms of memory, let's look at the internals. But before that, a quick reminder of the space required for some go types:
bool: 1 byte
int: 4 bytes
uintptr: 4 bytes
[N]type: N*sizeof(type)
[]type: 12 + len(slice)*sizeof(type)
Now, let's have a look at your structure:
type node struct {
root bool // 1 byte
b []byte // 12 + len(slice)*1
output bool // 1 byte
index int // 4 bytes
counter int // 4 bytes
child [256]*node // 256*4 = 1024 bytes
fails [256]*node // 256*4 = 1024 bytes
suffix *node // 4 bytes
fail *node // 4 bytes
}
Ok, you should have a guess of what happens here: each node weighs more than 2KB, this is huge ! Finally, we'll look at the code that you use to initialize your trie:
func (m *Matcher) buildTrie(dictionary [][]byte) {
max := 1
for _, blice := range dictionary {
max += len(blice)
}
m.trie = make([]node, max)
// ...
}
You said your dictionary is 4 MB. If it is 4MB in total, then it means that at the end of the for loop, max = 4MB. It it holds 4 MB different words, then max = 4MB*avg(word_length).
We'll take the first scenario, the nicest one. You are initializing a slice of 4M of nodes, each of which uses 2KB. Yup, that makes a nice 8GB necessary.
You should review how you build your trie. From the wikipedia page related to the Aho-Corasick algorithm, each node contains one character, so there is at most 256 characters that go from the root, not 4MB.
Some material to make it right: https://web.archive.org/web/20160315124629/http://www.cs.uku.fi/~kilpelai/BSA05/lectures/slides04.pdf

The node type has a memory size of 2084 bytes.
I wrote a litte program to demonstrate the memory usage: https://play.golang.org/p/szm7AirsDB
As you can see, the three strings (11(+1) bytes in size) dictionary := []string{"fizz", "buzz", "123"} require 24 MB of memory.
If your dictionary has a length of 4 MB you would need about 4000 * 2084 = 8.1 GB of memory.
So you should try to decrease the size of your dictionary.

Set resource limit to unlimited worked for me
if ulimit -a return 0 run ulimit -c unlimited
Maybe set a real size limit to be more secure

Noise in Merging two pcm files

I am merging two pcm data and the resultant pcm is having additional noise of grrrrrrrrrr. My code is :
int main(void)
{
FILE *rCAudio;
FILE *raudio;
FILE *wtest;
rCAudio=fopen("Audio1.pcm","rb"); //Reading first pcm file
if(rCAudio==NULL)
cout<<"Errr";
raudio=fopen("Audio2.pcm","rb"); //Reading second pcm file
if(raudio==NULL)
cout<<"Errr";
fopen_s(&wtest,"AudioMerge.pcm","ab"); // Writing final pcm file
short* first= new short[1792];;
short* second= new short[1792];;
short* merge = new short[1792];
short sample1,sample2;
while(1)
{
fread(first,2,1792,rCAudio);
fread(second,2,1792,raudio);
for(int j=0;j<1792;j++)
{
sample1 = first[j];
sample2 = second[j];
int mixedi=(int)sample1 + (int)sample2;
if (mixedi>32767) mixedi=32767;
if (mixedi<-32768) mixedi=-32768;
merge[j] =(short)mixedi;
}
fwrite(merge,2,1972,wtest);
}
}

I got the solution , the problem was:
I have written Audio1.pcm with 4096 bytes at a time in BYTE and Audio2.pcm with 4096 bytes at a time in BYTE. But i was reading 1972 bytes at a time in short.
So i corrected it by reading 4096 bytes at a time in BYTE and save by third merge file with 4096 bytes at a time in BYTE.

Determine physical file address of directory RVA in PE file

How can I determine the image address (byte offset in file) of a particular data directory in a PE file?
For example, given data directories as follows:
directory 1 RVA: 0x0 Size: 0
directory 2 RVA: 0xaf974 Size: 300
directory 3 RVA: 0xb8000 Size: 22328
directory 4 RVA: 0x0 Size: 0
directory 5 RVA: 0xc0800 Size: 6440
directory 6 RVA: 0xbe000 Size: 27776
directory 7 RVA: 0x91760 Size: 28
directory 8 RVA: 0x0 Size: 0
directory 9 RVA: 0x0 Size: 0
directory 10 RVA: 0x0 Size: 0
directory 11 RVA: 0xa46b8 Size: 64
directory 12 RVA: 0x0 Size: 0
directory 13 RVA: 0x91000 Size: 1736
directory 14 RVA: 0x0 Size: 0
directory 15 RVA: 0x0 Size: 0
directory 16 RVA: 0x0 Size: 0
The import directory (#2 above) is shown as being at an RVA of 0xAF974. However, the import directory is NOT located at byte 0xAF974 of the EXE file. How do I compute the byte offset of the import directory in the file as it is written on the disk?

This is fun! You have to loop through sections to find the correct location based on it's virtual address. Here is some code I wrote after a lot of
I can try to explain this, but it took a lot of time to understand it myself and I haven't looked at it in a few weeks and I already forgot a lot of the technical stuff. I was writing a C++ class to handle a lot of this too
In my code buffer is a pointer to a MapViewOfFile but it can be any char pointer.
/* Example usage...I know not perfect but should help a bit. */
unsigned char * lpFile = (unsigned char *)(void *)MapViewOfFile(fileMap, FILE_MAP_ALL_ACCESS, 0,0, 0);
if(lpFile==NULL) {
printf("Failed to MapViewOfFile\r\n");
exit(0);
}
header_dos = (PIMAGE_DOS_HEADER)lpFile;
header_nt = (PIMAGE_NT_HEADERS32)&lpFile [header_dos->e_lfanew];
IMAGE_DATA_DIRECTORY import = header_nt->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_IMPORT];
PIMAGE_IMPORT_DESCRIPTOR im = (PIMAGE_IMPORT_DESCRIPTOR)&lpFile[RVA2Offset(lpFile, import.VirtualAddress)];
/* RVA is relative to the section it resides in. */
int RVA2Offset(unsigned char * buffer, DWORD rva)
{
PIMAGE_NT_HEADERS header = (PIMAGE_NT_HEADERS) &buffer[ ((PIMAGE_DOS_HEADER)buffer)->e_lfanew ];
PIMAGE_SECTION_HEADER section = (PIMAGE_SECTION_HEADER) &buffer[ ((PIMAGE_DOS_HEADER)buffer)->e_lfanew + sizeof(IMAGE_NT_HEADERS) ];
for(int sectionIndex = 0; sectionIndex < header->FileHeader.NumberOfSections; sectionIndex++) {
/*
Check if the RVA is within the virtual addressing space of the section
Make sure the RVA is less than the VirtualAddress plus its raw data size
IMAGE_HEADER_SECTION.VirtualAddress = The address of the first byte of the section when loaded into memory, relative to the image base. For object files, this is the address of the first byte before relocation is applied.
Our ImageBase is 0, since we aren't loaded into actual memory
*/
section = (PIMAGE_SECTION_HEADER) &buffer[ ((PIMAGE_DOS_HEADER)buffer)->e_lfanew + sizeof(IMAGE_NT_HEADERS) + (sectionIndex*sizeof(IMAGE_SECTION_HEADER))];
if (rva >= section->VirtualAddress && (rva <= section->VirtualAddress + section->SizeOfRawData)) {
/**
PointerToRawData gives us the section's location within the file.
RVA - VirtualAddress = Offset WITHIN the address space
**/
return section->PointerToRawData + (rva - section->VirtualAddress);
}
}
return 0;
}

Can the logical erase block size of an MTD device be increased?

The minimum erase block size for jffs2 (mtd-utils version 1.5.0, mkfs.jffs2 revision 1.60) seems to be 8KiB:
Erase size 0x1000 too small. Increasing to 8KiB minimum
However I am running Linux 3.10 with an at25df321a,
m25p80 spi32766.0: at25df321a (4096 Kbytes),
and the erase block size is only 4KiB:
mtd5
Name: spi32766.0
Type: nor
Eraseblock size: 4096 bytes, 4.0 KiB
Amount of eraseblocks: 1024 (4194304 bytes, 4.0 MiB)
Minimum input/output unit size: 1 byte
Sub-page size: 1 byte
Character device major/minor: 90:10
Bad blocks are allowed: false
Device is writable: true
Is there a way to make the mtd system treat multiple erase blocks as one? Maybe some ioctl or module parameter?
If I flash a jffs2 image with larger erase block size, I get lots of kernel error messages, missing files and sometimes panic.
workaround
I found that flasherase --jffs2 results in a working filesystem inspite of the 4KiB erase block size. So I hacked the mkfs.jfss2.c file and the resulting image seems to work fine. I'll give it some testing.
diff -rupN orig/mkfs.jffs2.c new/mkfs.jffs2.c
--- orig/mkfs.jffs2.c 2014-10-20 15:43:31.751696500 +0200
+++ new/mkfs.jffs2.c 2014-10-20 15:43:12.623431400 +0200
## -1659,11 +1659,11 ## int main(int argc, char **argv)
}
erase_block_size *= units;
- /* If it's less than 8KiB, they're not allowed */
- if (erase_block_size < 0x2000) {
- fprintf(stderr, "Erase size 0x%x too small. Increasing to 8KiB minimum\n",
+ /* If it's less than 4KiB, they're not allowed */
+ if (erase_block_size < 0x1000) {
+ fprintf(stderr, "Erase size 0x%x too small. Increasing to 4KiB minimum\n",
erase_block_size);
- erase_block_size = 0x2000;
+ erase_block_size = 0x1000;
}
break;
}

http://lists.infradead.org/pipermail/linux-mtd/2010-September/031876.html
JFFS2 should be able to fit at least one node to eraseblock. The
maximum node size is 4KiB+few bytes. This is why the minimum
eraseblocks size is 8KiB.
But in practice, even 8KiB is bad because you and up with wasting a
lot of space at the end of eraseblocks.
You should join several erasblock into one virtual eraseblock of 64 or
128 KiB and use it - this will be more optimal.
Some drivers have already implemented this. I know about
MTD_SPI_NOR_USE_4K_SECTORS
Linux configuration option. It have to be set to "n" to enable large erase sectors of size 0x00010000.

Extract thumbnail from jpeg file

I'd like to extract thumbnail image from jpegs, without any external library. I mean this is not too difficult, because I need to know where the thumbnail starts, and ends in the file, and simply cut it. I study many documentation ( ie.: http://www.media.mit.edu/pia/Research/deepview/exif.html ), and try to analyze jpegs, but not everything clear. I tried to track step by step the bytes, but in the deep I confused. Is there any good documentation, or readable source code to extract the info about thumbnail start and end position within a jpeg file?
Thank you!

Exiftool is very capable of doing this quickly and easily:
exiftool -b -ThumbnailImage my_image.jpg > my_thumbnail.jpg

For most JPEG images created by phones or digital cameras, the thumbnail image (if present) is stored in the APP1 marker (FFE1). Inside this marker segment is a TIFF file containing the EXIF information for the main image and the optional thumbnail image stored as a JPEG compressed image. The TIFF file usually contains two "pages" where the first page is the EXIF info and the second page is the thumbnail stored in the "old" TIFF type 6 format. Type 6 format is when a JPEG file is just stored as-is inside of a TIFF wrapper. If you want the simplest possible code to extract the thumbnail as a JFIF, you will need to do the following steps:
Familiarize yourself with JFIF and TIFF markers/tags. JFIF markers consist of two bytes: 0xFF followed by the marker type (0xE1 for APP1). These two bytes are followed by the two-byte length stored in big-endian order. For TIFF files, consult the Adobe TIFF 6.0 reference.
Search your JPEG file for the APP1 (FFE1) EXIF marker. There may be multiple APP1 markers and there may be multiple markers before the APP1.
The APP1 marker you're looking for contains the letters "EXIF" immediately after the length field.
Look for "II" or "MM" (6 bytes away from length) to indicate the endianness used in the TIFF file. II = Intel = little endian, MM = Motorola = big endian.
Skip through the first page's tags to find the second IFD where the image is stored. In the second "page", look for the two TIFF tags which point to the JPEG data. Tag 0x201 has the offset of the JPEG data (relative to the II/MM) and tag 0x202 has the length in bytes.

There is a much simpler solution for this problem, but I don't know how reliable it is: Start reading the JPEG file from the third byte and search for FFD8 (start of JPEG image marker), then for FFD9 (end of JPEG image marker). Extract it and voila, that's your thumbnail.
A simple JavaScript implementation:
function getThumbnail(file, callback) {
if (file.type == "image/jpeg") {
var reader = new FileReader();
reader.onload = function (e) {
var array = new Uint8Array(e.target.result),
start, end;
for (var i = 2; i < array.length; i++) {
if (array[i] == 0xFF) {
if (!start) {
if (array[i + 1] == 0xD8) {
start = i;
}
} else {
if (array[i + 1] == 0xD9) {
end = i;
break;
}
}
}
}
if (start && end) {
callback(new Blob([array.subarray(start, end)], {type:"image/jpeg"}));
} else {
// TODO scale with canvas
}
}
reader.readAsArrayBuffer(file.slice(0, 50000));
} else if (file.type.indexOf("image/") === 0) {
// TODO scale with canvas
}
}

The wikipedia page on JFIF at http://en.wikipedia.org/wiki/JPEG_File_Interchange_Format gives a good description of the JPEG Header(the header contains the thumbnail as an uncompressed raster image). That should give you an idea of the layout and thus the code needed to extract the info.
Hexdump of an image header (little endian display):
sdk#AndroidDev:~$ head -c 48 stfu.jpg |hexdump
0000000 d8ff e0ff 1000 464a 4649 0100 0101 4800
0000010 4800 0000 e1ff 1600 7845 6669 0000 4d4d
0000020 2a00 0000 0800 0000 0000 0000 feff 1700
Image Magic (bytes 1,0), App0 Segment header Magic(bytes 3,2), Header Length (5,4) Header Type signature ("JFIF\0"||"JFXX\0")(bytes 6-10), Version (bytes 11,12) Density units (byte 13), X Density (bytes 15,14), Y Density (bytes 17,16), Thumbnail width (byte 19), Thumbnail height (byte 18), and finally rest up to "Header Length" is thumbnail data.
From the above example, you can see that the header length is 16 bytes (bytes 6,5) and version is 01.01 (bytes 12,13). Further, as Thumbnail Width and Thumbnail Height are both 0x00, the image doesn't contain a thumbnail.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string