My midlet is showing some images fine, but not others.
They are all 8-bit PNGs, but the ones that aren't displaying are the ones I have created myself in PhotoShop.
So I am thinking maybe my PhotoShop (CS6) settings are wrong...
PNG-8, Selective, Diffusion, Colors: 256, Dither: 100%, Matte: None, Web
Snap: 0%, Convert to sRGB: ticked, Width: 48, Height: 48, Percent: 100%,
Quality: Bicubic.
I've experimented with a few of these settings, but to no avail.
Any ideas?
There is a similar problem here but this is opposite to mine in that PhotoShop mends things in that case, rather than breaks things...
My code is...
image = Image.createImage("/img/loading1.png");
...and here is my stack trace:
java.io.EOFException
at javax.imageio.stream.ImageInputStreamImpl.readFully(
ImageInputStreamImpl.java:353)
at java.io.DataInputStream.readUTF(DataInputStream.java:609)
at javax.imageio.stream.ImageInputStreamImpl.readUTF(ImageInputStreamImpl.java:332)
at com.sun.kvem.png.PNGImageReader.parse_iTXt_chunk(PNGImageReader.java:447)
at com.sun.kvem.png.PNGImageReader.readMetadata(PNGImageReader.java:650)
at com.sun.kvem.png.PNGImageReader.readImage(PNGImageReader.java:1312)
at com.sun.kvem.png.PNGImageReader.read(PNGImageReader.java:1582)
at com.sun.kvem.midp.GraphicsBridge.loadImage(GraphicsBridge.java:2602)
at com.sun.kvem.midp.GraphicsBridge.createImageFromData(GraphicsBridge.java:2511)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at com.sun.kvem.sublime.MethodExecution.process(MethodExecution.java:42)
at com.sun.kvem.sublime.SublimeExecutor.processRequest(SublimeExecutor.java:63)
at javax.microedition.lcdui.Image.createImage(Image.java:315)
The image in question does exist - both in the project and in the jar that is built.
Here is the image in question:
According to the crash log, the PNG decoder in J2ME fails inside the non-critical chunk iTXt:1
> com.sun.kvem.png.PNGImageReader.readMetadata
> com.sun.kvem.png.PNGImageReader.parse_iTXt_chunk
> javax.imageio.stream.ImageInputStreamImpl.readUTF
> java.io.DataInputStream.readUTF
According to libpng documentation, the text part of an iTXt chunk must be valid UTF8:
... The remaining chunk data is the main UTF-8 text, either zlib-compressed or not, according to the compression flag. Since its length can be determined from the chunk length, it is not null-terminated. As with the other two text chunks, newlines should be represented by single line-feed characters (decimal 10), and all other control characters (1-9, 11-31, and 127-159) are discouraged.
and so normally this would indicate that the stream read is not valid UTF8 text - it contains 'raw' bytes higher than the plain ASCII range 0..127 that do not conform to UTF8 rules.
I found that not to be the case in the sample image. There is only one set of consecutive bytes that form a UTF8 code sequence, and it is a valid one:
<?xpacket begin="EFBBBF" id=" ..
(the bolded section represents 3 data bytes in hexadecimal notation). I first suspected this was the error:
If the BOM character appears in the middle of a data stream, Unicode says it should be interpreted as a "zero-width non-breaking space" (inhibits line-breaking between word-glyphs). In Unicode 3.2, this usage is deprecated in favour of the "Word Joiner" character, U+2060.[1] This allows U+FEFF to be only used as a BOM.
(http://en.wikipedia.org/wiki/Byte_order_mark)
.. and so a fully conforming UTF8 reader should inspect its bytes and throw an UTFDataFormatException when it encounters a BOM anywhere else than as the very first value. Surprisingly, this does not seem to be the problem! First of all, there is no indication any of the readUTF sources do anything else than only verify if the UTF8 code is valid on its own, irrespective of its value. There are lots of 'invalid' Unicode code points (values that do not represent a valid Unicode character or instruction), but it appears to me they are all silently ignored. But I noticed the common readUTF functions only implement a small subset of UTF8/Unicode (see, e.g., Modified UTF-8 in Oracle's documentation).
So the problem lies elsewhere. Another clue to this is that the error thrown is not UTFDataFormatException but rather EOFException, indicating the read buffer ran out of the number of bytes it was promised to contain.
(warning: pure conjecture follows)
Looking at a source of DataInputStream, I find this snippet of code:
588 public final static String readUTF(DataInput in) throws IOException {
589 int utflen = in.readUnsignedShort();
followed by a loop to read utflen bytes (not "Unicode characters"). This is wrong for an iTXt chunk, as it does not have a 'first word' to indicate its length. The number of bytes in the plain text can be derived from the chunk length (which is, per PNG convention, the total data length excluding the length long word, the iTXt signature itself, and the final CRC32 code) minus the length of the zero-terminated keyword name, language, and "translated keyword" strings, and the two bytes which indicate compression of the full plain text.
As a work-around, remove the iTXt chunks from your PNG images. The data itself -- XMP Metadata -- is most likely not interesting at all for your purposes (but feel free to read what benefits Adobe thinks it has). And if your workflow does not use it, it's just a useless hunk of uncompressed text, taking up 814 bytes of the total of 981 bytes in your sample image -- a whopping 83%!
You can use an external utility to remove extraneous data chunks; the command line for the popular pngcrush, for example, is
pngcrush -rem alla -rem text InputFile.png OutputFile.png
(from en.wikipedia.org/wiki/Pngcrush).
Or directly from Photoshop: if you save a PNG 'the usual way' with the "Save As" menu option, the metadata goes in and there is no checkbox to get rid of it. If you use "Save for Web & Devices" instead, you get a large dialog with a lot of handy options, such as a drop down list labelled "Metadata".
Choosing "All" I got an even larger file; my version of Photoshop creates a massive 3K chunk of XMP Metadata, including a 2K totally empty 'filler' block...
Selecting "Copyright" or "None" finally got rid of all the crud (presumably because I did not fill in any copyright information), and then you get a nice 169 bytes long PNG, in which the only metadata is that software used is called "Adobe ImageReady".
1 Which is kind of ironic. Per PNG specifications,
.. A decoder encountering an unknown chunk in which the ancillary bit is 1 can safely ignore the chunk and proceed to display the image.
(source)
This "ancillary bit" is the 5th bit of the first byte of the chunk ID: 0 (uppercase) = critical, 1 (lowercase) = ancillary, i.e., if the first character of the chunk ID is a capital, a PNG reader must read and interpret its data correctly, and if it's not, it can be skipped silently.
So technically, the writers of J2ME could safely have ignored this entire chunk. But they messed it up, attempt to read it, and now the code crashes on all programs that merely try to read the image data in PNGs which happen to contain iTXt chunks.
Related
I have a file in which i have very important java project source code that got lost.
It is an elf-file. When i open it with and editor most of it is unreadable but the complete java project seems to be embedded as a uncompressed zip folder inside the file with folderstructure and everything (dont ask me why. I only try to get the information back i am not responsible).
The relevant information pieces in the elf-file look like the following:
PK
Üi‰L§½kQ Q 9 file/path/i/cant/show/contenttext
content
content
Because i dont know where the zip folder starts and where it ends and because everything is uncompressed my idea was to write a small script to scrape the from the elf-file and create the complete javaproject from that.
For that i want the file name length from the header so its easy to know where filename ends end filecontent starts.
ThisPK Üi‰L§½kQ Q 9 seems to be the file header of the zipfile. I converted it to hex and it looks like this: 504B03040A2020082020DC69894CA71E BD6B512020205120202039202020
I tried to format that with the info from wikipedia:
504B0304 //sig (this showed me i did something right)
0A20 // version
2008 // generalpurpose flag
2020 // compression method
DC69 // File last modification time
894C // File last modification date
A71EBD6B //CRC-32 of uncompressed data
51202020 //Compressed size (or 0xffffffff for ZIP64)
51202020 //Uncompressed size (or 0xffffffff for ZIP64)
3920 //File name length (n)
2020 //Extra field length (m)
And Endian switch:
04034B50 //sig
200A // version
0820 // generalpurpose flag
2020 // compression method
69DC // File last modification time
4C89 // File last modification date
6BBD1EA7 //CRC-32 of uncompressed data
20202051 //Compressed size (or 0xffffffff for ZIP64)
20202051 //Uncompressed size (or 0xffffffff for ZIP64)
2039 //File name length (n)
2020 //Extra field length (m)
But something seems wrong. The length of the file header is right (30 bytes plus filename) and the numbers seem to have information at the right point but 2020 should be 0000 for compression. To me it seems the conversion to hex is only half right.
What do i have to change to get the right numbers?
I found my error.
The problem of the weird 2020 instead of 0000 was my mistake. I opened the file in notepadd++ copied interesting sections into a new file and converted them there into hex. I think the copying changed the data. When i open the file directly in a hexeditor the zipefile header is all right.
Using nodejs and iconv-lite to create a http response file in xml with charset windows-1252, the file -i command cannot identify it as windows-1252.
Server side:
r.header('Content-Disposition', 'attachment; filename=teste.xml');
r.header('Content-Type', 'text/xml; charset=iso8859-1');
r.write(ICONVLITE.encode(`<?xml version="1.0" encoding="windows-1252"?><x>€Àáção</x>`, "win1252")); //euro symbol and portuguese accentuated vogals
r.end();
The browser donwloads the file and then i check it in Ubuntu 20.04 LTS:
file -i teste.xml
/tmp/teste.xml: text/xml; charset=unknown-8bit
When i use gedit to open it, the accentuated vogal appear fine but the euro symbol it does not (all characters from 128 to 159 get messed up).
I checked in a windows 10 vm and in there all goes well. Both in Windows and Linux web browsers, it also shows all fine.
So, is it a problem in file command? How to check the right charsert of a file in Linux?
Thank you
EDIT
The result file can be get here
2nd EDIT
I found one error! The code line:
r.header('Content-Type', 'text/xml; charset=iso8859-1');
must be:
r.header('Content-Type', 'text/xml; charset=Windows-1252');
It's important to understand what a character encoding is and isn't.
A text file is actually just a stream of bits; or, since we've mostly agreed that there are 8 bits in a byte, a stream of bytes. A character encoding is a lookup table (and sometimes a more complicated algorithm) for deciding what characters to show to a human for that stream of bytes.
For instance, the character "€" encoded in Windows-1252 is the string of bits 10000000. That same string of bits will mean other things in other encodings - most encodings assign some meaning to all 256 possible bytes.
If a piece of software knows that the file is supposed to be read as Windows-1252, it can look up a mapping for that encoding and show you a "€". This is how browsers are displaying the right thing: you've told them in the Content-Type header to use the Windows-1252 lookup table.
Once you save the file to disk, that "Windows-1252" label form the Content-Type header isn't stored anywhere. So any program looking at that file can see that it contains the string of bits 10000000 but it doesn't know what mapping table to look that up in. Nothing you do in the HTTP headers is going to change that - none of those are going to affect how it's saved on disk.
In this particular case the "file" command could look at the "encoding" marker inside the XML document, and find the "windows-1252" there. My guess is that it simply doesn't have that functionality. So instead it uses its general logic for guessing an encoding: it's probably something ASCII-compatible, because it starts with the bytes that spell <?xml in ASCII; but it's not ASCII itself, because it has bytes outside the range 00000000 to 01111111; anything beyond that is hard to guess, so output "unknown-8bit".
Goodmorning,
with my Nikon DSLR, I always shoot raw. I'd like to import my nef's in an import-folder and copy them from there with a name, prefixed by the date taken. In Lazarus I've created a class named TMyNef and I've managed to retrieve the date taken, ISO, Shutterspeed, F-Stops, exposure program, maker, model and focal length. I'm struggling with the offset to the embedded JPG. I need this because the Lazarus TImage-component doesn't support the NEF-format, but it can display JPG's perfectly. Every NEF has an embedded JPG (you see on the back of the camera), hidden in the PreviewTAG of the NEF. Basically, a NEF is almost the same as a TIFF. A Tiff uses IFD-s to store meta-information. So, I studied lots of sites about tiff, exif, Exiftool (Phil Harvey), DCRaw (Dave Coffin) I've used Exiftool.exe to analyse my nef's. I've even called "Exiftool.exe -b" within my program, but it's to slow.
Shortly, it comes down to:
Open the file
Read the header
Search for tag 0x8769 (the EXIF-tag)
If not found -> exit
Goto the offset where the EXIF-IFD starts
Search for tag 0x927C (The makernotes)
If not found -> exit
Goto to the offset where the MAKERNOTES-IFD starts
Read the Makernotes header
Search for tag 0x011, the Nikon Preview IFD
If not found -> Exit
Goto to the Nikon Preview IFD
Search for tags 0x201 (Start of the embedded JPG) and
tag 0x202 (Size of the embedded JPG)
If not found -> exit
Goto the start of the embedded jpg.
Read JPGSIZE-bytes from the file and copy them to a TMemorystream,
called Fms
Perform a Image1.Picture.LoadFromStream(Fms)
Summarized:
Open nef, search tag x08769, search 0x927C, search 0x0011, search 0x0201.
"Goto" means in the case: move the filepointer to the given offset.
Unfortunately, I've haven't been able the retrieve the offset of the embedded JPG. My program gives entirely different values than Phil Harvey's "Exiftool".
Jumping to "Phil's offset" and read until 0xFFD9 or read the amount of bytes given in tag 0x202, display's a perfect picture, going to the offset my program found and read until 0xFFD9, Lazarus says: Unsupported picture-format.
In the several IFD's I've found the meta-data I needed, so I took that along with me.
Does anyone know a way to retrieve the correct offset to the embedded JPG in a NEF? Or is there another way to display a NEF via the TImage-component in Lazarus?
Thanks in advance and best regards,
Martin
(Intel I5, 8Gb, Windows 10, Nikon Codec loaded, Lazarus 1.8)
Offsets within Exif structures usually are relative to the beginning of the TIFF header in front of the first Exif tag. This is true for jpeg and tiff files, not sure about NEF.
I am using the XML-INTO op-code to parse a web service request. Every now and then I get errors in the logs
(RNX0351 - "The XML parser detected error code 302").
The help for a 302 is
302 The parser does not support the requested CCSID value or
the first character of the XML document was not '<'
To the best of my knowledge, the first character is "<" and the request is generated from a previous web service call so I would be very suprised if the CCSID has changed.
The error is repeatable, for the specific query so it is almost certainly data related, I am just unsure how I would go about identifying the offending item.
Any thoughts on how to determine the issue, or better yet, how to overcome it?
cheers
CCSID is an AS400/iSeries/Power System attribute, and it applies to the whole IFS.It's like a declaration of what inside the file is, or in other words what its internal encoding "should be".
It's supposed that data content encoding in the file and the file one (the envelope) match, and the box uses this attribute to show and handle corresponding characters.
It sounds like you receive data under one encoding, but CCSID file doesn't match.
Try changing CCSID on your file (only the envelope). E.G.: 37 (american), 500 (latin-1), 819 (utf-8), 850 (dos), 1252 (win) and display file after.You can check first using ls -Sla yourfile in QSH or QP2TERM, or EDTF as well. CHGATTR allows you to change CCSID, as well as setccsid in QSH (again).
This way helped me to find related issues. Remember that although data may be visible in the four hundred, they may not be visible through a share folder in Win. It means that CCSID file, an content encoding don't match.
Hope it helps.
Hi I've seen this error with XML data uploaded to AS400/iSeries/IBM i with FTP and the CCSID 819 (ISO 8859-1 ASCII) and it has some binary garbage in first few positions of file. Changed encoding to CCSID 1208 (UTF-8 with IBM PUA) using FTP "quote type c 1208" and the problem cleared and XML-INTO was successful.
So, suggestion about XML parser error 302 received when using XML-INTO is to look at the file (wrklnk ...) and if first character is not "<" but instead some binary garbage then try CCSID 1208 for utf-8.
Statements in this answer about what 819 is and what ccsid represents utf-8 do not agree with previous answer but are correct, according to IBM documentation:
https://www-01.ibm.com/software/globalization/ccsid/ccsid819.html
https://www-01.ibm.com/software/globalization/ccsid/ccsid1208.html
I'm working on this problem a couple hours,
for me the solution was use option ccsid=UCS2 when you use data structure or variable to store xml.
something like that :
XML-INTO customer %XML( xmlSource : 'ccsid=UCS2');
I have the program running on ccsid = 870, every conversion to ccsid on the xmlSource field don't work,
The strange thing that when I use the file with ccsid = 850, every thing work fine
I mention that becouse this is the first page when you looking about this problem.
Maybe this help someone.
I'm receiving UDP packets from a server ( exactly: Open Sound Control packets). i'm storing those packets in a ByteArray.
I want to convert this ByteArray into String so i can exploit the data received. I tried a lot of conversion but each time i'm having non readable charachters.
Here is the code :
| server peerAddr |
server := SocketAccessor newUDPserverAtPort: 3333.
peerAddr := IPSocketAddress new.
buffer := ByteArray new: 1024.
[ server readWait.
server receiveFrom: peerAddr buffer: buffer.
Transcript show: (buffer asString) ; cr ; flush. ] repeat.
I also tried the following conversion but in vain:
buffer asByteString.
buffer asStringEncoding:#UTF8.
buffer asStringEncoding:#UTF16.
buffer asString.
buffer asBase64String.
buffer asFourByteString
buffer withEncoding: #ASCII
Here is the string output :
Any help?
Additional info: The received data is open sound control data so it has a specific formatting, that's why it's diplayed like that, i need to parse ints, floats, strings, whitin a specific bytearray indexs. Does anyone recomand a package that offer those possibilities ?
Thx in advance.
if you want to read the data from the byte array, use the UninterpretedBytes class for that.
You can do:
ubytes := UninterpretedBytes from: aByteArray.
ubytes doubleAt: 5.
stuff like that.
you can also use the uninterpreted bytes to read a string from the bytes.
The correct way to convert bytes to string is definitely applying the right character encoding. The following
(65 to: 75) asByteArray asStringEncoding: #UTF8
should yield
'ABCDEFGHIJK'
Using #asStringEncoding: is the right way to do this. However looking at your screen capture it seems that the bytes you're receiving are not a straight string. There's probably some binary packet format that you need to take apart first and then only decode into strings those parts that you know are actually utf8 encoded (or whatever the encoding is).
You probably can borrow a lot of the code for Squeak's OSC package: http://opensoundcontrol.org/implementation/squeak-osc