Parsing binary VTK - vtk

This paper provides a link to 10 3D models contained in binary VTK files, but I can't open them in ParaView (tried several versions from 2.x to 5.x on Linux and Windows, 32 and 64 bits) and other applications for VTK files.
I'd appreciate any help with opening them. Below is what I could find out.
ParaView reports an unknown keyword. From VTK 4 specification PDF file we see that after "POINTS N float" follows N triples of floats. I suppose size of (float) is 4 thus we get a data block of N * 3 * 4 bytes. After this data block we expect a new keyword (CELLS or something else), but no keyword there, just another several KB of binary data.
Another weird thing is that the last block (CELL_TYPES) should consist of integers. Integer in binary form is supposed to be 2, 4 or 8 bytes (common is 4), BUT we see sequences of three zero-bytes followed by pairs of ASCII chars with codes (13,10) which obviously indicate new line, thus integer value is coded with three bytes? I'm very confused.

Related

RNN Implementation

I am going to implement RNN using Pytorch . But , before that , I am having some difficulties in understanding the character level one-hot encoding which is asked in the question .
Please find below the question
Choose the text you want your neural network to learn, but keep in mind that your
data set must be quite large in order to learn the structure! RNNs have been trained
on highly diverse texts (novels, song lyrics, Linux Kernel, etc.) with success, so you
can get creative. As one easy option, Gutenberg Books is a source of free books where
you may download full novels in a .txt format.
We will use a character-level representation for this model. To do this, you may use
extended ASCII with 256 characters. As you read your chosen training set, you will
read in the characters one at a time into a one-hot-encoding, that is, each character
will map to a vector of ones and zeros, where the one indicates which of the characters
is present:
char → [0, 0, · · · , 1, · · · , 0, 0]
Your RNN will read in these length-256 binary vectors as input.
So , For example , I have read a novel in python. Total unique characters is 97. and total characters is somewhere around 300,000 .
So , will my input be 97 x 256 one hot encoded matrix ?
or will it be 300,000 x 256 one hot encoded matrix ?
One hot assumes each of your vector should be different in one place. So if you have 97 unique character then i think you should use a 1-hot vector of size ( 97 + 1 = 98). The extra vector maps all the unknown character to that vector. But you can also use a 256 length vector. So you input will be:
B x N x V ( B = batch size, N = no of characters , V = one hot vector size).
But if you are using libraries they usually ask the index of characters in vocabulary and they handle index to one hot conversion. Hope that helps.

How are WAV Channel Data values scaled?

For a project I am decoding wav files and am using the values in the data channel. I am using the node package "node-wav". From what I understand the values should be in the thousands, but I am seeing values that are scaled between -1 and 1. If I want the actual values do I need to multiply the scaled value by some number?
Part of the reason I am asking is that I still do not fully understand how WAV files store the necessary data.
I don't exactly know how node.js is but usually audio data is stored in float values so it makes sense to see it scaled between -1 and 1.
What I pulled from the website:
Data format
Data is always returned as Float32Arrays. While reading and writing 64-bit float WAV files is supported, data is truncated to 32-bit floats.
And endianness if you need it for some reason:
Endianness
This module assumes a little endian CPU, which is true for pretty much every processor these days (in particular Intel and ARM).
If you needed it to scale from float to fixed point integer, you'd multiply the value by the number of bits. For example, if you're trying to convert to 16 bit integers; y = (2^15 - 1) * x, where x is the data value, y is the scaled value.

Why can a textual representation of pi be compressed?

A random string should be incompressible.
pi = "31415..."
pi.size # => 10000
XZ.compress(pi).size # => 4540
A random hex string also gets significantly compressed. A random byte string, however, does not get compressed.
The string of pi only contains the bytes 48 through 57. With a prefix code on the integers, this string can be heavily compressed. Essentially, I'm wasting space by representing my 9 different characters in bytes (or 16, in the case of the hex string). Is this what's going on?
Can someone explain to me what the underlying method is, or point me to some sources?
It's a matter of information density. Compression is about removing redundant information.
In the string "314159", each character occupies 8 bits, and can therefore have any of 28 or 256 distinct values, but only 10 of those values are actually used. Even a painfully naive compression scheme could represent the same information using 4 bits per digit; this is known as Binary Coded Decimal. More sophisticated compression schemes can do better than that (a decimal digit is effectively log210, or about 3.32, bits), but at the expense of storing some extra information that allows for decompression.
In a random hexadecimal string, each 8-bit character has 4 meaningful bits, so compression by nearly 50% should be possible. The longer the string, the closer you can get to 50%. If you know in advance that the string contains only hexadecimal digits, you can compress it by exactly 50%, but of course that loses the ability to compress anything else.
In a random byte string, there is no opportunity for compression; you need the entire 8 bits per character to represent each value. If it's truly random, attempting to compress it will probably expand it slightly, since some additional information is needed to indicate that the output is compressed data.
Explaining the details of how compression works is beyond both the scope of this answer and my expertise.
In addition to Keith Thompson's excellent answer, there's another point that's relevant to LZMA (which is the compression algorithm that the XZ format uses). The number pi does not consist of a single repeating string of digits, but neither is it completely random. It does contain substrings of digits which are repeated within the larger sequence. LZMA can detect these and store only a single copy of the repeated substring, reducing the size of the compressed data.

PLY file specifications with texture coordinates

I need to read PLY files (Stanford Triangle Format) with embedded texture for some purpose. I saw several specification of PLY files, but could not find a single source specifying the syntax for texture mapping. There seems to be so many libraries which reads PLY file, but most of them seems not to support texture (they just crashes; I tried 2-3 of them).
Following is in the header for a ply file with texture:
ply
format binary_little_endian 1.0
comment TextureFile Parameterization.png
element vertex 50383
property float x
property float y
property float z
property float nx
property float ny
property float nz
element face 99994
property list uint8 int32 vertex_index
property list uint8 float texcoord
end_header
What I don't understand is the line property list uint8 float texcoord. Also the list corresponding to a face is
3 1247 1257 1279 6 0.09163 0.565323 0.109197 0.565733 0.10888 0.602539 6 9 0.992157 0.992157 0.992157 0.992157 0.992157 0.992157 0.992157 0.992157 0.992157`.
What is this list; what is the format? While I understand that PLY gives you the opportunity to define your own properties for the elements, but the handling textures seems to be pretty much a standard and quite a few applications (like the popular Meshlab) seems to open textured PLY files using the above syntax.
I want to know what is the standard syntax followed for reading textured PLY files and if possible the source from where this information is found.
In PLY files faces often contain lists of values and these lists can vary in size. If it's a triangular face, expect three values, a quad = 4 and so on up to any arbitrary n-gon. A list is declared in a line like this:
property list uint8 int32 vertex_index
This is a list called 'vertex_index'. It will always consist of an 8-bit unsigned integer (that's the uint8) that is the size N, followed by N 32-bit integers (that's the int32).
In the example line this shows up right away:
3 1247 1257 1279
This says, here comes 3 values and then it gives you the three.
Now the second list is where the texture coordinates should be:
property list uint8 float texcoord
It's just like the first list in that the size comes first (as an unsigned byte) but this time it will be followed by a series of 32-bit floats instead of integers (makes sense for texture coordinates). The straightforward interpretation is that there will be a texture coordinate for each of the vertices listed in vertex_index. If we assume these are just 2d texture coordinates (a pretty safe assumption) we should expect to see the number 6 followed by 6 floating point values ... and we do:
6 0.09163 0.565323 0.109197 0.565733 0.10888 0.602539
These are the texture coordinates that correspond with the three vertices already listed.
Now, for a face, that should be it. I don't know what the stuff is on the rest of the line. According to your header the rest of the file should be binary so I don't know how you got it as a line of ascii text but the extra data on that line shouldn't be there (also according to the header which fully defines a face).
Let me add to #OllieBrown's response, as further info for anyone coming across this, that the format above uses per-face texture coordinates, also called wedge UVs. What this means is that if you are sharing vertices, there is a chance that a shared vertex(basically a vert index being used in multiple adjacent triangles), might have different UVs based on the triangle it takes part in. That usually happens when a vertex is on a UV seam or where UVs meet the texture borders. Typically that means duplicating vertices since GPUs require per-vertex attributes. So a shared vertex ends up as X vertices overlapping in space(where X is the number of triangles they are shared by), but have different UVs based on the triangle they take part in. One advantage to keeping data like that on disk is that since this is a text format, it reduces the amount of text you need, therefore reduced disk size. OBJ has that as well, although it keeps a flat UV array and uses indexing into that array instead, regardless of whether it's per-vertex or per-face UVs.
I also can't figure out what the 6 9 <9*0.992157> part is (although the 9 part seems like 3 vector3s which have the same value for all 3 axes), but Paul Bourke's code here has this description of the setup_other_props function:
/******************************************************************************
Make ready for "other" properties of an element-- those properties that
the user has not explicitly asked for, but that are to be stashed away
in a special structure to be carried along with the element's other
information.
Entry:
plyfile - file identifier
elem - element for which we want to save away other properties
******************************************************************************/
void setup_other_props(PlyFile *plyfile, PlyElement *elem)
From what I understand, it's possible to keep data that are not part of the header, per element. These data are supposed to be kept and stored, but not interpreted for use in every application. Bourke's description of the format speaks about backwards compatibility with older software, so this might be a case of a custom format that only some applications understand but the extra info shouldn't hinder an older application that doesn't need them from understanding and/or rendering the content.

Compression using Ascii, trying to figure out how many bits to store the following efficiently

I am trying to learn the basics of compression using only ASCII.
If I am sending an email of strings of lower-case letters. If the file has n
characters each stored as an 8-bit extended ASCII code, then we need 8n bits.
But according to Guiding principle of compression: we discard the unimportant information.
So using that we don't need all ASCII codes to code strings of lowercase letters: they use only 26 characters. We can make our own code with only 5-bit codewords (25 = 32 > 26), code the file using this coding scheme and then decode the email once received.
The size has decreased by 8n - 5n = 3n, i.e. a 37.5% reduction.
But what IF the email was formed with lower-case letters (26), upper-case letters, and extra m characters and they have to be stored efficiently?
If you have n symbols of equal probability, then it is possible to code each symbol using log2(n) bits. This is true even if log2(n) is fractional, using arithmetic or range coding. If you limit it to Huffman (fixed number of bits per symbol) coding, you can get close to log2(n), with still a fractional number of bits per symbol on average.
For example, you can encode ten symbols (e.g. decimal digits) in very close to 3.322 bits per symbol with arithmetic coding. With Huffman coding, you can code six of the symbols with three bits and four of the symbols with four bits, for an average of 3.4 bits per symbol.
The use of shift-up and shift-down operations can be beneficial since in English text you expect to have strings of lower case characters with occasional upper case characters. Now you are getting into both higher order models and unequal frequency distributions.

Resources