Why does java.util.zip.CRC32.getValue() return a long, not an int? - long-integer

See the title. The returned value is 32 bits, right? Why not return an int?

Because if it returned an int, half of the CRC's would be negative. The expectation is that a 32-bit CRC is unsigned, i.e. 0..4294967295, which cannot be represented in an int.

java.util.zip.CRC32 implements the Checksum interface, which requires a long return type for getValue(), therefore requiring a long for a 32-bit checksum; the upper 32 bits of the output are almost definitely 0.

Related

Strings, integers, data types

After several years of writing code for my own use, I'm trying to understand what does it really mean.
a = "Foo"
b = ""
c = 5
d = True
a - string variable. "Foo" (with quotes) - string literal, i.e. an entity of the string data type.
b - string variable. "" - empty string.
c - integer variable. 5 - integer literal, i.e. an entity of the integral data type.
d - Boolean variable. True - Boolean value, i.e. an entity of the Boolean data type.
Questions:
Is my understanding is correct?
It seems that 5 is an integer literal, which is an entity of the integral data type. "Integer" and "integral": for what reason we use different words here?
What is the "string" and "integer"?
As I understand from Wikipedia, "string" and "integer" are not the same thing as string/integer literals or data types. In other words, there are 3 pairs or terms:
string literal, integer literal
string data type, integer data type
string, integer
Firstly, a literal value is any value which appears literally in code, e.g "hello" is a string literal, 123 is an integer literal, etc. In contrast for example:
int a = 5;
int b = 2;
int c = a + b;
a and b have literal values assigned to them, but c does not, it has a computed value assigned to it.
With any literal value we describe the literal value with it's data type ( as in the first sentence ), e.g. "string literal" or "integer literal".
Now a data type refers to how the computer, or the software running on the computer, interprets the binary value of some data. For most kinds of data, the interpretation of the bytes is typically defined in a standard. utf-8 for example is one way to interpret the bytes of a string's internal (binary) value. Interestingly, the actual bytes of a string are treated as unsigned, 8-bit integers. In utf-8, the values of those integers are combined in various ways to determine which glyph, or character, should appear on the screen when those values are encountered in the data. utf-8 is a variable-byte-length encoding which can have between 1 and 4 bytes per character ( 8 to 32-bit ).
For numbers, particularly integers, implementations can vary, but most representations use four bytes with the most significant byte first in order, and the first bit of the first byte as the sign, with signed integers, or the first bit is simply the most significant bit for unsigned integers. This is referred to as big-endian ordering of bytes in a multi-byte integer. There is also little-endian encoding, and integers can in principle use any number of bytes, but the most typically implemented are 1, 2, 4 and sometimes 8, where bit-wise you have 8, 16, 32 or 64, respectively. For integer sizes that are not of these sizes, typically requires a custom implementation.
For floating point numbers it gets a bit more tricky. There is a common standard for floating point numbers called IEEE-754 which describes how floats are encoded. Likewise for floats, there are different sizes and variations, but primarily we use 16, 32, 64 and sometimes 24-bit in some mobile device graphics implementations. There are also extended precision floats which use 40 or 80 bits.

Convert large decimal number to hexadecimal notation

When creating a String object in Swift you can use a String Format Specifier to convert an integer to hexadecimal notation.
print(String(format:"%x", 1234))
// output: 4d2
// expected output: 4d2
But when numbers become bigger, the output is not as expected.
print(String(format:"%x", 12345678901234))
// output: 73ce2ff2
// expected output: b3a73ce2ff2
It seems that the output of String(format:"%x", n) is truncated at 8 characters. I don't think in hexadecimal natively, this makes debugging hard. I have seen answers for other programming languages where it is explained that you need to brake-up the large integer into parts, but that seems wrong to me.
What am I doing wrong here?
What is the right way to convert decimal numbers to hexadecimal numbers in Swift?
You need to use %lx or %llx
print(String(format:"%lx", 12345678901234))
b3a73ce2ff2
Table 2 on the site you linked specifies them
l -
Length modifier specifying that a following d, o, u, x, or X conversion specifier applies to a long or unsigned long argument.
x is for unsigned 32 bit integers which only go up to 4.294.967.296

Hexadecimal string with float number to float

Iam working in python 3.6
I receive from serial comunication an string '3F8E353F'. This is a float number 1.111. How can I convert this string to a float number?
Thank you
Ah yes. Since this is 32-bits, unpack it into an int first then:
x='3F8E353F'
struct.unpack('f',struct.pack('i',int(x,16)))
On my system this gives:
>>> x='3F8E353F'
>>> struct.unpack('f',struct.pack('i',int(x,16)))
(1.1109999418258667,)
>>>
Very close to the expected value. However, this can give 'backwards' results based on the 'endianness' of bytes in your system. Some systems store their bytes least significant byte first, others most significant byte first. See this reference page for the descriptors to format based on byte order.
I used struct.unpack('f',struct.pack('i',int(x,16))) to convert Hex value to float but for negative value I am getting below error
struct.error: argument out of range
To resolve this I used below code which converts Hex (0xc395aa3d) to float value (-299.33). It works for both positive as well for negative value.
x = 0xc395aa3d
struct.unpack('f', struct.pack('I', int(x,16) ))
Another way is to use bytes.fromhex()
import struct
hexstring = '3F8E353F'
struct.unpack('!f', bytes.fromhex(hexstring))[0]
#answer:1.1109999418258667
Note: The form '!' is available for those poor souls who claim they can’t remember whether network byte order is big-endian or little-endian (from struct docs).

(char)NULL , '\0' and 0 all mean the same thing in C's memset?

We are migrating a 32 bit application from rhel 5.3 to 6.4
We are getting an warning "Cast from pointer to integer of different size " on new system's memset.
Do (char)NULL, '\0' and 0 all mean the same thing in C's memset?
The following code is giving the warning in new environment.
#define NULLC (char)NULL
#define MAX_LEN 11
…
memset(process_name, NULLC, MAX_LEN + 1);
strncpy(process_name, "oreo", MAX_LEN);
The do not all mean the same thing, though they're likely to yield the same result.
(char)NULL converts the value of NULL, which is an implementation-defined null pointer constant, to char. The type of NULL may be int, or void*, or some other integer type. If it's of an integer type, the conversion is well defined and yields 0. If it's void*, you're converting a null pointer value to char, which has an implementation-defined result (which is likely, but not guaranteed, to be 0).
The macro NULL is intended to refer to a null pointer value, not a null character, which is a very different thing.
Your macro NULLC is not particularly useful. If you want to refer to a null character, just use the literal constant '\0'. (And NULLC is IMHO too easily confused with NULL.)
The other two constants, '\0' and 0, have exactly the same type (int) and value (zero).
(It's admittedly counterintutive that '\0' has type int rather than char. It's that way for historical reasons, and it rarely matters. In C++, character constants are of type char, but you asked about C.)
They all have the same value 0 but they don't mean the same thing.
(char)NULL - You are casting the value of NULL pointer to character with value 0
'\0' - End of string character with value 0 (NUL)
0 - 32 bit integer with value 0.
You are getting a warning because somewhere in your code you're likely using something like:
short somevar = NULL;
or something similar.
0 and '\0' are both the integer 0 (the type of a character literal is int, not char), so they are exactly equivalent. The second argument to memset is an int, from which only the low-order 8 bits will be used.
NULL is a different beast. It is a pointer type that is guaranteed by the standard to be different from any pointer to a real object. The standard does NOT say that this is done by giving it the value 0, though it must compare equal to zero. Also, it may be of different width than int, so passing it as the second argument to memset() might not compile.
In defining NULLC, you are casting NULL from a native pointer (64-bits, probably defined as (void*)0) to char (8-bits). If you wanted to declare NULLC, you should just do
#define NULLC 0
and do away with NULL and the (char). The formal argument to memset is int, not char.
0 = zero of int datatype
'\0' = (char)0 //null char
NULL = (void*)0 //null pointer
See how they are interlinked to each other. Gcc often gives warnings for all the typecast that are implicitly done by the compiler.
You are using
#define NULLC (char)NULL
.....
memset(process_name, NULLC, MAX_LEN + 1);
equivalent to:
memset(process_name, (char)NULL, MAX_LEN + 1);
equivalent to:
memset(process_name, '\0', MAX_LEN + 1);
You are passing char data (ie; '\0' ) as second parameter where "unsigned int" data is accepted. So compiler is converting it to unsigned int implicilty and thus giving typecast warning. You can simply ignore it or change it as:
memset(process_name, 0, MAX_LEN + 1);

Convert an integer to a string

I am trying to learn assembler and want to write a function to convert a number to a string. The signature of the function I want to write would looks like this in a C-like fashion:
int numToStr(long int num, unsigned int bufLen, char* buf)
The function should return the number of bytes that were used if conversion was successful, and 0 otherwise.
My current approach is a simple algorithm. In all cases, if the buffer is full, return 0.
Check if the number is negative. If it is, write a - char into buf[0] and increment the current place in the buffer
Repeatedly divide by 10 and store the remainders in the buffer, until the division yields 0.
Reverse the number in the buffer.
Is this the best way to do this conversion?
This is pretty much how every single implementation of itoa that I've seen works.
One thing that you don't mention but do want to take care of is bounds checking (i.e. making sure you don't write past bufLen).
With regards to the sign: once you've written the -, you need to negate the value. Also, the - needs to be excluded from the final reversal; an alternative is to remember the sign at the start but only write it at the end (just before the reversal).
One final corner case is to make sure that zero gets written out correctly, i.e. as 0 and not as an empty string.

Resources