What about memory address of objects in C++? - memory-address

I am reading some C++ text and got the following code:
From that code, in the main() function, the author use sizeof() to get the memory address of all object and its member functions. the results:
From that results, the author made a diagram as follows:
There's no other explanation from the author.
What I do not understand is that where the numbers 992, 928, 880, 776 came from? And what is the boundary and why it occupies 8 bytes?
Thanks a lot.

The hex addresses 12FF40, 12FF00, 12FED0, 12FE68 are equivalent to the decimal numbers: 1244992, 1244928, 1244880, and 1244776. The least significant digits of these decimal numbers is where the addresses in the diagram come from.
Not sure, but the boundary probably represents padding that causes the object layout to fit onto word boundaries.

Related

How does the instructor pointer work with a stack?

I recently read about a programming language called Befunge (esoteric programming language). It deals with operations with a stack. The Wikipedia article contains this text:
The instruction pointer begins at a set location (the upper-left corner of the playfield) and is initially travelling in a set direction (right). As it encounters instructions, they are executed. Befunge-93 has no jumps farther than two cells, so control flow is done by altering the direction of the program counter, sending it to different literal code paths. (Befunge-98 has far jumps, but because of inertia in the community, these are rarely used.) The following, for example, is an infinite loop: >v ^<
I don’t understand what "instructor pointer" means. Can somebody explain it to me in simpler terms?
This is what the linked Wikipedia page says right at the beginning. Emphasis is mine.
A Befunge program is laid out on a two-dimensional playfield of fixed size. The playfield is a rectangular grid of ASCII characters, each generally representing an instruction. The playfield is initially loaded with the program.
Execution proceeds by the means of a program counter (-93) or instruction pointer (-98). This points to a grid cell on the playfield.
Both terms program counter and instruction pointer mean the same thing. It is an abstract "variable" that realizes a pointer to the next instruction to execute. You can literally use your pointer finger (sic!) and use it to read the next character of the program (grid cell), which encodes the instruction.
In this case the instruction pointer needs to hold at least two indexes, one for the row and one for the column of the current cell. One would also add the current direction to be able to advance.

How to use leading_zeros/trailing_zeros in platform independent way?

I want find the first non-zero bit in the binary representation of a u32. leading_zeros/trailing_zeros looks like what I want:
let x: u32 = 0b01000;
println!("{}", x.trailing_zeros());
This prints 3 as expected and described in the docs. What will happen on big-endian machines, will it be 3 or some other number?
The documentation says
Returns the number of trailing zeros in the binary representation
is it related to machine binary representation (so the result of trailing_zeros depends on architecture) or base-2 numeral system (so result will be always 3)?
The type u32 respresents binary numbers with 32 bits as an abstract concept. You can imagine them as abstract, mathematical numbers in the range from 0 to 232-1. The binary representation of these numbers is written in the usual convention of starting with the most significant bit (MSB) and ending with the least significant bit (LSB), and the trailing_zeros() method returns the number of trailing zeros in that representation.
Endianness only comes into play when serializing such an integer to bytes, e.g. for writing it to a bytes buffer, a file or the network. You are not doing any of this in your code, so it doesn't matter here.
As mentioned above, writing a number starting with the MSB is also just a convention, but this convention is pretty much universal today for numbers written in positional notation. For programming, this convention is only relevant when formatting a number for display, parsing a number from a string, and maybe for naming methods like trailing_zeros(). When storing an u32 in a register, the bits don't have any defined order.

Fortran 77: output floats with variable widths

I need to output lots of (>20 million) float values to a text file from a Fortran 77 program. I'd like to keep the output file as small as possible. Therefore I would like to output the floats in a compact way, without resorting to binary.
I know the precision I need (usually two digits right of the decimal point), so in C I would use printf("%.2f %.2f", val1, val2); Is something like this possible in Fortran 77? All I found was that I have to set the field width explicitly (like in format (f8.2,x,f8.2)). This wastes lots of space, when I don't know the range of the output numbers beforehand.
If it is not possible in Fortran 77, do newer Fortran standards offer a way to do this?
The Fortran 2008 standard allows an edit descriptor such as f0.2 in response to which the output is the smallest possible field width which writes the whole part of the number followed by a decimal point and two fractional digits. I think that this has been part of the language standard since Fortran 90, possibly longer.
If you have a number, X, then INT(LOG10(X))+1 is the size of the integer part of your number (number of digits of the integer part). So, you just have to make some custom FORMAT labels for each of the values you want to print.
It is not very elegant, but I think it will help you achieve what you want.
I know this might come across as pedantic and unhelpful, but hear me out. It sounds like you are doing bad science. If your instrument is spitting out numbers from 1000.00 to 0.01, then your instrument is probably only accurate to one part in a hundred. So the number 9894.36 ought to be rounded to 9900 (no decimal point). All the other digits are not significant. Why is that relevant and helpful? Because you are wasting storage space if you are storing 9894.36. So, the answer is to use the g edit descriptor, which outputs in scientific notation. Then all of your numbers will take up the same space.

Word, Doubleword, Quadword

It's my second question, one after another. That's the problem with assembly (x86 - 32bit) too.
"Programming from the Ground Up" says that 4bytes are 32bits and that's a word.
But Intel's "Basic Architecture" guide says, that word is 16bits (2 bytes) and 4 bytes is a dualword.
Memory uses 4bytes words, to get to another word I have to skip next 4 bytes, on each word I can make 4 offsets (0-3) to read a byte, so it's wrong with Intel's name, but this memory definition goes from Intel, so what's there bad?
And how to operate on words, dualword, quadwords in assembly? How to define the number as quadword?
To answer your first question, the processor word size is a function of the architecture. Thus, a 32-processor has a 32-bit word. In software types, including assembly, usually there is need to identify the size unambigously, so word type for historical reasons is 16-bits. So probably both sources are correct, if you read them in context: the first one is referring to the processor word, while the Intel guide is referring to the word type.
We've got different "word"s: program words, memory words, OS-specific words, architecture-specific words (program space word, flash word, eeprom word), even address words.
It's just a matter of convention what size the word word refers to.
I usually find the size of the word by looking at the number of hex digits the context is using to show them. Intel's most common type, 4 digits (0x0000), is two bytes.
And for further information, even byte is a convention. In many systems in the past bytes have been 7 or 9 bits. Most architectures nowadays have 8-bit bytes. The correct name for an always-8-bit structure is an octet.

What defines data that can be stored in strings

A few days ago, I asked why its not possible to store binary data, such as a jpg file into a string variable.
Most of the answers I got said that string is used for textual information such as what I'm writing now.
What is considered textual data though? Bytes of a certain nature represent a jpg file and those bytes could be represented by character byte values...I think. So when we say strings are for textual information, is there some sort of range or list of characters that aren't stored?
Sorry if the question sounds silly. Just trying to 'get it'
I see three major problems with storing binary data in strings:
Most systems assume a certain encoding within string variables - e.g. if it's a UTF-8, UTF-16 or ASCII string. New line characters may also be translated depending on your system.
You should watch out for restrictions on the size of strings.
If you use C style strings, every null character in your data will terminate the string and any string operations performed will only work on the bytes up to the first null.
Perhaps the most important: it's confusing - other developers don't expect to find random binary data in string variables. And a lot of code which works on strings might also get really confused when encountering binary data :)
I would prefer to store binary data as binary, you would only think of converting it to text when there's no other choice since when you convert it to a textual representation it does waste some bytes (not much, but it still counts), that's how they put attachments in email.
Base64 is a good textual representation of binary files.
I think you are referring to binary to text encoding issue. (translate a jpg into a string would require that sort of pre-processing)
Indeed, in that article, some characters are mentioned as not always supported, other can be confusing:
Some systems have a more limited character set they can handle; not only are they not 8-bit clean, some can't even handle every printable ASCII character.
Others have limits on the number of characters that may appear between line breaks.
Still others add headers or trailers to the text.
And a few poorly-regarded but still-used protocols use in-band signaling, causing confusion if specific patterns appear in the message. The best-known is the string "From " (including trailing space) at the beginning of a line used to separate mail messages in the mbox file format.
Whoever told you you can't put 'binary' data into a string was wrong. A string simply represents an array of bytes that you most likely plan on using for textual data... but there is nothing stopping you from putting any data in there you want.
I do have to be careful though, because I don't know what language you are using... and in some languages \0 ends the string.
In C#, you can put any data into a string... example:
byte[] myJpegByteArray = GetBytesFromSomeImage();
string myString = Encoding.ASCII.GetString(myJpegByteArray);
Before internationalization, it didn't make much difference. ASCII characters are all bytes, so strings, character arrays and byte arrays ended up having the same implementation.
These days, though, strings are a lot more complicated, in order to deal with thousands of foreign language characters and the linguistic rules that go with them.
Sure, if you look deep enough, everything is just bits and bytes, but there's a world of difference in how the computer interprets them. The rules for "text" make things look right when it's displayed to a human, but the computer is free to monkey with the internal representation. For example,
In Unicode, there are many encoding systems. Changing between them makes every byte different.
Some languages have multiple characters that are linguistically equivalent. These could switch back and forth when you least expect it.
There are different ways to end a line of text. Unintended translations between CRLF and LF will break a binary file.
Deep down everything is just bytes.
Things like strings and pictures are defined by rules about how to order bytes.
strings for example end in a byte with value 32 (or something else)
jpg's don't
Depends on the language. For example in Python string types (str) are really byte arrays, so they can indeed be used for binary data.
In C the NULL byte is used for string termination, so a sting cannot be used for arbitrary binary data, since binary data could contain null bytes.
In C# a string is an array of chars, and since a char is basically an alias for 16bit int, you can probably get away with storing arbitrary binary data in a string. You might get errors when you try to display the string (because some values might not actually correspond to a legal unicode character), and some operations like case conversions will probably fail in strange ways.
In short it might be possible in some langauges to store arbitrary binary data in strings, but they are not designed for this use, and you may run into all kinds of unforseen trouble. Most languages have a byte-array type for storing arbitrary binary data.
I agree with Jacobus' answer:
In the end all data structures are made up of bytes. (Well, if you go even deeper: of bits). With some abstraction, you could say that a string or a byte array are conventions for programmers, on how to access them.
In this regard, the string is an abstraction for data interpreted as a text. Text was invented for communication among humans, computers or programs do not communicate very well using text. SQL is textual, but is an interface for humans to tell a database what to do.
So in general, textual data, and therefore strings, are primarily for human to human, or human to machine interaction (say for the content of a message box). Using them for something else (e.g. reading or writing binary image data) is possible, but carries lots of risk bacause you are using the data type for something it was not designed to handle. This makes it much more error prone. You may be able to store binary data in strings, mbut just because you are able to shoot yourself in the foot, you should avoid doing so.
Summary: You can do it. But you better don't.
Your original question (c# - What is string really good for?) made very little sense. So the answers didn't make sense, either.
Your original question said "For some reason though, when I write this string out to a file, it doesn't open." Which doesn't really mean much.
Your original question was incomplete, and the answers were misleading and confusing. You CAN store anything in a String. Period. The "strings are for text" answers were there because you didn't provide enough information in your question to determine what's going wrong with your particular bit of C# code.
You didn't provide a code snippet or an error message. That's why it's hard to 'get it' -- you're not providing enough details for us to know what you don't get.

Resources