Type-Length-Value vs. defined/structured Length-Value

Type-Length-Value vs. defined/structured Length-Value - protocols

There's no doubt that a length-value representation of data is useful, but what advantages are there to type-length-value over it?
Of course, using LV requires the representation to be predefined or structured, but that's hardly ever a problem. Actually, I can't think of a good enough case where it wouldn't be defined enough that TLV would be required.
In my case, this is about data interchange/protocols. In every situation, the representation must be known to both parties to be processed, which eliminates the need for type explicitly inserted in the data. Any thoughts on when the type would be useful or necessary?
Edit
I should mention that a generic parser/processor would certainly benefit from the type information, but that's not my case.

The only decent reason I could come up with is for a generic processor of the data, mainly for debugging or direct user presentation. Having the type embedded within the data would allow the processor to handle the data correctly without having to predefine all possible structures.

The below point was mentioned in the wikipedia.
New message elements which are received at an older node can be safely skipped and the rest of the message can be parsed. This is similar to the way that unknown XML tags can be safely skipped;
Example:
Imagine a message to make a telephone call. In a first version of a system this might use two message elements, a "command" and a "phoneNumberToCall":
command_c/4/makeCall_c/phoneNumberToCall_c/8/"722-4246"
Here command_c, makeCall_c and phoneNumberToCall_c are integer constants and 4 and 8 are the lengths of the "value" fields, respectively.
Later (in version 2) a new field containing the calling number could be added:
command_c/4/makeCall_c/callingNumber_c/8/"715-9719"/phoneNumberToCall_c/8/"722-4246"
A version 1 system which received a message from a version 2 system would first read the command_c element and then read an element of type callingNumber_c.
The version 1 system does not understand ;callingNumber_c
so the length field is read (i.e. the first 8) and the system skips forward 8 bytes to read
phoneNumberToCall_c which it understands, and message parsing carries on.
Without the type field, the version 1 parser would not know to skip callingNumber_c and instead call the wrong number and maybe throw an error on the rest of the message. So the type field allows for forward compatibility in a way that omiting it does not.

Related

why can't a structure have more than one property of type "text"

This doesn't seem right. Why can't a structure have more than one property per type?

The IDE error message is valid.
Due to the design of Bixby platform (modeling and action planing requires unique concept type), a structure can have at most 1 concept of each type. (The concept could be max(Many) for an array)
One general rule is to name each of your concept and not directly use any core base type. It might seems unnecessary at the beginning, but soon it will start making sense and making things easier for complex capsules.
To fix above error, create a Text type BixbyUserId, and replace with:
property (bixbyuserid) {
type (BixbyUserId)
min (Optional) max (One)
}

Is there a way to use the namelist I/O feature to read in a derived type with allocatable components?

Is there a way to use the namelist I/O feature to read in a derived type with allocatable components?
The only thing I've been able to find about it is https://software.intel.com/en-us/forums/intel-fortran-compiler-for-linux-and-mac-os-x/topic/269585 which ended on an fairly unhelpful note.
Edit:
I have user-defined derived types that need to get filled with information from an input file. So, I'm trying to find a convenient way of doing that. Namelist seems like a good route because it is so succinct (basically two lines). One to create the namelist and then a namelist read. Namelist also seems like a good choice because in the text file it forces you to very clearly show where each value goes which I find highly preferable to just having a list of values that the compiler knows the exact order of. This makes it much more work if I or anyone else needs to know which value corresponds to which variable, and much more work to keep clean when inevitably a new value is needed.
I'm trying to do something of the basic form:
!where myType_T is a type that has at least one allocatable array in it
type(myType_T) :: thing
namelist /nmlThing/ thing
open(1, file"input.txt")
read(1, nml=nmlThing)
I may be misunderstanding user-defined I/O procedures, but they don't seem to be a very generic solution. It seems like I would need to write a new one any time I need to do this action, and they don't seem to natively support the
&nmlThing
thing%name = "thing1"
thing%siblings(1) = "thing2"
thing%siblings(2) = "thing3"
thing%siblings(3) = "thing4"
!siblings is an allocatable array
/
syntax that I find desirable.
There are a few solutions I've found to this problem, but none seem to be very succinct or elegant. Currently, I have a dummy user-defined type that has arrays that are way large instead of allocatable and then I write a function to copy the information from the dummy namelist friendly type to the allocatable field containing type. It works just fine, but it is ugly and I'm up to about 4 places were I need to do this same type of operation in the code.
Hence trying to find a good solution.

If you want to use allocatable components, then you need to have an accessible generic interface for a user defined derived type input/output procedure (typically by the type having a generic binding for such a procedure). You link to a thread with an example with such a procedure.
Once invoked, that user defined derived type input/output procedure is then responsible for reading and writing the data. That can include invoking namelist input/output on the components of the derived type.
Fortran 2003 also offers derived types with length parameters. These may offer a solution without the need for a user defined derived type input/output procedure. However, use of derived types with length parameters, in combination with namelist, will put you firmly in the "highly experimental" category with respect to the current compiler implementation.

Given ShortString is deprecated, what is the best method for limiting string length

It's not uncommon to have a record like this:
TAddress = record
Address: string[50];
City : string[20];
State : string[2];
ZIP : string[5];
end;
Where it was nice to have hard-coded string sizes to ensure that the size of the string wouldn't exceed the database field size allotted for the data.
Given, however, that the ShortString type has been deprecated, what are Delphi developers doing to "solve" this problem? Declaring the record fields as string gets the job done, but doesn't protect the data from exceeding the proper length.
What is the best solution here?

If I had to keep data from exceeding the right length, I'd let the database code handle it as much as possible. Put size limits on the fields, and display data to the user in data-bound controls. A TDBEdit bound to a string field will enforce the length limit correctly. Set it up so the record gets populated directly from the dataset, and it will always have the right length.
Then all you need to worry about is data coming into the record from some outside source that is not part of your UI. For that, use the same process. Have the import code insert the data into the dataset, and let its length constraints do the validation for you. If it raises an exception, reject the import. If not, then you've got a valid dataset row that you can use to populate a record from.

The short string types in your question don't really protect the strings from exceeding the proper length. When you assign a longer value to these short strings, the value is silently truncated.
I'm not sure what database access method you are using but I rather imagine that it will do the same thing. Namely truncate any over-length strings to the maximum length. In which case there is nothing to do.
If your database access method throws an error when you give it an over long string then you would need to truncate before passing the value to the database.
If you have to truncate explicitly, then there are lots of places where you might choose to do so. My philosophy would be to truncate at the last possible moment. That's the point at which you are subject to the limit. Truncating anywhere else seems wrong. It means that a database limitation is spreading to parts of the code that are not obviously related to the database.
Of course, all this is based on the assumption that you want to carry on silently truncating. If you want to do provide user feedback in the event of truncation, then you will need to decide just where are the right points to action that feedback.

From my understanding, my answer should be "do not mix layers".
I suspect that the string length is specified at the database layer level (a column width), or at the business application layer (e.g. to validate a card number).
From the "pure Delphi code" point of view, you should not know that your string variable has a maximum length, unless you reach the persistence layer or even the business layer.
Using attributes could be an idea. But it may "pollute" the source code for the very same reason that it is mixing layers.
So what I recommend is to use a dedicated Data Modeling, in which you specify your data expectations. Then, at the Delphi var level, you just define a plain string. This is exactly how our mORMot framework implements data filtering and validation: at Model level, with some dedicated classes - convenient, extendable and clean.

If you're just porting from Delphi 7 to XE3, leave it be. Also, although "ShortString" may be deprecated, I'll eat my hat if they ever remove it completely, because there are a lot of bits of code that will never be able to be rebuilt without it. ShortString + Records is still the only practical way to specify a byte-oriented file-of-record data storage. Delphi will NEVER remove ShortString nor change its behaviour, it would be devastating to existing delphi code. So if you really must define records and limit their length, and you really don't want those records to support Unicode, then there is zero reason to stop using or stop writing ShortString code. That being said, I detest short-strings, and File-of-record, wish they would go away, and am glad they are marked deprecated.
That being said, I agree with mason and David entirely; I would say, Length checking, and validation are presentation/validation concerns, and Delphi's strong typing is NOT the right place or the right way to deal with them. If you need to put validation constraints on your classes, write helper classes that implement constraint-storage (EmployeeName is a string field and EmployeeName has the following length limit). In Edit controls for example, this is already a property. It seems to me that mapping DB Fields to visual fields, using the new Binding system would be much preferable to trying to express constraints statically in the code.
User input validation and storage are different and length limits should be set in your GUI controls not in your data structures.
You could for example use Array of UnicodeChar, if you wanted to have a Unicode wide but length limited string. You could even write your own LimitedString class using the new class helper methods in Delphi. But such approaches are not a maintainable and stable design.
If your SQL database has a field declared with VARCHAR(100) type, and you want to limit your user's input to 100 characters, you should do so at the GUI layer and forget about imposing truncation (data corruption, in fact) silently behind the scenes.

I had this problem - severely - upgrading from Delphi 6 to 2009 for what one program was/is doing it was imperative to be able to treat the old ASCII strings as individual ASCII characters.
The program outputs ASCII files (NO ANSI even) and has concepts such as over-punch on the last numeric digit to indicate negative. So the file format goes back a bit one could say!
After the first build in 2009 (10 year old code, well, you do don't you!) after sorting unit names etc there were literally hundreds of reported errors/ illegal assignments and data loss / conversion warnings...
No matter how good Delphi's back-room manipulation/magic with strings and chars I did not trust it enough. In the end to make sure everything was back as it was I re-declared them all as array of byte and then changed the code accordingly.

You haven't specified the delphi version, here's what works for me in delphi 2010:
Version1:
TTestRecordProp = record
private
FField20: string;
...
FFieldN: string
procedure SetField20(const Value: string);
public
property Field20: string read FField20 write SetField20;
...
property FieldN: string ...
end;
...
procedure TTestRecordProp.SetField20(const Value: string);
begin
if Length(Value) > 20 then
/// maybe raise an exception?
FField20 := Copy(FField20, 1, 20)
else
FField20 := Value;
end;
Version2:
TTestRecordEnsureLengths = record
Field20: string;
procedure EnsureLengths;
end;
...
procedure TTestRecordEnsureLengths.EnsureLengths;
begin
// for each string field, test it's length and truncate or raise exception
if Length(Field20) > 20 then
Field20 := Copy(Field20, 1, 20); // or raise exception...
end;
// You have to call .EnsureLength before push data to db...
Personally, I'd recommend replacing records with objects, then you can do more tricks.

libspotify: What can I and can't I do with image IDs?

Various libspotify API functions deal with image IDs:
These all return an image ID as a const byte*:
sp_album_cover
sp_artist_portrait
sp_artistbrowse_portrait
sp_image_image_id
sp_image_create takes an image ID parameter as const byte[20], while sp_playlist_get_image takes an image ID parameter as byte[20] and fills it with an image ID value.
In this question, a Spotify employee says both that the content of the image ID is opaque and that the size of 20 is not necessarily an accurate length for the image ID: libspotify API: image ID format?
sp_image_create takes 20 bytes long image_id parameter. Does that mean the maximum length of image id is 20 bytes?
No. sp_subscribers is another example of where we've put a fake number in for the compiler. The contents of an image id pointer are opaque and likely to change between releases. Don't write code that makes assumptions about them, because it'll break.
However, in order to use sp_playlist_get_image, the caller needs to allocate the array to store the image ID. This seems to be inconsistent advice, or is at least surprising. Which of the following is true?
Interpretation A: Image IDs will always be exactly 20 bytes.
Interpretation B: Image IDs might be of any length up to 20 bytes.
Interpretation C: Image IDs might be of any length, but the image IDs returned by sp_playlist_get_image are guaranteed to be no more than 20 bytes.
Interpretation D: Image IDs might be of any length and sp_playlist_get_image cannot be used safely at all.
I think the answer to the linked question rules out A and probably B, so I think the answer is probably C, as frustrating as that is. An utter pessimist might go with D.
I'm interested because I'm trying to write a safer and higher-level .NET wrapper than the existing libspotify.net, and I'm unsure how to present image IDs in managed code. I think the only thing for it is to have two alternative implementations - one with a 20 byte buffer that represents an image ID returned from sp_playlist_get_image, and one with an IntPtr that represents an image ID returned from anything else. If the library made sufficient guarantees about the size and nature of an image ID I could always use my own buffer and copy into it when necessary, but I fear it's looking unlikely that libspotify makes guarantees anywhere near strong enough to allow this.

For the current release of libSpotify, Interpretation C is correct for that specific call. Since it takes byte[20], the function guarantees that if you allocate 20 bytes, you'll always have enough for the playlist's image ID. If that guarantee changes in the future the function signature will be changed, assuming we haven't made it work like everything else by then.
Your hybrid solution actually sounds the best for now, considering the state of that API. Using IntPtr where you can will be a lot more future-proof when that nasty sp_playlist_get_image goes away.
I hope your project goes well — we've been wanting a decent .NET wrapper for ages but have never had the time to do the whole thing ourselves. If it's open source, I'll gladly contribute.

Misuse of a variables value?

I came across an instance where a solution to a particular problem was to use a variable whose value when zero or above meant the system would use that value in a calculation but when less than zero would indicate that the value should not be used at all.
My initial thought was that I didn't like the multipurpose use of the value of the variable: a.) as a range to be using in a formula; b.) as a form of control logic.
What is this kind of misuse of a variable called? Meta-'something' or is there a classic antipattern that this fits?
Sort of feels like when a database field is set to null to represent not using a value and if it's not null then use the value in that field.
Update:
An example would be that if a variable's value is > 0 I would use the value if it's <= 0 then I would not use the value and decided to perform some other logic.

Values such as these are often called "distinguished values". By far the most common distinguished value is null for reference types. A close second is the use of distinguished values to indicate unusual conditions (e.g. error return codes or search failures).
The problem with distinguished values is that all client code must be aware of the existence of such values and their associated semantics. In practical terms, this usually means that some kind of conditional logic must be wrapped around each call site that obtains such a value. It is far too easy to forget to add that logic, obtaining incorrect results. It also promotes copy-and-paste code as the boilerplate code required to deal with the distinguished values is often very similar throughout the application but difficult to encapsulate.
Common alternatives to the use of distinguished values are exceptions, or distinctly typed values that cannot be accidentally confused with one another (e.g. Maybe or Option types).
Having said all that, distinguished values may still play a valuable role in environments with extremely tight memory availability or other stringent performance constraints.

I don't think what your describing is a pure magic number, but it's kind of close. It's similar to the situation in pre-.NET 2.0 where you'd use Int32.MinValue to indicate a null value. .NET 2.0 introduced Nullable and kind of alleviated this issue.
So you're describing the use of a variable who's value really means something other than it's value -- -1 means essentially the same as the use of Int32.MinValue as I described above.
I'd call it a magic number.
Hope this helps.

Using different ranges of the possible values of a variable to invoke different functionality was very common when RAM and disk space for data and program code were scarce. Nowadays, you would use a function or an additional, accompanying value (boolean, or enumeration) to determine the action to take.
Current OS's suggest 1GiB of RAM to operate correctly, when 256KiB was high very few years ago. Cheap disk space has gone from hundreds of MiB to multiples of TiB in a matter of months. Not too long ago I wrote programs for 640KiB of RAM and 10MiB of disk, and you would probably hate them.
I think it would be good to cope with code like that if it's just a few years old (refactor it!), and denounce it as bad practice if it's recent.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string