Core Data - Optional attributes and performance - core-data

Per the Core Data Programming Guide:
You can specify that an attribute is
optional—that is, it is not required
to have a value. In general, however,
you are discouraged from doing
so—especially for numeric values
(typically you can get better results
using a mandatory attribute with a
default value—in the model—of 0). The
reason for this is that SQL has
special comparison behavior for NULL
that is unlike Objective-C's nil. NULL
in a database is not the same as 0,
and searches for 0 will not match
columns with NULL.
I have always made numeric values non-optional, but have not for dates and strings. It is convenient in my code to base logic on dates and/or strings being nil.
Based on the above recommendations, I am considering making everything in my database non-optional. For dates I could set the model default to a value of 0 and for strings a model default of nothing (""). Then, in my code I could test dates for [date timeIntervalSince1970] != 0 and strings for string.length != 0.
The question is, for a relatively small database, does this really matter from a Core Data performance standpoint? And what is the tradeoff if the attribute in question will never be directly queried via a predicate?

I have not seen any performance problems on small to medium sized data sets. I suspect that this is something you would deal with in the performance stage of your application.
Personally I use the same logic of non numerics being optional if it makes sense as it does indeed make the code easier which in turn gives me more time to optimize later.

Related

Should equals() method of value objects strive to return true by using transformations of attributes?

Assume we have a value object Duration (with attributes numberOfUnits, unit). Would it be a good idea to treat these objects as equal (for example, overriding Object.equals()) if they have the same duration but different units? Should 1 min be equal to 60 sec.
There are many contradicting examples. With java's BigDecimal compareTo() == 0 does not imply equals() == true (new BigDecimal("0").equals(new BigDecimal("0.0")) returns false). But Duration.ofHours(24).equals(Duration.ofDays(1)) returns true.
That's an unfortunately complicated question.
The simple answer is no: the important goal of value objects is to correctly model queries in your domain.
If it happens that equals in your domain has nice properties, then you should model them and everything is awesome. But if you are modeling something weird then getting it right trumps following the rules everywhere else.
Complications appear when your implementation language introduces contracts for equals that don't match the meaning in your domain. Likely, you will need to invent a different spelling for the domain meaning.
In Java, there are a number of examples where equals doesn't work as you would expect, because the hashCode contract prohibits it.
Well... I upvoted #jonrsharpe comment because without knowing the context it is almost impossible to give you an answer.
An example of what #jonrsharpe means could be that if in your domain you are using Duration VO to compare users input (who choose numberOfUnits and unit in a UI) it is obvious that Duration in minutes is not equal to Duration in seconds even if 1 min = 60 sec because you want to know if the users inputs the same things, and in this case they not.
Now, assuming that you will use Duration just for other things in which the format does not matter and ever means the same thing for domain rules (i.e. outdate something):
Why do you need Duration.unit attribute if it gives you nothing of value in your domain?
Why can not you just work with one unit type internally?
If it is just because different inputs/outputs in your system you should transform it to your internal/external(UI, REST API, etc) representation before apply rules, persist the VO (if needed) and/or showing it in a UI. So, separate input/output concerns from your domain. Maybe Duration (with unit attribue) is not a VO is just part of your ViewModel.

Check for [Axis.Icon] for null values on Spotfire

I am learning to use the spotfire tool now. I am creating a graphical table with Icons. I would like to represent null values as Icon instead of showing ---. Is it possible to do like this?
I also try to write a custom expression as
If([Axis.Icon] is null, 0)
for which I get an error saying "All parts of the expression have to be aggregated".
Can anybody help me to fix this issue? Many Thanks!
Values / Expressions which are on an aggregate axis must be aggregated in their entirety to maintain consistency. Otherwise, the info-graphic could misrepresent the data. In cases like yours, usually you can either aggregate all of the parts or the entire expression, or handle your logic in your table data itself.
Something like If(SUM([Axis.Icon]) is null, SUM(0))
A lot of people rather replace NULL with 0 in their data. For this, you need to create a calculated column or use a transformation to replace missing values in your data with 0, with a similar expression: If([columnName] is null, 0).
In both cases you may still have --- which is similar to null but is actually the missing value of a specific grouping you are using. What this means is that there aren't any rows which conform to this grouping, thus you can't force a value.
For your specific case, we would need a sample data set.

Given ShortString is deprecated, what is the best method for limiting string length

It's not uncommon to have a record like this:
TAddress = record
Address: string[50];
City : string[20];
State : string[2];
ZIP : string[5];
end;
Where it was nice to have hard-coded string sizes to ensure that the size of the string wouldn't exceed the database field size allotted for the data.
Given, however, that the ShortString type has been deprecated, what are Delphi developers doing to "solve" this problem? Declaring the record fields as string gets the job done, but doesn't protect the data from exceeding the proper length.
What is the best solution here?
If I had to keep data from exceeding the right length, I'd let the database code handle it as much as possible. Put size limits on the fields, and display data to the user in data-bound controls. A TDBEdit bound to a string field will enforce the length limit correctly. Set it up so the record gets populated directly from the dataset, and it will always have the right length.
Then all you need to worry about is data coming into the record from some outside source that is not part of your UI. For that, use the same process. Have the import code insert the data into the dataset, and let its length constraints do the validation for you. If it raises an exception, reject the import. If not, then you've got a valid dataset row that you can use to populate a record from.
The short string types in your question don't really protect the strings from exceeding the proper length. When you assign a longer value to these short strings, the value is silently truncated.
I'm not sure what database access method you are using but I rather imagine that it will do the same thing. Namely truncate any over-length strings to the maximum length. In which case there is nothing to do.
If your database access method throws an error when you give it an over long string then you would need to truncate before passing the value to the database.
If you have to truncate explicitly, then there are lots of places where you might choose to do so. My philosophy would be to truncate at the last possible moment. That's the point at which you are subject to the limit. Truncating anywhere else seems wrong. It means that a database limitation is spreading to parts of the code that are not obviously related to the database.
Of course, all this is based on the assumption that you want to carry on silently truncating. If you want to do provide user feedback in the event of truncation, then you will need to decide just where are the right points to action that feedback.
From my understanding, my answer should be "do not mix layers".
I suspect that the string length is specified at the database layer level (a column width), or at the business application layer (e.g. to validate a card number).
From the "pure Delphi code" point of view, you should not know that your string variable has a maximum length, unless you reach the persistence layer or even the business layer.
Using attributes could be an idea. But it may "pollute" the source code for the very same reason that it is mixing layers.
So what I recommend is to use a dedicated Data Modeling, in which you specify your data expectations. Then, at the Delphi var level, you just define a plain string. This is exactly how our mORMot framework implements data filtering and validation: at Model level, with some dedicated classes - convenient, extendable and clean.
If you're just porting from Delphi 7 to XE3, leave it be. Also, although "ShortString" may be deprecated, I'll eat my hat if they ever remove it completely, because there are a lot of bits of code that will never be able to be rebuilt without it. ShortString + Records is still the only practical way to specify a byte-oriented file-of-record data storage. Delphi will NEVER remove ShortString nor change its behaviour, it would be devastating to existing delphi code. So if you really must define records and limit their length, and you really don't want those records to support Unicode, then there is zero reason to stop using or stop writing ShortString code. That being said, I detest short-strings, and File-of-record, wish they would go away, and am glad they are marked deprecated.
That being said, I agree with mason and David entirely; I would say, Length checking, and validation are presentation/validation concerns, and Delphi's strong typing is NOT the right place or the right way to deal with them. If you need to put validation constraints on your classes, write helper classes that implement constraint-storage (EmployeeName is a string field and EmployeeName has the following length limit). In Edit controls for example, this is already a property. It seems to me that mapping DB Fields to visual fields, using the new Binding system would be much preferable to trying to express constraints statically in the code.
User input validation and storage are different and length limits should be set in your GUI controls not in your data structures.
You could for example use Array of UnicodeChar, if you wanted to have a Unicode wide but length limited string. You could even write your own LimitedString class using the new class helper methods in Delphi. But such approaches are not a maintainable and stable design.
If your SQL database has a field declared with VARCHAR(100) type, and you want to limit your user's input to 100 characters, you should do so at the GUI layer and forget about imposing truncation (data corruption, in fact) silently behind the scenes.
I had this problem - severely - upgrading from Delphi 6 to 2009 for what one program was/is doing it was imperative to be able to treat the old ASCII strings as individual ASCII characters.
The program outputs ASCII files (NO ANSI even) and has concepts such as over-punch on the last numeric digit to indicate negative. So the file format goes back a bit one could say!
After the first build in 2009 (10 year old code, well, you do don't you!) after sorting unit names etc there were literally hundreds of reported errors/ illegal assignments and data loss / conversion warnings...
No matter how good Delphi's back-room manipulation/magic with strings and chars I did not trust it enough. In the end to make sure everything was back as it was I re-declared them all as array of byte and then changed the code accordingly.
You haven't specified the delphi version, here's what works for me in delphi 2010:
Version1:
TTestRecordProp = record
private
FField20: string;
...
FFieldN: string
procedure SetField20(const Value: string);
public
property Field20: string read FField20 write SetField20;
...
property FieldN: string ...
end;
...
procedure TTestRecordProp.SetField20(const Value: string);
begin
if Length(Value) > 20 then
/// maybe raise an exception?
FField20 := Copy(FField20, 1, 20)
else
FField20 := Value;
end;
Version2:
TTestRecordEnsureLengths = record
Field20: string;
procedure EnsureLengths;
end;
...
procedure TTestRecordEnsureLengths.EnsureLengths;
begin
// for each string field, test it's length and truncate or raise exception
if Length(Field20) > 20 then
Field20 := Copy(Field20, 1, 20); // or raise exception...
end;
// You have to call .EnsureLength before push data to db...
Personally, I'd recommend replacing records with objects, then you can do more tricks.

Check if values of two string-type items are equal in a Zabbix trigger

I am monitoring an application using Zabbix and have defined a custom item which returns a string value. Since my item's values are actually checksums, they will only contain the characters [0-9a-f]. Two mirror copies of my application are running on two servers for the sake of redundancy. I would like to create a trigger which would take the item values from both machines and fire if they are not the same.
For a moment, let's forget about the moment when values change (it's not an atomic operation, so the system may see inconsistent state, which is not a real error, for a short time), since I could work around it by looking at several previous values.
The crux is: how to write a Zabbix trigger expression which could compare for equality the string values of two items (the same item on two mirror hosts, actually)?
Both according to the fine manual and as I confirmed in praxis, the standard operators = and # only work on numeric values, so I can't just write the natural {host1:myitem[param].last(0)} # {host2:myitem[param].last(0)}. Functions such as change() or diff() can only compare values of the same item at different points in time. Functions such as regexp() can only compare the item's value with a constant string/regular expression, not with another item's value. This is very limiting.
I could move the comparison logic into the script which my custom item executes, but it's a bit messy and not elegant, so if at all possible, I would prefer to have this logic inside my Zabbix trigger.
Perhaps despite the limitations listed above, someone can come up with a workaround?
Workaround:
{host1:myitem[param].change(0)} # {host2:myitem[param].change(0)}
When only one of the servers sees a modification since the previously received value, an event is triggered.
From the Zabbix Manual,
change (float, int, str, text, log)
Returns difference between last and previous values.
For strings:
0 - values are equal
1 - values differ
I believe, and am struggling with this EXACT situation to this myself, that the correct way to do this is via calculated items.
You want to create a new ITEM, not trigger (yet!), that performs a calculated comparison on multiple item values (Strings Difference, Numbers within range, etc).
Once you have that item, have the calculation give you a value you can trigger off of. You can use ANY trigger functions in your calculation along with arrhythmic operations.
Now to the issue (which I've submitted a feature request for because this is extremely limiting), most trigger expressions evaluate to a number or a 0/1 bool.
I think I have a solution for my problem, which is that I am tracking a version number from a webpage: e.g. v2.0.1, I believe I can use string connotation and regex in calculated items in order to convert my string values into multiple number values. As these would be a breeze to compare.
But again, this is convoluted and painful.
If you want my advice, have yourself or a dev look at the code for trigger expressions and see if you can submit a patch add one trigger function for simple string comparison. (Difference, Length, Possible conversion to numerical values (using binary and/or hex combinations) etc.)
I'm trying to work on a patch myself, but I don't have time as I have so much monitoring to implement and while zabbix is powerful, it's got several huge flaws. I still believe it's the best monitoring system out there.
Simple answer: Create a UserParameter until someone writes a patch.
You could change your items to return numbers instead of strings. Because your items are checksums that are using only [0-9a-f] characters, they are numbers written in hexadecimal. So you would need to convert the checksum to decimal number.
Because the checksum is a big number, you would need to limit the hexadecimal number to 8 characters for Numeric (unsigned) type before conversion. Or if you would want higher precision, you could use float (but that would be more work):
Numeric (unsigned) - 64bit unsigned integer
Numeric (float) - floating point number
Negative values can be stored.
Allowed range (for MySQL): -999999999999.9999 to 999999999999.9999 (double(16,4)).
I wish Zabbix would have .hashedUnsigned() function that would compute hash of a string and return it as a number. Such a function should be easy to write.

Misuse of a variables value?

I came across an instance where a solution to a particular problem was to use a variable whose value when zero or above meant the system would use that value in a calculation but when less than zero would indicate that the value should not be used at all.
My initial thought was that I didn't like the multipurpose use of the value of the variable: a.) as a range to be using in a formula; b.) as a form of control logic.
What is this kind of misuse of a variable called? Meta-'something' or is there a classic antipattern that this fits?
Sort of feels like when a database field is set to null to represent not using a value and if it's not null then use the value in that field.
Update:
An example would be that if a variable's value is > 0 I would use the value if it's <= 0 then I would not use the value and decided to perform some other logic.
Values such as these are often called "distinguished values". By far the most common distinguished value is null for reference types. A close second is the use of distinguished values to indicate unusual conditions (e.g. error return codes or search failures).
The problem with distinguished values is that all client code must be aware of the existence of such values and their associated semantics. In practical terms, this usually means that some kind of conditional logic must be wrapped around each call site that obtains such a value. It is far too easy to forget to add that logic, obtaining incorrect results. It also promotes copy-and-paste code as the boilerplate code required to deal with the distinguished values is often very similar throughout the application but difficult to encapsulate.
Common alternatives to the use of distinguished values are exceptions, or distinctly typed values that cannot be accidentally confused with one another (e.g. Maybe or Option types).
Having said all that, distinguished values may still play a valuable role in environments with extremely tight memory availability or other stringent performance constraints.
I don't think what your describing is a pure magic number, but it's kind of close. It's similar to the situation in pre-.NET 2.0 where you'd use Int32.MinValue to indicate a null value. .NET 2.0 introduced Nullable and kind of alleviated this issue.
So you're describing the use of a variable who's value really means something other than it's value -- -1 means essentially the same as the use of Int32.MinValue as I described above.
I'd call it a magic number.
Hope this helps.
Using different ranges of the possible values of a variable to invoke different functionality was very common when RAM and disk space for data and program code were scarce. Nowadays, you would use a function or an additional, accompanying value (boolean, or enumeration) to determine the action to take.
Current OS's suggest 1GiB of RAM to operate correctly, when 256KiB was high very few years ago. Cheap disk space has gone from hundreds of MiB to multiples of TiB in a matter of months. Not too long ago I wrote programs for 640KiB of RAM and 10MiB of disk, and you would probably hate them.
I think it would be good to cope with code like that if it's just a few years old (refactor it!), and denounce it as bad practice if it's recent.

Resources