Options for representing string input as an object - string

I am receiving as input a "map" represented by strings, where certain nodes of the map have significance (s). For example:
---s--
--s---
s---s-
s---s-
-----s
My question is, what reasonable options are there for representing this input as an object.
The only option that really comes to mind is:
(1) Each position translated to node with up,down,left,right pointers. The whole object contains a pointer to top right node.
This seems like just a graph representation specific to this problem.
Thanks for the help.
Additionally, if there are common terms for this type of input, please let me know

Well, it depends a lot on what you need to delegate to those objects. OOP is basically about asking objects to perform things in order to solve a given problem, so it is hard to tell without knowing what you need to accomplish.
The solution you mention can be a valid one, as can also be having a matrix (in this case of 6x5) where you store in each matrix cell an object representing the node (just as an example, I used both approaches once to model the Conway's game of life). If you could give some more information on what you need to do with the object representation of your map then a better design can be discussed.
HTH

Related

How do you approach creating a complete new datatype on the "bit-level"?

I would like to create a new data type in Rust on the "bit-level".
For example, a quadruple-precision float. I could create a structure that has two double-precision floats and arbitrarily increase the precision by splitting the quad into two doubles, but I don't want to do that (that's what I mean by on the "bit-level").
I thought about using a u8-array or a bool-array but in both cases, I waste 7 bits of memory (because also bool is a byte large). I know there are several crates that implement something like bit-arrays or bit-vectors, but looking through their source code didn't help me to understand their implementation.
How would I create such a bit-array without wasting memory, and is this the way I would want to choose when implementing something like a quad-precision type?
I don't know how to implement new data types that don't use the basic types or are structures that combine the basic types, and I haven't been able to find a solution on the internet yet; maybe I'm not searching with the right keywords.
The question you are asking has no direct answer: Just like any other programming language, Rust has a basic set of rules for type layouts. This is due to the fact that (most) real-world CPUs can't address individual bits, need certain alignments when referencing memory, have rules regarding how pointer arithmetic works etc. etc.
For instance, if you create a type of just two bits, you'll still need an 8-bit byte to represent that type, because there is simply no way to address two individual bits on most CPU's opcodes; there is also no way to take the address of such a type because addressing works at least on the byte-level. More useful information regarding this can be found here, section 2, The Anatomy of a Type. Be aware that the non-wasting bit-level type you are thinking about needs to fulfill all the rules mentioned there.
It's a perfectly reasonable approach to represent what you want to do e.g. either as a single, wrapped u128 and implement all arithmetic on top of that type. Another, more generic, approach would be to use a Vec<u8>. You'll always do a relatively large amount of bit-masking, indirecting and such.
Having a look at rust_decimal or similar crates might also be a good idea.

How would I construct an integer optimization model corresponding to a graph

Suppose we're given some sort of graph where the feasible region of our optimization problem is given. For example: here is an image
How would I go on about constructing these constraints in an integer optimization problem? Anyone got any tips? Thanks!
Mate, I agree with the others that you should be a little more specific than that paint-ish picture ;). In particular you are neither specifying any objective/objective direction nor are you giving any context, what about this graph should be integer-variable related, except for the existence of disjunctive feasible sets, which may be modeled by MIP-techniques. It seems like your problem is formalization of what you conceptualized. However, in case you are just being lazy and are just interested in modelling disjunctive regions, you should be looking into disjunctive programming techniques, such as "big-M" (Note: big-M reformulations can be problematic). You should be aiming at some convex-hull reformulation if you can attain one (fairly easily).
Back to your picture, it is quite clear that you have a problem in two real dimensions (let's say in R^2), where the constraints bounding the feasible set are linear (the lines making up the feasible polygons).
So you know that you have two dimensions and need two real continuous variables, say x[1] and x[2], to formulate each of your linear constraints (a[i,1]*x[1]+a[i,2]<=rhs[i] for some index i corresponding to the number of lines in your graph). Additionally your variables seem to be constrained to the first orthant so x[1]>=0 and x[2]>=0 should hold. Now, to add disjunctions you want some constraints that only hold when a certain condition is true. Therefore, you can add two binary decision variables, say y[1],y[2] and an additional constraint y[1]+y[2]=1, to tell that only one set of constraints can be active at the same time. You should be able to implement this with the help of big-M by reformulating the constraints as follows:
If you bound things from above with your line:
a[i,1]*x[1]+a[i,2]-rhs[i]<=M*(1-y[1]) if i corresponds to the one polygon,
a[i,1]*x[1]+a[i,2]-rhs[i]<=M*(1-y[2]) if i corresponds to the other polygon,
and if your line bounds things from below:
-M*(1-y[1])<=-a[i,1]*x[1]-a[i,2]+rhs[i] if i corresponds to the one polygon,
-M*(1-y[1])<=-a[i,1]*x[1]-a[i,2]+rhs[i] if i corresponds to the other polygon.
It is important that M is sufficiently large, but not too large to cause numerical issues.
That being said, I am by no means an expert on these disjunctive programming techniques, so feel free to chime in, add corrections or make things clearer.
Also, a more elaborate question typically yields more elaborate and satisfying answers ;) If you had gone to the effort of making up a true small example problem you likely would have gotten a full formulation of your problem or even an executable piece of code in no time.

Naming a method that returns 0 for negative values

I'm writing such a method library. I can't seem to find a good name for it nor any reference of anyone having named such a function before.
What would be a good name for it?
I would call it a clipper as a noun and clip as a verb.
The operation you're describing would be a NegativeClip or SubzeroClip. Of course you could just have a generic function with two or three arguments. Depending on your needs I could see LowerBoundClip or FloorClip or BottomClip being paired with UpperBoundClip or CeilingClip or TopClip instead. In a couple of those the word clip almost sounds redundant.
Words like bound, bounds, bounded, boundary are used in mathematics but as I think about possible function names I'm not sure they were as clear in meaning. And binding is already used in other programming topics so there's some potential for confusion that also.

ID3 Implementation Clarification

I am trying to implement the ID3 algorithm, and am looking at the pseudo-code:
(Source)
I am confused by the bit where it says:
If examples_vi is empty, create a leaf node with label = most common value in TargegetAttribute in Examples.
Unless I am missing out on something, shouldn't this be the most common class?
That is, if we cannot split the data on an attribute value because no sample takes that value for the particular attribute, then we take the most common class among all samples and use that?
Also, isn't this just as good as picking a random class?
The training set tells us nothing about the relation between the attribute value and the class labels...
1) Unless I am missing out on something, shouldn't this be the most
common class?
You're correct, and the text also says the same. Look at the function description at the top :
Target_Attribute is the attribute whose value is to be predicted by the tree
so the value of Target_Attribute is the class/label.
2) That is, if we cannot split the data on an attribute value because no sample takes that value for the particular attribute, then we take the most common class among all samples and use that?
Yes, but not among all samples in your whole dataset, but rather those samples that reached up to this point in the tree/recursion. (ID3 functions is recursive and so the current Examples is actually Examples_vi of the caller)
3) Also, isn't this just as good as picking a random class?
The training set tells us nothing about the relation between the attribute value and the class labels...
No, picking a random class (with equal chances for each class) is not the same. Because often the inputs do have an unbalanced class distribution (this distribution is often called the prior distribution in many texts), so you may have 99% of examples are positive and only 1% negative. So whenever you really have no information whatsoever to decide on the outcome of some input, it makes sense to predict the most probable class, so that you have the most probability of being correct. This maximizes your classifier's accuracy on unseen data only under the assumption that the class distribution in your training data is the same as in the unseen data.
This explanation holds with the same reasoning for the base case when Attributes is empty (see 4 line in your pseudocode text); whenever we have no information, we just report the most common class of the data at hand.
If you never implemented the codes(ID3) but still want to know more in processing details, I suggest you to read this paper:
Building Decision Trees in Python
and here is the source code from the paper:
decision tree source code
This paper has a example or use example from your book(replace the "data" file with the same format). And you can just debug it (with some breakpoints) in eclipse to check the attribute values during the algorithms running.
Go over it, you will understand ID3 better.

Erlang: Extracting values from a Key/Value tuple

I have a tuple list that look like this:
{[{<<"id">>,1},
{<<"alerts_count">>,0},
{<<"username">>,<<"santiagopoli">>},
{<<"facebook_name">>,<<"Santiago Ignacio Poli">>},
{<<"lives">>,{[{<<"quantity">>,8},
{<<"max">>,8},
{<<"unlimited">>,true}]}}]}
I want to know how to extract properties from that tuple. For example:
get_value("id",TupleList), %% should return 1.
get_value("facebook_name",TupleList), %% should return "Santiago Ignacio Poli".
get_value("lives"), %% should return another TupleList, so i can call
get_value("quantity",get_value("lives",TupleList)).
I tried to match all the "properties" to a record called "User" but I don't know how to do it.
To be more specific: I used the Jiffy library (github.com/davisp/jiffy) to parse a JSON. Now i want to obtain a value from that JSON.
Thanks!
The first strange thing is that the tuple contains a single item list: where [{Key, Value}] is embedded in {} for no reason. So let's reference all that stuff you wrote as a variable called Stuff, and pull it out:
{KVList} = Stuff
Good start. Now we are dealing with a {Key, Value} type list. With that done, we can now do:
lists:keyfind(<<"id">>, 1, KVList)
or alternately:
proplists:get_value(<<"id">>, KVList)
...and we would get the first answer you asked about. (Note the difference in what the two might return if the Key isn't in the KVList before you copypasta some code from here...).
A further examination of this particular style of question gets into two distinctly different areas:
Erlang docs regarding data functions that have {Key, Value} functions (hint: the lists, proplists, orddict, and any other modules based on the same concept is a good candidate for research, all in the standard library), including basic filter and map.
The underlying concept of data structures as semantically meaningful constructs. Honestly, I don't see a lot of deliberate thought given to this in the functional programming world outside advanced type systems (like in Haskell, or what Dialyzer tries hard to give you). The best place to learn about this is relational database concepts -- once you know what "5NF" really means, then come back to the real world and you'll have a different, more insightful perspective, and problems like this won't just be trivial, they will beg for better foundations.
You should look into proplists module and their proplist:get_value/2 function.
You just need to think how it should behave when Key is not present in the list (or is the default proplists behavior satisfying).
And two notes:
since you keys are binnary, you should use <<"id">> in your function
proplists works on lists, but data you presented is list inside one element tuple. So you need to extract this you Data.
{PropList} = Data,
Id = proplists:get_value(<<"id">>, PropList),

Resources