I read a paper. It states that " For data sets, we used 10 realizations for parameter selection and 20 other realizations for parameter selection.
Does realization in this case mean simulation ?
Thanks
It would be better if you provided either (i) a link to the paper or (ii) more context, because at the moment it isn't clear what the sentence refers to.
My best guess is that it means 'realization' in the sense of 'a particular series that might be generated by a specified random process'.
Related
Suppose we're given some sort of graph where the feasible region of our optimization problem is given. For example: here is an image
How would I go on about constructing these constraints in an integer optimization problem? Anyone got any tips? Thanks!
Mate, I agree with the others that you should be a little more specific than that paint-ish picture ;). In particular you are neither specifying any objective/objective direction nor are you giving any context, what about this graph should be integer-variable related, except for the existence of disjunctive feasible sets, which may be modeled by MIP-techniques. It seems like your problem is formalization of what you conceptualized. However, in case you are just being lazy and are just interested in modelling disjunctive regions, you should be looking into disjunctive programming techniques, such as "big-M" (Note: big-M reformulations can be problematic). You should be aiming at some convex-hull reformulation if you can attain one (fairly easily).
Back to your picture, it is quite clear that you have a problem in two real dimensions (let's say in R^2), where the constraints bounding the feasible set are linear (the lines making up the feasible polygons).
So you know that you have two dimensions and need two real continuous variables, say x[1] and x[2], to formulate each of your linear constraints (a[i,1]*x[1]+a[i,2]<=rhs[i] for some index i corresponding to the number of lines in your graph). Additionally your variables seem to be constrained to the first orthant so x[1]>=0 and x[2]>=0 should hold. Now, to add disjunctions you want some constraints that only hold when a certain condition is true. Therefore, you can add two binary decision variables, say y[1],y[2] and an additional constraint y[1]+y[2]=1, to tell that only one set of constraints can be active at the same time. You should be able to implement this with the help of big-M by reformulating the constraints as follows:
If you bound things from above with your line:
a[i,1]*x[1]+a[i,2]-rhs[i]<=M*(1-y[1]) if i corresponds to the one polygon,
a[i,1]*x[1]+a[i,2]-rhs[i]<=M*(1-y[2]) if i corresponds to the other polygon,
and if your line bounds things from below:
-M*(1-y[1])<=-a[i,1]*x[1]-a[i,2]+rhs[i] if i corresponds to the one polygon,
-M*(1-y[1])<=-a[i,1]*x[1]-a[i,2]+rhs[i] if i corresponds to the other polygon.
It is important that M is sufficiently large, but not too large to cause numerical issues.
That being said, I am by no means an expert on these disjunctive programming techniques, so feel free to chime in, add corrections or make things clearer.
Also, a more elaborate question typically yields more elaborate and satisfying answers ;) If you had gone to the effort of making up a true small example problem you likely would have gotten a full formulation of your problem or even an executable piece of code in no time.
I want to show use of same algorithm as a black box in two-pass iteration. In first pass, I would pass a value of a flag f as false, and an array of one element as A[1..1], output of first pass would be B[1..N]. In second pass, same algorithm would be used with f as true (to indicate second pass) with an input of A[1..N] (fed from output B[1..N] of first pass) whereas the output of second pass would be B[1..M]
Please help me drawing the UML Activity diagram for the same.
It's not a good idea to try "programming graphically". The algorithm you describe is better shown in meta code than in an activity diagram, as you already have seen. So what I'd do in your case is to have a single Action (representing most likely some CallOperation of some class. And the according behavior of the operation contains the description in either meta code or plain text (as you already stated above).
If for what reason ever you really want to "program graphically" you would need to use single actions for the assignments of the flag like this:
The A and B arrays would be just mentioned in the description of the single actions.
To actually show passing the A and B arrays you would need to add ActionsPins or Objects with ObjectFlows between the single Actions. Honestly, that would make the whole thing even more unreadable and hinder more than helping the reader:
I am trying to implement the ID3 algorithm, and am looking at the pseudo-code:
(Source)
I am confused by the bit where it says:
If examples_vi is empty, create a leaf node with label = most common value in TargegetAttribute in Examples.
Unless I am missing out on something, shouldn't this be the most common class?
That is, if we cannot split the data on an attribute value because no sample takes that value for the particular attribute, then we take the most common class among all samples and use that?
Also, isn't this just as good as picking a random class?
The training set tells us nothing about the relation between the attribute value and the class labels...
1) Unless I am missing out on something, shouldn't this be the most
common class?
You're correct, and the text also says the same. Look at the function description at the top :
Target_Attribute is the attribute whose value is to be predicted by the tree
so the value of Target_Attribute is the class/label.
2) That is, if we cannot split the data on an attribute value because no sample takes that value for the particular attribute, then we take the most common class among all samples and use that?
Yes, but not among all samples in your whole dataset, but rather those samples that reached up to this point in the tree/recursion. (ID3 functions is recursive and so the current Examples is actually Examples_vi of the caller)
3) Also, isn't this just as good as picking a random class?
The training set tells us nothing about the relation between the attribute value and the class labels...
No, picking a random class (with equal chances for each class) is not the same. Because often the inputs do have an unbalanced class distribution (this distribution is often called the prior distribution in many texts), so you may have 99% of examples are positive and only 1% negative. So whenever you really have no information whatsoever to decide on the outcome of some input, it makes sense to predict the most probable class, so that you have the most probability of being correct. This maximizes your classifier's accuracy on unseen data only under the assumption that the class distribution in your training data is the same as in the unseen data.
This explanation holds with the same reasoning for the base case when Attributes is empty (see 4 line in your pseudocode text); whenever we have no information, we just report the most common class of the data at hand.
If you never implemented the codes(ID3) but still want to know more in processing details, I suggest you to read this paper:
Building Decision Trees in Python
and here is the source code from the paper:
decision tree source code
This paper has a example or use example from your book(replace the "data" file with the same format). And you can just debug it (with some breakpoints) in eclipse to check the attribute values during the algorithms running.
Go over it, you will understand ID3 better.
I was running a tutorial today, and a we were designing a Class diagram to model a road system. One of the constraints of the system is that any one segment of road has a maximum capacity; once reached, no new vehicles can enter the segment.
When drawing the class diagram, can I use capacity as one of the multiplicities? This way, instead of having 0..* vehicles on a road segment, I can have 0..capacity vehicles.
I had a look at ISO 1905-1 for inspiration, and I thought that what I want is similar to what they've called a 'multiplicity element'. In the standard, it states:
If the Multiplicity is associated with an element whose notation is a text string (such as an attribute, etc.), the multiplicity string will be placed within square brackets ([]) as part of that text string. Figure 9.33 shows two multiplicity strings as part of attribute specifications within a class symbol. -- section 9.12
However, in the examples it gives, they don't seem to employ this feature in the way I expected - they annotate association links rather than replace the multiplicities.
I would rather get a definitive answer for the students in question, rather than make a guess based on the standard, so I ask here: has anyone else faced this issue? How did you overcome it?
According to the UML specification you can use a ValueSpecification for lower and upper bounds of a multiplicity element. And a ValueSpecification can be an expression. So in theory it must be possible although the correct expression will be more complex. Indeed it mixes design and instance level.
In such a case it is more usual to use a constraint like this:
UML multiplicity constraint http://app.genmymodel.com/engine/xaelis/roads.jpg
I am receiving as input a "map" represented by strings, where certain nodes of the map have significance (s). For example:
---s--
--s---
s---s-
s---s-
-----s
My question is, what reasonable options are there for representing this input as an object.
The only option that really comes to mind is:
(1) Each position translated to node with up,down,left,right pointers. The whole object contains a pointer to top right node.
This seems like just a graph representation specific to this problem.
Thanks for the help.
Additionally, if there are common terms for this type of input, please let me know
Well, it depends a lot on what you need to delegate to those objects. OOP is basically about asking objects to perform things in order to solve a given problem, so it is hard to tell without knowing what you need to accomplish.
The solution you mention can be a valid one, as can also be having a matrix (in this case of 6x5) where you store in each matrix cell an object representing the node (just as an example, I used both approaches once to model the Conway's game of life). If you could give some more information on what you need to do with the object representation of your map then a better design can be discussed.
HTH