I am reading category theory for programmers from Bartosz Milewski and I did not get the idea of partial order.
I did not get the context of the following sentences:
You can also have a stronger relation, that satisfies an additional
condition that, if a <= b and b <= a then a must be the same as b.
That’s called a partial order.
Why a must be the same as b? For example, a = 4 and b = 5, so it is not the same at all. If he would mention
....if a = b and b = a....
Then yes, I would agree.
The second part, that I also do not understand:
Finally, you can impose the condition that any two objects are in a
relation with each other, one way or another; and that gives you a
linear order or total order.
What does he mean?
if a <= b ...
so a = 4 and b = 5 satisfy the first inequality
and b <= a
but they don't satisfy the second inequality. So, your counterexample is invalid.
Let's forget <= because I suspect it's tricking you into thinking about integers or some other set of numbers you're familiar with. So, we'll re-write it with some arbitrary relation, say ¤
if a ¤ b is true
and b ¤ a is true
and this always implies that a is the same entity as b
then we call relation ¤ a "partial order" (over whatever set a, b are drawn from)
All the author is saying is that for some relation, if the given rule is true, then we call that relation a partial order. This is the author's definition of a partial order. If you find some situation where the rule doesn't hold - that just means you found a type of relation that is not a partial order.
Anyway, the reason for defining a partial order is that sometimes we have collections of objects, and we can't compare all of them to each other.
For example, a set of grades for different subjects: perhaps I can decide whether one student is better at English than another, and I can decide whether one student is better at Music than another, but it doesn't make sense to discuss whether one student's English is better than another's Music.
The last quote just means that if we have a relation which is at least a partial order (it satisfies the rule given) and it can be applied to your whole set (say we're only discussing English grades), then we can call it a total order over that set.
PS. As it happens the rule does hold for the usual <= with integers: hence, we can call the relation <= a partial order over ℤ. Since it is also defined for every pair of integers, we can also call <= a total order on ℤ.
PPS. Yes, a partial order also requires transitivity: my answer really only addresses the fairly informal definition quoted in the question. You can find more complete definitions at Wolfram MathWorld, Wikipedia or wherever.
The divisibility of a positive natural number by another positive natural number is an example of partial order which is not a total order (x divides y if y/x is a natural number).
1) If x divides y and if y divides z, then x divides z (transitivity).
2) If x divides y and y divides x, then x = y (antisymmetry).
3) x divides x (reflexivity).
These are the three properties of a partial order.
But this is not a total order, because you can find two natural numbers x and y such that x does not divide y and y does not divide x.
To understand the distinction, you need to look at sets other than integers. Consider the complex numbers. A valid preorder on the complex numbers could say z1 <= z2 if and only if real(z1) <= real(z2). Thus, (3, 5) <= (3, 6) and (3, 6) <= (3, 5). This is not a partial order, though, because (3, 5) != (3, 6).
Adding the condition that z1 <= z2 also requires imag(z1) <= imag(z2) makes this a preorder, since now (3, 5) <= (3, 6) but not vice versa. It's still not a total order, because neither (2, 3) <= (3, 2) nor (3, 2) <=(2, 3) is true.
Instead, one could say z1 <= z2 if and only if real(z1) <= real(z2) and abs(z1) <= abs(z2). Now (3, 5) <= (3, 6) is still true, but (3, 6) <= (3, 5) is not because sqrt(3**2 + 6**2) > sqrt(3**2 + 5**2). But we can say that (2, 3) <= (3, 2) because 2 <= 3 and sqrt(13) <= sqrt(13). This makes the <= operator a total order. (Update: checking whether lexicographical ordering on abs and arg -- with arg limited to (-pi,pi] while special casing the 0 -- is a proper total order, is left as an exercise to a reader.)
(Normally, we say the complex numbers are not ordered because there are several ways one could define a total order, but no single "natural" ordering.)
Consider this Directed Acyclic Graph:
If we say that an arrow on this graph stands for the <= relation then we can see that a <= c and c<=d. But it is not the case that b<=c nor does c<=b hold. Hence we have an order, but it is only partial because it only exists for some pairs of items in the domain.
In general a DAG defines a partial order on its members. Even if the arrow from a to e were not included we could still say that a<=c and c<=e, so therefore a<=e.
Bear in mind that we are not interpreting "x <= y" as meaning anything other than "I can get from x to y by following arrows on the diagram". Now suppose we have two letters x and y, and we know that x <= y and y <= x. If x and y are different and you can get from x to y then you can't get from y to x. Hence it is not possible for x and y to be different items, so they must both be the same item.
A total order, on the other hand, exists for all pairs of items. The integers, for instance, have a total order.
Related
I have this problem that I must solve in time that is polynomial in N, K and D given below:
Let N, K be some natural numbers.
Then a, b, c ... are N numbers of exactly K digits each.
a, b, c ... contain only the digits 1 and 2 in some order given by the input.
Although, there have only D digits that are visible, the rest of them being hidden (the hidden digits will be noted with the character "?").
There may be different numbers such that one or more of a, b, c ... are generalizations of the said number:
e.g.
2?122 is a generalization for 21122 and 22122
2?122 is not a generalization for 11111
12??? and ?21?? are both generalizations for 12112
???22 and ???11 cannot be generalizations of the same number
Basically, some number is a generalization of the other if the latter can be one of the "unhidden" versions of the former.
Question:
How many different numbers there are such that at least one of a, b, c or ... is their generalization?***
Quick Reminder:
N = nº of numbers
K = nº of digits in each number
D = nº of visible digits in each number
Conditions & Limitations:
N, K, D are natural numbers
1 ≤ N
1 ≤ D < K
Input / Output snippets for verification of the algorithm:
Input:
N = 3, K = 5, D = 3
112??
?122?
1?2?1
Output:
8
Explanation:
The numbers are 11211, 11212, 11221, 11222, 12211, 12221, 21221, 21222, which are 8 numbers.
11211, 11212, 11221, 11222 are the generalizations of 112??
11221, 11222, 21221, 21222 are the generalizations of ?122?
11211, 11221, 12211, 12221 are the generalizations of 1?2?1
Input:
N = 2, K = 3, D = 1
1??
?2?
Output:
6
Explanation:
The numbers are 111, 112, 121, 122, 221, 222, which are 6 numbers.
From my calculations, I found out that there are 2^(K-D) possible numbers in total that have a as their generalization, 2^(K-D) possible numbers in total that have b as their generalization etc., leaving me with N*2^(K-D) numbers.
My big problem is that I found cases where a number has multiple generalizations and therefore it repeats inside N*2^(K-D), so the real nº of different numbers will be, in this case, something smaller than N*2^(K-D).
I don't know how to find only the different numbers and I need your help.
Thank you very much!
EDIT: Answer to «n. 1.8e9-where's-my-share m.»'s question from the comments:
If you have two generalisations, can you find out how many numbers they both generalise?
For two given general numbers a and b (general meaning that they both contain "?"), it is possible to find nº of numbers generalised by both a and b in polynomial time by using the following logic:
1 - we declare some variable q = 1
2 - we start "scanning" the digits of the two numbers simultaneously from left to right:
2.1 - if we find two unhidden digits and they are different, then no numbers are generalized by both a and b and we return 0
2.2 - if we find two hidden digits, then we multiply q by 2, since of the two general numbers result to both generalize some number, that number can have 1 or 2 in place of "?", therefore for each "?" we double the numbers that can be generalized from both a and b as long as step 2.1 is never true.
3 - if scanned all the digits and step 2.1 was never true, then we return 2^q
Therefore, the nº of numbers both a and b generalize is 0 or 2^q, according to the cases presented above.
Unfortunately, this is impossible to do in polynomial time (unless P=NP, and maybe not even then.) Your problem is equivalent to the problem of counting satisfying assignments to a formula in Disjunctive Normal Form, called the DNF counting problem. DNF counting is Sharp-P-hard, so a polynomial time solution could be used to solve all problems in NP in polynomial time too (and more).
To see the relationship, note that each pattern is equivalent to an AND of several literals. If you take '1' in a position to be a literal, and '2' in that position to be that literal negated, you can convert it to a disjunctive clause.
For example:
1 1 2 ? ?
becomes
(x_1 ∧ x_2 ∧ ¬x_3)
? 1 2 2 ?
becomes
(x_2 ∧ ¬x_3 ∧ ¬x_4)
1 ? 2 ? 1
becomes
(x_1 ∧ ¬x_3 ∧ x_5)
The question of how many numbers satisfy at least one of these patterns is exactly the question of how many assignments satisfy at least one of the equivalent clauses.
In scikit-learn's PolynomialFeatures preprocessor, there is an option to include_bias. This essentially just adds a column of ones to the dataframe. I was wondering what the point of having this was. Of course, you can set it to False. But theoretically how does having or not having a column of ones along with the Polynomial Features generated affect Regression.
This is the explanation in the documentation, but I can't seem to get anything useful out of it relation to why it should be used or not.
include_bias : boolean
If True (default), then include a bias column, the feature in which
all polynomial powers are zero (i.e. a column of ones - acts as an
intercept term in a linear model).
Suppose you want to perform the following regression:
y ~ a + b x + c x^2
where x is a generic sample. The best coefficients a,b,c are computed via simple matricial calculus. First, let us denote with X = [1 | X | X^2] a matrix with N rows, where N is the number of samples. The first column is a column of 1s, the second column is a column of values x_i, for all the samples i, the third column is a column of values x_i^2, for all samples i. Let us denote with B the following column vector B=[a b c]^T If Y is a column vector of the N target values for all samples i, we can write the regression as
y ~ X B
The i-th row of this equation is y_i ~ [1 x_i x^2] [a b c]^t = a + b x_i + c x_i^2.
The goal of training a regression is to find B=[a b c] such that X B be as close as possible to y.
If you don't add a column of 1, you are assuming a-priori that a=0, which might not be correct.
In practice, when you write Python code, and you use PolynomialFeatures together with sklearn.linear_model.LinearRegression, the latter takes care by default of adding a column of 1s (since in LinearRegression the fit_intercept parameter is True by default), so you don't need to add it as well in PolynomialFeatures. Therefore, in PolynomialFeatures one usually keeps include_bias=False.
The situation is different if you use statsmodels.OLS instead of LinearRegression
Given a set A of n positive integers a1, a2,... a3 and another positive integer M, I'm going to find a subset of numbers of A whose sum is closest to M. In other words, I'm trying to find a subset A′ of A such that the absolute value |M - Σ a∈A′| is minimized, where [ Σ a∈A′ a ] is the total sum of the numbers of A′. I only need to return the sum of the elements of the solution subset A′ without reporting the actual subset A′.
For example, if we have A as {1, 4, 7, 12} and M = 15. Then, the solution subset is A′ = {4, 12}, and thus the algorithm only needs to return 4 + 12 = 16 as the answer.
The dynamic programming algorithm for the problem should run in
O(nK) time in the worst case, where K is the sum of all numbers of A.
You construct a Dynamic Programming table of size n*K where
D[i][j] = Can you get sum j using the first i elements ?
The recursive relation you can use is: D[i][j] = D[i-1][j-a[i]] OR D[i-1][j] This relation can be derived if you consider that ith element can be added or left.
Time complexity : O(nK) where K=sum of all elements
Lastly you iterate over entire possible sum you can get, i.e. D[n][j] for j=1..K. Which ever is closest to M will be your answer.
For dynamic algorithm, we
Define the value we would work on
The set of values here is actually a table.
For this problem, we define value DP[i , j] as an indicator for whether we can obtain sum j using first i elements. (1 means yes, 0 means no)
Here 0<=i<=n, 0<=j<=K, where K is the sum of all elements in A
Define the recursive relation
DP[i+1 , j] = 1 , if ( DP[i,j] == 1 || DP[i,j-A[i+1]] ==1)
Else, DP[i+1, j] = 0.
Don't forget to initialize the table to 0 at first place. This solves boundary and trivial case.
Calculate the value you want
Through bottom-up implementation, you can finally fill the whole table.
Now, things become easy. You just need to find out the closest value to M in the table whose value is one.
Here, just work on DP[n][j], since n covers the whole set. Find the closest j to M whose value is 1.
Time complexity is O(kn), since you iterate k*n times in total.
I have an issue with solver as follows (simplified version):
So I have a nested If statement that describes condition for 2 changing variables(x,y). For example:
In one cell: IF(AND((x<=2),(x>=0.5),(y<=10),(y>=5)),1,0
The cell below it: IF(AND((x<=2.5),(x>=1.9),(y<=11),(y>=9)),1,0
The objective function is the sum of these 2 variables
Solver or goal seek (unless i give it the awnser) can't seem to get an awnser other than 0,0.
My actual problem is that i have 6 of these IF cells and I'm trying to find an (x,y) that maximizes my objective function. I want excel to go through as many combinations it can.
Any thoughts or other ways to do this? Thanks.
The reason that the Solver does not find the optimal solution in this toy problem is because the use of IF and AND statements make the problem non convex. For non-convex problems, the GRG Nonlinear solution method (the default used by solver) does not guarantee an optimal solution, as it can be trapped in locally best solutions which are not optimal.
Having said that, there is a way to formulate your problem as a mixed integer program, which, although still non-convex, can be solved with the "Simplex LP" method of Solver, and give a guaranteed maximum.
Model Setup
Here is a screenshot of the spreadsheet setup:
For convenience, I have used named ranges for the several quantities.
In particular:
- B2 --> x_var
- C2 --> x_UB1
- D2 --> x_LB1
- E2 --> x_UB2
- F2 --> x_LB2
and for row 3 I use the same convention, but instead of x_ we have y_.
The red cells (B4 and E4) have the conditions you described, and the blue cell (B5) has their sum.
For example, the condition for B4 reads
=IF(AND(x_var<=x_UB1,x_var>=x_LB1,y_var<=y_UB1,y_var>=y_LB1),1,0)
We are going to replace these expressions with two binary variables, which equal one if each expression is satisfied and zero otherwise.
The logic is that instead of an IF expression we can impose the constraints:
LB_x * z <= x <= UB_x * z
LB_y * z <= y <= UB_y * z
z is binary
then z = 1 ==> LB_x <= x <= UB_x
LB_y <= y <= UB_y
and because we maximize the sum of the two z variables, the x and y will try to fit i the corresponding ranges so that as many z as possible equal 1.
The green cells H2, J2 have the two new binary varibles, called cond1_true, cond2_true respectively. The other cells have the constraints described above:
For example, for the first expression:
J2: =x_var-cond1_true*x_UB1
J3: =y_var-cond1_true*y_UB1
K2: =x_LB1*cond1_true-x_var
K3: =y_LB1*cond1_true-y_var
All these cells need to be <= 0 in the solver model.
Solver Model:
In the mode, the objective function cell is the sum of the binary variables. The decision variables are x_var, y_yar, cond1_true, cond2_true. The constraints are all in expression <= 0 format. Here is the worksheet: https://www.dropbox.com/s/uek2k9gownhh3ni/excel-solver-is-there-a-way-to-iterate-over-2-changing-variables.xlsx?dl=0
Using this formulation, the solver goes through many combinations of variables and tries to pick up the best one. It can often guarantee an optimal solution (which is almost always the case for small problems)
UPDATE
If the intervals are non overlapping we need to modily
LB_x * z <= x <= UB_x * z
to
min(LB_x) * (1-z) + LB_x * z <= x <= UB_x * z + max(UB_x) * (1-z)
Where min(LB_x) is the minimum lower bound across all intervals (likewise for UB and for y). This way, if an x does not fall into the interval (z=0) it is only forced to fall in some other interval.
I hope this helps!
I am reading about String algorithms in Introduction to Algorithms by Cormen etc
Following is text about some elementary number theoretic notations.
Note: In below text refere == as modulo equivalence.
Given a well-defined notion of the remainder of one integer when divided by another, it is convenient to provide special notation to indicate equality of remainders. If (a mod n) = (b mod n), we write a == b (mod n) and say that a is equivalent to b, modulo n. In other words, a == b (mod n) if a and b have the same remainder when divided by n. Equivalently, a == b (mod n) if and only if n | (b - a).
For example, 61 == 6 (mod 11). Also, -13 == 22 == 2 == (mod 5).
The integers can be divided into n equivalence classes according to their remainders modulo n. The equivalence class modulo n containing an integer a is
[a]n = {a + kn : k Z} .
For example, [3]7 = {. . . , -11, -4, 3, 10, 17, . . .}; other denotations for this set are [-4]7 and [10]7.
Writing a belongs to [b]n is the same as writing a == b (mod n). The set of all such equivalence classes is
Zn = {[a]n : 0 <= a <= n - 1}.----------> Eq 1
My question in above text is in equation 1 it is mentioned that "a" should be between 0 and n-1, but in example it is given as -4 which is not between 0 and 6, why?
In addition to above it is mentioned that for Rabin-Karp algorithm we use equivalence of two numbers modulo a third number? What does this mean?
I'll try to nudge you in the right direction, even though it's not about programming.
The example with -4 in it is an example of an equivalence class, which is a set of all numbers equivalent to a given number. Thus, in [3]7, all numbers are equivalent (modulo 7) to 3, and that includes -4 as well as 17 and 710 and an infinity of others.
You could also name the same class [10]7, because every number that is equivalent (modulo 7) to 3 is at the same time equivalent (modulo 7) to 10.
The last definition gives a set of all distinct equivalence classes, and states that for modulo 7, there is exactly 7 of them, and can be produced by numbers from 0 to 6. You could also say
Zn = {[a]n : n <= a < 2 * n}
and the meaning will remain the same, since [0]7 is the same thing as [7]7, and [6]7 is the same thing as [13]7.
This is not a programming question, but never mind...
it is mentioned that "a" should be between 0 and n-1, but in example it is given as -4 which is not between 0 and 6, why?
Because [-4]n is the same equivalence class as [x]n for some x such that 0 <= x < n. So equation 1 takes advantage of the fact to "neaten up" the definition and make all the possibilities distinct.
In addition to above it is mentioned that for Rabin-Karp algorithm we use equivalence of two numbers modulo a third number? What does this mean?
The Rabin-Karp algorithm requires you to calculate a hash value for the substring you are searching for. When hashing, it is important to use a hash function that uses the whole of the available domain even for quite small strings. If your hash is a 32 bit integer and you just add the successive unicode values together, your hash will usually be quite small resulting in lots of collisions.
So you need a function that can give you large answers. Unfortunately, this also exposes you to the possibility of integer overflow. Hence you use modulo arithmetic to keep the comparisons from being messed up by overflow.