Let L1 and L2 be the regular languages over the alphabet {a,b}. We define the language L3 as follows:
L3 = {pqr | pr ∈ L1, q ∈ L2}
L3 is obtained by inserting a string from L2 inside a string from L1. Is language L3 is still regular or not?
I am trying to solve this problem. Is it possible to prove this using string substitution property or homomorphism of regular language? Is there any better and easier way to prove this?
Here is a high-level description of a construction to show there is an NFA accepting L3.
Let M1 and M2 be minimal DFAs such that L(M1) = L1 and L(M2) = L2. Copy M1 so there are two copies, M1[1] and M1[2]. Copy M2 so there are |Q1| copies M2[1], M2[2], …, M2[|Q1|]. Also, number the states q1, q2, …, q|Q1| of M1. Now construct an NFA for M3 as follows:
From state qk of M1[1], add a lambda/epsilon transition to the initial state of M2[k]
From the accepting state(s) of M2[k], add lambda/epsilon transitions to state qk of M1[2]
The accepting states are the accepting states of M1[2]
The initial state is the initial state of M1[1].
This NFA reads some input and then nondeterministically jumps to an instance of M2. It then reads a string in M2 and jumps back to where it left off in the next copy of M1 where it can accept. This NFA has a number of states equal to 2|Q1| + |Q1| * |Q2|.
I am dealing with a problem which is a variant of a subset-sum problem, and I am hoping that the additional constraint could make it easier to solve than the classical subset-sum problem. I have searched for a problem with this constraint but I have been unable to find a good example with an appropriate algorithm either on StackOverflow or through googling elsewhere.
The problem:
Assume you have two lists of positive numbers A1,A2,A3... and B1,B2,B3... with the same number of elements N. There are two sums Sa and Sb. The problem is to find the simultaneous set Q where |sum (A{Q}) - Sa| <= epsilon and |sum (B{Q}) - Sb| <= epsilon. So, if Q is {1, 5, 7} then A1 + A5 + A7 - Sa <= epsilon and B1 + B5 + B7 - Sb <= epsilon. Epsilon is an arbitrarily small positive constant.
Now, I could solve this as two completely separate subset sum problems, but removing the simultaneity constraint results in the possibility of erroneous solutions (where Qa != Qb). I also suspect that the additional constraint should make this problem easier than the two NP complete problems. I would like to solve an instance with 18+ elements in both lists of numbers, and most subset-sum algorithms have a long run time with this number of elements. I have investigated the pseudo-polynomial run time dynamic programming algorithm, but this has the problems that a) the speed relies on a short bit-depth of the list of numbers (which does not necessarily apply to my instance) and b) it does not take into account the simultaneity constraint.
Any advice on how to use the simultaneity constraint to reduce the run time? Is there a dynamic programming approach I could use to take into account this constraint?
If I understand your description of the problem correctly (I'm confused about why you have the distance symbols around "sum (A{Q}) - Sa" and "sum (B{Q}) - Sb", it doesn't seem to fit the rest of the explanation), then it is in NP.
You can see this by making a reduction from Subset sum (SUB) to Simultaneous subset sum (SIMSUB).
If you have a SUB problem consisting of a set X = {x1,x2,...,xn} and a target called t and you have an algorithm that solves SIMSUB when given two sets A = {a1,a2,...,an} and B = {b1,b2,...,bn}, two intergers Sa and Sb and a value for epsilon then we can solve SUB like this:
Let A = X and let B be a set of length n consisting of only 0's. Set Sa = t, Sb = 0 and epsilon = 0. You can now run the SIMSUB algorithm on this problem and get the solution to your SUB problem.
This shows that SUBSIM is as least as hard as SUB and therefore in NP.
I am reading the book: introduction to the theory of computation and got stuck on this example.
Convert a DFA to an equivalent expression by converting it first to a GNFA(generalized nondeterministic finite automaton) and then convert GNFA to a regular expression.
here is the example:
enter image description here
I should use this recursively to arrive at the the fourth state:
enter image description here
Unfortunately, I cannot understand what is going on from b to c? I only understand that we are trying to get rid of state 2, but how we arrive at c from b?
Thank you very much!
This can be quite tricky at first but I suggest you check definition 1.64 and see the function CONVERT(G) for more clearance. But as a brief explanation using the function for each possible neighbour state:
First from a to b, add a start state and a new accept state;
Afterwards you need to calculate each new path after qrip is removed, in this case state 1;
So, from start to q2, you get only label a from epsilon and a;
Same goes from start to q3, resulting only in b;
Now from q2 to q2 going trough qrip, you have label a to qrip and label a to get back, so you get (aa U b);
Same goes to q3 to q3 through qrip, so resulting in bb, notice that there is no loop in q3 so no union;
Now from q2 to q3 through qrip, you only need to concatenate a and b resulting in ab label;
Lastly the other way around, from q3 to q2 going through qrip, concatenate b and a resulting in ba but this time making the union with the previous path between q3 and q2;
Now choose a new qrip and proceed to do the same algorithm again.
Hope the explanation was clear enough, but as said before refer to the algorithm in the book for a better and more detailed explanation.
The two popular methods for converting a given DFA to its regular expression are-
Arden’s Method
State Elimination Method
Arden’s Theorem states that:
Let P and Q be two regular expressions over ∑.
To use Arden’s Theorem, the following conditions must be satisfied-
The transition diagram must not have any ∈ transitions.
There must be only a single initial state.
Step-01:
Form an equation for each state considering the transitions which come towards that state.
Add ‘∈’ in the equation of the initial state.
Step-02:
Bring the final state in the form R = Q + RP to get the required regular expression.
If P does not contain a null string ∈, then-
R = Q + RP has a unique solution i.e. R = QP*
I have a little alloy specification as follows:
sig class {parents : set class}
fact f1{all p:class | not p in p.^parents}
run{} for exactly 4 class
First, I thought alloy would translate f1 into boolean matrix, then perform closure operation on it. But it seems it does not do this kind of translation (it looks like it runs something before boolean matrix creation.). So how exactly this f1 gets translated? Does it use relation predicate? I am just very curious about alloy's translation.
Boolean matrices are used to represent expression in Alloy. So, you start with a unary matrix for each sig, a binary matrix for each binary relation, a ternary matrix for each ternary relation, and so on. Then, translation of "complex" expression (e.g., involving relational algebra operators) is done by manipulating (composing) those matrices you started with. For each Alloy operator (e.g., transitive closure (^), relational join (.), in, not, etc.) there is a corresponding algorithm that performs a bunch of matrix operations such that the semantics of that operator is correctly implemented.
So in this example, the all quantifier is first unrolled, meaning that for each atom p of type class the body is translated (something like:
m0 = matrix(p) //returns matrix corresponding to p
m1 = matrix(parents) //returns matrix corresponding to parents
m2 = ^m1
m3 = m0.m2
m4 = m0 in m3
m5 = not m4
), and finally, all those body translations are AND'ed.
My program (Hartree-Fock/iterative SCF) has two matrices F and F' which are really the same matrix expressed in two different bases. I just lost three hours of debugging time because I accidentally used F' instead of F. In C++, the type-checker doesn't catch this kind of error because both variables are Eigen::Matrix<double, 2, 2> objects.
I was wondering, for the Haskell/ML/etc. people, whether if you were writing this program you would have constructed a type system where F and F' had different types? What would that look like? I'm basically trying to get an idea how I can outsource some logic errors onto the type checker.
Edit: The basis of a matrix is like the unit. You can say 1L or however many gallons, they both mean the same thing. Or, to give a vector example, you can say (0,1) in Cartesian coordinates or (1,pi/2) in polar. But even though the meaning is the same, the numerical values are different.
Edit: Maybe units was the wrong analogy. I'm not looking for some kind of record type where I can specify that the first field will be litres and the second gallons, but rather a way to say that this matrix as a whole, is defined in terms of some other matrix (the basis), where the basis could be any matrix of the same dimensions. E.g., the constructor would look something like mkMatrix [[1, 2], [3, 4]] [[5, 6], [7, 8]] and then adding that object to another matrix would type-check only if both objects had the same matrix as their second parameters. Does that make sense?
Edit: definition on Wikipedia, worked examples
This is entirely possible in Haskell.
Statically checked dimensions
Haskell has arrays with statically checked dimensions, where the dimensions can be manipulated and checked statically, preventing indexing into the wrong dimension. Some examples:
This will only work on 2-D arrays:
multiplyMM :: Array DIM2 Double -> Array DIM2 Double -> Array DIM2 Double
An example from repa should give you a sense. Here, taking a diagonal requires a 2D array, returns a 1D array of the same type.
diagonal :: Array DIM2 e -> Array DIM1 e
or, from Matt sottile's repa tutorial, statically checked dimensions on a 3D matrix transform:
f :: Array DIM3 Double -> Array DIM2 Double
f u =
let slabX = (Z:.All:.All:.(0::Int))
slabY = (Z:.All:.All:.(1::Int))
u' = (slice u slabX) * (slice u slabX) +
(slice u slabY) * (slice u slabY)
in
R.map sqrt u'
Statically checked units
Another example from outside of matrix programming: statically checked units of dimension, making it a type error to confuse e.g. feet and meters, without doing the conversion.
Prelude> 3 *~ foot + 1 *~ metre
1.9144 m
or for a whole suite of SI units and quanities.
E.g. can't add things of different dimension, such as volumes and lengths:
> 1 *~ centi litre + 2 *~ inch
Error:
Expected type: Unit DVolume a1
Actual type: Unit DLength a0
So, following the repa-style array dimension types, I'd suggest adding a Base phantom type parameter to your array type, and using that to distinguish between bases. In Haskell, the index Dim
type argument gives the rank of the array (i.e. its shape), and you could do similarly.
Or, if by base you mean some dimension on the units, using dimensional types.
So, yep, this is almost a commodity technique in Haskell now, and there's some examples of designing with types like this to help you get started.
This is a very good question. I don't think you can encode the notion of a basis in most type systems, because essentially anything that the type checker does needs to be able to terminate, and making judgments about whether two real-valued vectors are equal is too difficult. You could have (2 v_1) + (2 v_2) or 2 (v_1 + v_2), for example. There are some languages which use dependent types [ wikipedia ], but these are relatively academic.
I think most of your debugging pain would be alleviated if you simply encoded the bases in which you matrix works along with the matrix. For example,
newtype Matrix = Matrix { transform :: [[Double]],
srcbasis :: [Double], dstbasis :: [Double] }
and then, when you M from basis a to b with N, check that N is from b to c, and return a matrix with basis a to c.
NOTE -- it seems most people here have programming instead of math background, so I'll provide short explanation here. Matrices are encodings of linear transformations between vector spaces. For example, if you're encoding a rotation by 45 degrees in R^2 (2-dimensional reals), then the standard way of encoding this in a matrix is saying that the standard basis vector e_1, written "[1, 0]", is sent to a combination of e_1 and e_2, namely [1/sqrt(2), 1/sqrt(2)]. The point is that you can encode the same rotation by saying where different vectors go, for example, you could say where you're sending [1,1] and [1,-1] instead of e_1=[1,0] and e_2=[0,1], and this would have a different matrix representation.
Edit 1
If you have a finite set of bases you are working with, you can do it...
{-# LANGUAGE EmptyDataDecls #-}
data BasisA
data BasisB
data BasisC
newtype Matrix a b = Matrix { coefficients :: [[Double]] }
multiply :: Matrix a b -> Matrix b c -> Matrix a c
multiply (Matrix a_coeff) (Matrix b_coeff) = (Matrix multiplied) :: Matrix a c
where multiplied = undefined -- your algorithm here
Then, in ghci (the interactive Haskell interpreter),
*Matrix> let m = Matrix [[1, 2], [3, 4]] :: Matrix BasisA BasisB
*Matrix> m `multiply` m
<interactive>:1:13:
Couldn't match expected type `BasisB'
against inferred type `BasisA'
*Matrix> let m2 = Matrix [[1, 2], [3, 4]] :: Matrix BasisB BasisC
*Matrix> m `multiply` m2
-- works after you finish defining show and the multiplication algorithm
While I realize this does not strictly address the (clarified) question – my apologies – it seems relevant at least in relation to Don Stewart's popular answer...
I am the author of the Haskell dimensional library that Don referenced and provided examples from. I have also been writing – somewhat under the radar – an experimental rudimentary linear algebra library based on dimensional. This linear algebra library statically tracks the sizes of vectors and matrices as well as the physical dimensions ("units") of their elements on a per element basis.
This last point – tracking physical dimensions on a per element basis – is rather challenging and perhaps overkill for most uses, and one could even argue that it makes little mathematical sense to have quantities of different physical dimensions as elements in any given vector/matrix. However, some linear algebra applications of interest to me such as kalman filtering and weighted least squares estimation typically use heterogeneous state vectors and covariance matrices.
Using a Kalman filter as an example, consider a state vector x = [d, v] which has physical dimensions [L, LT^-1]. The next (future) state vector is predicted by multiplication by the state transition matrix F, i.e.: x' = F x_. Clearly for this equation to make sense F cannot be arbitrary but must have size and physical dimensions [[1, T], [T^-1, 1]]. The predict_x' function below statically ensures that this relationship holds:
predict_x' :: (Num a, MatrixVector f x x) => Mat f a -> Vec x a -> Vec x a
predict_x' f x_ = f |*< x_
(The unsightly operator |*< denotes multiplication of a matrix on the left with a vector on the right.)
More generally, for an a priori state vector x_ of arbitrary size and with elements of arbitrary physical dimensions, passing a state transition matrix f with "incompatible" size and/or physical dimensions to predict_x' will cause a compile time error.
In F# (which originally evolved from OCaml), you can use units of measure. Andrew Kenned, who designed the feature (and also created a very interesting theory behind it) has a great series of articles that demonstrate it.
This can quite likely be used in your scenario - although I don't fully understand the question. For example, you can declare two unit types like this:
[<Measure>] type litre
[<Measure>] type gallon
Adding litres and gallons gives you a compile time error:
1.0<litre> + 1.0<gallon> // Error!
F# doesn't automatically insert conversion between different units, but you can write a conversion function:
let toLitres gal = gal * 3.78541178<litre/gallon>
1.0<litre> + (toLitres 1.0<gallon>)
The beautiful thing about units of measure in F# is that they are automatically inferred and functions are generic. If you multiply 1.0<gallon> * 1.0<gallon>, the result is 1.0<gallon^2>.
People have used this feature for various things - ranging from conversion of virtual meters to screen pixels (in solar system simulations) to converting currencies (dollars in financial systems). Although I'm not expert, it is quite likely that you could use it in some way for your problem domain too.
If it's expressed in a different base, you can just add a template parameter to act as the base. That will differentiate those types. A float is a float is a float- if you don't want two float values to be the same if they actually have the same value, then you need to tell the type system about it.