How to put two elements in the same sequence? - constraint-programming

I want to force the flexible job shop example model in CP Optimizer that if a specific mode/element is put in a sequence the successor has to be put also in the same sequence which means in this case that both have to be done on the same machine. How can I create such an subject to condition??
Thank you in advance for your help!!
dvar sequence mchs[m in Mchs] in all(md in Modes: md.mch == m) modes[md]
minimize max(j in Jobs, o in Ops: o.pos==jlast[j]) endOf(ops[o]);
subject to {....}

If op_1 and op_2 denote the two operations that must be allocated to the same machine and if mode_1_i and mode_2_i denote the optional interval variables representing the allocation of op_1 (resp. op_2) on machine i, then all you need to do is posting a constraint: presenceOf(mode_1_i)==presenceOf(mode_2_i).

Related

How to create a 100% declarative model?

I am creating a simple model of a network. The network contains nodes. Nodes send and receive data.
Here's one way to model the network: Each node has a "data" field representing the data possessed by the node at time t. Each node also has a "send" field recording the data sent to other nodes at time t.
sig Node {
data: Data -> Time,
send: Data -> Node -> Time
}
In the spectrum between 100% declarative and 100% imperative, I don't think that that signature is at the 100% declarative side. In my first paragraph I said nothing about nodes having data stores, nothing about keeping a record of sent data.
Also, isn't "send" a verb? Isn't that a sign that the model is not as declarative as it could be? Shouldn't declarative models exclusively use nouns?
I want my model to be at the 100% declarative end of the spectrum. To achieve that, I must simply state what "is". Let's do it!
What is is there are nodes:
sig Node {}
What is is that at any time t, a node has data.
sig Node {
data: Data -> Time
}
What is is that the network has wires between nodes.
sig Network {
wire: Node -> Node
}
What is is that data d is on a wire between time t and t' ...
Let me stop there. Is the second approach that I sketched out more declarative? Is there an approach that is even more declarative?
How to model a network in a 100% declarative manner?
I think you need to separate the semantic content of the model, which is one issue, from the names used in it. To me, the essence of a declarative model is that it records observations -- which may be about state, or about dynamic things like transitions -- expressed in logic. The alternative, an operational model, is to describe behavior and states by building up sequences of primitive actions. A simple example: modeling an action that picks an element non-deterministically from a set. An operational spec might treat the set as ordered, and then use a loop to walk through the set, at each step tossing a coin, and returning the given element when the toss first comes up heads. This models the intuitive operational description: "go through the elements of the set one at a time, and pick one to return". A declarative spec would simply say that the returned value is an element of the set -- nothing else needs to be said.
Regarding the use of names in your particular model, it seems that a larger issue to me is whether the names convey directly what they mean. So send to me is not a very helpful name; a better one would be sentAt.

How to determine whether given language is regular or not(by just looking at the language)?

Is there any trick to guess if a language is regular by just looking at the language?
In order to choose proof methods, I have to have some hypothesis at first. Do you know any hints/patterns required to reduce time consumption in solving long questions?
For instance, in order not to spend time on pumping lemma, when language is regular and I don't want to construct DFA/grammar.
For example:
1. L={w ε {a,b}*/no of a in (w) < no of b in (w)}
2. L={a^nb^m/n,m>=0}
How to tell which is regular by just looking at the above examples??
In general, when looking at a language, a good rule of thumb for whether the language is regular or not is to think of a program that can read a string and answer the question "is this string in the language?"
To write such a program, do you need to store some arbitrary value in a variable or is the program's state (that is, the combination of all possible variables' values) limited to some finite fixed number of possibilities? If the language can be recognized by a program that only needs a fixed number of variables that can only have a fixed number of values, then you've got a regular language. If not, then not.
Using this, I can see that the first language is not regular, but the second language is. In the first language, I need to remember how many as I've seen, and how many bs. (Or at the very least, I need to keep track of (# of as) - (# of bs), and accept if the string ends while that count is negative). At the same time, there's no limit on the number of as, so this count could go arbitrarily large.
In the second language, I don't care what n and m are at all. So with the second language, my program would just keep track of "have I seen at least one b yet?" to make sure we don't have any a characters that occur after the first b. (So, one variable with only two values - true or false)
So one way to make language 1 into a regular language is to change it to be:
1. L={w ∈ {a,b}*/no of a in (w) < no of b in (w), and no of a in (w) < 100}
Now I don't need to keep track of the number of as that I've seen once I hit 100 (since then I know automatically that the string isn't in the language), and likewise with the number of bs - once I hit 100, I can stop counting because I know that'll be enough unless the number of as is itself too large.
One common case you should watch out for with this is when someone asks you about languages where "number of as is a multiple of 13" or "w ∈ {0,1}* and w is the binary representation of a multiple of 13". With these, it might seem like you need to keep track of the whole number to make the determination, but in fact you don't - in both cases, you only need to keep a variable that can count from 0 to 12. So watch out for "multiple of"-type languages. (And the related "is odd" or "is even" or "is 1 more than a multiple of 13")
Other mathematical properties though - for example, w ∈ {0,1}* and w is the binary representation of a perfect square - will result in non-regular languages.

Modelling a vending box using Alloy

I am trying to model a vending machine program using alloy . I wish to create a model in which I could insert some money and provide the machine a selection option for an item and it would provide me the same and in case the money supplied is less then nothing would be provided .
Here I am trying to input a coin along with a button as input and it should return the desired item from the vending machine provided the value ie. amount assigned to each item is provided as input. So here button a should require ten Rs, button b requires 5 rs, c requires 1 and d requires 2 . The op instance is the item returned once the money required is inserted. opc is the balance amount of coins to be returned. ip is input button and x is money input . How can I provide an instance such that it intakes multiple coins as input and also if the amount is greater than the item cost then it should return a no of coins back. If I could get some help it'll be greatly appreciated.
If I were you, I'd proceed by asking myself what kinds of entities I care about; you've done that (signatures for coins and items -- do you also need some notion of a customer?).
Next, I'd ask myself what constitutes a legal state for the system -- sometimes it helps to think about it backwards by asking what would constitute an illegal or unacceptable state.
Then I'd try to define operations -- you've already mentioned insertion of money and selection of an item -- as transitions from one legal state of the system to the next.
At each stage I'd use the Analyzer to examine instances of the model and see whether what I'd done so far makes sense. One example of this pattern of defining entities, states, and state transitions in that order is given in the Whirlwind Tour chapter of Daniel Jackson's Software Abstractions -- if you have access to that book, you will find it helpful to review that chapter.
Good luck!
module vending_machines
open util /ordering[Event]
fun fst:Event{ordering/first}
fun nxt:Event->Event{ordering/next}
fun upto[e:Event]:set Event{prevs[e]+e}
abstract sig Event{}
sig Coin extends Event{}
pred no_vendor_loss[product:set (Event-Coin)]
{
all e:Event | let pfx=upto[e] | #(product&pfx)<=#(Coin&pfx)

How to understand and add syllable break in this example?

I am new in machine learning and computing probabilities. This is an example from Lingpipe for adding syllabification in a word by training data.
Given a source model p(h) for hyphenated words, and a channel model p(w|h) defined so that p(w|h) = 1 if w is equal to h with the hyphens removed and 0 otherwise. We then seek to find the most likely source message h to have produced message w by:
ARGMAXh p(h|w) = ARGMAXh p(w|h) p(h) / p(w)
= ARGMAXh p(w|h) p(h)
= ARGMAXh s.t. strip(h)=w p(h)
where we use strip(h) = w to mean that w is equal to h with the hyphenations stripped out (in Java terms, h.replaceAll(" ","").equals(w)). Thus with a deterministic channel, we wind up looking for the most likely hyphenation h according to p(h), restricting our search to h that produce w when the hyphens are stripped out.
I do not understand how to use it to build a syllabification model.
If there is a training set containing:
a bid jan
a bide
a bie
a bil i ty
a bim e lech
How to have a model that will syllabify words? I mean what to be computed in order to find possible syllable breaks of a new word.
First compute what? then compute what? Can you please be specific with example?
Thanks a lot.
The method described in the article is based on a statistical law allowing to compute the correct value observing a noisy value. In other words, non-syllabified word is noisy or incorrect, like picnic, and the goal is finding a probably correct value, which is pic-nic.
Here is an excellent video lesson on very this topic (scroll to 1:25, but the whole set of lectures worth watching).
This method is specifically useful for word delimiting, but some use it for syllabification as well. Chinese language has space delimiters only for logical constructs, but most words follow each other with no delimiters. However, each character is a syllable, no exception.
There are other languages that have more complicated grammar. For instance, Thai has no spaces between the words, but each syllable may be constructed from several symbols, e.g. สวัสดี -> ส-วัส-ดี. Rule-based syllabification may be hard but possible.
As per English, I would not bother with Markov chains and N-grams and instead just use several simple rules that give pretty good match ratio (not perfect, however):
Two consonants between two vowels VCCV - split between them VC-CV as in cof-fee, pic-nic, except the "cluster consonant" that represents a single sound: meth-od, Ro-chester, hang-out
Three or more consonants between the vowels VCCCV - split keeping the blends together as in mon-ster or child-ren (this seems the most difficult as you cannot avoid a dictionary)
One consonant between two vowels VCV - split after the first vowel V-CV as in ba-con, a-rid
The rule above also has an exception based on blends: cour-age, play-time
Two vowels together VV - split between, except they represent a "cluster vowel": po-em, but glacier, earl-ier
I would start with the "main" rules first, and then cover them with "guard" rules preventing cluster vowels and consonants to be split. Also, there would be an obvious guard rule to prevent a single consonant to become a syllable. When done, I would have added another guard rule based on a dictionary.

cluster short, homogeneous strings (DNA) according to common sub-patterns and extract consensus of classes

Task:
to cluster a large pool of short DNA fragments in classes that share common sub-sequence-patterns and find the consensus sequence of each class.
Pool: ca. 300 sequence fragments
8 - 20 letters per fragment
4 possible letters: a,g,t,c
each fragment is structured in three regions:
5 generic letters
8 or more positions of g's and c's
5 generic letters
(As regex that would be [gcta]{5}[gc]{8,}[gcta]{5})
Plan:
to perform a multiple alignment (i.e. withClustalW2) to find classes that share common sequences in region 2 and their consensus sequences.
Questions:
Are my fragments too short, and would it help to increase their size?
Is region 2 too homogeneous, with only two allowed letter types, for showing patterns in its sequence?
Which alternative methods or tools can you suggest for this task?
Best regards,
Simon
Yes, 300 is FAR TOO FEW considering that this is the human genome and you're essentially just looking for a particular 8-mer. There are 65,536 possible 8-mers and 3,000,000,000 unique bases in the genome (assuming you're looking at the entire genome and not just genic or coding regions). You'll find G/C containing sequences 3,000,000,000 / 65,536 * 2^8 =~ 12,000,000 times (and probably much more since the genome is full of CpG islands compared to other things). Why only choose 300?
You don't want to use regex's for this task. Just start at chromosome 1, look for the first CG or GC and extend until you get your first non-G-or-C. Then take that sequence, its context and save it (in a DB). Rinse and repeat.
For this project, Clustal may be overkill -- but I don't know your objectives so I can't be sure. If you're only interested in the GC region, then you can do some simple clustering like so:
Make a database entry for each G/C 8-mer (2^8 = 256 in all).
Take each GC-region and walk it to see which 8-mers it contains.
Tag each GC-region with the sequences it contains.
Now, for each 8-mer, you have thousands of sequences which contain it. I'll leave the analysis of the data up to your own objectives.
Your region two, with the 2 letters, may end up a bit too similar, increasing length or variability (e.g. more letters) could help.

Resources