How do you interpret it? (u∈Σ∗) - regular-language

Here is the full rule
{a^k u a^k| k≥1, u∈Σ∗}
does this mean either single a or single b or any combinations of a and b from the language can be replaced in u?
So if k=1 then is it aaa | aba OR a(aba)a | a(ba)a
Thanks
Rahman

This rule means every string in the language has the same number of a's at the beginning as at the end, with whatever you want (including more a's) between.
So aaa, aba, aabaa and abaa are all in the language (assuming b is in Σ).
In fact, it is enough that the string is at least 2 characters long and there is an a at either end (left as an exercise).

Related

Haskell: Exercise problem (Convert Currencies that consist of two seperate Integers)

So I got three datatypes Euro, Dollar and Yen. The datatype Currency is one of those.
data Euro = MkEuro Integer Integer
data Dollar = MkDollar Integer Integer
data Yen = MkYen Integer
data Currency = MkE Euro | MkD Dollar | MkY Yen
Now I wanna convert f.e. Dollar to Euro. Lets say 1 Dollar is 0.90 Euro.
I really dont know how to implement that in Haskell. I need a function toEuro that takes in a Currency and converts it into Euro and gives it out as a Currency aswell. The problem is that f.e. Dollar und Cents are split into two seperate Integers and Iam not allowed to use any split or connection functions (if there even is some of these). I have no idea how to calculate with two seperate Integers. Lets say I have 12,20 Dollars and I want it as 10,98 Euros. How do I get it into Euros if 1 Dollar was 0.90 Cent. So I need 12 20 to be 10 98. I just dont see it.
Iam not allowed to use any split or connection functions (if there even is some of these).
It's not clear what you mean by that. I strongly suspect that you're supposed to use pattern matching. Joseph's comment is fine, and possibly helpful, but it sounds like the thing you're missing is how to get the integers you need out of the Currency. Try completing this fragment:
toEuro :: Currency -> Currency
toEuro (MkE e) = MkE e
toEuro (MkD (MkDollar d c)) = let usCents = (100 * d) + c
in MkE (MkEuro ... ...)
...
Protips:
That last ellipsis isn't a mistake, there's a whole line missing.
The first pattern seems awkward; we didn't unpack e into MkEuro eE eC, so why did we have to unpack (MkE e)? The answer is because we had to check that it was actually a Euro; obviously we couldn't just write toEuro e = e. But a "better" compromise may have been to use an "as" pattern: toEuro e#(MkE _) = e.
You suggested using 0.9 as a conversion factor; it seems inevitable that you'll want that to be an argument to your function. It should be your first argument; in Haskell your "subject" argument, the most "data-like" argument, should always go last. (Configuration arguments come first.) But it's more complicated than that because you also have to worry about Yen. I don't know how you're going to want to handle that...

Regular expressions for strings not containing specific substring

What could be the regular expression for - All words that do not have the substring baa for alphabet set ={a,b}?
Is it:
a* ((aa) * b *)?
Can a string of length 2 be acceptable for the above condition to hold?
a*(ba?)*
At start, it can go with arbitrarily many a's, but once a b has been introduced, not more than a single isolated a is allowed to appear anywhere hereupon.
a*(b+(ba))*
By grammar, once b reached, there can be many b occurrences or if there is an a after b, it must end or follow by b or by ba.

most efficient way to sort strings with only 2 distinct characters?

If I have strings that I know have no more than 2 distinct characters,
example set:
aab
abbbbabb
bbbaa
aaaaaaa
aaaa
abab
a
aa
aaaaa
aaabba
aabbbab
What's the most efficient way to put them into alphabetical order?
the resulting sorted set:
a
aa
aaaa
aaaaa
aaaaaaa
aaabba
aab
aabbbab
abab
abbbbabb
bbbaa
edit:
I know I could just use a normal sorting algorithm (quick sort, merge sort), but the question is: Does the fact that there are not more than 2 distinct characters make something else more efficient?
If the maximum length of the string matters, I would like to know the answer for 2 different scenarios:
maximum length of the string is the same as the number of strings (n strings being sorted, n maximum length of the string)
maximum length of the string is log n, with n as the number of strings being sorted
I can also assume that all of the strings are distinct.
The String compareTo or compareToIgnoresCase method will return a negative integer, 0, or a polsitive integer depending on the alphabetical ordering of the two Strings being compared. Try that.
General sorting algorithm based on comparisons only asymptotically can't achieve results better than O(nlogn). In your case there is an additional information (2 distinct chars) which has a potential of improving this result. A simple approach that will yield a O(n) result:
Check the first character (let's mark it x).
Scan the string till the end
whenever x is encountered increase a counter.
when encountered the non-x character (let's mark it y) for the first time store it in a dedicated variable
Compare x and y.
if x < y fill the string from the beginning with x's according to the counter and the rest with y
if x > y fill the string from the beginning with y's string length-num of x's slots and the rest with x's.

Understanding the Knuth Morris Pratt(KMP) Failure Function

I've been reading the Wikipedia article about the Knuth-Morris-Pratt algorithm and I'm confused about how the values are found in the jump/partial match table.
i | 0 1 2 3 4 5 6
W[i] | A B C D A B D
T[i] | -1 0 0 0 0 1 2
If someone can more clearly explain the shortcut rule because the sentence
"let us say that we discovered a proper suffix which is a proper prefix and ending at W[2] with length 2 (the maximum possible)"
is confusing. If the proper suffix ends at W[2] wouldn't it be size of 3?
Also I'm wondering why T[4] isn't 1 when there is a prefix and suffix of size 1: The A.
Thanks for any help that can be offered.
Notice that the failure function T[i] does not use i as an index, but rather as a length. Therefore, T[2] represents the length of the longest proper border (a string that is both a prefix and suffix) of the string formed from the first two characters of W, rather than the longest proper border formed by the string ending at character 2. This is why the maximum possible value of T[2] is 2 rather than 3 - the substring formed from the first two characters of W can't have length any greater than 2.
Using this interpretation, it's also easier to see why T[4] is 0 rather than 1. The substring of W formed from the first four characters of W is ABCD, which has no proper prefix that is also a proper suffix.
Hope this helps!
"let us say that we discovered a proper suffix which is a proper prefix and ending at W[2] with length 2 (the maximum possible)"
Okay, the length can be maximum 2, it's correct, here is why...
One fact: "proper" prefix can't be the whole string , same goes for "proper" suffix(like proper subset)
Lets, W[0]=A W[1]=A W[2]=A , i.e the pattern is "AAA", so, the (max length)proper prefix can be "AA" (left to right) and, the (max length) proper suffix can be "AA" (right to left)
//yes, the prefix and suffix have overlaps (the middle "A")
So, the value would be 2 rather than 3, it would have been 3 only if the prefix was not proper.

Algorithm to form a given pattern using some strings

Given are 6 strings of any length. The words are to be arranged in the pattern shown below. They can be arranged either vertically or horizontally.
--------
| |
| |
| |
---------------
| |
| |
| |
--------
The pattern need not to be symmetric and there need to be two empty areas as shown.
For example:
Given strings
PQF
DCC
ACTF
CKTYCA
PGYVQP
DWTP
The pattern can be
DCC...
W.K...
T.T...
PGYVQP
..C..Q
..ACTF
where dot represent empty areas.
The other example is
RVE
LAPAHFUIK
BIRRE
KZGLPFQR
LLHU
UUZZSQHILWB
Pattern is
LLHU....
A..U....
P..Z....
A..Z....
H..S....
F..Q....
U..H....
I..I....
KZGLPFQR
...W...V
...BIRRE
If multiple patterns are possible then pattern with lexicographically smallest first line, then second line and so on is to be formed. What algorithm can be used to solve this?
Find strings which suits to this constraint:
strlen(a) + strlen(b) - 1 = strlen(c)
strlen(d) + strlen(e) - 1 = strlen(f)
After that try every possible situation if they are valid. For example;
aaa.....
d.f.....
d.f.....
d.f.....
cccccccc
..f....e
..f....e
..bbbbbb
There will be 2*2*2 = 8 different situation.
There are a number of heuristics that you can apply, but before that, let's go over some properties of the puzzle.
+aa+
c f
+ee+eee+
f d
+bbb+
Let us call the length of the string with the same character as appeared in the diagram above. We have:
a + b - 1 = e
c + d - 1 = f
I will refer to the 2 strings for the cross in the middle as middle strings.
We also infer that the length of the string cannot be less than 2. Therefore, we can infer:
e > a, e > b
f > c, f > d
From this, we know that the 2 shortest strings cannot be middle strings, due to the inequality above.
The 3 largest strings cannot be equal also, since after choosing any of 3 string as middle string, we are left with 2 largest strings that are equal, and it is impossible according to the inequality above.
The puzzle is only tricky when the lengths are regular. When the lengths are irregular, you can do direct mapping from length to position.
If we have the 2 largest strings being equal, due to the inequality above, they are the 2 middle strings. The worst case for this one is a "regular" puzzle, where the length a, b, c, d are equal.
If the 2 largest strings are unequal, the largest string's position can be determined immediately (since its length is unique in the puzzle) - as one of the middle string. In worst case, there can be 3 candidates for the other middle string - just brute force and check all of them.
Algorithm:
Try to map unique length string to the position.
Brute force the 2 strings in the middle (taken into consideration what I mentioned above), and brute force to fill in the rest.
Even with stupid brute force, there are only 6! = 720 cases, if the string can only go from left to right, up to down (no reverse). There will be 46080 cases (* 2^6) if the string is allowed to be in any direction.

Resources