Annihilator set in regular expression - regular-language

Given that ø+anything is an indentity, therefore ø + a = a.
What exactly is the result of (ø+ øbb)?

The empty set ∅ is kinda sorta like the number zero. If you add zero to anything, you get what you started with. (That is, ∅ + x = x.) Similarly, if you multiply anything by zero, you get zero. This is also true for languages: ∅x = ∅ for any x. The reason why is that the concatenation of two languages is the language of all strings you can make by grabbing something from the first set and something from the second set and concatenating them together, and in the case of the empty set there are no strings to pick.
(More abstractly: language union and concatenation form a semiring with the empty set as the zero element and {ε} as an identity element.)

Related

Find the minimal lexographical string formed by merging two strings

Suppose we are given two strings s1 and s2(both lowercase). We have two find the minimal lexographic string that can be formed by merging two strings.
At the beginning , it looks prettty simple as merge of the mergesort algorithm. But let us see what can go wrong.
s1: zyy
s2: zy
Now if we perform merge on these two we must decide which z to pick as they are equal, clearly if we pick z of s2 first then the string formed will be:
zyzyy
If we pick z of s1 first, the string formed will be:
zyyzy which is correct.
As we can see the merge of mergesort can lead to wrong answer.
Here's another example:
s1:zyy
s2:zyb
Now the correct answer will be zybzyy which will be got only if pick z of s2 first.
There are plenty of other cases in which the simple merge will fail. My question is Is there any standard algorithm out there used to perform merge for such output.
You could use dynamic programming. In f[x][y] store the minimal lexicographical string such that you've taken x charecters from the first string s1 and y characters from the second s2. You can calculate f in bottom-top manner using the update:
f[x][y] = min(f[x-1][y] + s1[x], f[x][y-1] + s2[y]) \\ the '+' here represents
\\ the concatenation of a
\\ string and a character
You start with f[0][0] = "" (empty string).
For efficiency you can store the strings in f as references. That is, you can store in f the objects
class StringRef {
StringRef prev;
char c;
}
To extract what string you have at certain f[x][y] you just follow the references. To udapate you point back to either f[x-1][y] or f[x][y-1] depending on what your update step says.
It seems that the solution can be almost the same as you described (the "mergesort"-like approach), except that with special handling of equality. So long as the first characters of both strings are equal, you look ahead at the second character, 3rd, etc. If the end is reached for some string, consider the first character of the other string as the next character in the string for which the end is reached, etc. for the 2nd character, etc. If the ends for both strings are reached, then it doesn't matter from which string to take the first character. Note that this algorithm is O(N) because after a look-ahead on equal prefixes you know the whole look-ahead sequence (i.e. string prefix) to include, not just one first character.
EDIT: you look ahead so long as the current i-th characters from both strings are equal and alphabetically not larger than the first character in the current prefix.

match part of string in R

I'm stuck with something that usually is pretty easily in other programming languages.
I want to test whether a string is inside another one in R. For example I tried:
match("Diagnosi Prenatale,Esercizio Fisico", "Diagnosi Prenatale")
pmatch("Diagnosi Prenatale,Esercizio Fisico", "Diagnosi Prenatale")
grep("Diagnosi Prenatale,Esercizio Fisico", "Diagnosi Prenatale")
And none worked. To make it work I should fist split the first string with strsplit and extract the first element.
NOTE: I'd like to do this on a vector of strings to receive a yes/no vector, so in the function I wrote should go a vector not a single string. But of course if the single string doesn't work, image a full vector of them...
Any ideas?
Try grepl
grepl("Diagnosi Prenatale","Diagnosi Prenatale,Esercizio Fisico" )
[1] TRUE
You can also do this with character vectors, for example:
x <- c("Diagnosi Prenatale,Esercizio Fisico", "Diagnosi Prenatale")
grepl("Diagnosi Prenatale",x)
#[1] TRUE TRUE

Error when take two numbers out of a string

I'm just playing around with Lua trying to make a calculator that uses string manipulation. Basically I take two numbers out of a string, then do something to them (+ - * /). I can successfully take a number out of x, but taking a number out of y always returns nil. Can anyone help?
local x = "5 * 75"
function calculate(s)
local x, y =
tonumber(s:sub(1, string.find(s," ")-1)),
tonumber(s:sub(string.find(s," ")+3), string.len(s))
return x * y
end
print(calculate(x))
You have a simple misplaced parenthesis, sending string.len to tonumber instead of sub.
local x, y =
tonumber(s:sub(1, string.find(s," ")-1)),
tonumber(s:sub(string.find(s," ")+3, string.len(s)))
You actually don't need the string.len, as end of string is the default value for sub if nothing is given.
EDIT:
You can actually do what you want to do way shorter by using string.match instead.
local x,y = string.match(s,"(%d+).-(%d+)")
Match looks for tries to match the string with the pattern given and returns the captured values, in this case the numbers. This pattern translates to "One or more digits, then as few as possible of any character, then one or more digits". %d is 1 digit, + means one or more. . means any character and - means as few as possible. The values within the parentheses are captured, which means that they are returned.

I am trying to display variable names and num2str representations of their values in matlab

I am trying to produce the following:The new values of x and y are -4 and 7, respectively, using the disp and num2str commands. I tried to do this disp('The new values of x and y are num2str(x) and num2str(y) respectively'), but it gave num2str instead of the appropriate values. What should I do?
Like Colin mentioned, one option would be converting the numbers to strings using num2str, concatenating all strings manually and feeding the final result into disp. Unfortunately, it can get very awkward and tedious, especially when you have a lot of numbers to print.
Instead, you can harness the power of sprintf, which is very similar in MATLAB to its C programming language counterpart. This produces shorter, more elegant statements, for instance:
disp(sprintf('The new values of x and y are %d and %d respectively', x, y))
You can control how variables are displayed using the format specifiers. For instance, if x is not necessarily an integer, you can use %.4f, for example, instead of %d.
EDIT: like Jonas pointed out, you can also use fprintf(...) instead of disp(sprintf(...)).
Try:
disp(['The new values of x and y are ', num2str(x), ' and ', num2str(y), ', respectively']);
You can actually omit the commas too, but IMHO they make the code more readable.
By the way, what I've done here is concatenated 5 strings together to form one string, and then fed that single string into the disp function. Notice that I essentially concatenated the string using the same syntax as you might use with numerical matrices, ie [x, y, z]. The reason I can do this is that matlab stores character strings internally AS numeric row vectors, with each character denoting an element. Thus the above operation is essentially concatenating 5 numeric row vectors horizontally!
One further point: Your code failed because matlab treated your num2str(x) as a string and not as a function. After all, you might legitimately want to print "num2str(x)", rather than evaluate this using a function call. In my code, the first, third and fifth strings are defined as strings, while the second and fourth are functions which evaluate to strings.

Representing the strings we use in programming in math notation

Now I'm a programmer who's recently discovered how bad he is when it comes to mathematics and decided to focus a bit on it from that point forward, so I apologize if my question insults your intelligence.
In mathematics, is there the concept of strings that is used in programming? i.e. a permutation of characters.
As an example, say I wanted to translate the following into mathematical notation:
let s be a string of n number of characters.
Reason being I would want to use that representation in find other things about string s, such as its length: len(s).
How do you formally represent such a thing in mathematics?
Talking more practically, so to speak, let's say I wanted to mathematically explain such a function:
fitness(s,n) = 1 / |n - len(s)|
Or written in more "programming-friendly" sort of way:
fitness(s,n) = 1 / abs(n - len(s))
I used this function to explain how a fitness function for a given GA works; the question was about finding strings with 5 characters, and I needed the solutions to be sorted in ascending order according to their fitness score, given by the above function.
So my question is, how do you represent the above pseudo-code in mathematical notation?
You can use the notation of language theory, which is used to discuss things like regular languages, context free grammars, compiler theory, etc. A quick overview:
A set of characters is known as an alphabet. You could write: "Let A be the ASCII alphabet, a set containing the 128 ASCII characters."
A string is a sequence of characters. ε is the empty string.
A set of strings is formally known as a language. A common statement is, "Let s ∈ L be a string in language L."
Concatenating alphabets produces sets of strings (languages). A represents all 1-character strings, AA, also written A2, is the set of all two character strings. A0 is the set of all zero-length strings and is precisely A0 = {ε}. (It contains exactly one string, the empty string.)
A* is special notation and represents the set of all strings over the alphabet A, of any length. That is, A* = A0 ∪ A1 ∪ A2 ∪ A3 ... . You may recognize this notation from regular expressions.
For length use absolute value bars. The length of a string s is |s|.
So for your statement:
let s be a string of n number of characters.
You could write:
Let A be a set of characters and s ∈ An be a string of n characters. The length of s is |s| = n.
Mathematically, you have explained fitness(s, n) just fine as long as len(s) is well-defined.
In CS texts, a string s over a set S is defined as a finite ordered list of elements of S and its length is often written as |s| - but this is only notation, and doesn't change the (mathematical) meaning behind your definition of fitness, which is pretty clear just how you've written it.

Resources