Simple string replacement set of rules

Simple string replacement set of rules - string

I have an application where users set up a bunch of objects by filling up a bunch of text boxes which represent values that these objects will take. Just like setting up a Person object which requires you to enter a Name and a LastName properties.
Now I want to introduce global variables that the user will be able to assign values to, or which's values will change during the execution of the program. And I want the user to be able to use them when filling up any object's properties. My first idea was to choose an special character that will mark the beginning of a variable name, and then let the user use the character itself twice to represent the character itself.
For instance, say I have a global variable called McThing. Then, say the symbol I choose to mark the beginning of a variable is %. The user would then be able to enter as a person's last name the string "Mc. %McThing", which then I'd replace using the value of McThing. If McThing's value is "Donalds", the last name would become "Mc. Donalds".
The problem with this is that, if I'd have a variable called He and another called Hello and the user enters "%Hello" as the string I wouldn't know which variable needs to be replaced. I could change my rules to, for instance, use the "%" symbol to mark both the beginning and the end of the variable name. But I'm not sure whether this will cause any other problem.
What would be the simplest possible set of rules to achieve this such that the user will be able to represent every possible string without ambiguities? Ideally, the variable names can have any character but I could restrict their names to a given set of characters.

Your approach of marking both beginning and end with % has one problem. What happens if the input string is %foo%%bar%? Do I get the value of foo and the value of bar? Or do I get the value of foo%bar? (Of course if % in variable names isn't allowed, this isn't a problem.)
The simplest way I can think of to avoid this problem is to use one symbol for the beginning and another (e.g. #) for the end. That should avoid any ambiguity. If the user wants a # in text or a variable name, he escapes it like so: %#. This causes no problems, since empty variable names are not a thing (at least I hope not).

It will be fine and easy to implement on the assumptions that:
You have no empty variable names (i.e. if we see a %% is that a % or an empty variable name?)
Variable names cannot contain %s (i.e. if we see a % in a variable name, is that the end or a %?)
OR
Variable names cannot start or end with % and you cannot have 2 variable names in a row
(i.e. is %a%%b% = a and b or a%b?)
These assumptions will ensure that any %% always represents a % character, and any % always represents the start or the end of the string.
These assumptions might not necessarily be required, but at the very least they will make the implementation a lot more difficult (with the above assumptions, we never have to look more than 1 character forward).
An alternative with no such restrictions, loosely based on the way C/Java/etc. does it:
Have % take on a role similar to \ in C/Java/etc. - use:
%s to denote the start of the string
%e to denote the end of the string
%% to denote the % character
You can also use the same characters to represent the start and end, but, we may as well make them different so it's easier to read.

Related

Using flex to identify variable name without repeating characters

I'm not fully sure how to word my question, so sorry for the rough title.
I am trying to create a pattern that can identify variable names with the following restraints:
Must begin with a letter
First letter may be followed by any combination of letters, numbers, and hyphens
First letter may be followed with nothing
The variable name must not be entirely X's ([xX]+ is a seperate identifier in this grammar)
So for example, these would all be valid:
Avariable123
Bee-keeper
Y
E-3
But the following would not be valid:
XXXX
X
3variable
5
I am able to meet the first three requirements with my current identifier, but I am really struggling to change it so that it doesn't pick up variables that are entirely the letter X.
Here is what I have so far: [a-z][a-z0-9\-]* {return (NAME);}
Can anyone suggest a way of editing this to avoid variables that are made up of just the letter X?

The easiest way to handle that sort of requirement is to have one pattern which matches the exceptional string and another pattern, which comes afterwards in the file, which matches all the strings:
[xX]+ { /* matches all-x tokens */ }
[[:alpha:]][[:alnum:]-]* { /* handle identifiers */ }
This works because lex (and almost all lex derivatives) select the first match if two patterns match the same longest token.
Of course, you need to know what you want to do with the exceptional symbol. If you just want to accept it as some token type, there's no problem; you just do that. If, on the other hand, the intention was to break it into subtokens, perhaps individual letters, then you'll have to use yyless(), and you might want to switch to a new lexing state in order to avoid repeatedly matching the same long sequence of Xs. But maybe that doesn't matter in your case.
See the flex manual for more details and examples.

LUA -- gsub problems -- passing a variable to the match string isn't working [duplicate]

This question already has an answer here:
How to match a sentence in Lua
(1 answer)
Closed 1 year ago.
Been stuck on this for over a day.
I'm trying to use gsub to extract a portion of an input string. The exact pattern of the input varies in different cases, so I'm trying to use a variable to represent that pattern, so that the same routine - which is otherwise identical - can be used in all cases, rather than separately coding each.
So, I have something along the lines of:
newstring , n = oldstring:gsub(matchstring[i],"%1");
where matchstring[] is an indexed table of the different possible pattern matches, set up so that "%1" will match the target sequence in each matchstring[].
For instance, matchstring[1] might be
"\[User\] <code:%w*>([^<]*)<\\code>.*" -- extract user name from within the <code>...<\code>
while matchstring[2] could be
"\[World\] (%w)* .*" -- extract user name as first word after prefix '[World] '
and matchstring[3] could be
"<code:%w*>([^<]*)<\\code>.*" -- extract username from within <code>...<\code> at start
This does not work.
Yet when, debugging one of the cases, I replace matchstring[i] with the exact same string -- only now passed as a string literal rather than saved in a variable -- it works.
So.. I'm guessing there must be some 'processing' of the string - stripping out special characters or something - when it's sent as a variable rather than a string literal ... but for the life of me I can't figure out how to adjust the matchstring[] entries to compensate!
Help much appreciated...

FACEPALM
Thankyou, Piglet, you got me on the right track.
Given how this particular platform processes & passes strings, anything within <...> needed the escape character \ for downstream use, but of course - duh - for the lua gsub's processing itself it needed the standard %
much obliged

How to store part of a string in a variable on Lua

I have a phrase, where only some words will change, and I need to store those words on a variable.
Example:
phrase = "I cannot connect to server XPTO\TEST for the last five hours"
The only part that will change is XPTO\TEST and I need to store it on a variable so that I can use it later.
Any ideas, or is it possible?

Seems like you need some form of placeholders, if that is a case, then you can use string.format or string.gsub.
local t = {name="lua", version="5.3"}
x = string.gsub("$name-$version.tar.gz", "%$(%w+)", t)
--> x="lua-5.3.tar.gz"
With PHP for example you can achieve what you want without any extra work done, because there is a feature called string interpolation (wiki).
But at the same time Lua doesn't have one, that's why you can't do that without extra string post-processing.

need guidance with basic function creation in MATLAB

I have to write a MATLAB function with the following description:
function counts = letterStatistics(filename, allowedChar, N)
This function is supposed to open a text file specified by filename and read its entire contents. The contents will be parsed such that any character that isn’t in allowedChar is removed. Finally it will return a count of all N-symbol combinations in the parsed text. This function should be stored in a file name “letterStatistics.m” and I made a list of some commands and things of how the function should be organized according to my professors' lecture notes:
Begin the function by setting the default value of N to 1 in case:
a. The user specifies a 0 or negative value of N.
b. The user doesn’t pass the argument N into the function, i.e., counts = letterStatistics(filename, allowedChar)
Using the fopen function, open the file filename for reading in text mode.
Using the function fscanf, read in all the contents of the opened file into a string variable.
I know there exists a MATLAB function to turn all letters in a string to lower case. Since my analysis will disregard case, I have to use this function on the string of text.
Parse this string variable as follows (use logical indexing or regular expressions – do not use for loops):
a. We want to remove all newline characters without this occurring:
e.g.
In my younger and more vulnerable years my father gave me some advice that I've been turning over in my mind ever since.
In my younger and more vulnerableyears my father gave me some advicethat I’ve been turning over in my mindever since.
Replace all newline characters (special character \n) with a single space: ' '.
b. We will treat hyphenated words as two separate words, hence do the same for hyphens '-'.
c. Remove any character that is not in allowedChar. Hint: use regexprep with an empty string '' as an argument for replace.
d. Any sequence of two or more blank spaces should be replaced by a single blank space.
Use the provided permsRep function, to create a matrix of all possible N-symbol combinations of the symbols in allowedChar.
Using the strfind function, count all the N-symbol combinations in the parsed text into an array counts. Do not loop through each character in your parsed text as you would in a C program.
Close the opened file using fclose.
HERE IS MY QUESTION: so as you can see i have made this list of what the function is, what it should do, and using which commands (fclose etc.). the trouble is that I'm aware that closing the file involves use of 'fclose' but other than that I'm not sure how to execute #8. Same goes for the whole function creation. I have a vague idea of how to create a function using what commands but I'm unable to produce the actual code.. how should I begin? Any guidance/hints would seriously be appreciated because I'm having programmers' block and am unable to start!

I think that you are new to matlab, so the documentation may be complicated. The root of the problem is the basic understanding of file I/O (input/output) I guess. So the thing is that when you open the file using fopen, matlab returns a pointer to that file, which is generally called a file ID. When you call fclose you want matlab to understand that you want to close that file. So what you have to do is to use fclose with the correct file ID.
fid = open('test.txt');
fprintf(fid,'This is a test.\n');
fclose(fid);
fid = 0; % Optional, this will make it clear that the file is not open,
% but it is not necessary since matlab will send a not open message anyway
Regarding the function creation the syntax is something like this:
function out = myFcn(x,y)
z = x*y;
fprintf('z=%.0f\n',z); % Print value of z in the command window
out = z>0;
This is a function that checks if two numbers are positive and returns true they are. If not it returns false. This may not be the best way to do this test, but it works as example I guess.
Please comment if this is not what you want to know.

TCL: detect string map substitution

I have a question about the use of string map in TCL.
Is there a way to detect when this function has changed the previous value of the mapped string?
For example, in this case:
set location "default_user: admin"
set new_user "user"
set new_location [string map [list "admin" $new_user] $location]
In this case, I want to know if new_location has a different value than location (without comparing both variables, maybe there is a more elegant way).
My real case is more complicated than this one, I have a variable with the content of a html file, and I want to substitute a specific value for another one or read from another variable if there was no subtitution.
Thanks for your help, I hope everything is clear in the example above.

string map function does not imply a return number of replacements. In order to get the number of substitutions may be used regsub -all function that returns the value. You can also use a string first to value before string map, to learn whether there is a variable line of the desired value.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Simple string replacement set of rules - string

Related

Using flex to identify variable name without repeating characters

LUA -- gsub problems -- passing a variable to the match string isn't working [duplicate]

How to store part of a string in a variable on Lua

need guidance with basic function creation in MATLAB

TCL: detect string map substitution

Categories

Resources