I'm not a regular expression expert, to say the least. What I'm looking for is a regular expression that extracts multiple values of a certain format from a string.
Example string:
"Customer [record:CustomerID] from [record:CityID] is of type [record:TypeID]"
What I need is an expression that gives me all values in this string that are of the format "[record:XXXXX]". So in this example it would give me:
["CustomerID", "CityID", "TypeID"]
Can it be done?
Actually, something like sed may do the trick, i.e.:
echo "Customer ..." | sed -e 's/\][^[]*\[record:/","/'g -e 's/^.*record:/["/' -e 's/].*$/"]/
In Javascript:
var pattern = '\\[record:([a-zA-Z0-9]+)\\]';
var records = new RegExp(pattern, 'g');
var extract = new RegExp(pattern);
var string = "Customer [record:CustomerID] from [record:CityID] is of type [record:TypeID]"
var matches = string.match(records);
console.log(matches);
> [ '[record:CustomerID]',
'[record:CityID]',
'[record:TypeID]' ]
var records = [];
for (var i=0; i<matches.length; i++) {
var match = matches[i].match(extract);
records.push(match[1]);
}
console.log(records)
> [ 'CustomerID',
'CityID',
'TypeID' ]
Possibly not the most concise solution, but clean and (hopefully) intelligible.
the square brackets that should not be treated specially are escaped by placing \ in front of them
the group to be extracted are wrapped in (), forming a regexp group/subpattern
the pattern [a-zA-Z0-9]+ means "match a string of letters (upper or lower case) or numbers" and the + specifies "of length one or more". A * here would mean "of length 0 or more".
Here I am using two regular expressions, based on the same pattern. They are compiled with different options: the g flag tells the regex to look for all matches in the string. With this flag, we don't get the groups that matched with the results, just the whole string that matched. The second regex is compiled without the g flag, so we can use it to extract the matched group.
Related
How do i check for the word "Hello" inside a string in an if statement but it should only detect if the word "Hello" is alone and not like "Helloo" or "HHello"
The easiest way to do such thing is to use regular expressions. By using regular expressions you can define a rule in order to validate a specific pattern.
Here is the rule for the pattern you required to be matched:
The word must contain the string "hello"
The string "hello" must be preceded by white-space, otherwise it must be the found at the beginning of the string to be matched.
The string "hello" must be followed by either a '.' or a white-space, Otherwise it must be found at the end of the string to be matched.
Here is a simple js code which implements the above rule:
let string = 'Hello, I am hello. Say me hello.';
const pattern = /(^|\s)hello(\s|.|$)/gi;
/*const pattern = /\bhello\b/ you can use this pattern, its easier*/
let matchResult = string.match(pattern);
console.log(matchResult);
In the above code I assumed that the pattern is not case sensitive. That is why I added case insensitive modifier ("i") after the pattern. I also added the global modifier ("g") to match all occurrence of the string "hello".
You can change the rule to whatever you want and update the regular expression to confirm to the new rule. For example you can allow for the string to be followed by "!". You can do that by simply adding "|!" after "$".
If you are new to regular expressions I suggest you to visit W3Schools reference:
https://www.w3schools.com/jsref/jsref_obj_regexp.asp
One way to achieve this is by first replacing all the non alphabetic characters from string like hello, how are you #NatiG's answer will fail at this point, because the word hello is present with a leading , but no empty space. once all the special characters are removed you can simply split the string to array of words and filter 'hello' from there.
let text = "hello how are you doing today? Helloo HHello";
// Remove all non alphabetical charachters
text = text.replace(/[^a-zA-Z0-9 ]/g, '')
// Break the text string to words
const myArray = text.split(" ");
const found = myArray.filter((word) => word.toLowerCase() == 'hello')
// to check the array of found ```hellos```
console.log(found)
//Get the found status
if(found.length > 0) {
console.log('Found')
}
Result
['hello']
Found
I have a large set of JavaScript snippets each containing a line like:
function('some string without numbers', '123,71')
and I'm hoping to get a regex together to pull the numbers from the second argument. The second argument can contain an arbitrary number of comma separated numbers (inlcuding zero numbers), so the following are all valid:
''
'2'
'17,888'
'55,1,6000'
...
The regex '(?:\d+|,)*' successfully matches the quoted numbers, but I have no idea how to match each of the numbers. Placing a capture group around the \d+ seems to capture the last number (if there is one present -- it doesn't work if the second argument is just ''), but none of the others.
In your case, you may match and capture the digits inside the single quotes and then split them with a comma:
var s = "function('some string without numbers', '123,71')";
var res = s.match(/'([\d,]+)'/) || ["", ""];
console.log(res[1].split(','));
The /'([\d,]+)'/ regex will match a ', then 1+ digits or commas (placing that value into Group 1) and then a closing '.
If you want to run the regex globally, use
var s = "function('some string without numbers', '123,71')\nfunction('some string without numbers', '13,4,0')";
var rx = /'([\d,]+)'/g;
var res = [], m;
while ((m=rx.exec(s)) !== null) {
res.push(m[1].split(','));
}
console.log(res);
If you have a numbers in a variable x like this:
var x = '55,1,6000';
then use this to have the list of numbers:
var array = x.split(',');
If you can have some whitespace before/after the comma then use:
var array = x.split('\s*,\s*');
or something like that.
Sometimes it is easier to match the thing that you don't want and split on that.
I have a large body of text and I print only lines that contain one of several strings. Each line can contain more than one string.
Example of the rule:
(house|mall|building)
I want to mark the found string for making the result easier to read.
Example of the result I want:
New record: Two New York houses under contract for nearly $5 millionĀ each.
New record: Two New York #house#s under contract for nearly $5 million each.
I know I can find the location, trim, add marker, add string etc.
I am asking if there is a way to mark the found string in one command.
Thanks.
http://pubs.opengroup.org/onlinepubs/009695399/utilities/awk.html
gsub(ere, repl[, in])
Behave like sub (see below), except that it shall replace all occurrences of the regular expression ...
sub(ere, repl[, in ])
Substitute the string repl in place of the first instance of the
extended regular expression ERE in string in and return the number of
substitutions. An ampersand ( '&' ) appearing in the string repl shall
be replaced by the string from in that matches the ERE ...
BEGIN {
r = "house|mall|building"
s = "Two New York houses under contract for nearly $5 million each."
gsub(r, "#&#", s)
print s
}
This is not a duplicate because all the other questions were not in AS3.
Here is my problem: I am trying to find some substrings that are in the "storage" string, that are in another string. I need to do this because my game server is sending the client random messages that contain on of the strings in the "storage" string. The strings sent from the server will always begin with: "AA_".
My code:
private var storage:String = AA_word1:AA_word2:AA_word3:AA_example1:AA_example2";
if(test.indexOf("AA_") >= 0) {
//i dont even know if this is right...
}
}
If there is a better way to do this, please let me know!
Why not just using String.split() :
var storage:String = 'AA_word1:AA_word2:AA_word3:AA_example1:AA_example2';
var a:Array = storage.split('AA_');
// gives : ,word1:,word2:,word3:,example1:,example2
// remove the 1st ","
a.shift();
trace(a); // gives : word1:,word2:,word3:,example1:,example2
Hope that can help.
Regular Expressions are the right tool for this job:
function splitStorage(storage: String){
var re: RegExp = /AA_([\w]+):?/gi;
// Execute the regexp until it
// stops returning results.
var strings = [];
var result: String;
while(result = re.exec(storage)){
strings.push(result[1]);
}
return strings;
}
The important part of this is the regular expression itself: /AA_([\w]+):?/gi
This says find a match starting with AA_, followed by one-or-more alphanumeric characters (which we capture) ([\w]+), optionally followed by a colon.
The match is then made global and case insensitive with /gi.
If you need to capture more than just letters and numbers - like this: "AA_word1 has spaces and [special-characters]:" - then add those characters to the character set inside the capture group.
e.g. ([-,.\[\]\s\w]+) will also match hyphen, comma, full-stop, square brackets, whitespace and alphanumeric characters.
Also you could do it with just one line, with a more advanced regular expression:
var storage:String = 'AA_word1:AA_word2:AA_word3:AA_example1:AA_example2';
const a:Array = storage.match(/(?<=AA_)\w+(?=:|$)/g);
so this means: one or more word char, preceeded by "AA_" and followed by ":" or the end of string. (note that "AA_" and ":" won't be included into the resulting match)
I am using Horde_Text_Diff to compute the difference between two strings. Sample code is as follows:
$check_diff = new Horde_Text_Diff( 'auto', array('asdf','asd11') );
$renderer = new Horde_Text_Diff_Renderer_Inline();
echo $renderer->render($check_diff);
This echoes nothing. The correct behaviour would be to show a difference at character 4.
If I change the comparison array from array('asdf','asd11') to, for instance, array('asdf','12345'), then it will output a1. In other words, it seems only to be comparing the first character. Any ideas?
When I try this, I get two warnings:
PHP Warning: array_walk() expects parameter 1 to be array, string given in /usr/share/php/Horde/Text/Diff/Engine/Native.php on line 33
PHP Warning: array_walk() expects parameter 1 to be array, string given in /usr/share/php/Horde/Text/Diff/Engine/Native.php on line 34
I.e., something is getting strings where it expects arrays.
That's because, rather than passing (an array containing) two strings to Horde_Text_Diff(), you should pass (an array containing) two arrays-of-strings (where each string represents a line of text).
If the actual strings you're currently trying to pass in contain multiple lines of text, then you can split them into arrays-of-strings using explode(), e.g.:
$a = "foo\nbar\nbaz";
$b = "foo\nqux\nbaz";
$a_lines = explode("\n", $a);
$b_lines = explode("\n", $b);
$check_diff = new Horde_Text_Diff( 'auto', array($a_lines, $b_lines) );
$renderer = new Horde_Text_Diff_Renderer_Inline();
echo $renderer->render($check_diff);
which outputs:
foo
<del>bar</del><ins>qux</ins>
baz