meaning of / operator in solr search - search

I had found following list of characters had special meaning in solr search
+ - && || ! ( ) { } [ ] ^ " ~ * ? : /
I understand all the characters except / operator in solr search.
Can you provide help ?

The special character you are talking about is not "/" but "\" . It is used for escaping characters in a Solr query . For example, while searching for (1+1):2 , you should escape it using "\" in this manner \(1\+1\)\:2 .Let me know if that helps you to understand it :) .

Related

get the string between two char in php using preg_match or Symfony dom crawler

I have a string like this
Power.S04E10.You......
I want to get the 04 and 10 in the string
note: after the s char and e char its always two digits.
in my project, I already use Symfony dom crawler
I appreciate any solution with dom crawler or preg_match
performance is an issue here because the code is placed in the loop with lots of instances.
so far I write this code that gives me the s04e10 part from the string above. I don't know how to get the 04 and 10 separately.
$matches = [];
$s = 'Power.S04E10.You.Cant.Fix.This.720p.&.1080p.NF.WEB-DL.DD5.1.x264-NTb';
$t = preg_match('/s([0-9]){2}e([0-9]){2}/i', $s, $matches);
thanks in advance
You may use
$s = 'Power.S04E10.You.Cant.Fix.This.720p.&.1080p.NF.WEB-DL.DD5.1.x264-NTb';
if (preg_match('/\.s([0-9]+)e([0-9]+)\./i', $s, $matches)) {
echo $matches[1] . " - " . $matches[2]; // => 04 - 10
}
See the PHP demo and the online regex demo.
I assume you have SXXEYY in between dots, if not, replace \. with \b, word boundaries.
Pattern details
\. - a dot
s - an s or S
([0-9]+) - Group 1: one or more digits (you may limit the repetitions to two if you use ([0-9]{2}) if you think it will work better)
e - an e or E
([0-9]+) - ([0-9]+) - Group 2: one or more digits (you may limit the repetitions to two if you use ([0-9]{2}) if you think it will work better)
\. - a . char
The $matches[1] contains the contents of Group 1 and $matches[2] has the Group 2 contents.

ANTLRv4 : Read double quotes escaped with both \ and "

I'm trying to implement a parser using ANTLRv4 for a language that accepts both "" and \" as a way escaping " characters in " delimited strings.
The answers to this question show how to do it for "" escaping. However when I try to extend it to also cover the \" case, it almost works but becomes too greedy when two strings are on the same line.
Here is my grammar:
grammar strings;
strings : STRING (',' STRING )* ;
STRING
: '"' (~[\r\n"] | '""' | '\"' )* '"'
;
Here is my input of three strings:
"This is ""my string\"",
"cat","fish"
This correctly recognises "This is ""my string\"", but thinks that "cat","fish" is all one string.
If I move "fish" down on to the next line it works correctly.
Can anyone figure out how to make it work if "cat" and "fish" are on the same line?
Make your STRING rule non greedy to stop at the first quote char it encounters, instead of trying to get as much as possible:
STRING
: '"' (~[\r\n"] | '""' | '\"' )*? '"'
;
I've found what I need to do to get this to work as I wanted, though to be honest I'm still not entirely sure why Antlr was doing what it did.
Simply by adding another backslash character to the '\"' clause it works!
So my final STRINGS definition is : '"' (~[\r\n"] | '""' | '\\"' )* '"'
Going back to first principles, I hand drew a state transition diagram of the problem and then realised that the two escaping mechanism sequences are not the same and cannot be treated similarly. Then trying to implement the two patterns in AntlrWorks it became apparent that I needed to add the second backslash at which point it all started working.
Does a single backslash followed by some arbitrary character simply mean that character?

Execution of ls command in Linux

I don't understand the execution when ?? and * are used together.
The following files are in the current working directory:
abc.txt
abcd.txt
bcd.txt
amm.doc
ammc.txt
What is the return result after executing command ls a??.*
* Matches any string, including the null string (empty string)
? Matches any single character
For exemples
Pattern a??.* matches abc.txt
- (a,a)
- (?,b)
- (?,c)
- (.,.)
- (*,txt)
Pattern a??.* don't matches abcd.txt
- (a,a)
- (?,b)
- (?,c)
- but . dont' matches with d
Pattern a??.* don't matches bcd.txt because a don't matches with b.
The questions marks will translate to any one character but the * will translate to multiple characters. Your example will only produce abc.txt and amm.doc. Look up Shell Globbing if you want to know more.

Split a string in Perl

I am trying to split a string in Perl such as below :-
String = "What are you doing these days?"
Split1 - What
Split2 - are
Split3 - you
Split4 - doing these days?
I want the first n number of words separately and the rest of the line together in a separate variable.
Is there any way to do this ? There is no common delimiter I can use. Any help is appreciated ! Thanks.
Perl's split has a limit parameter that seems to be just what you want. To split off the first $n words and leave the rest together, use $n+1 as the limit (the result will be at most $n+1 elements):
my $n = 3;
my $string = "What are you doing these days?";
my #words = split / /, $string, $n+1;
print "$_\n" for #words;
($string1, $string2, $string3, $rest) = split (/ /, $instring, 4);
You can use the following regex to split the string according to your requirement
$ip_tring = "What are you doing these days?";
if($ip_tring =~ m/(\S+)\s(\S+)\s(\S+)\s(.*)/)
{
print("1=$1,2=$2,3=$3,4=$4\n");
}
else
{
print("no match...\n");
}

Perl string matching

I am facing problems with Perl string matching/searching using both index as well as the =~ operator. I need to search for the string "RT #zaynmalik: Big cover for #cosmopolitanuk ! Boys looking slick http://example.com/FcWA80HI" in a text file.
if($splitlines[1] =~ /RT #zaynmalik: Big cover for #cosmopolitanuk ! Boys looking slick http://example.com/FcWA80HI/){
## Do something ##
}
However, because '#' is a special character in Perl, I am getting compile errors. Could you suggest me a method to do this? I tried saving the string to a variable like $str, but it did not work (which is understandable).
So, this is what I am doing now,
$max_freq_tweet = 'RT #zaynmalik: Big cover for #cosmopolitanuk ! Boys looking slick http://example.com/FcWA80HI';
if($splitlines[1] =~ /\Q$max_freq_tweet\E/){
print FILE5 "$splitlines2[1] \n";
}
But it still doesn't seem to be working.
Either escape the # via a backslash, or use single quotes.
my $search_string = 'RT #zaynmalik: Big cover for #cosmopolitanuk ! Boys looking slick http://example.com/FcWA80HI';
# or: "RT \#zaynmalik: Big cover for \#cosmopolitanuk ! Boys looking slick http://example.com/FcWA80HI"
if (-1 != index $str, $search_string) { do something }
If you have a string and want to use it in a regex, you should make sure to protect the meaning via \Q...\E:
if ($str =~ /\Q$search_string\E/) { do something }
This \QUOT\E doesn't prevent array interpolation, but no character in that string will be considered special; without it the . in the string would match any character!
You need to escape the # in your regexp. As in $str =~ /RT \#.*:/.
Edit: you also escape slashes (/) with a backslash (\). $str =~/RT \#.*: .* http:\/\/.*/.
You need to escape special characters with a preceding \ (backslash).
This is relevant not only for #, but for other characters too.
To be on the safe side, you can escape any non-letter character.

Resources