Sphinx search with any "symbol" breaks the search engine? - search

For some reason, when I type a symbol anywhere in the query,
. ! ^ , or anything else, the search engine breaks.
But when I type in "#" , the search engine does not break.
That's because I modified the charset to deal with #hashtags.
charset_table = 0..9, A..Z->a..z, _, a..z, U+23, U+410..U+42F->U+430..U+44F, U+430..U+44F
How can I fix this so that I could include other symbols in the query as well as make the hashtag work? (Right now the hashtag works wonders, and sphinx treats it as a normal keyword)
By the way, the unicode for hashtag is "U+0023"

I am not sure about . and , (Sphinx specific?), but ! and ^ (among others, that is + - && || ! ( ) { } [ ] ^ " ~ * ? : \) are Lucene special characters and you need to escape them. See Escaping Special Characters at the bottom of the page.

Related

search a string having special character

I need to search a string which contains special characters
eg
keyword = alpha#bet+a
db.collection.find('course.title': { $regex: keyword, $options: 'i' } )
but i am getting errors like keyword cannot be empty, also invalid regex expression
i even tried by putting backslash before special characters
keyword.trim().replace(/[^a-z\d#&]/g, '\$&')
this works for few but not for all.
I am using reactjs as frontend and nodejs ,mongodb for backend
Following is the query to search string with special characters in a MongoDB document. Here, we are searching for a string keyword with special character $
db.searchDocumentWithSpecialCharactersDemo.find({ field_name: /.\alpha#bet+a./i }).pretty();

How to escape brackets in SPARQL string?

I'm trying to make a sparql query to: http://sparql.lynx-project.eu/
The graph: http://sparql.lynx-project.eu/graph/eurovoc
Which contains some entries with brackets in the prefLabel i.e. "sanction (EU)".
I'm trying to retrieve such exact match of such entries with:
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
SELECT ?c ?label
WHERE {
GRAPH <http://sparql.lynx-project.eu/graph/eurovoc> {
?c a skos:Concept .
?c ?p ?label.
FILTER regex(?label, "^sanction (EU)$", "i" )
FILTER (lang(?label) = "en")
FILTER (?p IN (skos:prefLabel, skos:altLabel ) )
}
}
It doesn't return anything. Also tried to escape the brackets with backslash but the query breaks. Do you know how to escape brackets in a sparql string?? thanks in advance!

How to replace part of a string with an added condition

The problem:
The objective is to convert: "tan(x)*arctan(x)"
Into: "np.tan(x)*np.arctan(x)"
What I've tried:
s = "tan(x)*arctan(x)"
s = s.replace('tan','np.tan')
Out: np.tan(x)*arcnp.tan(x)
However, using pythons replace method resulted in arcnp.tan.
Taking one additional step:
s = s.replace('arcnp.', 'np.arc')
Out: np.tan(x)*np.arctan(x)
Achieves the desired result... but this solution is sloppy and inefficient.
Is there a more efficient solution to this problem?
Any help is appreciated. Thanks in advance.
Here is a way to do the job:
var string = 'tan(x)*arctan(x)';
var res = string.replace(/\b(?:arc)?tan\b/g,'np.$&');
console.log(res);
Explanation:
/ : regex delimiter
\b : word boundary, make sure we don't have any word character before
(?:arc)? : non capture group, literally 'arc', optional
tan : literally 'tan'
\b : word boundary, make sure we don't have any word character after
/g : regex delimiter, global flag
Replace:
$& : means the whole match, ie. tan or arctan
You can use regular expression to solve your issue. Following code is in javascript. Since, u didn't mention the language you are using.
var string = 'tan(x)*arctan(x)*xxxtan(x)';
console.log(string.replace(/([a-z]+)?(tan)/g,'np.$1$2'));

How to replace unwanted characters

I have some hotels that contains characters which are not valie for when i want to insert these hotel names as a file name as file naming doesn't allow /, * or ? and want to know what this error means.
text?text
text?text
text**text**
text*text (text)
text *text*
text?
I am trying to use an if else statement so that if a hotel name contains any of these characters, then replace them with -. However I am receiving and error stating a dangling ?. I just want to check if I am using the replace correctly for these characters.
def hotelNameTrim = hotelName.toString()
if (hotelNameTrim.contains("/"))
{
hotelNameTrim.replaceAll("/", "-")
}
else if (hotelNameTrim.contains("*"))
{
hotelNameTrim.replaceAll("*", "-")
}
else if (hotelNameTrim.contains("?"))
{
hotelNameTrim.replaceAll("?", "-")
}
replaceAll accepts a regex as a search pattern. * and ? are special characters in regex and need to be escaped with a back slash. Which itself needs to be escaped in a Java string :)
Try this:
hotelNameTrim.replaceAll("[\\*\\?/]","-")
That will replace all you characters with a dash.

How to search for unicode characters in records of DB2?

I have a table in DB2 say METAATTRIBUTE wherein a column say "content" might contain any special character including the unicode characters.
For any special character, Eg: "#" I can simply search by :
Select * from METAATTRIBUTE where content like '%#%';
but how to search for unicode characters like "u201B" or "u201E" ???
Thanks in advance.
Assuming you are talking about DB2 LUW, the Unicode string literals are designated by the symbols "u&", followed by a regular string literal in single quotes. Unicode code points are designated by an escape character, backslash by default. For example:
$ db2 "values u&'\201b'"
1
---
‛
1 record(s) selected.
So your query would look like:
Select * from METAATTRIBUTE where content like u&'%\201b%';
Recently, I have had the same problem. This worked for me
select *
from METAATTRIBUTE
where MEDEDELINGSZONE like '%' || UX'201B' || '%'

Resources