How to find non-ASCII symbols in a string. DB2

How to find non-ASCII symbols in a string. DB2 - string

please advise on my particular issue.
I have a table field with VARCHAR type. I need to validate this field the way it DOESN'T have any non-ASCII symbols (like ╥ ї ╡ etc.) I didn't find any ways to resolve this issue.
Please give me a hand in this. Thanks in advance!
**Update:
The example attached in comments can't resolve my issue. There is shown a fixed set of latin chars and numbers, but my field accepts Japanese and Chinese symbols.

Time for another silly XML trick:
SELECT
XMLQUERY('matches($X,"^[A-z0-9]+$")'
PASSING XMLTEXT('╥ї╡') AS "X"
)
FROM SYSIBM.SYSDUMMY1
1
-----
false
See https://stackoverflow.com/a/17467695/3434508 for details on using Regular Expressions for DB2
See https://www-01.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.xml.doc/doc/xqrregexp.html for advanced RegEx character classes.

Related

how to replace the particular string in-between paragraph in DB2

In the below string, i need to replace the string as space (' ') which starts from DND_NCP_TEXT_START and end at DND_NCP_TEXT_END. without hardcode the string or without store it in variable. any possible using regular expression?
String:
'Dear DND_CST_NAME_STARTHARRIET SCOTT
:DND_CST_NAME_END
DND_NCP_NAME_STARTHARRY SHORT
:DND_NCP_NAME_END
DND_ATTORNEY_TEXT_START
Our records indicate that you represent
DND_CST_NAME_STARTHARRIET D SCOTT
DND_CST_NAME_END
DND_NCP_NAME_STARTHARRY A SHORT
DND_NCP_NAME_END in the above referenced child support matter. Please contact your client regarding this matter and advise us as soon as possible.
If you are no longer representing
DND_CST_NAME_STARTHARRIET D SCOTT
DND_CST_NAME_END
DND_NCP_NAME_STARTHARRY A SHORT
DND_NCP_NAME_END, please file a Notice of Withdrawal.
DND_ATTORNEY_TEXT_END'
needed output:
start string DND_NCP_NAME_START and end string DND_NCP_NAME_END and string inside this need to remove

The problem of your expression is, that you have 2 such patterns in the text, but since the default algorithm used is greedy one, the whole string starting from the 1-st DND_NCP_TEXT_START and ending with the last DND_NCP_TEXT_END is removed.
Use non-greedy matching (.*?) as in the example below.
SELECT
REGEXP_REPLACE
(
'DND_ATTORNEY_TEXT_START Dear EDWIN Our records indicate that you represent HARRIET D SCOTT in the above referenced child support matter. DND_NCP_TEXT_START Please contact your client regarding this matter and advise us as soon as possible.DND_NCP_TEXT_END Please co-operate.DND_NCP_TEXT_START Please contact your client regarding this matter and advise us as soon as possible.DND_NCP_TEXT_END We are anticipating your reply. Thanks a lot. DND_ATTORNEY_TEXT_END'
,'DND_NCP_TEXT_START.*?DND_NCP_TEXT_END',''
)
FROM SYSIBM.SYSDUMMY1;

PexObserve only records 255 characters

I am using Pex from the command line to find input values for test case generation.
I use PexObserve to record certain values during execution.
One of the values that I want to record is an XML-String.
However, when parsing the XML I receive "malformed XML" exceptions, since Pex only writes the first 255 characters into the log.
Is there a way to record the full XML string? or does PexObserve have a different type that will let me record longer texts?

Leaving this here, in case somebody at any point has the same issue.
I've found a solution that helped me.
Unfortunately the 255 character limit is set internally in static readonly fields.
Therefore I needed to use reflection.
My solution works by including the following line in the PUT:
typeof(Microsoft.Pex.Framework.PexObserve.ValueWriterManager).GetField("MaxWrittenElements").SetValue(null, 1000);
Replace the 1000 with any value you like.
BUT: remember that this is a quick-fix solution, that might not work for you.
It may have unwanted side-effects. You're also changing the number of List elements that are written, and perhaps other things.

What does "SEM1:3ENCE_B:NW:NG102:EECT300:120:0900:2" mean?

In my project I am developing teachers and their timetable. I was provided with a text file that contains the teacher timetable from my uni. They ware unable to tell me what is the syntax or code language so I would know how to read it and use it in my iPhone app. Can you help me identifying what sot of code is this and how can I read that?
Sample:
SEM1:3ENCE_B:NW:NG102:EECT300:120:0900:2
SEM1:3ENCE_B,3ENCE_C:TW:NLG107:EEEL300:120:0900:1
19:3ENCE_A,3ENCE_B,3ENCE_C:TW:CLG.01:EEEL305_L:120:1100:1
19:3ENCE_A,3ENCE_B,3ENCE_C:TW:NLG107:EEEL305:120:0900:1
SEM1:3ENCE_A,3ENCE_B:TW::EEEL300:120:1100:4
SEM1&2:3ENCE_A,3ENCE_B,3ENCE_C,3ENCE_D:SK:CLG.06:EEEL315_L:120:1400:4
SEM1:3CS_A,3CS_B,3CS_C,3CS_D,3ENCE_A,3ENCE_B,3ENCE_C,3ENCE_D:DHE:CLLT:EICG301_L:120:0900:5
SEM1:3CS_A,3CS_B:ABO,DHE:N5.114:EICG301:120:1100:5
SEM1:3CS_A,3CS_B,3CS_C,3CS_D,3ENCE_A,3ENCE_B,3ENCE_C,3ENCE_D:NW:LTS205:EECT300_L:120:1600:2
27:3ENCE_A,3ENCE_B,3ENCE_C,3ENCE_CS::NG100:EEEL320:120:1100:2
SEM1:3CS_A,3CS_B,3CS_C,3CS_D:NW:C2.14:ECSC302_L:120:0900:3
SEM1:3CS_A:NW:NG100:EECT300:120:1400:2

It's not code, it's data. And the best way of interpreting it is to compare this representation with another : Think Rosetta Stone.
Obviously, colon is used to separate the fields, and each line probably represents a single tinmetable item. Each line appears to have 8 fields on it.
One field looks like a course ID : EECT300
Another looks like a time : 0900
As for the rest, you'll have to work it out...
University of Westminster, maybe...?

It is not a code language.
It is just a plain text file which contains data using colons : as a separator
I guess you have to parse it and retrieve the information for each column. You have to be aware of the signification of each column (if no ask to your uni)

In Sphinx Search, how do I add "hashtag" to the charset_table?

I would like people to be able to search #photography as well as photography. Those should be treated as two different words in Sphinx. By default, #photography maps to photography, and I can't search for hashtags.
I read on this page that you can add the hash tag to the charset_table to accomplish this. I am completely clueless on how to do that. I don't know unicode, and I don't know what my charset_table should be.
Can someone tell me what my charset_table should be? Thanks.
# charset_table = 0..9, A..Z->a..z, _, a..z, U+410..U+42F->U+430..U+44F, U+430..U+44F
Note: I plan on using real-time index. (not sure if this makes a difference)

It's U+0023 according to the Unicode table. So the final config should be like
charset_table = 0..9, A..Z->a..z, _, a..z, U+23, U+410..U+42F->U+430..U+44F, U+430..U+44F
Don't forget about charset_type variable. AFAIK, this example charset_table is for utf-8. Besides this, you should delete U+23 from blend_chars variable to allow Sphinx to index it as a legit character.

I would like people to be able to search #photography as well as photography. Those should be treated as two different words in Sphinx. By default, #photography maps to photography, and I can't search for hashtags.
good day.
i think it wiil some workaround for you problem, but:
it's bad way to call search function directly from user query.
before call search function in sphinx engine, you need to make some kind of processing on user string.
for example you may check user string for some kind of special characters and delete special characters from query. aftet you may call search function with proceeded query.
good luck.

Handle unknown characters

I'm in need to retrieve a substring from a text. The text is returned by a device and the problem is it sends it with unknown characters in it. What I'm trying to achieve is to retrieve the value '1' at the end but the XSLT statement would fail due to the JUNK characters(shown as BS and in a vi editor as ^H).
Is there a way I can remove these keystroke characters out of the text and use regular string functions in XSLT?
Any help would be much appreciated.
Thank you guys!
<xsl:value-of select="substring-before('show owp onu next-available port gpon_1/2$nu next-available port gpon_1/2 / 3 : 81.' , '.')"/>

If your data contains a Backspace control character then it isn't legal XML, and if it isn't legal XML then you can't process it using XSLT. You have to deal with the problem at the stage when you are turning the text returned by the device into XML.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to find non-ASCII symbols in a string. DB2 - string

Related

how to replace the particular string in-between paragraph in DB2

PexObserve only records 255 characters

What does "SEM1:3ENCE_B:NW:NG102:EECT300:120:0900:2" mean?

In Sphinx Search, how do I add "hashtag" to the charset_table?

Handle unknown characters

Categories

Resources