How do you do a LOCATE in Unidata with BASICTYPE 'U' for #AM's? - u2

I typically use BASICTYPE 'P' at our shop but had an occasion to use 'U' for a project and noticed that I could not do a locate on a dynamic array that is delimited by Attribute Marks.
Referencing the docs, it plainly states that in type U, not specifying an attribute expression is a syntax error. This seems like a huge oversight to me.
How would this be done without resorting to a for-loop to search for these items?

If your array is delimited by attribute marks, you simply need to do your locate in the following syntax:
LOCATE expression IN array_name SETTING position_var THEN | ELSE ...
It's a bit trickier to locate within a value mark delimited array, which would be:
LOCATE expression IN array_name<1> SETTING position_var THEN | ELSE ...

There are two forms of the LOCATE statement
One takes the form of Locate xxx in yyy setting zzz then aaa else bbb
and the other
Locate(xxx;yyy;zzz) then aaa else bbb

When in BASICTYPE 'U', you could use the FIND statement instead.
From the manual:
Syntax
FIND expr IN dyn.array[,occur] SETTING f [,v[,s]] {THEN statements | ELSE statements}
Description
The UniBasic FIND command determines the position of the given expression in a
dynamic array. FIND returns the attribute, value, and subvalue position of the found
string. The expression must match the entire array element to make a matc

Related

Error missing values using .replace(regex)

I'm trying to replace some messy data with regex in a data frame, the column has qualification values, but they are messy. For example, I have 'plastic','plastique','Plasticpackage',or 'Karton','carton','Carton'... in the column 'packaging', but they all mean the same thing, that is 'plastic' or 'Carton', things like that. Therefore Im trying to replace all these values with .replace and Regex. My code looks like this:
dict1={r'[cK]arton':'Carton',r'\W*((?i)plasti(?-i))\W*':'Plastique',r'[cC]onserve':'Conserve'}
df['packaging'].replace(dict1,inplace=True,regex=True)
However, when i execute it gives me the error:missing : at position 18
I have checked, line 1 to line 18 have 17 missing values not only at line 18, so why i have this error? Should I tell python to ignore na values? But the replace() function does not seem to have the ignore na option.
Thank you very much in advance
You cannot use inline modifiers in Python re at a non-initial position in a regex. Besides, it does not support (?-i) notation (to disable the effect of the preceding (?i)).
Instead, you can use an inline modifier group, (?i:...).
So, you need to fix the regex definitions likes this:
dict1={
r'[cK]arton':'Carton',
r'\W*(?i:plasti)\W*':'Plastique',
r'[cC]onserve':'Conserve'
}
Or, r'\W*(?i:plasti)\W*' can also be written as r'(?i)\W*plasti\W*'.

How to search events for a value from new eval fieldname in Splunk?

I need to search for events that contains specific value generated from a new field name. This is what I'm trying to accomplish:
index=app sourcetype=source
| eval uri_t = "uri:type:subtype:123-5678:DATA_REFERENCE:DATA1:999:123-5678:DATA2:DATA_REFERENCE2:123456"
| eval uri2=replace(uri_t, "\:", "%3A")
| search uri2
Basically, I'm encoding part of a url using replace and eval function into field name uri2, then i need to search specifically in the result of the encoded value. But it seems using search, will search for "uri2" instead of the entire encoded string.
Note, I had to use replace to encode part of the url because it seems there is no encode function in splunk.
Any assistance will be appreciated.
As you've learned, the search command searches entire events. To find text within a field, use one of these commands.
| where match(uri2, "<regex>")
| regex uri2="<regex>"
Both of them will filter out events that do not match the given regular expression.
If you want to find a substring within the field without filtering events, then use the rex command.
| rex field=uri2 "<regex>"
Note that rex must contain a named capture group. The group name will become field into which rex will put the matching text.

Python regular expressions with Foreign characters in python PyQT5

This problem might be very simple but I find it a bit confusing & that is why I need help.
With relevance to this question I posted that got solved, I got a new issue that I just noticed.
Source code:
from PyQt5 import QtCore,QtWidgets
app=QtWidgets.QApplication([])
def scroll():
#QtCore.QRegularExpression(r'\b'+'cat'+'\b')
item = listWidget.findItems(r'\bcat\b', QtCore.Qt.MatchRegularExpression)
for d in item:
print(d.text())
window = QtWidgets.QDialog()
window.setLayout(QtWidgets.QVBoxLayout())
listWidget = QtWidgets.QListWidget()
window.layout().addWidget(listWidget)
cats = ["love my cat","catirization","cat in the clouds","catść"]
for i,cat in enumerate(cats):
QtWidgets.QListWidgetItem(f"{i} {cat}", listWidget)
btn = QtWidgets.QPushButton('Scroll')
btn.clicked.connect(scroll)
window.layout().addWidget(btn)
window.show()
app.exec_()
Output GUI:
Now as you can see I am just trying to print out the text data based on the regex r"\bcat\b" when I press the "Scroll" button and it works fine!
Output:
0 love my cat
2 cat in the clouds
3 catść
However... as you can see on the #3, it should not be printed out cause it obviously does not match with the mentioned regular expression which is r"\bcat\b". However it does & I am thinking it has something to do with that special foreign character ść that makes it a match & prints it out (which it shouldn't right?).
I'm expecting an output like:
0 love my cat
2 cat in the clouds
Researches I have tried
I found this question and it says something about this \p{L} & based on the answer it means:
If all you want to match is letters (including "international"
letters) you can use \p{L}.
To be honest I'm not so sure how to apply that with PyQT5 also still I've made some tries & and I tried changing the regex to like this r'\b'+r'\p{cat}'+r'\b'. However I got this error.
QString::contains: invalid QRegularExpression object
QString::contains: invalid QRegularExpression object
QString::contains: invalid QRegularExpression object
QString::contains: invalid QRegularExpression object
Obviously the error says it's not a valid regex. Can someone educate me on how to solve this issue? Thank you!
In general, when you need to make your shorthand character classes and word boundaries Unicode-aware, you need to pass the QRegularExpression.UseUnicodePropertiesOption option to the regex compiler. See the QRegularExpression.UseUnicodePropertiesOption reference:
The meaning of the \w, \d, etc., character classes, as well as the meaning of their counterparts (\W, \D, etc.), is changed from matching ASCII characters only to matching any character with the corresponding Unicode property. For instance, \d is changed to match any character with the Unicode Nd (decimal digit) property; \w to match any character with either the Unicode L (letter) or N (digit) property, plus underscore, and so on. This option corresponds to the /u modifier in Perl regular expressions.
In Python, you could declare it as
rx = QtCore.QRegularExpression(r'\bcat\b', QtCore.QRegularExpression.UseUnicodePropertiesOption)
However, since the QListWidget.findItems does not support a QRegularExpression as argument and only allows the regex as a string object, you can only use the (*UCP) PCRE
verb as an alternative:
r'(*UCP)\bcat\b'
Make sure you define it at the regex beginning.

substitue string by index without using regular expressions

It should be very easy, but I am looking for an efficient way to perform it.
I know that I could split the string into two parts and insert the new value, but I have tried to substitute each line between the indexes 22-26 as follows:
line.replace(line[22:26],new_value)
The Problem
However, that function substitutes everything in the line that is similar to the pattern in line[22:26].
In the example below, I want to replace the marked number 1 with number 17:
Here are the results. Note the replacement of 1 with 17 in several places:
Thus I don't understand the behavior of replace command. Is there a simple explanation of what I'm doing wrong?
Why I don't want RE
The values between index 22-26 are not unified in form.
Note: I am using python 3.5 on Unix/Linux machines.
str.replace replaces 1 sub-string pattern with another everywhere in the string.
e.g.
'ab cd ab ab'.replace('ab', 'xy')
# produces output 'xy cd xy xy'
similarly,
mystr = 'ab cd ab ab'
mystr.replace(mystr[0:2], 'xy')
# also produces output 'xy cd xy xy'
what you could do instead, to replace just the characters in position 22-26
line = line[0:22] + new_value + line[26:]
Also, looking at your data, it seems to me to be a fixed-width text file. While my suggestion will work, a more robust way to process this data would be to read it & separate the different fields in the record first, before processing the data.
If you have access to the pandas library, it provides a useful function just for reading fixed-width files

string.gmatch to find a string included between two inequality signs

I'm using Lua, already used Google and nothing, can't find way to get string between inequality signs (< >). Other brackets are easy to get but these not. It's possible to do?
Target: How to grab "name" from string between inequality signs?
String: < name >: Message
If name does not contain >, then <(.-)> works.
You can use the (%b<>) pattern to capture matching <>. Then using that value, you can simply use string.sub to cut off the first and last char:
name,message=('< name<> > : Foo Bar!'):match('(%b<>)%s*:%s*(.*)')
name=name:sub(2,-2)
print(name,'sent message :',message)
As you can see this also takes care of strings containing other, embedded <> signs

Resources