How to define a regex to find groups of multi-line patterns and use alternate to match the longest match

How to define a regex to find groups of multi-line patterns and use alternate to match the longest match - python-regex

I am having a hard time with this regex use case. I have a file that has multiple lines with the same pattern. I am trying to define a regex that finds each occurrence.
File Has:
HR1111
B2222
C3333
D4444
HR1111
B2222
C3333
D4444
D4544
sdaf
HR1111
B2222
C3333
D4444
HR1111
B2222
C3333
SD444
I have tried this:
(?sm)^HR(.*?\n)(?=D4|SD)(.*?\n)
gives:
HR1111
B2222
C3333
D4444
HR1111
B2222
C3333
D4444
D4544
sdaf
HR1111
B2222
C3333
D4444
HR1111
B2222
C3333
SD444
This is almost what I want. It matches each group of HR111 - HR* lines. The only issue is that I need to match either D4444 or D4544. I need to match D4544 if present or D4444 if D4544 is not present. Not sure how to correctly use the alternative to read right to left vs left to right. Thank you.

Related

Regex Optional Lookahead with non-greedy

I am currently trying to create a regex that is able to parse the following lines of logs:
[210616|13:46:32.738|00017] --> CONFD_OK
[210616|13:46:32.738|00017] TRACE CDB_EXISTS /managed-element/fault-management/active-alarm/active-alarm-entries{oru oran-vendor-specific-alarm ORU[1]-ORU[1]/carrier0/antenna34/1004}[210616|13:46:32.738|00017] --> CONFD_OK
[210616|13:46:32.738|00017] TRACE CDB_END_SESSION [210616|13:46:32.738|00017] --> CONFD_OK
[210616|13:46:38.361|00270] TRACE Established new CDB session to ConfD
By parsing in this case it would select the initial [time] followed by the description.
Matches:
[210616|13:46:32.738|00017] --> CONFD_OK
[210616|13:46:32.738|00017] TRACE CDB_EXISTS /managed-element/fault-management/active-alarm/active-alarm-entries{oru oran-vendor-specific-alarm ORU[1]-ORU[1]/carrier0/antenna34/1004}
[210616|13:46:32.738|00017] --> CONFD_OK
[210616|13:46:32.738|00017] TRACE CDB_END_SESSION [210616|13:46:32.738|00017] --> CONFD_OK
[210616|13:46:38.361|00270] TRACE Established new CDB session to ConfD
I started out with \[[^\[\|]*\|[^\]]*\].* to select the initial part but I am having some trouble overcoming the variety in the logs:
There can be two logs per line
There can be a [#] in the log
I tried implementing a non-greedy per followed by positive look ahead in order to account for these issues, but now I am only able to select the first item and only if there are two items in a row.
https://regexr.com/611kh
\[[^\[\|]*\|[^\]]*\](.*?)(?=\[[^\[\|]*\|[^\]]*\])
I think ideally it would be my initial sequence followed by a non-greedy search followed by either a positive look ahead of my intial condition or an end line.
For context, I am working on this for an Angular electron app in an Angular component.
Any suggestions would be greatly appreciated.

In this pattern \[[^\[\|]*\|[^\]]*\].* the .* at the end will match the rest of the line
In this pattern \[[^\[\|]*\|[^\]]*\](.*?)(?=\[[^\[\|]*\|[^\]]*\]) you match the beginning of the log with the square brackets and then capture as least as possible characters until the positive lookahead assertion at the end is true.
If the assertion is not true, the .*? non greedy part will suffice with matching 0 chars.
What you could do is add an alternation | which states matches as least as possible chars until you either encounter another log start, or the end of the string.
\[[^\[\|]*\|[^\]]*\](.*?)(?=\[[^\[\|]*\|[^\]]*\]|$)
Regex demo
If you want the 2 different parts, you can use 2 capture groups as well.
(\[[^\[\|]*\|[^\]]*\])\s*(.*?)(?=\[[^\[\|]*\|[^\]]*\]|$)
Regex demo

Instead of trying to write the one regex that to rule them all, I would go for a simpler approach - string splitting.
First, split on a pattern that begins a new log entry (/(?=\[\d+\|)/ works just fine), then slice the rest of the line at ] and split once more at |:
var fileContents = `[210616|13:46:32.738|00017] --> CONFD_OK
[210616|13:46:32.738|00017] TRACE CDB_EXISTS /managed-element/fault-management/active-alarm/active-alarm-entries{oru oran-vendor-specific-alarm ORU[1]-ORU[1]/carrier0/antenna34/1004}
[210616|13:46:32.738|00017] --> CONFD_OK
[210616|13:46:32.738|00017] TRACE CDB_END_SESSION [210616|13:46:32.738|00017] --> CONFD_OK
[210616|13:46:38.361|00270] TRACE Established new CDB session to ConfD`
var lines = fileContents.split(/(?=\[\d+\|)/).map(line => {
var pos = line.indexOf(']');
return line.slice(1, pos).split('|').concat(line.slice(pos + 1).trim());
});
console.log(lines);
gives
[
["210616", "13:46:32.738", "00017", "--> CONFD_OK"],
["210616", "13:46:32.738", "00017", "TRACE CDB_EXISTS /managed-element/fault-management/active-alarm/active-alarm-entries{oru oran-vendor-specific-alarm ORU[1]-ORU[1]/carrier0/antenna34/1004}"],
["210616", "13:46:32.738", "00017", "--> CONFD_OK"],
["210616", "13:46:32.738", "00017", "TRACE CDB_END_SESSION"],
["210616", "13:46:32.738", "00017", "--> CONFD_OK"],
["210616", "13:46:38.361", "00270", "TRACE Established new CDB session to ConfD"]
]

Groovy: replaceLast() is missing

I need replaceLast() method in the Groovy script - replace the last substring. It is available in Java, but not in Groovy AFAIK. It must work with regex in the same way as the following replaceFirst.
replaceFirst(CharSequence self, Pattern pattern, CharSequence replacement)
Replaces the first substring of a CharSequence that matches the given compiled regular expression with the given replacement.
EDIT: Sorry not being specific enough. Original string is an XML file and the same key (e.g. Name) is present many times. I want to replace the last one.
<Header>
<TransactionId>1</TransactionId>
<SessionId>1</SessionId>
<User>
<Name>Bob</Name>
...
</User>
<Sender>
<Name>Joe</Name>
...
</Sender>
</Header>
...
<Context>
<Name>Rose</Name>
...
</Context>

No idea what replaceLast in Java is...it's not in the JDK... If it was in the JDK, you could use it in Groovy...
Anyway, how about using an XML parser to change your XML instead of using a regular expression?
Given some xml:
def xml = '''<Header>
<TransactionId>1</TransactionId>
<SessionId>1</SessionId>
<User>
<Name>Bob</Name>
</User>
<Sender>
<Name>Joe</Name>
</Sender>
<Something>
<Name>Tim</Name>
</Something>
</Header>'''
You can parse it using Groovy's XmlParser:
import groovy.xml.*
def parsed = new XmlParser().parseText(xml)
Then, you can do a depth first search for all nodes with the name Name, and take the last -1 one:
def lastNameNode = parsed.'**'.findAll { it.name() == 'Name' }[-1]
Then, set the value to a new string:
lastNameNode.value = 'Yates'
And print the new XML:
println XmlUtil.serialize(parsed)
<?xml version="1.0" encoding="UTF-8"?><Header>
<TransactionId>1</TransactionId>
<SessionId>1</SessionId>
<User>
<Name>Bob</Name>
</User>
<Sender>
<Name>Joe</Name>
</Sender>
<Something>
<Name>Yates</Name>
</Something>
</Header>

Check if string contains substring in VHS 2.4

How can I check if a string contains a specific substring?
E.g. I have an array of objects called {category.files} and output each object name.
<f:for each="{category.files}" as="file">
<p>{file.name}</p>
</f:for>
Now I try to check if the name contains a specific substring and if true output it? In my case I search for the substring fire.
I found the VHS Method contains but I don't know how to use it, since there is no real example included.
How does it work in my case?
<f:for each="{category.files}" as="file">
??? ??? ??
<v:if.string.contains then="[mixed]" else="[mixed]" haystack="{file}" needle="fire">
<!--output it if found -->
<!-- else, do nothing -->
</v:if.string.contains>
</f:for>
So in other words, I try to search fire in the haystack.

heystack = string to compare
needle = string to find
if {file.name} can have the needle you should do something like this:
<f:for each="{category.files}" as="file">
<v:condition.string.contains haystack="{file.name}" needle="fire">
<f:then>
Needle found
</f:then>
<f:else>
Needle not found
</f:else>
</v:condition.string.contains>
</f:for>
Edited: As mentioned in the comments the asked ViewHelper v:if.string.containswas renamed to v:condition.string.contains in the version which was used.

How to change particular value from paragraph if value is repeated

Working on SUSE machine.
This is one paragraph from my file
<object class="SaImmMngt">
<dn>safRdn=immManagement,safApp=safImmService</dn>
<attr>
<name>saImmRepositoryInit</name>
<value>2</value>
</attr>
</attr>
</object>
<object class="SaLogStreamConfig">
<dn>safLgStrCfg=saLogAlarm,safApp=safLogService</dn>
where "safRdn=immManagement,safApp=safImmService" is unique. I have to change the value 2 to 1 but these lines are repeated every where in file except unique line. I don't want to change it by line number.
I am trying this:
`sed -n '/safRdn=immManagement,safApp=safImmService/,/safLgStrCfg=saLogAlarm,safApp=safLogService/p' /etc/opensaf/imm.xml | sed -i '0,/1/s//2/' /etc/opensaf/imm.xml`
But this command will change the value everywhere, so how to catch that particular string and change in file.
Note: string
"safRdn=immManagement,safApp=safImmService" is unique string in file while other lines are repeated everywhere in file. I don't want to create any extra file the changes should be in file only.

sed '/<object class="SaImmMngt">/,\#</object># {
\#<dn>safRdn=immManagement,safApp=safImmService</dn>#,\#</object># {
\#<name>saImmRepositoryInit</name>#,\#</attr># {
\#<value># s/2/1/
}
}
}' YourFile
change the value in sub tree you selected using sub selection to only change section that correspond to your criteria.
this could be shorter for specific known file, here this is generic (reusable for other criteria)
be carreful to pattern delimiter used (mainly # here, it can not be part of the pattern (like the default / unless you escape it)

This will change the value from 2 to 1 but only if it is within three lines after your unique string:
$ sed '/safRdn=immManagement,safApp=safImmService/,+3 s/<value>2/<value>1/' file
<object class="SaImmMngt">
<dn>safRdn=immManagement,safApp=safImmService</dn>
<attr>
<name>saImmRepositoryInit</name>
<value>1</value>
</attr>
</attr>
</object>
<object class="SaLogStreamConfig">
<dn>safLgStrCfg=saLogAlarm,safApp=safLogService</dn>
To change the file in place, use the -i option.
sed -i '/safRdn=immManagement,safApp=safImmService/,+3 s/<value>2/<value>1/' file
How it works
/safRdn=immManagement,safApp=safImmService/,+3
This sets a range that starts with the line containing your unique string and continues through the first three lines which follow.
If, in general, three lines are not enough, you can increase the number to whatever you need.
s/<value>2/<value>1/
This command is only applied to lines within the range. For any line within the range, the string <value>2 is changed to <value>1.

how to replace string with result of function in Vim?

I want to insert filename and line number into some places in the file. For example this line:
_debug('init');
I want to replace
:s/debug('/debug('(%current_filename_here%:%current_line_number_here%)\ /g
to get this
_debug('(filename.ext:88) init');
I try to use expand('%:t') to get filename and line(".") to get line number, but I don't know how to use it in replace expression.
How can I do this?

You can use \=. For example:
:s#_debug('\zs#\=printf('(%s:%d) ', expand('%:t'), line('.'))#
When the {replacement} starts with "\=" it is evaluated as an expression,

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to define a regex to find groups of multi-line patterns and use alternate to match the longest match - python-regex

Related

Regex Optional Lookahead with non-greedy

Groovy: replaceLast() is missing

Check if string contains substring in VHS 2.4

How to change particular value from paragraph if value is repeated

how to replace string with result of function in Vim?

Categories

Resources