Movable Type: Create index page with MTTags - movabletype

I want to make a index page (like you see one in beginning of dictionaries) with MTTags from a blog using MT5.1. There might be some jQuery solutions but I would love to accomplish this with Movable type tags. Here is what I have so far.
<ul>
<mt:Tags sort_by="Name">
<li><mt:TagName></li>
</mt:Tags>
</ul>
I would like the result be like below:
A
- Apple
- apricot
B
- bee
C
- Cake
- Cinnamon
D
- Dog
- Dragon

First we need to isolate the first character:
<$mt:TagName regex_replace="/(?<=.).*$/","" $>
(that is a zero-width positive look-behind assertion) but we want it as capital letter, and to save it to a variable:
<$mt:TagName regex_replace="/(?<=.).*$/","" upper_case="1" setvar="current_index" $>
Now we only need to compare it to the last index, to see if we need to output the index header:
<mt:Tags sort_by="Name">
<$mt:TagName regex_replace="/(?<=.).*$/","" upper_case="1" setvar="current_index" $>
<mt:unless name="last_index">
# this is the first time
<mt:else name="current_index" ne="last_index">
# need to output the new index
</mt:unless>
<mt:var name="current_index" setvar="last_index">
<li><mt:TagName></li>
</mt:Tags>
<mt:if name="last_index">
# close the list
</mt:if>
The html tags are left to the reader. :-)

Related

Using REGEX to grab the information after the match

I ran a PDF through a series of processes to extra the text from it. I was successful in that regard. However, now I want to extract specific text from documents.
The document is set up as a multi lined string (I believe. when I paste it into Word the paragraph character is at the end of each line):
Send Unit: COMPLETE
NOA Selection: 20-0429.07
#for some reason, in this editor, despite the next line having > infront of it, the following line (Pni/Trk) keeps wrapping up to the line above. This doesn't exist in the actual doc.
Pni/Trk: 3 Panel / 3 Track
Panel Stack: STD
Width: 142.0000
The information is want to extract are the numbers following "NOA Selection:".
I know I can do a regex something to the effect of:
pattern = re.compile(r'NOA\sSelection:\s\d*-\d*\.\d*)
but I only want the numbers after the NOA selection, especially because NOA Selection will always be the same but the format of the numbers/letters/./-/etc. can vary pretty wildly. This looked promising but it is in Java and I haven't had much luck recreating it in Python.
I think I need to use (?<=...), but haven't been able to implement it.
Also, several of the examples show the string stored in the python file as a variable, but I'm trying to access it from a .txt file, so I might be going wrong there. This is what I have so far.
with open('export1.txt', 'r') as d:    
contents = d.read()    
p = re.compile('(?<=NOA)')
s = re.search(p, contents)
print(s.group())
Thank you for any help you can provide.
With your shown samples, you could try following too. For sample 20-0429.07 I have kept .07 part optional in regex in case you have values 20-0429 only it should work for those also.
import re
val = """Send Unit: COMPLETE
NOA Selection: 20-0429.07"""
matches = re.findall(r'NOA\s+Selection:\s+(\d+-\d+(?:\.\d+)?)', val)
print(matches)
['20-0429.07']
Explanation: Adding detailed explanation(only for explanation purposes).
NOA\s+Selection:\s+ ##matching NOA spaces(1 or more occurrences) Selection: spaces(1 or more occurrences)
(\d+-\d+(?:\.\d+)?) ##Creating capturing group matching(1 or more occurrences) digits-digits(1 or more occurrences)
##and in a non-capturing group matching dot followed by digits keeping it optional.
Keeping it simple, you could use re.findall here:
inp = """Send Unit: COMPLETE
NOA Selection: 20-0429.07"""
matches = re.findall(r'\bNOA Selection: (\S+)', inp)
print(matches) # ['20-0429.07']

Netsuite Custom Field with REGEXP_REPLACE to strip HTML code except carriage return

I have a custom field with some HTML code in it:
<h1>A H1 Heading</h1>
<h2>A H2 Heading</h2>
<b>Rich Text</b><br>
fsdfafsdaf df fsda f asdfa f asdfsa fa sfd<br>
<ol><li>numbered list</li><li>fgdsfsd f sa</li></ol>Another List<br>
<ul><li>bulleted</li></ul>
I also have another non-stored field where I want to display the plain text version of the above using REGEXP_REPLACE, while preserving the carriage returns/line breaks, maybe even converting <br> and <br/> to \r\n
However the patterns etc... seem to be different in NetSuite fields compared to using ?replace(...) in freemarker... and I'm terrible with remembering regexp patterns :)
Assuming the html text is stored in custitem_htmltext what expression could i use as the default value of the NetSuite Text Area custom field to display the html code above as:
A H1 Heading
A H2 Heading
Rich Text
fsdfafsdaf df fsda f asdfa f asdfsa fa sfd
etc...
I understand the bulleted or numbered lists will look crap.
My current non-working formula is:
REGEXP_REPLACE({custitem_htmltext},'<[^<>]*>','')
I've also tried:
REGEXP_REPLACE({custitem_htmltext},'<[^>]+>','') - didn't work
When you use a Text Area type of custom field and input HTML, NetSuite seems to change the control characters ('<' and '>') to HTML entities ('<' and '>'). You can see this if you input the HTML and then change the field type to Long Text.
If you change both fields to Long Text, and re-input the data and formula, the REGEXP_REPLACE() should work as expected.
From what I have learned recently, Netsuite encodes data by default to URL format, so from < to < and > to >.
Try using triple handlebars e.g. {{{custitem_htmltext}}}
https://docs.celigo.com/hc/en-us/articles/360038856752-Handlebars-syntax
This should stop the default behaviour and allow you to use in a formula/saved search.

How to get ordered, defined or all columns except or after or before a given column

In BASH
I run the following one liner to get an individual column/field after splitting on a given character (one can use AWK as well if they want to split on more than one char i.e. on a word in any order, ok).
#This will give me first column i.e. 'lori' i.e. first column/field/value after splitting the line / string on a character '-' here
echo "lori-chuck-shenzi" | cut -d'-' -f1
# This will give me 'chuck'
echo "lori-chuck-shenzi" | cut -d'-' -f2
# This will give me 'shenzi'
echo "lori-chuck-shenzi" | cut -d'-' -f3
# This will give me 'chuck-shenzi' i.e. all columns after 2nd and onwards.
echo "lori-chuck-shenzi" | cut -d'-' -f2-
Notice the last command above, How can I do the same last cut command shit in Groovy?
For ex: if the contents are in a file and they look like:
1 - a
2 - b
3 - c
4 - d
5 - e
6 - lori-chuck shenzi
7 - columnValue1-columnValue2-columnValue3-ColumnValue4
I tried the following Groovy code, but it's not giving me lori-chuck shenzi (i.e. after ignoring the 6th bullet and first occurence of the -, I want my output to be lori-chuck shenzi and the following script is returning me just lori (which is givning me the correct output as my index is [1] in the following code, so I know that).
def file = "/path/to/my/file.txt"
File textfile= new File(file)
//now read each line from the file (using the file handle we created above)
textfile.eachLine { line ->
//list.add(line.split('-')[1])
println "Bullet entry full value is: " + line.split('-')[1]
}
// return list
Also, is there an easy way for the last line in the file above, if I can use Groovy code to change the order of the columns after they are split i.e. reverse the order like we do in Python [1:], [:1], [:-1] etc.. or in some fashion
I don't like this solution but I did this to get it working. After getting index values from [1..-1 (i.e. from 1st index, excluding the 0th index which is the left hand side of first occurrence of - character), I had to remove the [ and ] (LIST) using join(',') and then replacing any , with a - to get the final result what I was looking for.
list.add(line.split('-')[1..-1].join(',').replaceAll(',','-'))
I would still like to know what's a better solution and how can this work when we talk about cherry picking individual columns + in a given order (instead of me writing various Groovy statements to pick individual elements from the string/list per statement).
If I'm understanding your question correctly, what you want is:
line.split('-')[1..-1]
This will give you from position 1 to the last. You can do -2 (next to last) and so on, but just be aware that you can get an ArrayIndexOutOfBoundsException moving backwards too, if you go past the beginning of your array!
-- Original answer is above this line --
Adding to my answer, since comments don't allow code formatting. If all you want is to pick specific columns, and you want a string in the end, you could do something like:
def resultList = line.split('-')
def resultString = "${resultList[1]}-${resultList[2]} ${resultList[3]}"
and pick whatever columns you want that way. I thought you were looking for a more generic solution, but if not, specific columns are easy!
If you want the first value, a dash, then the rest joined by spaces, just use:
"${resultList[1]}-${resultList[2..-1].join(" ")}"
I don't know how to give you specific answers for every combination you might want, but basically once you have your values in a list, you can manipulate that however you want, and turn the results back into a string with GStrings or with .join(...).

What is different about these two pairs of strings that makes this sed script with one and not the other?

This question is related to this other question I asked earlier today:
Find and replace text with all-inclusive wild card
I have a text file like this
I want= to keep this
This is some <text> I want to keep <and "something" in tags that I" want to keep> aff FOO1 WebServices and some more "text" that" should "</be> </deleted>
<this is stuff in tags I want=to begone> and other text I want gone too. </this is stuff in tags I want to begone>
A novice programmer walked into a "BAR2" descript keepthis
and this even more text, let's keep it
<I actually want this>
and this= too.`
when I use sed -f script.sed file.txt to run this script:
# Check for "aff"
/\baff\b/ {
# Define a label "a"
:a
# If the line does not contain "desc"
/\bdesc\b/!{
# Get the next line of input and append
# it to the pattern buffer
N
# Branch back to label "a"
ba
}
# Replace everything between aff and desc
s/\(\baff\)\b.*\b\(desc\b\)/\1TEST DATA\2/
}
I get this as my output:
I want= to keep this
This is some <text> I want to keep <and "something" in tags that I" want to keep> aff FOO1 WebServices and some more "text" that" should "</be> </deleted>
<this is stuff in tags I want=to begone> and other text I want gone too. </this is stuff in tags I want to begone>
A novice programmer walked into a "BAR2" descript keepthis
and this even more text, let's keep it
<I actually want this>
and this= too.
However, by simply changing the search strings from aff and desc to FOO1 and BAR2:
# Check for "FOO1"
/\bFOO1\b/ {
# Define a label "a"
:a
# If the line does not contain "BAR2"
/\bBAR2\b/!{
# Get the next line of input and append
# it to the pattern buffer
N
# Branch back to label "a"
ba
}
# Replace everything between FOO1 and BAR2
s/\(\bFOO1\)\b.*\b\(BAR2\b\)/\1TEST DATA\2/
}
gives the expected output:
I want= to keep this
This is some <text> I want to keep <and "something" in tags that I" want to keep> aff FOO1TEST DATABAR2" descript keepthis
and this even more text, let's keep it
<I actually want this>
and this= too.`
I am completely stumped about what is going on here. Why should searching between FOO1 and BAR2 work differently from the exact same script with aff and desc?
The end marker should be \bdesc instead of \bdesc\b.
Note the \b in the pattern, it matches a word boundary. Your above text contains the word description, but not desc.
Your previous question made me assume that you want that. If you don't care about word boundaries, remove the \b escape sequences completely.

How to verify ANY text is present with selenium IDE

I know how to verify if a specific text is present in a web page using Selenium IDE. But what I wanted to know is, can you verify that any text is present in an element?
For example there's a text box with the title "Top Champion". This text box will be changed daily with the name of a person. Now I just wanted to check whether there is a text in this text box, no matter what the text actually is. I've tried the verify text command and tried blanking the value, but it doesn't work. If the command can return a true or false command that would be really helpful
BTW, verify value doesn't work either since the element that I'm testing is not a form field
Your best bet is as follows (I have written single tests for this for numbers)
Medium rigour:
waitForText | css=.SELECTORS | regex:.+?
This will wait until there is at least 1 character present.
Strong rigour (only works if you have a subset of characters present):
waitForText | css=.SELECTORS | regex:^[0-9]+$
This will wait until there is text. This text must start with a number, have at least 1 number, and then finish. It does not permit any character outside of the subset given. An example you could do to match numbersNAMEnumbers would be.
waitForText | css=.SELECTORS | regex:^[0-9]+[a-zA-Z]+[0-9]+$
This would wait for a string such as 253432234BobbySmith332
Luke
If i have understood your question properly there below is one way you can search for an element contains a string. Not sure if this is what you are looking.
List<WebElement> findElement = webElement.findElements(By.xpath("YOUR_TEXTINPUT_PATH_HERE"));
if( findElement.size() > 0 ){
if( findElement.get(0).getText() != null && findElement.get(0).getText().indexOf("THE_STRING_THAT_YOU_WANT_TO SEARCH") != -1 ) {
// IF IT COMES HERE, THAT MEANS THE ELEMENT IS PRESENT WITH THE TEXT
}
}
store text|[your element]|StoredText
execute script|return ${StoredText}.length > 0|x
assert|x|true
Using these three lines in the Selenium IDE, the first line will extract the text from the element into the variable StoredText.
The second line will store whether the length of that text is greater than zero into the variable x (a true or false result).
The third line asserts that the result was true, failing the test if not. You don't need the third line if all you want is the true or false result.
So if the element contains any text, the extracted text length will be greater than zero, the variable x will be true, and the assert will pass. This verifies that any text is present in the element.

Resources