Find and replace text and wrap in "href" - string

I am trying to find specific word in a div (id="Test") that starts with "a04" (no case). I can find and replace the words found. But I am unable to correctly use the word found in a "href" link.
I am trying the following working code that correctly identifies my search criteria. My current code is working as expected but I would like help as i do not know how to used the found work as the url id?
var test = document.getElementById("test").innerHTML
function replacetxt(){
var str_rep = document.getElementById("test").innerHTML.replace(/a04(\w)+/g,'TEST');
var temp = str_rep;
//alert(temp);
document.getElementById("test").innerHTML = temp;
}
I would like to wrap the found word in an href but i do not know how to use the found word as the url id (url.com?id=found word).
Can someone help point out how to reference the found work please?
Thanks

If you want to use your pattern with the capturing group, you could move the quantifier + inside the group or else you would only get the value of the last iteration.
\ba04(\w+)
\b word boundary to prevent the match being part of a longer word
a04 Match literally
(\w+) Capture group 1, match 1+ times a word character
Regex demo
Then you could use the first capturing group in the replacement by referring to it with $1
If the string is a04word, you would capture word in group 1.
Your code might look like:
function replacetxt(){
var elm = document.getElementById("test");
if (elm) {
elm.innerHTML = elm.innerHTML.replace(/\ba04(\w+)/g,'TEST');
}
}
replacetxt();
<div id="test">This is text a04word more text here</div>
Note that you don't have to create extra variables like var temp = str_rep;

Related

Separating an HTML Element String into Multiple Strings

I am webscraping using puppeteer and I am trying to extract the innerText of this h4 element.
<h4 class="loss">
(NA)
<br>
<span class="team-name">TEAMNAME</span>
<br>
<span class="win spoiler-wrap">0</span>
</h4>
I am able to get this element using:
const teamName = await matches.$eval('h4', (h4) => h4.innerHTML);
This will set teamName to:
(NA)<br><span class="team-name">TEAMNAME</span><br><span class="win spoiler-wrap">0</span>
I am trying to get only the inner text of each element.
I can get the (NA) using const s = teamName.substr(0, teamName.indexOf('<'));
But I cannot seem to figure out how to get "TEAMNAME" or "0" out of this string. I have thoughts of using regex, but I am not sure how I would accomplish this.
PS the inner text will not always be the same so I can't look for specific words.
With regex, you can do it like this:
teamName.match(/<span class="team-name">(.*)<\/span>/)[1]
match returns an array, where the first element is the match of the whole regex, the second element is the match of the first regex group, the third element is the match of the second regex group (there is none in this case), etc.
The /.../ marks a regex which matches the first biggest match it can find. . in a regex is any character. * specifies that any number of occurrences of the character is matched, including 0 occurences. (...) is a regex group, which is used by match. \ is an escape character, because / is a special character to start and end a regex.
I very much recommend reading the Mozilla docs on match and on regexes for details. You will often find them useful.
However, in the case of puppeteer there probably also is a way of directly matching the selector h4 span, which would be more straightforward than using regexes. I don't know enough about puppeteer to tell you the exact way of doing that. :/
With a bit more thinking, I was able to solve my issue.
Here is a solution:
const teamName = await matches.$eval('h4', (h4) => h4.innerHTML);
const openSpanGT = teamName.indexOf('>', 20);
const closeSpanLT = teamName.indexOf('<', openSpanGT);
const teamTitle = teamName.substr(openSpanGT + 1, closeSpanLT - openSpanGT - 1);
console.log(teamTitle);
This will output "TEAMNAME" no matter how long the string is.

Sitecore 7 content search Starts with function

I am working with sitecore 7 content search.
var webIndex = ContentSearchManager.GetIndex("sitecore_web_index");
using (var context = webIndex.CreateSearchContext())
{
var results = context.GetQueryable<SearchResultItem>().Where(i =>
i.Content.Contains(mysearchterm));
}
sitecore performing contains operation on the content string, content contains the whole content of the page and does not return the result as I expect, for example searching for "hr" also returning results containing "through" in content, I tried using startswith but that just matches the start of the whole content string, I tried "Equal" but that matches the whole word, is there any way to search content where a word starts with search term?
Define '^' as the first character of a search phrase, it means "Starts With". for example to define all terms starting with "hr", just add '^' to search keyword like this "^hr".

Find index of a specific character in a string then parse the string

I have strings which looks like this [NAME LASTNAME/NAME.LAST#emailaddress/123456678]. What I want to do is parse strings which have the same format as shown above so I only get NAME LASTNAME. My psuedo idea is find the index of the first instance of /, then strip from index 1 to that index of / we found. I want this as a VBScript.
Your way should work. You can also Split() your string on / and just grab the first element of the resulting array:
Const SOME_STRING = "John Doe/John.Doe#example.com/12345678"
WScript.Echo Split(SOME_STRING, "/")(0)
Output:
John Doe
Edit, with respect to comments.
If your string contains the [, you can still Split(). Just use Mid() to grab the first element starting at character position 2:
Const SOME_STRING = "[John Doe/John.Doe#example.com/12345678]"
WScript.Echo Mid(Split(SOME_STRING, "/")(0), 2)
Your idea is good here, you should also need to grab index for "[".This will make script robust and flexible here.Below code will always return strings placed between first occurrence of "[" and "/".
var = "[John Doe/John.Doe#example.com/12345678]"
WScript.Echo Mid(var, (InStr(var,"[")+1),InStr(var,"/")-InStr(var,"[")-1)

How to find and replace a string in Matlab

So here is my problem:
I have a list of names in Matlab in a cell array.
I automatically create directories and .mat files for each name.
My problem is that some of these names contains '/' and therefore everything go wrong when I create the directory…
So I am trying to find an efficient way to find '/' and replace them.
So far I've tried to find them using the findstr function. It then gives me a cell array with the indexes where '/' appears. So when the name doesn't contain any '/' it returns {[]} and when the function find it, it returns {[i]}.
Now i'd like to have a logical condition that says if findstr is not empty then do something. I've tried with the isempty function but it doesn't work (it's never empty…)
So does anyone have a solution to this?
Thanks
Use regexprep to replace the character:
list = {'aaa', 'bb/cc', '/dd/'};
replace_from = '/'; %// character to be replaced
replace_to = '_'; %// replacing character
list_replaced = regexprep(list, replace_from, replace_to);
gives
list_replaced =
'aaa' 'bb_cc' '_dd_'

Reading from a string using sscanf in Matlab

I'm trying to read a string in a specific format
RealSociedad
this is one example of string and what I want to extract is the name of the team.
I've tried something like this,
houseteam = sscanf(str, '%s');
but it does not work, why?
You can use regexprep like you did in your post above to do this for you. Even though your post says to use sscanf and from the comments in your post, you'd like to see this done using regexprep. You would have to do this using two nested regexprep calls, and you can retrieve the team name (i.e. RealSociedad) like so, given that str is in the format that you have provided:
str = 'RealSociedad';
houseteam = regexprep(regexprep(str, '^<a(.*)">', ''), '</a>$', '')
This looks very intimidating, but let's break this up. First, look at this statement:
regexprep(str, '^<a(.*)">', '')
How regexprep works is you specify the string you want to analyze, the pattern you are searching for, then what you want to replace this pattern with. The pattern we are looking for is:
^<a(.*)">
This says you are looking for patterns where the beginning of the string starts with a a<. After this, the (.*)"> is performing a greedy evaluation. This is saying that we want to find the longest sequence of characters until we reach the characters of ">. As such, what the regular expression will match is the following string:
<ahref="/teams/spain/real-sociedad-de-futbol/2028/">
We then replace this with a blank string. As such, the output of the first regexprep call will be this:
RealSociedad</a>
We want to get rid of the </a> string, and so we would make another regexprep call where we look for the </a> at the end of the string, then replace this with the blank string yet again. The pattern you are looking for is thus:
</a>$
The dollar sign ($) symbolizes that this pattern should appear at the end of the string. If we find such a pattern, we will replace it with the blank string. Therefore, what we get in the end is:
RealSociedad
Found a solution. So, %s stops when it finds a space.
str = regexprep(str, '<', ' <');
str = regexprep(str, '>', '> ');
houseteam = sscanf(str, '%*s %s %*s');
This will create a space between my desired string.

Resources