I have a lot of html files that I want to add a rel="nofollow" to all the a href tags that are into a specific div.
I think c# code can do it. But how do I relate only partial code..?
Any suggestion? also I didnt realy know
Here is what I think:
Parse the HTML line by line
Look for " block start you would need to find the means look for "/div>".
Store all the content between "" into a string
check if "href=" is found and how many
Now parse this string again to search all "" and match with "href=" counter
#5 will give you an array of "href=" tag based lines
Now you can assume every "href=" tag must have ">" at the end of the "
At the last this is what you can do:
string s1 = " this is link ";
string s2 = s1.Insert(s1.IndexOf(">"), " rel=\"nofollow\"");
Related
I am trying to find specific word in a div (id="Test") that starts with "a04" (no case). I can find and replace the words found. But I am unable to correctly use the word found in a "href" link.
I am trying the following working code that correctly identifies my search criteria. My current code is working as expected but I would like help as i do not know how to used the found work as the url id?
var test = document.getElementById("test").innerHTML
function replacetxt(){
var str_rep = document.getElementById("test").innerHTML.replace(/a04(\w)+/g,'TEST');
var temp = str_rep;
//alert(temp);
document.getElementById("test").innerHTML = temp;
}
I would like to wrap the found word in an href but i do not know how to use the found word as the url id (url.com?id=found word).
Can someone help point out how to reference the found work please?
Thanks
If you want to use your pattern with the capturing group, you could move the quantifier + inside the group or else you would only get the value of the last iteration.
\ba04(\w+)
\b word boundary to prevent the match being part of a longer word
a04 Match literally
(\w+) Capture group 1, match 1+ times a word character
Regex demo
Then you could use the first capturing group in the replacement by referring to it with $1
If the string is a04word, you would capture word in group 1.
Your code might look like:
function replacetxt(){
var elm = document.getElementById("test");
if (elm) {
elm.innerHTML = elm.innerHTML.replace(/\ba04(\w+)/g,'TEST');
}
}
replacetxt();
<div id="test">This is text a04word more text here</div>
Note that you don't have to create extra variables like var temp = str_rep;
I am trying to clean text strings containing any ' or ' (which includes an ; but if i add it here you will see just ' again. Because the the ANSI is also encoded by stackoverflow. The string content contains ' and when it does there is an error.
when i insert the string to my database i get this error:
psycopg2.ProgrammingError: syntax error at or near "s"
LINE 1: ...tment and has commenced a search for mr. whitnell's
the original string looks like this:
...a search for mr. whitnell's...
To remove the ' and ' ; I use:
stripped_content = stringcontent.replace("'","")
stripped_content = stringcontent.replace("' ;","")
any advice is welcome, best regards
When you try to replace("' ;","") it literally searching for "' ;" occurrences in string. You need to convert "' ;" to its character equivalent. Try this:
s = "That's how we 'roll"
r = s.replace(chr(int('''[2:])), "")
and with this chr(int('''[2:])) you'll get ' character.
Output:
Thats how we roll
Note
If you try to run this s.replace(chr(int('''[2:])), "") without saving your result in variable then your original string would not be affected.
I'm trying to read a string in a specific format
RealSociedad
this is one example of string and what I want to extract is the name of the team.
I've tried something like this,
houseteam = sscanf(str, '%s');
but it does not work, why?
You can use regexprep like you did in your post above to do this for you. Even though your post says to use sscanf and from the comments in your post, you'd like to see this done using regexprep. You would have to do this using two nested regexprep calls, and you can retrieve the team name (i.e. RealSociedad) like so, given that str is in the format that you have provided:
str = 'RealSociedad';
houseteam = regexprep(regexprep(str, '^<a(.*)">', ''), '</a>$', '')
This looks very intimidating, but let's break this up. First, look at this statement:
regexprep(str, '^<a(.*)">', '')
How regexprep works is you specify the string you want to analyze, the pattern you are searching for, then what you want to replace this pattern with. The pattern we are looking for is:
^<a(.*)">
This says you are looking for patterns where the beginning of the string starts with a a<. After this, the (.*)"> is performing a greedy evaluation. This is saying that we want to find the longest sequence of characters until we reach the characters of ">. As such, what the regular expression will match is the following string:
<ahref="/teams/spain/real-sociedad-de-futbol/2028/">
We then replace this with a blank string. As such, the output of the first regexprep call will be this:
RealSociedad</a>
We want to get rid of the </a> string, and so we would make another regexprep call where we look for the </a> at the end of the string, then replace this with the blank string yet again. The pattern you are looking for is thus:
</a>$
The dollar sign ($) symbolizes that this pattern should appear at the end of the string. If we find such a pattern, we will replace it with the blank string. Therefore, what we get in the end is:
RealSociedad
Found a solution. So, %s stops when it finds a space.
str = regexprep(str, '<', ' <');
str = regexprep(str, '>', '> ');
houseteam = sscanf(str, '%*s %s %*s');
This will create a space between my desired string.
Let's say I have some text in rails:
text = "A bunch of data goes in here: %#user.name%#, %#user.email%#, %#company.name%#, %#company.state%# and then some other information as well"
I am looking for the best way to parse through that text looking for all substrings between %# and another %# in order to replace it with actual data. The text should not anticipate that data will be in any particular order and it should ideally be able to turn the substrings into references to local variables that match the substring.
Use the String#scan method.
For your case in particular, the regex I used to match was: /(\%\#.*?\%\#)/
text = "A bunch of data goes in here: %#user.name%#, %#user.email%#, %#company.name%#, %#company.state%# and then some other information as well"
regex = /(\%\#.*?\%\#)/
#here's the one line version
text.scan(regex).each {|match| text.sub!(match[0], eval(match[0].gsub(/[\%\#]/, '')))}
#Here's the more organized version
text.scan(regex).each do |match|
current_match = match[0]
replacement_var = current_match.gsub(/[\%\#]/, '')
text.sub!(current_match, eval(replacement_var))
end
puts text
text = "A bunch of data goes in here: %#user.name%#, %#user.email%#, %#company.name%#, %#company.state%# and then some other information as well"
placeholder = text[/\%\#.*?\%\#/]
while placeholder
case placeholder
when "%#user.name%#"
text.sub!(/\%\#.*?\%\#/,"Steve")
when "%#user.email%#"
text.sub!(/\%\#.*?\%\#/,"steve#example.com")
when "%#company.name%#"
text.sub!(/\%\#.*?\%\#/,"Wayne Industries")
when "%#company.state%#"
text.sub!(/\%\#.*?\%\#/,"Gotham")
else
text.sub!(/\%\#.*?\%\#/,"unknown")
end
placeholder = text[/\%\#.*?\%\#/]
end
I need to import text from txt file with some variables. I use BufferedReader and File Reader. In code I have :
String car = "vw golf";
String color = "nice sunny blue color";
And in my txt file:
I have nice " +car+ " which has "+color+".
My expected output :
I have nice vw golf which has nice sunny blue color.
My actual output is :
I have nice " +car+ " which has "+color+".
If I've understood correctly, what you want to do is replace " + car + " with the value of your car string and likewise for colour. You've tried to do this by writing your text file as if it were a command to be evaluated. However, that won't happen - it will just be outputted as is. I'm going to assume you are using c#. What you need to do is, prior to outputting your string, parse it to replace the markers with the variables. I would recommend you get rid of the double quotes in your text file. You could then do something like this:
string text = this.ReadTextFromFile();
string ammended = text.Replace("+car+", car);
As mentioned, this is assuming you remove the double quotes from your text file so it reads:
I have nice +car+ which has +color+.
Also, you don't need to use the + symbols, but I suppose they are a good way of designating a unique token to be replaced. You could use {car} in the file and then likewise in the Replace startment, for example.
I may not have properly understood what you wanted to do, of course!
Edit: Incase of confustion,
this.ReadTextFile();
was just a short hand way of saying that the text variable contains the contents as read from your text file.