BBcode parsing problem - bbcode

I use this function for BBcode parsing:
function bbcode ($message) {
$search = array(
'#\[(?i)b\](.*?)\[/(?i)b\]#si',
'#\[(?i)i\](.*?)\[/(?i)i\]#si',
'#\[(?i)u\](.*?)\[/(?i)u\]#si',
'#\[color=rgb(.*?)\](.*?)\[\/color\]#si',
'#\[quote](.*?)\[\/quote\]#si',
'#\[li](.*?)\[\/li\]#si',
'#\[ul](.*?)\[\/ul\]#si',
);
$replace = array(
'<b>\\1</b>',
'<i>\\1</i>',
'<u>\\1</u>',
'<span style=\"color:rgb\\1\">\\2</span>',
'<span class=\"quote">\\1</span>',
'<li>\\1</li>',
'<ul>\\1</ul>',
);
return preg_replace($search , $replace, $message);
}
In most cases it works ok, but not always.
For example:
[color=rgb(102, 0, 102)]H[color=rgb(204, 0, 0)]e[/color]llo[/color]
The result is:
<span style="color:rgb(102, 0, 102)">H[color=rgb(204, 0, 0)]e</span>llo[/color]
As you can see, only the first [color=...][/color] has been converted to html. The second stays as it is. Any ideas?

It's working correctly as you specified it. The problem is with embedded sequences.
I suggest you perform two replaces. One for the starting tags and one for the ending tags.
You might also be able to get away with specifying all of the starting tags first and
all of the ending tags last in the array of replacements.
That makes the search-replace values simpler anyway and in most cases you don't
need to use back-references, especially for simple tags like [b].
That should fix your problem.

Related

Is this the right way to implement my own filter in twig? (I want *minimal* whitespace between tags, not *no* whitespace between tags)

I know that apply spaceless "isn't about optimisation", in Symfony's words. But dammit, I dislike extraneous whitespace from being in my served files.
So I'm keen to use it.
But... I don't like how it reduces
<span>1</span>
<span>2</span>
to
<span>1</span><span>2</span>
As that results in the browser displaying 12, rather than 1 2.
In my mind whitespace between tags should be reduced to a single space, not to nothing.
<span>1</span>
<span>2</span>
->
<span>1</span> <span>2</span>
So I thought I'd make a custom filter, minimizeWhitespace, and wrap my templates with <% apply minimizeWhitespace %>
This is what I came up with:
function minimizeWhitespace($s)
{
return new \Twig\Markup(preg_replace('/\s+/', ' ', $s->__toString()), 'UTF-8');
}
$TWIG_env->addFilter(
new \Twig\TwigFilter('minimizeWhitespace', 'minimizeWhitespace')
);
This feels messy though - I'm taking their \Twig\Markup object, converting it to a string, running my regexp on it, and then creating a new \Twig\Markup object to return.
Is there a better way?

Separating an HTML Element String into Multiple Strings

I am webscraping using puppeteer and I am trying to extract the innerText of this h4 element.
<h4 class="loss">
(NA)
<br>
<span class="team-name">TEAMNAME</span>
<br>
<span class="win spoiler-wrap">0</span>
</h4>
I am able to get this element using:
const teamName = await matches.$eval('h4', (h4) => h4.innerHTML);
This will set teamName to:
(NA)<br><span class="team-name">TEAMNAME</span><br><span class="win spoiler-wrap">0</span>
I am trying to get only the inner text of each element.
I can get the (NA) using const s = teamName.substr(0, teamName.indexOf('<'));
But I cannot seem to figure out how to get "TEAMNAME" or "0" out of this string. I have thoughts of using regex, but I am not sure how I would accomplish this.
PS the inner text will not always be the same so I can't look for specific words.
With regex, you can do it like this:
teamName.match(/<span class="team-name">(.*)<\/span>/)[1]
match returns an array, where the first element is the match of the whole regex, the second element is the match of the first regex group, the third element is the match of the second regex group (there is none in this case), etc.
The /.../ marks a regex which matches the first biggest match it can find. . in a regex is any character. * specifies that any number of occurrences of the character is matched, including 0 occurences. (...) is a regex group, which is used by match. \ is an escape character, because / is a special character to start and end a regex.
I very much recommend reading the Mozilla docs on match and on regexes for details. You will often find them useful.
However, in the case of puppeteer there probably also is a way of directly matching the selector h4 span, which would be more straightforward than using regexes. I don't know enough about puppeteer to tell you the exact way of doing that. :/
With a bit more thinking, I was able to solve my issue.
Here is a solution:
const teamName = await matches.$eval('h4', (h4) => h4.innerHTML);
const openSpanGT = teamName.indexOf('>', 20);
const closeSpanLT = teamName.indexOf('<', openSpanGT);
const teamTitle = teamName.substr(openSpanGT + 1, closeSpanLT - openSpanGT - 1);
console.log(teamTitle);
This will output "TEAMNAME" no matter how long the string is.

Find and replace text and wrap in "href"

I am trying to find specific word in a div (id="Test") that starts with "a04" (no case). I can find and replace the words found. But I am unable to correctly use the word found in a "href" link.
I am trying the following working code that correctly identifies my search criteria. My current code is working as expected but I would like help as i do not know how to used the found work as the url id?
var test = document.getElementById("test").innerHTML
function replacetxt(){
var str_rep = document.getElementById("test").innerHTML.replace(/a04(\w)+/g,'TEST');
var temp = str_rep;
//alert(temp);
document.getElementById("test").innerHTML = temp;
}
I would like to wrap the found word in an href but i do not know how to use the found word as the url id (url.com?id=found word).
Can someone help point out how to reference the found work please?
Thanks
If you want to use your pattern with the capturing group, you could move the quantifier + inside the group or else you would only get the value of the last iteration.
\ba04(\w+)
\b word boundary to prevent the match being part of a longer word
a04 Match literally
(\w+) Capture group 1, match 1+ times a word character
Regex demo
Then you could use the first capturing group in the replacement by referring to it with $1
If the string is a04word, you would capture word in group 1.
Your code might look like:
function replacetxt(){
var elm = document.getElementById("test");
if (elm) {
elm.innerHTML = elm.innerHTML.replace(/\ba04(\w+)/g,'TEST');
}
}
replacetxt();
<div id="test">This is text a04word more text here</div>
Note that you don't have to create extra variables like var temp = str_rep;

Gradle: How to filter and search through text?

I'm fairly new to gradle. How do I filter text in the following manner?
Pretend that the output/result I want to filter will be the two URLs below.
"http://localhost/artifactory/appNameIwant/moreStuffHereThatsDynamic"
> I want this URL
"http://localhost/artifactory/differentAppName"
> I don't want this URL
I want to put up a "match" variable that would be something like
variable = http://localhost/artifactory/appnameIwant
So essentially, the string will not be a perfect match. I want it to filter and provide back any URLs that start with the variable listed above. It cannot be a perfect match as the characters after the /appnameIwant/ will be changing.
I want to use a for loop to cycle through an array, with an if then statement to return any matches. For instance.
for (i=0; i < results.length; i++){
if (results[i] strings matches (http://localhost/artifactory/appnameIwant) {
return results[i] }
I am just filtering the URL strings themselves, not anything complicated inside the webpages.
Let me know if further explanation would be helpful.
Thanks so much for your time and help!
I figured it out - I just used
if (string.startsWith"texthere")) {println string}
A lot easier than I thought!

Drupal 6: How do you print Taxonomy Terms as a CSS Body Class?

In Drupal 6, how do you print a taxonomy term as a CSS body class?
I have found this snippet that lets you print almost every aspect of Drupal content as a body class, but it doesn't include taxonomy terms:
http://www.davidnewkerk.com/book/122
Being able to print taxonomy terms as a body class is essential for theming processes, so I am surprised that a solution is not readily available.
Check what variables are passed to the page template by either doing print_r($vars) or dpm($vars) in your page pre-process function or using the http://drupal.org/project/devel_themer module. The usage of dpm require you to install the devel module.
You will find that some themes will pass $taxonomy as a variable to page.tpl.php . If that is not the case you can find the taxonomy terms in the $node variable which is also available in the page.tpl.php in some themes.
(The above holds true for my fusion based theme acquia marina http://drupal.org/project/acquia_marina ). Once you have these taxonomy terms available you can easily print them out in your body classes.
After much hard work, I found a very easy way to do this.
On Drupal Snippets, there is a snippet that lets you print out the taxonomy terms applied to each page as text.
The only problem is that the snippet will print any spaces or punctuation that are in the taxonmy term, which is no good for body classess.
However, by adding a str_replace command, you can strip out all the spaces and punctuation.
I'm sure there are other people who wants to print taxonmy terms as body classes, so to save them the bother, here is the code that I used with the str_replace command added.
Put the following in template.php:
function getTerm($label, $vid, $link) {
$node = node_load(array('nid'=>arg(1)));
foreach((array)$node->taxonomy as $term){
if ($term->vid == $vid){
if ($link){
$link_set[] = l($term->name, taxonomy_term_path($term));
} else {
$link_set[] = $term->name;
}
}
}
if (!empty($link_set)){
$label = ($label) ? "<strong>$label </strong>" : "";
$link_set = $label.implode(', ', $link_set);
}
$link_set = str_replace(' ', '_', $link_set);
$link_set = str_replace('&', 'and', $link_set);
$link_set = strtolower($link_set);
return $link_set;
}
Put the following in Page.tpl.php:
<body class="taxonomy-<? print getTerm(false, 1, false);?>">
I hope this helps anyone who has the same problem.
Extra tips:
(1)In the code I have posted, the only punctuation that is striped out is the ampersand (i.e. '&').
If you have other punctuation to strip out use the following:
$link_set = str_replace('INSET_PUNCTUATION_HERE', 'INSERT_REPLACEMENT_HERE', $link_set);
Place this command under the other $link_set lines in the code I have posted for template.php.
(2) In the page.tpl.php code I have posted, the "taxonomy-" part places the words taxonomy and a dash before each body class term. You can edit this as you wish to get the results your require.

Resources