How to edit a multi-line scalar and print the edits - string

I need to edit a multi-line scalar and print the results, however I am not able to do it neatly.
my $text = "$omething\n nothing\n Everything\n";
What I need to do is check each line, and if there's a capital letter or special charracter - print this line and remove it from the original scalar ($text).
In this example it would print two times, first time:
$omething
Second time:
Everything
And remove both of those strings from the $text scalar.

To include a dollar sign in a double quoted string, you need to escape it by a backslash.
You can remove the matching lines in a while loop:
#!/usr/bin/perl
use warnings;
use strict;
my $text = "\$omething\nnothing\nEverything\n";
while ($text =~ s/(.*[[:upper:]\$].*\n)//) {
print $1;
}
print "Remaining: $text";
A period never matches a newline (unless you specify the /s modifier).

Related

How to remove or replace brackets in a string?

my $book = Spreadsheet::Read->new();
my $book = ReadData
('D:\Profiles\jmahroof\Desktop\Scheduled_Build_Overview.xls');
my $cell = "CD7";
my $n = "1";
my $send = $book->[$n]{$cell};
The above code gets data from a spreadsheet, then prints the content of a cell that I know has text in. It has text of exactly the following format: text(text)
I need to replace the open bracket with a empty space and I need to remove the close bracket. I have tried the below code to substitute the open bracket for an empty space however it does not seem to work.
$send =~ s/(/ /g;
print $send;
The bracket is seen as part of the code, just escape it.
$send =~ s/\(/ /;
print $send;
Since you only replace one char with another, you don't want a substitution, but a transliteration. That's the tr/// function in Perl. Since the pattern is just a list of chars, and not an actual regex, you don't need to escape the open parenthesis (. There is also no /g flag. It just substitutes all occurrences.
$send =~ tr/(/ /;
The main difference to a regular expression substitution is that the transliterations get compiled at compile time, not at run time. That makes the tr/// faster than a s///, especially in a loop.
See the full documentation in perlop.

How do I include new lines in a string in Perl?

I have a string that looks like this
Acanthocolla_cruciata,#8B5F65Acanthocyrta_haeckeli,#8B5F65Acanthometra_fusca,#8B5F65Acanthopeltis_japonica,#FFB5C5
I am trying to added in new lines so get in list format. Like this
Acanthocolla_cruciata,#8B5F65
Acanthocyrta_haeckeli,#8B5F65
Acanthometra_fusca,#8B5F65
Acanthopeltis_japonica,#FFB5C5
I have a perl script
use strict;
use warnings;
open my $new_tree_fh, '>', 'test_match.txt'
or die qq{Failed to open "update_color.txt" for output: $!\n};
open my $file, '<', $ARGV[0]
or die qq{Failed to open "$ARGV[0]" for input: $!\n};
while ( my $string = <$file> ) {
my $splitmessage = join ("\n", ($string =~ m/(.+)+\,+\#+\w{6}/gs));
print $new_tree_fh $splitmessage, "\n";
}
close $file;
close $new_tree_fh;
The pattern match works but it wont print the new line as I want to make the list. Can anyone please suggest anything.
I'd do:
my $str = 'Acanthocolla_cruciata,#8B5F65Acanthocyrta_haeckeli,#8B5F65Acanthometra_fusca,#8B5F65Acanthopeltis_japonica,#FFB5C5';
$str =~ s/(?<=,#\w{6})/\n/g;
say $str;
Output:
Acanthocolla_cruciata,#8B5F65
Acanthocyrta_haeckeli,#8B5F65
Acanthometra_fusca,#8B5F65
Acanthopeltis_japonica,#FFB5C5
OK, I think your problem here is that your regular expression doesn't match properly.
(.+)+
for example - probably doesn't do what you think it does. It's a greedy capture of 1 or more of "anything" which will grab your whole string.
Check it out on regex101.
Try:
#!/usr/bin/perl
use strict;
use warnings;
while ( my $string = <DATA> ) {
my $splitmessage = join( "\n", ( $string =~ m/(\w+,\#+\w{6})/g ) );
print $splitmessage, "\n";
}
__DATA__
Acanthocolla_cruciata,#8B5F65Acanthocyrta_haeckeli,#8B5F65Acanthometra_fusca,#8B5F65Acanthopeltis_japonica,#FFB5C5
Which will print:
Acanthocolla_cruciata,#8B5F65
Acanthocyrta_haeckeli,#8B5F65
Acanthometra_fusca,#8B5F65
Acanthopeltis_japonica,#FFB5C5
Rather than a quickfix solution, let's find the problem in your existing code and hence learn from it. Your problem is in the regular expression, so we'll dissect and fix it.
($string =~ m/(.+)+\,+\#+\w{6}/gs)
First, the two significant mistakes that lead to the bug:
At the beginning, you're doing a .+, followed by matching with , and # and so on. The problem is, .+ is greedy, which means it'll match upto the last , in the input, and not the first one. So when you run this, almost the entire line (except for the last plant's color) gets matched up by this single .+.
There are a few different ways you can fix this, but the easiest is to restrict what you're matching. Instead of saying .+ "match anything", make it [\w\s]+ at the beginning - which means match either "word characters" (which includes alphabets and digits) or space characters (since there is a space in the middle of the plant name).
($string =~ m/([\w\s]+)+\,+\#+\w{6}/gs)
That changes the output, but still not to the fully correct version because:
m/some regex/g returns a list of its matches as a list here, and what we want is for it to return the whole match including both plant name and color. But, when there are paranthesis inside the match anywhere, m/ returns only the part matched by the paranthesis (which is the plant name here), not the whole match. So, remove the paranthesis, and it becomes:
($string =~ m/[\w\s]++\,+\#+\w{6}/gs)
This works, but is quite clumsy and bug-prone, so here's some improvement suggestions:
Since your input has no newline characters, the /s at the end is unnecessary.
($string =~ m/[\w\s]++\,+\#+\w{6}/g)
, and # are not a special character in perl regular expressions, so they don't need a \ before them.
($string =~ m/[\w\s]++,+#+\w{6}/g)
+ is for when you know only that the character will be present, but don't know how many times it'll be there. Here, since we're only trying to match one , and one # characters, the + after them is unnecessary.
($string =~ m/[\w\s]++,#\w{6}/g)
The ++ after [\w\s] means something quite different from + (basically an even greedier match than usual), so let's make it a single +
($string =~ m/[\w\s]+,#\w{6}/g)
Optionally, you can change the last \w to match only the hexadecimal characters which will appear in the colour code:
($string =~ m/[\w\s]+,#[0-9A-F]{6}/g)
That's a pretty solid, working regular expression that does what you want.

How to grep/split a word in middle of %% or $$

I have a variable from which I have to grep the which in middle of %% adn the word which starts with $$. I used split it works... but for only some scenarios.
Example:
#!/usr/bin/perl
my $lastline ="%Filters_LN_RESS_DIR%\ARC\Options\Pega\CHF_Vega\$$(1212_GV_DATE_LDN)";
my #lastline_temp = split(/%/,$lastline);
print #lastline_temp;
my #var=split("\\$\\$",$lastline_temp[2]);
print #var;
I get the o/p as expected. But can i get the same using Grep command. I mean I dont want to use the array[2] or array[1]. So that I can replace the values easily.
I don't really see how you can get the output you expect. Because you put your data in "busy" quotes (interpolating, double, ...), it comes out being stored as:
'%Filters_LN_RESS_DIR%ARCOptionsPegaCHF_Vega$01212_GV_DATE_LDN)'
See Quote and Quote-like Operators and perhaps read Interpolation in Perl
Notice that the backslashes are gone. A backslash in interpolating quotes simply means "treat the next character as literal", so you get literal 'A', literal 'O', literal 'P', ....
That '0' is the value of $( (aka $REAL_GROUP_ID) which you unwittingly asked it to interpolate. So there is no sequence '$$' to split on.
Can you get the same using a grep command? It depends on what "the same" is. You save the results in arrays, the purpose of grep is to exclude things from the arrays. You will neither have the arrays, nor the output of the arrays if you use a non-trivial grep: grep {; 1 } #data.
Actually you can get the exact same result with this regular expression, assuming that the single string in #vars is the "result".
m/%([^%]*)$/
Of course, that's no more than
substr( $lastline, rindex( $lastline, '%' ) + 1 );
which can run 8-10 times faster.
First, be very careful in your use of quotes, I'm not sure if you don't mean
'%Filters_LN_RESS_DIR%\ARC\Options\Pega\CHF_Vega\$$(1212_GV_DATE_LDN)'
instead of
"%Filters_LN_RESS_DIR%\ARC\Options\Pega\CHF_Vega\$$(1212_GV_DATE_LDN)"
which might be a different string. For example, if evaluated, "$$" means the variable $PROCESS_ID.
After trying to solve riddles (not sure about that), and quoting your string
my $lastline =
'%Filters_LN_RESS_DIR%\ARC\Options\Pega\CHF_Vega\$$(1212_GV_DATE_LDN)'
differently, I'd use:
my ($w1, $w2) = $lastline =~ m{ % # the % char at the start
([^%]+) # CAPTURE everything until next %
[^(]+ # scan to the first brace
\( # hit the brace
([^)]+) # CAPTURE everything up to closing brace
}x;
print "$w1\n$w2";
to extract your words. Result:
Filters_LN_RESS_DIR
1212_GV_DATE_LDN
But what do you mean by replace the values easily. Which values?
Addendum
Now lets extract the "words" delimited by '\'. Using a simple split:
my #words = split /\\/, # use substr to start split after the first '\\'
substr $lastline, index($lastline,'\\');
you'll get the words between the backslashes if you drop the last entry (which is the $$(..) string):
pop #words; # remove the last element '$$(..)'
print join "\n", #words; # print the other elements
Result:
ARC
Options
Pega
CHF_Vega
Does this work better with grep? Seems to:
my #words = grep /^[^\$%]+$/, split /\\/, $lastline;
and
print join "\n", #words;
also results in:
ARC
Options
Pega
CHF_Vega
Maybe that is what you are after? What do you want to do with these?
Regards
rbo

Extract the required substring from another string -Perl

I want to extract a substring from a line in Perl. Let me explain giving an example:
fhjgfghjk3456mm 735373653736
icasd 666666666666
111111111111
In the above lines, I only want to extract the 12 digit number. I tried using split function:
my #cc = split(/[0-9]{12}/,$line);
print #cc;
But what it does is removes the matched part of the string and stores the residue in #cc. I want the part matching the pattern to be printed. How do I that?
You can do it with regular expressions:
#!/usr/bin/perl
my $string = 'fhjgfghjk3456mm 735373653736 icasd 666666666666 111111111111';
while ($string =~ m/\b(\d{12})\b/g) {
say $1;
}
Test the regex here: http://rubular.com/r/Puupx0zR9w
use YAPE::Regex::Explain;
print YAPE::Regex::Explain->new(qr/\b(\d+)\b/)->explain();
The regular expression:
(?-imsx:\b(\d+)\b)
matches as follows:
NODE EXPLANATION
----------------------------------------------------------------------
(?-imsx: group, but do not capture (case-sensitive)
(with ^ and $ matching normally) (with . not
matching \n) (matching whitespace and #
normally):
----------------------------------------------------------------------
\b the boundary between a word char (\w) and
something that is not a word char
----------------------------------------------------------------------
( group and capture to \1:
----------------------------------------------------------------------
\d+ digits (0-9) (1 or more times (matching
the most amount possible))
----------------------------------------------------------------------
) end of \1
----------------------------------------------------------------------
\b the boundary between a word char (\w) and
something that is not a word char
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------
The $1 built-in variable stores the last match from a regex. Also, if you perform a regex on a whole string, it will return the whole string. The best solution here is to put parentheses around your match then print $1.
my $strn = "fhjgfghjk3456mm 735373653736\nicasd\n666666666666 111111111111";
$strn =~ m/([0-9]{12})/;
print $1;
This makes our regex match JUST the twelve digit number and then we return that match with $1.
#!/bin/perl
my $var = 'fhjgfghjk3456mm 735373653736 icasd 666666666666 111111111111';
if($var =~ m/(\d{12})/) {
print "Twelve digits: $1.";
}
#!/usr/bin/env perl
undef $/;
$text = <DATA>;
#res = $text =~ /\b\d{12}\b/g;
print "#res\n";
__DATA__
fhjgfghjk3456mm 735373653736
icasd 666666666666
111111111111

Printing string in Perl

Is there an easy way, using a subroutine maybe, to print a string in Perl without escaping every special character?
This is what I want to do:
print DELIMITER <I don't care what is here> DELIMITER
So obviously it will great if I can put a string as a delimiter instead of special characters.
perldoc perlop, under "Quote and Quote-like Operators", contains everything you need.
While we usually think of quotes as literal values, in Perl they function as operators, providing various kinds of interpolating and pattern matching
capabilities. Perl provides customary quote characters for these behaviors, but also provides a way for you to choose your quote character for any of
them. In the following table, a "{}" represents any pair of delimiters you choose.
Customary Generic Meaning Interpolates
'' q{} Literal no
"" qq{} Literal yes
`` qx{} Command yes*
qw{} Word list no
// m{} Pattern match yes*
qr{} Pattern yes*
s{}{} Substitution yes*
tr{}{} Transliteration no (but see below)
<<EOF here-doc yes*
* unless the delimiter is ''.
$str = q(this is a "string");
print $str;
if you mean quotes and apostrophes with 'special characters'
You can use the __DATA__ directive which will treat all of the following lines as a file that can be accessed from the DATA handle:
while (<DATA>) {
print # or do something else with the lines
}
__DATA__
#!/usr/bin/perl -w
use Some::Module;
....
or you can use a heredoc:
my $string = <<'END'; #single quotes prevent any interpolation
#!/usr/bin/perl -b
use Some::Module;
....
END
The printing is not doing special things to the escapes, double quoted strings are doing it. You may want to try single quoted strings:
print 'this is \n', "\n";
In a single quoted string the only characters that must be escaped are single quotes and a backslash that occurs immediately before the end of the string (i.e. 'foo\\').
It is important to note that interpolation does not work with single quoted strings, so
print 'foo is $foo', "\n";
Will not print the contents of $foo.
You can pretty much use any character you want with q or qq. For example:
#!/usr/bin/perl
use utf8;
use strict; use warnings;
print q∞This is a test∞;
print qq☼\nThis is another test\n☼;
print q»But, what is the point?»;
print qq\nYou are just making life hard on yourself!\n;
print qq¿That last one is tricky\n¿;
You cannot use qq DELIMITER foo DELIMITER. However, you could use heredocs for a similar effect:
print <<DELIMITER
...
DELIMETER
;
or
print <<'DELIMETER'
...
DELIMETER
;
but your source code would be really ugly.
If you want to print a string literally and you have Perl 5.10 or later then
say 'This is a string with "quotes"' ;
will print the string with a newline.. The importaning thing is to use single quotes ' ' rather than double ones " "

Resources