What's a good character to separate strings with leading white-spaces? - string

I'm using the null character (\0) as a separator to keep the strings leading white-spaces after the sprintf. But the strings with the null character don't work (in this case) with the Curses addstr function.
Is there some suitable character to replace the \0 for this purpose?
#!/usr/bin/env perl
use warnings;
use 5.12.0;
sub routine {
my #list = #_;
#list = map{ "\0".$_."\0"; } #list;
# ...
# ...
#list = map{ sprintf "%35.35s", $_ } #list;
# ...
# ...
my $result = $list[5];
$result =~ s/\A\s+\0//;
$result =~ s/\0\s+\z//;
return $result;
}

What about using some pretty print module from CPAN?
http://metacpan.org/pod/Data::Format::Pretty::Console
http://metacpan.org/pod/Text::Tabulate

Related

perl number of lines in a string

Using perl, is there any single command which give me the number of lines inside a string?
my $linenum= .... $str ....
It should work for when the string is empty, single line, and multiple lines.
You can count number of newline chars \n in the string (or \r for Mac newline)
my $linenum = $str =~ tr/\n//;
I've adapted #rplantiko's answer into a full subroutine that works the way I picture it, with handling for undef and "". It also knows about how the last line of text can be missing a "\n" and returns the apparent line count ( which is the count of "\n" +1 )
# should work on windows + unix but not the old mac
sub count_lines_in_string {
$_ = shift;
return 0 if( !defined $_ or $_ eq "");
my $lastchar = substr $_, -1,1;
my $numlines = () = /\n/g;
# was last line a whole line with a "\n"?;
return $numlines + ($lastchar ne "\n");
}
say count_lines_in_string("asdf\nasdf\n") ;
say count_lines_in_string undef;
say count_lines_in_string "a";
Try to use a regular expression

Perl: Transfer substring positions between two strings

I'm writing a Perl programm and I've got the following problem:
I have a large list of start and end positions in a string. This positions correspond to substrings in this string. I now want to transfer this positions to a second string. This second string is identical to the first string, except that it has additional hyphen.
Example for original String: "ABCDEF" and one Substring "BCDE"
What I have:
Positions of substring in this original string: Start = 1, End =
4
The original string with additional hyphen: "-AB---CD--E-F---"
What I want:
Position of the substring in the hyphen-string: Start=2, End=10
I have a large list of this substring positions.
I strongly suspect that you have shown a reduced version of the problem, in which case any solution may not work for the real situation.
However, it seems simplest to build a regex by interspersing -* (i.e. zero or more hyphens) between characters.
This program works that way, building a regex of B-*C-*D-*E and comparing it to both of your sample strings.
use strict;
use warnings;
my #strings = qw/ ABCDEF -AB---CD--E-F--- /;
my ($start, $end) = (1, 4);
my $substr = substr $strings[0], $start, $end-$start + 1;
my $regex = join '-*', split //, $substr;
$regex = qr/$regex/;
for my $string (#strings) {
if ($string =~ $regex) {
printf "Substring found at %d to %d in string %s\n", $-[0], $+[0]-1, $string;
}
}
output
Substring found at 1 to 4 in string ABCDEF
Substring found at 2 to 10 in string -AB---CD--E-F---
Does this work for you? It just searches for the characters specified by start and end in the hyphenated string and returns their indices.
sub hyphen_substrings {
my $original = shift;
my $hyphenated = shift;
my #substrings = #_;
my #return;
for my $substring (#substrings) {
my ($start, $end) = #{$substring}[0, 1];
my $start_h = index $hyphenated, substr $original, $start, 1;
my $end_h = index $hyphenated, substr $original, $end, 1;
push #return, [$start_h, $end_h];
}
return #return;
}
use strict;
use warnings;
my $theStringGivenAsAnInputExample="-AB---CD--E-F---";
my $start=1;
my $end=4;
my $theStringGivenAsAnotherInput="ABCDEF";
my $regexp=join("-*",split("",substr($theStringGivenAsAnotherInput,$start,$end))
);
$theStringGivenAsAnInputExample =~ /$regexp/p;
print ${^PREMATCH},"\n";
print ${^POSTMATCH},"\n";
print ${^MATCH},"\n";
my $startPosition = length(${^PREMATCH});
my $finishPosition = length(${^PREMATCH})+length(${^MATCH})-1;
print "start, $startPosition finish, $finishPosition\n";

compare string variables in perl

I have an if clause in perl, where as condition I need to compare two variables if they match as strings. But my code doesnt work and the strings never match:
if(trim($file) eq trim($fields[0])) {
print "OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO";
}
For the definition of trim I have used:
sub trim($)
{
my $string = shift;
$string =~ s/^\s*(.*?)\s*$/$1/;
return $string;
}
Moreover I have used this before for the variables to compare.
my #fields= split(/\;/,$_);
Any help? Thanks!
Your code is correct, so your strings are different.
To find the differences, I recommend the following code since it will reveals differences that might not be noticeable by just printing the strings:
use Data::Dumper;
{
local $Data::Dumper::Useqq=1;
print Dumper($file, $fields[0]);
}
By the way, the following is more elegant and possibly faster:
sub trim {
my $string = shift;
$string =~ s/^\s+//;
$string =~ s/\s+\z//;
return $string;
}
And IIRC, the following is even faster (for a drop in readability):
sub trim {
my $string = shift;
$string =~ s/^\s+|\s++\z//g;
return $string;
}

perl: useing commas in hash values

I have key value pairs as "statement:test,data" where 'test,data' is the value for hash. While trying to create a hash with such values, perl splits the values on the comma. Is there a way around this where strings with commas can be used as values
There is nothing in Perl that stops you from using 'test,data' as hash value.
If your incoming string is literally "statement:test,data", you can use this code to add into hash:
my ($key, $value) = ($string =~ /(\w+):(.*)/);
next unless $key and $value; # skip bad stuff - up to you
$hash{$key} = $value;
Perl won't split a string on a comma unless you tell it to.
#!/usr/bin/perl
use v5.16;
use warnings;
use Data::Dump 'ddx';
my $data = "statement:test,data";
my %hash;
my ($key, $value) = split(":", $data);
$hash{$key} = $value;
ddx \%hash;
gives:
# split.pl:14: { statement => "test,data" }

how to compare 2 strings by each characters in perl

basically I want to compare
$a = "ABCDE";
$b = "--(-)-";
and get output CE.
i.e where ever parentheses occur the characters of $a should be taken.
One of the rare uses of the bitwise or-operator.
# magic happens here ↓
perl -E'say (("ABCDE" | "--(-)-" =~ tr/-()/\377\000/r) =~ tr/\377//dr)'
prints CE.
Use this for golfing purposes only, AHA’s solution is much more maintainable.
Simple regex and pos solution:
my $str = "ABCDE";
my $pat = "--(-)-";
my #list;
while ($pat =~ /(?=[()])/g) {
last if pos($pat) > length($str); # Required to prevent matching outside $x
my $char = substr($str, pos($y), 1);
push #list, $char;
}
print #list;
Note the use of lookahead to get the position before the matching character.
Combined with Axeman's use of the #- variable we can get an alternative loop:
while ($pat =~ /[()]/g) {
last if $-[0] > length($str);
my $char = substr($str, $-[0], 1);
push #list, $char;
}
This is pretty much mentioned in the documentation for #-:
After a match against some variable $var :
....
$& is the same as substr($var, $-[0], $+[0] - $-[0])
In other words, the matched string $& equals that substring expression. If you replace $var with another string, you would get the characters matching the same positions.
In my example, the expression $+[0] - $-[0] (offset of end of match minus offset of start of match) would be 1, since that is the max length of the matching regex.
QED.
This uses the idea that you can scan one string for positions and just take the values of the other strings. #s is a reusable product.
use strict;
use warnings;
sub chars {
my $source = shift;
return unless #_;
my #chars = map { substr( $source, $_, 1 ) } #_;
return wantarray ? #chars, join( '', #chars );
}
my $a = "ABCDE";
my $b = "--(-)-";
my #s;
push #s, #- while $b =~ m/[()]/g;
my $res = chars( $a, #s );
Way faster than all the solutions except daxim's, and almost as fast as daxim's without preventing the use of characters 255 and above:
my $pat = $b =~ s/[^()]/.?/gr =~ s/[()]/(.?)/gr
my $c = join '', $a =~ /^$pat/s;
It changes
---(-)-
to
.?.?.?(.?).?(.?).?
Then uses the result as regex pattern to extract the desired characters.
This is easy to accomplish using each_array, each_arrayref or pairwise from List::MoreUtils:
#!/usr/bin/env perl
use strict;
use warnings;
use List::Util qw( min );
use List::MoreUtils qw( each_array );
my $string = 'ABCDE';
my $pattern = '--(-)-';
my #string_chars = split //, $string;
my #pattern_chars = split //, $pattern;
# Equalise length
my $min_length = min $#string_chars, $#pattern_chars;
$#string_chars = $#pattern_chars = $min_length;
my $ea = each_array #string_chars, #pattern_chars;
while ( my ( $string_char, $pattern_char ) = $ea->() ) {
print $string_char if $pattern_char =~ /[()]/;
}
Using pairwise:
{
no warnings qw( once );
print pairwise {
$a if $b =~ /[()]/;
} #string_chars, #pattern_chars;
}
Without using List::MoreUtils:
for ( 0 .. $#string_chars ) {
print $string_chars[$_] if $pattern_chars[$_] =~ /[()]/;
}
Thanks to TLP for discovering the set $# technique without which this solution will have been longer and complicated. :-)
#!/usr/bin/perl
use strict;
use warnings;
my $a = "ABCDE";
my $b = "--(-)-";
my ($i, $c, $x, $y) = 0;
$c .= $y =~ /\(|\)/ ? $x : "" while ($x = substr $a, $i, 1) && ($y = substr $b, $i++, 1);
print "$c\n";

Resources