Should I protect strings given to subroutines? - string

If I have a variable which have been defined as a string,
my $x = "abc";
sub p { ... }
do I then have to p("$x") or can just do p($x) or p($hash->{x})?
All works in my tests. Any downsides to not quote?

Regardless of whether it is used as a subroutine call parameter, it is generally considered to be bad practice to quote a single scalar variable, as in "$s", for two reasons
You are unnecessarily making a duplicate of the value
You may be invoking an overloaded stringify behaviour
Of course, the second may also be a good reason to choose to do exactly this, because you wanted to use the stringify special behaviour.
The only downside with using a bare variable as a subroutine parameter is that, since Perl passes the values by reference, it is possible to modify that value from within the subroutine. However you would need to modify an element of #_ which is very difficult to do accidentally.
The usual form of a subroutine is this
sub proc {
my ($p1, $p2, $p3) = #_;
# Do stuff with $p1, $p1, $p3
}
in which case you are working with safe copies of the parameters anyway, and modifying them will have no effect on the actual parameters

p($x) and p($hash->{x}) are fine. You already make a copy of the variable when you do
my ($x) = #_;
or
my $x = shift;
No need to create a copy (using "$x") on the caller's side too.
If you didn't copy the elements, you could have a problem if you changed a global variable in the sub, and you also pass that global variable as an argument to the sub.
$ perl -E'
my $x;
sub f { $x = "def"; say $_[0] }
$x = "abc";
say $x;
f($x);
'
abc
def
But why would you do that? The plausible instance of this is I can think of is the following:
$ perl -E'
sub f { "def" =~ /(.*)/s; say $_[0] }
"abc" =~ /(.*)/s;
say $1;
f($1);
'
abc
def
So maybe f("$1") makes sense sometimes, but that's about it.

Related

Add to integer within string?

I would like to make a string by incrementing a variable within the string.
eg.
$result = "Result: $amount++";
How can this be achieved?
It can be done using trickery.
$result = "Result: ${\( $amount++ )}";
But why would you want to???
$result = "Result: ".$amount++;
If you want to modify a number in a string, you have to use the e modifier for the s operation. This makes Perl evaluating the replacement as an expression.
#! /usr/bin/perl
$_ = "Result: 1\n";
s/\d+/$&+1/e;
print;
It is documented in the Perl manual.
I take it that you have a string that already contains a 'number' (string of digits), and you want to increment that number within.
You'd have to extract the "number" first, in one way or another, since it is merely a string of chars when inside a string; then increment it and join it all back. I'll take it that it is a string of digits bounded by non-digits
my ($pre, $num, $post) = $str =~ m/(\D*)(\d+)(\D*)/;
$str = $pre . ($num+1) . $post;
This makes a critical assumption that the word contains a string of digits in only one place and no digits elsewhere, since if that were not the case the problem would be ill posed.
Just for the curiousity of it I'd like to add a bit to this. A part of a string can be accessed by substr, and that function can be manipulated as an lvalue (can be assigned to). So, if you were to know the starting position and the length of your "number" (what can be found in various ways) you could cram the above process in one statement, if you must
substr($str, $num_beg, $num_len) = substr($str, $num_beg, $num_len) + 1;
or, equally bad
substr($str, $num_beg, $num_len) = ($str =~ m/(\d+)/)[0] + 1;
Now your starting $str string contains the "number" within it incremented. However, this is plain nasty and I cannot recommend any of it. Finally, you can of course find $num_beg and $num_len on the fly, inside of substr, but that is just too much as the poor string would be processed three times in a single statement. (Also, this changes your $str in place, which your question hints is not what you want.)
Added Regex provide the capability to run code in the replacement part, by using /e modifier.
my $str = "ah20bah";
$str =~ s/(\d+)/$1+1/e;
say $str; # it's 'ah21bah'
See this in perlrequick and in perlop.

Perl if condition parameters

I have a log file which looks like below:
4680 p4exp/v68 PJIANG-015394 25:34:19 IDLE none
8869 unnamed p4-python R integration semiconductor-project-trunktip-turbolinuxclient 01:33:52 IDLE none
8870 unnamed p4-python R integration remote-trunktip-osxclient 01:33:52
There are many such entries in the same log file such that some contains IDLE none at the end while some does not. I would like to retain the ones having "R integration" and "IDLE none" in a hash and ignore the rest. I have tried the following code but not getting the desired results.
#!/usr/bin/perl
open (FH,'/root/log.txt');
my %stat;
my ($killid, $killid_details);
while ($line = <FH>) {
if ($line =~ m/(\d+)/){
$killid = $1;
}
if ($line =~ /R integration/ and $line =~ /IDLE none/){
$killid_details = $line;
}
$stat{$killid} = {
killid => $killid_details
};
}
close (FH);
I am getting all the lines with R integration (for example I get 8869, 8870 lines) which should not be the case as 8870 should be ignored.
Please inform me if any mistake. I am still learning perl. Thank you.
I made a few changes in your program:
Always put in use strict; and use warnings;. These will catch 90% of your errors. (Although not this time).
When you open a file, you need to either use or die as in open my $fh, "<", $file or die qq(blah, blah, blah); or use use autodie; (which is now preferred). In your case, if the file didn't open, your program would have continued merrily along. You need to test whether or not the open statement worked.
Note my open statement. I use a variable for the file handle. This is preferred because it's not global, and it's easier to pass into subroutines. Also note I use the three parameter open. This way, you don't run into trouble if your file name begins with some strange character.
When you declare a variable, it's best to do it in scope. This way, variables go out of scope when you no longer need them. I moved where $killid and $killid_details to be declared inside the loop. That way, they no longer exist outside the loop.
You need to be more careful with your regular expressions. What if the phrase IDLE none appears elsewhere in your line? You only want it if its on the end of the line.
Now, for the issues you had:
You need to chomp lines when you read them. In Perl, the NL at the end of the line is read in. The chomp command removes it.
Your logic was a bit strange. You set $killid if your line had a digit in it (I modified it to look only for digits at the beginning of the line). However, you simply went on your merry way even if killid was not set. In your version, because you declared $killid outside of the loop, it had a value in each loop. Here I go to the next statement if $killid isn't defined.
You had a weird definition for your hash. You were defining a reference hash within a hash. No need for that. I made it a simple hash.
Here it is:
#! /usr/bin/env perl
use strict;
use warnings;
use feature qw(say);
use autodie;
use Data::Dumper;
open my $log_fh, '<', '/root/log.txt';
my %stat;
while (my $line = <$log_fh>) {
chomp $line;
next if not $line =~ /^(\d+)\s+/;
my $killid = $1;
if ($line =~ /R\s+integration/ and $line =~ /IDLE\s+none$/){
my $killid_details = $line;
$stat{$killid} = $killid_details;
}
}
close $log_fh;
say Dumper \%stat;
I think this is probably what you want:
while (<FH>) {
next unless /^(\d+).*R integration.*IDLE none/;
$stat{$1} = $_;
}
The regexp should be anchored to the beginning of the line, so you don't match a number anywhere on the line. There's no need to do multiple regexp matches, assuming the order of R integration and IDLE none are always as in the example. You need to use next when there's no match, so you don't process non-matching lines.
And I suspect that you just want to set the value of the hash entry to the string, not a reference to another hash.

Use of uninitialized value in a Perl script

I have the below program for checking the file availability in a Unix directory.
my $numbera = "c://";
my $numberb = "test1.txt";
check_file_exist($numbera, $numberb);
sub check_file_exist {
my $download_filename;
my ($numbera,$numberb) = #_;
$download_filename = $numbera.$numberb;
print "*** $download_filename ****";
my $mtime = (stat $download_filename)[9];
my $filedatetime = scalar localtime $mtime;
if (-e $download_filename) {
print "Data File Exist which is created on $filedatetime";
}
unless (-e $download_filename) {
print "File not exists";
}
}
while running the program I am getting the below error:
*** data_file=HASH(0xa912f0)/home1/saravanan/ ****
Use of uninitialized value in localtime at /home1/saravanan/data_file.pl
First, always put these in your program:
use strict;
use warnings;
When you use strict, you will have to declare your variables with either my or our (HINT: You use my about 99.99% of the time).
These will catch all sorts of errors in your script:
Also, use indentations. It makes your script easier to read. It is also bad form to output inside of your subroutine (unless that is the purpose of your subroutine. Instead, have your subroutine return (or not return a value), and then display that.
Your problem is that you were attempting to stat a file before you knew whether it exists or not. You need to put your stat inside your if statement where you check for the file's existence.
I've made a few changes besides what I stated above:
I use say instead of print. If you use print, you have to put in a terminating \n. The say command does this for you.
I pull in my parameters as soon as I get the subroutine (and use better variable names than $numbera and $numberb.
I use if/then/else instead of doing an if and then an unless with the same test. I no longer use unless in most circumstances. It's simply clearer to say if ( not ... ).
The subroutine either returns a datestamp or returns nothing. I check for the return value of the subroutine with my if statement.
Here's your program updated a bit:
use warnings;
use strict;
use autodie;
use feature qw(say);
use Data::Dumper;
my $numbera = "/Users/david";
my $numberb = ".profile";
if ( my $timestamp = check_file_exist( $numbera, $numberb ) ) {
say "The file was downloaded at $timestamp";
}
else {
say "The file does not exist";
}
sub check_file_exist {
my $directory = shift;
my $file_name = shift;
my $download_filename = "$directory/$file_name";
my #stat = stat($download_filename);
if (not #stat) {
return;
}
my $mtime = $stat[9];
return scalar localtime $mtime;
}

Bash, referring to array by value?

Is there some way to access a variable by referring to it by a value?
BAR=("hello", "world")
function foo() {
DO SOME MAGIC WITH $1
// Output the value of the array $BAR
}
foo "BAR"
Perhaps what you're looking for is indirect expansion. From man bash:
If the first character of parameter is an exclamation point (!), a level of variable indirection is introduced. Bash uses the
value of the variable formed from the rest of parameter as the name of the variable; this variable is then expanded and that value
is used in the rest of the substitution, rather than the value of parameter itself. This is known as indirect expansion. The
exceptions to this are the expansions of ${!prefix*} and ${!name[#]} described below. The exclamation point must immediately fol‐
low the left brace in order to introduce indirection.
Related docs: Shell parameter expansion (Bash Manual) and Evaluating indirect/reference variables (BashFAQ).
Here's an example.
$ MYVAR="hello world"
$ VARNAME="MYVAR"
$ echo ${!VARNAME}
hello world
Note that indirect expansion for arrays is slightly cumbersome (because ${!name[#]} means something else. See linked docs above):
$ BAR=("hello" "world")
$ v="BAR[#]"
$ echo ${!v}
hello world
$ v="BAR[0]"
$ echo ${!v}
hello
$ v="BAR[1]"
$ echo ${!v}
world
To put this in context of your question:
BAR=("hello" "world")
function foo() {
ARR="${1}[#]"
echo ${!ARR}
}
foo "BAR" # prints out "hello world"
Caveats:
Indirect expansion of the array syntax will not work in older versions of bash (pre v3). See BashFAQ article.
It appears you cannot use it to retrieve the array size. ARR="#${1}[#]" will not work. You can however work around this issue by making a copy of the array if it is not prohibitively large. For example:
function foo() {
ORI_ARRNAME="${1}[#]"
local -a ARR=(${!ORI_ARRNAME}) # make a local copy of the array
# you can now use $ARR as the array
echo ${#ARR[#]} # get size
echo ${ARR[1]} # print 2nd element
}
BAR=("hello", "world")
function foo() {
eval echo "\${$1[#]}"
}
foo "BAR"
You can put your arrays into a dictionary matched with their names. Then you can look up this dictionary to find your array and display its contents.

How explicitly resolve variables in a perl string?

In my perl script I want to have both versions of $config directory:
my $config='$home/client/config';
and
my $config_resolved="$home/client/config";
But I want to get $config_resolved from $config, i.e. something like this:
my $config_resolved=resolve_vars($config);
How can I do such thing in perl?
From the Perl FAQ (which every Perl programmer should read at least once):
How can I expand variables in text strings?
(contributed by brian d foy)
If you can avoid it, don't, or if you can
use a templating system, such as Text::Template or Template Toolkit,
do that instead. You might even be able to get the job done with
sprintf or printf:
my $string = sprintf 'Say hello to %s and %s', $foo, $bar;
However, for the one-off simple case where I don't want to pull out a
full templating system, I'll use a string that has two Perl scalar
variables in it. In this example, I want to expand $foo and $bar to
their variable's values:
my $foo = 'Fred';
my $bar = 'Barney';
$string = 'Say hello to $foo and $bar';
One way I can do this involves the substitution operator and a double /e flag. The
first /e evaluates $1 on the replacement side and turns it into $foo. The
second /e starts with $foo and replaces it with its value. $foo,
then, turns into 'Fred', and that's finally what's left in the string:
$string =~ s/(\$\w+)/$1/eeg; # 'Say hello to Fred and Barney'
The /e will also silently ignore violations of strict, replacing undefined
variable names with the empty string. Since I'm using the /e flag
(twice even!), I have all of the same security problems I have with
eval in its string form. If there's something odd in $foo, perhaps
something like #{[ system "rm -rf /" ]}, then I could get myself in
trouble.
To get around the security problem, I could also pull the
values from a hash instead of evaluating variable names. Using a
single /e, I can check the hash to ensure the value exists, and if it
doesn't, I can replace the missing value with a marker, in this case
??? to signal that I missed something:
my $string = 'This has $foo and $bar';
my %Replacements = (
foo => 'Fred',
);
# $string =~ s/\$(\w+)/$Replacements{$1}/g;
$string =~ s/\$(\w+)/
exists $Replacements{$1} ? $Replacements{$1} : '???'
/eg;
print $string;
I use eval for this.
So, you must replace all scalars (their names) with their values.
$config = 'stringone';
$boo = '$config/any/string';
$boo =~ s/(\$\w+)/eval($1)/eg;
print $boo;
Because you are using my to declare it as private variable, you might as well use a /ee modifier. This can find variables declared to be in local scope:
$boo =~ s/(\$\w+)/$1/eeg;
This is most tidily and safely done by the double-eval modifier on s///.
In the program below, the first /e evaluates the string $1 to get $home, while the second evaluates $home to get the variable's value HOME.
use strict;
my $home = 'HOME';
my $config = '$home/client/config';
my $config_resolved = resolve_vars($config);
print $config_resolved, "\n";
sub resolve_vars {
(my $str = shift) =~ s/(\$\w+)/$1/eeg;
return $str;
}
output
HOME/client/config

Resources