How to tell apart numeric scalars and string scalars in Perl? - string

Perl usually converts numeric to string values and vice versa transparently. Yet there must be something which allows e.g. Data::Dumper to discriminate between both, as in this example:
use Data::Dumper;
print Dumper('1', 1);
# output:
$VAR1 = '1';
$VAR2 = 1;
Is there a Perl function which allows me to discriminate in a similar way whether a scalar's value is stored as number or as string?

A scalar has a number of different fields. When using Perl 5.8 or higher, Data::Dumper inspects if there's anything in the IV (integer value) field. Specifically, it uses something similar to the following:
use B qw( svref_2object SVf_IOK );
sub create_data_dumper_literal {
my ($x) = #_; # This copying is important as it "resolves" magic.
return "undef" if !defined($x);
my $sv = svref_2object(\$x);
my $iok = $sv->FLAGS & SVf_IOK;
return "$x" if $iok;
$x =~ s/(['\\])/\\$1/g;
return "'$x'";
}
Checks:
Signed integer (IV): ($sv->FLAGS & SVf_IOK) && !($sv->FLAGS & SVf_IVisUV)
Unsigned integer (IV): ($sv->FLAGS & SVf_IOK) && ($sv->FLAGS & SVf_IVisUV)
Floating-point number (NV): $sv->FLAGS & SVf_NOK
Downgraded string (PV): ($sv->FLAGS & SVf_POK) && !($sv->FLAGS & SVf_UTF8)
Upgraded string (PV): ($sv->FLAGS & SVf_POK) && ($sv->FLAGS & SVf_UTF8)
You could use similar tricks. But keep in mind,
It'll be very hard to stringify floating point numbers without loss.
You need to properly escape certain bytes (e.g. NUL) in string literals.
A scalar can have more than one value stored in it. For example, !!0 contains a string (the empty string), a floating point number (0) and a signed integer (0). As you can see, the different values aren't even always equivalent. For a more dramatic example, check out the following:
$ perl -E'open($fh, "non-existent"); say for 0+$!, "".$!;'
2
No such file or directory

It is more complicated. Perl changes the internal representation of a variable depending on the context the variable is used in:
perl -MDevel::Peek -e '
$x = 1; print Dump $x;
$x eq "a"; print Dump $x;
$x .= q(); print Dump $x;
'
SV = IV(0x794c68) at 0x794c78
REFCNT = 1
FLAGS = (IOK,pIOK)
IV = 1
SV = PVIV(0x7800b8) at 0x794c78
REFCNT = 1
FLAGS = (IOK,POK,pIOK,pPOK)
IV = 1
PV = 0x785320 "1"\0
CUR = 1
LEN = 16
SV = PVIV(0x7800b8) at 0x794c78
REFCNT = 1
FLAGS = (POK,pPOK)
IV = 1
PV = 0x785320 "1"\0
CUR = 1
LEN = 16

There's no way to find this out using pure perl. Data::Dumper uses a C library to achieve it. If forced to use Perl it doesn't discriminate strings from numbers if they look like decimal numbers.
use Data::Dumper;
$Data::Dumper::Useperl = 1;
print Dumper(['1',1])."\n";
#output
$VAR1 = [
1,
1
];

Based on your comment that this is to determine whether quoting is needed for an SQL statement, I would say that the correct solution is to use placeholders, which are described in the DBI documentation.
As a rule, you should not interpolate variables directly in your query string.

One simple solution that wasn't mentioned was Scalar::Util's looks_like_number. Scalar::Util is a core module since 5.7.3 and looks_like_number uses the perlapi to determine if the scalar is numeric.

The autobox::universal module, which comes with autobox, provides a type function which can be used for this purpose:
use autobox::universal qw(type);
say type("42"); # STRING
say type(42); # INTEGER
say type(42.0); # FLOAT
say type(undef); # UNDEF

When a variable is used as a number, that causes the variable to be presumed numeric in subsequent contexts. However, the reverse isn't exactly true, as this example shows:
use Data::Dumper;
my $foo = '1';
print Dumper $foo; #character
my $bar = $foo + 0;
print Dumper $foo; #numeric
$bar = $foo . ' ';
print Dumper $foo; #still numeric!
$foo = $foo . '';
print Dumper $foo; #character
One might expect the third operation to put $foo back in a string context (reversing $foo + 0), but it does not.
If you want to check whether something is a number, the standard way is to use a regex. What you check for varies based on what kind of number you want:
if ($foo =~ /^\d+$/) { print "positive integer" }
if ($foo =~ /^-?\d+$/) { print "integer" }
if ($foo =~ /^\d+\.\d+$/) { print "Decimal" }
And so on.
It is not generally useful to check how something is stored internally--you typically don't need to worry about this. However, if you want to duplicate what Dumper is doing here, that's no problem:
if ((Dumper $foo) =~ /'/) {print "character";}
If the output of Dumper contains a single quote, that means it is showing a variable that is represented in string form.

You might want to try Params::Util::_NUMBER:
use Params::Util qw<_NUMBER>;
unless ( _NUMBER( $scalar ) or $scalar =~ /^'.*'$/ ) {
$scalar =~ s/'/''/g;
$scalar = "'$scalar'";
}

The following function returns true (1) if the input is numeric and false ("") if it is a string. The function also returns true (-1) if the input is a numeric Inf or NaN. Similar code can be found in the JSON::PP module.
sub is_numeric {
my $value = shift;
no warnings 'numeric';
# string & "" -> ""
# number & "" -> 0 (with warning)
# nan and inf can detect as numbers, so check with * 0
return unless length((my $dummy = "") & $value);
return unless 0 + $value eq $value;
return 1 if $value * 0 == 0; # finite number
return -1; # inf or nan
}

I don't think there is perl function to find type of value. One can find type of DS(scalar,array,hash). Can use regex to find type of value.

Related

Linux Perl - Convertion of Hex Value to Decimal

I am developing a script that convert collected HEX ( I don't know the Bit format of them) values to Decimal Values.
One of the example is the hex value: fef306da
If I convert it, I receive 4277339866.
Website where I found the expected value (Decimal from signed 2's complement:):
https://www.rapidtables.com/convert/number/hex-to-decimal.html
Do you guys have a solution how can I convert hex fef306da to decimal -17627430.
Note: I get wrong value conversion when I convert hex that have (-)negative sign when decimal.
Thanks all!
Look at pack and use modifiers for unsigned and signed values.
my $hex_value = "fef306da";
my $output_num = unpack('l', pack('L', hex($hex_value)));
print $output_num; ## -17627430
Perform a test on each hex value to determine if it is a 16bit or 32bit value.
Then use the correct modifier with pack for long or short values.
it seems that you expect your decimal to be 32bit signed integer, but HEX($n) returns a 64bit one
so you may try repack it
perl -e 'print unpack "l", pack "L", hex( "fef306da" )'
If you interested in binary conversion then check following code (fef306da is 32bit number)
use strict;
use warnings;
use feature 'say';
my $input = 'fef306da';
my $hex = hex($input);
my $dec;
if( $hex & 0x80000000 ) {
$dec = -1 * ((~$hex & 0x7fffffff)+1);
} else {
$dec = $data;
}
say $dec;
Output
-17627430
Tip: Two's complement
You could use pack
my $hex = "fef306da";
my $num = hex($hex);
$num = unpack("l", pack("L", $num));
say $num; # -17627430
or
my $hex = "fef306da";
$hex = substr("00000000$hex", -8); # Pad to 8 chars
my $num = unpack("l>", pack("H8", $hex));
say $num; # -17627430
But simple arithmetic will do.
my $hex = "fef306da";
my $num = hex($hex);
$num -= 0x1_0000_0000 if $num >= 0x8000_0000;
say $num; # -17627430

convert 0 into string

i'm working on a script in perl.
This script read a DB and generate config file for other devices.
I have a problem with "0".
From my database, i get a 0 (int) and i want this 0 become a "0" in the config file. When i get any other value (1,2,3, etc), the script generate ("1","2","3", etc). But the 0 become an empty string "".
I know, for perl:
- undef
- 0
- ""
- "0"
are false.
How can i convert a 0 to "0" ? I try qw,qq,sprintf, $x = $x || 0, and many many more solutions.
I juste want to make a explicit conversion instead of an implicite conversion.
Thank you for your help.
If you think you have zero, but the program thinks you have an empty string, you are probably dealing with a dualvar. A dualvar is a scalar that contains both a string and a number. Perl usually returns a dualvar when it needs to return false.
For example,
$ perl -we'my $x = 0; my $y = $x + 1; CORE::say "x=$x"'
x=0
$ perl -we'my $x = ""; my $y = $x + 1; CORE::say "x=$x"'
Argument "" isn't numeric in addition (+) at -e line 1.
x=
$ perl -we'my $x = !1; my $y = $x + 1; CORE::say "x=$x"'
x=
As you can see, the value returned by !1 acts as zero when used as a number, and acts as an empty string when used as a string.
To convert this dualvar into a number (leaving other numbers unchanged), you can use the following:
$x ||= 0;

Perl - Searching values in a log file and store/print them as a string.

I would like to search values after a specific word (Current Value = ) in a log file, and makes a string with values.
vcs_output.log: a log file
** Fault injection **
Count = 1533
0: Path = cmp_top.iop.sparc0.exu.alu.byp_alu_rcc_data_e[6]
0: Current value = x
1: Path = cmp_top.iop.sparc0.exu.alu.byp_alu_rs3_data_e[51]
1: Current value = x
2: Path = cmp_top.iop.sparc0.exu.alu.byp_alu_rs1_data_e[3]
2: Current value = 1
3: Path = cmp_top.iop.sparc0.exu.alu.shft_alu_shift_out_e[18]
3: Current value = 0
4: Path = cmp_top.iop.sparc0.exu.alu.byp_alu_rs3_data_e[17]
4: Current value = x
5: Path = cmp_top.iop.sparc0.exu.alu.byp_alu_rs1_data_e[43]
5: Current value = 0
6: Path = cmp_top.iop.sparc0.exu.alu.byp_alu_rcc_data_e[38]
6: Current value = x
7: Path = cmp_top.iop.sparc0.exu.alu.byp_alu_rs2_data_e_l[30]
7: Current value = 1
.
.
.
If I store values after "Current value = ", then x,x,1,0,x,0,x,1. I ultimately save/print them as a string such as xx10x0x1.
Here is my code
code.pl:
#!/usr/bin/perl
use strict;
use warnings;
##### Read input
open ( my $input_fh, '<', 'vcs_output.log' ) or die $!;
chomp ( my #input = <$input_fh> );
my $i=0;
my #arr;
while (#input) {
if (/Current value = /)
$arr[i]= $input; # put the matched value to array
}
}
## make a string from the array using an additional loop
close ( $input_fh );
I think there is a way to make a string in one loop (or even not using a loop). Please advise me to make it. Any suggestion is appreciated.
You can do both that you ask for.
To build a string directly, just append to it what you capture in the regex
my $string;
while (<$input_fh>)
{
my ($val) = /Current\s*value\s*=\s*(.*)/;
$string .= $val;
}
If the match fails then $val is an empty string, so we don't have to test. You can also write the whole while loop in one line
$string .= (/Current\s*value\s*=\s*(.*)/)[0] while <$input_fh>;
but I don't see why that would be necessary. Note that this reads from the filehandle, and line by line. There is no reason to first read all lines into an array.
To avoid (explicit) looping, you can read all lines and pass them through map, naively as
my $string = join '',
map { (/Current\s*value\s*=\s*(.*)/) ? $1 : () } <$input_fh>;
Since map needs a list, the filehandle is in list context, returning the list of all lines in the file. Then each is processed by code in map's block, and its output list is then joined.
The trick map { ($test) ? $val : () } uses map to also do grep's job, to filter -- the empty list that is returned if $test fails is flattened into the output list, thus disappearing. The "test" here is the regex match, which in the scalar context returns true/false, while the capture sets $1.
But, like above, we can return the first element of the list that match returns, instead of testing whether the match was successful. And since we are in map we can in fact return the "whole" list
my $string = join '',
map { /Current\s*value\s*=\s*(.*)/ } <$input_fh>;
what may be clearer here.
Comments on the code in the question
the while (#input) is an infinite loop, since #input never gets depleted. You'd need foreach (#input) -- but better just read the filehandle, while (<$input_fh>)
your regex does match on a line with that string, but it doesn't attempt to match the pattern that you need (what follows =). Once you add that, it need be captured as well, by ()
you can assign to the i-th element (which should be $i) but then you'd have to increment $i as you go. Most of the time it is better to just push #array, $value
You can use capturing parentheses to grab the string you want:
use strict;
use warnings;
my #arr;
open ( my $input_fh, '<', 'vcs_output.log' ) or die $!;
while (<$input_fh>) {
if (/Current value = (.)/) {
push #arr, $1;
}
}
close ( $input_fh );
print "#arr\n";
__END__
x x 1 0 x 0 x 1
Use grep and perlre
http://perldoc.perl.org/functions/grep.html
http://perldoc.perl.org/perlre.html
If on a non-Unix environment then...
#!/usr/bin/perl -w
use strict;
open (my $fh, '<', "vcs_output.log");
chomp (my #lines = <$fh>);
# Filter for lines which contain string 'Current value'
#lines = grep{/Current value/} #lines;
# Substitute out what we don't want... leaving us with the 'xx10x0x1'
#lines = map { $_ =~ s/.*Current value = //;$_} #lines;
my $str = join('', #lines);
print $str;
Otherwise...
#!/usr/bin/perl -w
use strict;
my $output = `grep "Current value" vcs_output.log | sed 's/.*Current value = //'`;
$output =~ s/\n//g;
print $output;

Perl: Transfer substring positions between two strings

I'm writing a Perl programm and I've got the following problem:
I have a large list of start and end positions in a string. This positions correspond to substrings in this string. I now want to transfer this positions to a second string. This second string is identical to the first string, except that it has additional hyphen.
Example for original String: "ABCDEF" and one Substring "BCDE"
What I have:
Positions of substring in this original string: Start = 1, End =
4
The original string with additional hyphen: "-AB---CD--E-F---"
What I want:
Position of the substring in the hyphen-string: Start=2, End=10
I have a large list of this substring positions.
I strongly suspect that you have shown a reduced version of the problem, in which case any solution may not work for the real situation.
However, it seems simplest to build a regex by interspersing -* (i.e. zero or more hyphens) between characters.
This program works that way, building a regex of B-*C-*D-*E and comparing it to both of your sample strings.
use strict;
use warnings;
my #strings = qw/ ABCDEF -AB---CD--E-F--- /;
my ($start, $end) = (1, 4);
my $substr = substr $strings[0], $start, $end-$start + 1;
my $regex = join '-*', split //, $substr;
$regex = qr/$regex/;
for my $string (#strings) {
if ($string =~ $regex) {
printf "Substring found at %d to %d in string %s\n", $-[0], $+[0]-1, $string;
}
}
output
Substring found at 1 to 4 in string ABCDEF
Substring found at 2 to 10 in string -AB---CD--E-F---
Does this work for you? It just searches for the characters specified by start and end in the hyphenated string and returns their indices.
sub hyphen_substrings {
my $original = shift;
my $hyphenated = shift;
my #substrings = #_;
my #return;
for my $substring (#substrings) {
my ($start, $end) = #{$substring}[0, 1];
my $start_h = index $hyphenated, substr $original, $start, 1;
my $end_h = index $hyphenated, substr $original, $end, 1;
push #return, [$start_h, $end_h];
}
return #return;
}
use strict;
use warnings;
my $theStringGivenAsAnInputExample="-AB---CD--E-F---";
my $start=1;
my $end=4;
my $theStringGivenAsAnotherInput="ABCDEF";
my $regexp=join("-*",split("",substr($theStringGivenAsAnotherInput,$start,$end))
);
$theStringGivenAsAnInputExample =~ /$regexp/p;
print ${^PREMATCH},"\n";
print ${^POSTMATCH},"\n";
print ${^MATCH},"\n";
my $startPosition = length(${^PREMATCH});
my $finishPosition = length(${^PREMATCH})+length(${^MATCH})-1;
print "start, $startPosition finish, $finishPosition\n";

Why is my word frequency counter example written in Perl failing to produce useful output?

I am very new to Perl, and I am trying to write a word frequency counter as a learning exercise.
However, I am not able to figure out the error in my code below, after working on it. This is my code:
$wa = "A word frequency counter.";
#wordArray = split("",$wa);
$num = length($wa);
$word = "";
$flag = 1; # 0 if previous character was an alphabet and 1 if it was a blank.
%wordCount = ("null" => 0);
if ($num == -1) {
print "There are no words.\n";
} else {
print "$length";
for $i (0 .. $num) {
if(($wordArray[$i]!=' ') && ($flag==1)) { # start of a new word.
print "here";
$word = $wordArray[$i];
$flag = 0;
} elsif ($wordArray[$i]!=' ' && $flag==0) { # continuation of a word.
$word = $word . $wordArray[$i];
} elsif ($wordArray[$i]==' '&& $flag==0) { # end of a word.
$word = $word . $wordArray[$i];
$flag = 1;
$wordCount{$word}++;
print "\nword: $word";
} elsif ($wordArray[$i]==" " && $flag==1) { # series of blanks.
# do nothing.
}
}
for $i (keys %wordCount) {
print " \nword: $i - count: $wordCount{$i} ";
}
}
It's neither printing "here", nor the words. I am not worried about optimization at this point, though any input in that direction would also be much appreciated.
This is a good example of a problem where Perl will help you work out what's wrong if you just ask it for help. Get used to always adding the lines:
use strict;
use warnings;
to the top of your Perl programs.
Fist off,
$wordArray[$i]!=' '
should be
$wordArray[$i] ne ' '
according to the Perl documentation for comparing strings and characters. Basically use numeric operators (==, >=, …) for numbers, and string operators for text (eq, ne, lt, …).
Also, you could do
#wordArray = split(" ",$wa);
instead of
#wordArray = split("",$wa);
and then #wordArray wouldn't need to do the wonky character checking and you never would have had the problem. #wordArray will be split into the words already and you'll just have to count the occurrences.
You seem to be writing C in Perl. The difference is not just one of style. By exploding a string into a an array of individual characters, you cause the memory footprint of your script to explode as well.
Also, you need to think about what constitutes a word. Below, I am not suggesting that any \w+ is a word, rather pointing out the difference between \S+ and \w+.
#!/usr/bin/env perl
use strict; use warnings;
use YAML;
my $src = '$wa = "A word frequency counter.";';
print Dump count_words(\$src, 'w');
print Dump count_words(\$src, 'S');
sub count_words {
my $src = shift;
my $class = sprintf '\%s+', shift;
my %counts;
while ($$src =~ /(?<sequence> $class)/gx) {
$counts{ $+{sequence} } += 1;
}
return \%counts;
}
Output:
---
A: 1
counter: 1
frequency: 1
wa: 1
word: 1
---
'"A': 1
$wa: 1
=: 1
counter.";: 1
frequency: 1
word: 1

Resources