get hahes key to string in perl - string

I do not understand why $test is '1' and not 'foo'. Is there any way to write 'foo' (written in %hash) to $test?
#/usr/bin/perl
use warnings;
use strict;
use utf8;
my %hash;
$hash{'test'} = {'foo' => 1};
print keys %{$hash{'test'}}; #foo
print "\n";
print values %{$hash{'test'}}; #1
print "\n";
print %{$hash{'test'}}; #foo1
print "\n";
# here is the part I do not understand
my $test = keys %{$hash{'test'}};
print "$test\n"; #1 but why? I was expecting #foo
How can I push 'foo' in $test?

keys() returns a list. But your assignment to a scalar (my $test = keys ...) puts it into scalar context. Therefore it is evaluated to the length of the list, which is 1 in your case.

my $test = keys %{$hash{'test'}};
When assigning a list like keys returns to a scalar, what's assigned is the length of the list. Which is 1 in this case.

When the return value from a Perl function confuses you, it's always worth checking the documentation for that function - paying particular attention to the section talking about how the return value varies in different contexts.
The very start of perldoc -f keys says this (emphasis mine):
keys HASH
keys ARRAY
Called in list context, returns a list consisting of all the keys of the named hash, or in Perl 5.12 or later only, the indices of an array. Perl releases prior to 5.12 will produce a syntax error if you try to use an array argument. In scalar context, returns the number of keys or indices.
You're assigning the results of the expression to a scalar variable. The expression is, therefore, evaluated in scalar context and you get the number of keys in the hash (i.e. 1).
To fix that, force the expression to be evaluated in list context by putting parentheses around the variable:
my ($test) = keys %{$hash{'test'}};

Related

TCL string map for ( to \(

how to do this
puts [string map { ( ) "\(" "\)"} (3.8.001)]
o\p I'm getting $tclsh main.tcl
(3.8.001)
I'm expecting
\(3.8.001\)
help me to do this
You should use the string map as follows,
puts [string map { ( "\\(" ) "\\)"} (3.8.001)]
Backslash has to be used twice, to have a single backslash when used inside double quotes in Tcl.
Whenever I'm confused about how exactly to write a complex string map with backslashes involved, I try building the mapping list with list. I might then use the literal it produces rather than having my script contain the actual list command call, but that's purely optimisation on my part. (And a very low value one; the bytecode compiler does it for me if all the arguments to list are literals.) In particularly tricky cases, I'll build it by stages with lappend, but that's only where what is going on is a true head-scratcher!
Also, the mapping is supposed to be “replaceA withA replaceB withB ...”; you were putting ) and "\(" in the wrong order, and the result would not have been expected to work at all.
set mapping [list "(" "\\(" ")" "\\)"]
# puts "mapping is “$mapping”"; # Yay for unicode quote characters!
puts [string map $mapping (3.8.001)]
The sequence you were looking for is this, with a few more braces and fewer double-quotes, but I encourage you to learn how to work this out for yourself…
puts [string map {( {\(} ) {\)}} "(3.8.001)"]

Splitting a String with Perl

I was following along with this tutorial on how to split strings when I came across a quote that confused me.
Words about Context
Put to its normal use, split is used in list context. It may also be
used in scalar context, though its use in scalar context is
deprecated. In scalar context, split returns the number of fields
found, and splits into the #_ array. It's easy to see why that might
not be desirable, and thus, why using split in scalar context is
frowned upon.
I have the following script that I've been working with:
#!/usr/bin/perl
use strict;
use warnings;
use v5.24;
doWork();
sub doWork {
my $str = "This,is,data";
my #splitData = split(/,/, $str);
say $splitData[1];
return;
}
I don't fully understand how you would use split on a list.
From my understanding, using the split function on my $str variable is frowned upon? How then would I go about splitting a string with the comma as the delimiter?
The frowned-upon behaviour documented by that passage was deprecated at least as far back as 5.8.8 (11 years ago) and was removed from Perl in 5.12 (7 years ago).
The passage documents that
my $n = split(...);
is equivalent to
my $n = do { #_ = split(...); #_ }; # <5.12
The assignment to #_ is unexpected. This type of behaviour is called "surprising action at a distance", and it can result in malfunctioning code. As such, before 5.12, using split in scalar context was frowned-upon. Since 5.12, however,
my $n = split(...);
is equivalent to
my $n = do { my #anon = split(...); #anon }; # ≥5.12
The surprising behaviour having been removed, it's no longer frowned-upon to use split in scalar context for the reason stated in the passage you quoted.
It should probably still be avoided, not just for backwards compatibility, but because there are far better ways of counting the number of substrings. I would use the following:
my $n = 1 + tr/,//; # Faster than: my $n = split(/,/, $_, -1);
You are using split in list context, so it does not exercise the frowned-upon behaviour, no matter what version of Perl you use. In other words, your usage is fine.
It's fine unless you are trying to handle CSV data, that is. In that case, you should be using Text::CSV_XS.
use Text::CSV_XS qw( );
my $csv = Text::CSV_XS->new({ auto_diag => 2, binary => 1 });
while (my $row = $csv->getline($fh)) { ... } # Parsing CSV
for (...) { $csv->say($fh, $row); } # Generating CSV
Calling split in scalar context isn't very useful. It effectively returns the number of separators plus one, and there are better ways of doing that.
For example,
my $str = "This,is,data";
my $splitData = split(/,/, $str);
say $splitData;
will print 3 as it counts the substrings after the split.
split in scalarf context used to also return the split parts in #_, but that frowned-upon behaviour was removed because it's rather unexpected.
Using it as an array is perfect.
my $str = "This,is,data";
the above line is a single string.
my #splitData = split(/,/, $str);
You are now splitting the $str into an array, or a list of values. So effectively you are now sitting with #splitData which is in fact:
"This" "is" "string"
So you can either use them all, say #splitData or use each of them as a scalar #splitData[1] which we never use as it is always better to write it as $splitData[1]
The tutorial says it nicely. Use split on a string to create a list of substrings.
You can then obviously automatically assign each of the list values in a loop without having to print each list value.
my $str = "This,is,data";
my #splitData = split(/,/, $str);
foreach $value(#splitData) {
say "$value\n"
}
This basically re-assigns $splitData[0], $splitData[1] etc... to $value as scalar.

Distance between matched substrings

I have a chromosome sequence and have to find subsequences in it and the distances between them.
For example:
string:
AACCGGTTACGTTTGGCCAAACGTTTTTTGGGGAAACCCACGTACGTAAAGCCGGTTAAACGT
Substring:
ACGT
I have to find the distance between all occurrences of ACGT.
I normally do not recommend answering posts where it is obvious the OP just wants other people to do their work. However, there is already one answer the use of which will be problematic if input strings are largish, so here is something that uses Perl builtins.
The special variable #- stores the positions of matches after a pattern matches.
use strict;
use warnings;
use Data::Dumper;
my $string = 'AACCGGTTACGTTTGGCCAAACGTTTTTTGGGGAAACCCACGTACGTAAAGCCGGTTAAACGT';
my #pos;
while ( $string =~ /ACGT/g ) {
push #pos, $-[0];
}
my #dist;
for my $i (1 .. $#pos) {
push #dist, $pos[$i] - $pos[$i - 1];
}
print Dumper(\#pos, \#dist);
This method uses less memory than splitting the original string (which may be a problem if the original string is large enough). Its memory footprint can be further reduced, but I focused on clarity by showing the accumulation of match positions and the calculation of deltas separately.
One open question is whether you want the index of the first match from the beginning of the string. Strictly speaking, "distances between matches" excludes that.
use strict;
use warnings;
use Data::Dumper;
my $string = 'AACCGGTTACGTTTGGCCAAACGTTTTTTGGGGAAACCCACGTACGTAAAGCCGGTTAAACGT';
my #dist;
my $last;
while ($string =~ /ACGT/g) {
no warnings 'uninitialized';
push #dist, $-[0] - $last;
$last = $-[0];
}
# Do we want the distance of the first
# match from the beginning of the string?
shift #dist;
print Dumper \#dist;
Of course, it is possible to use index for this as well, but it looks considerably uglier.
You may split your input string by "ACGT" and remove the first and the last elements of the returned array to get all fragments between "ACGT". Then calculate lengths of this fragments:
my $input = "AACCGGTTACGTTTGGCCAAACGTTTTTTGGGGAAACCCACGTACGTAAAGCCGGTTAAACGT";
my #fragments = split("ACGT", $input, -1);
#fragments = #fragments[1..$#fragments - 1];
my #dist_arr = map {length} #fragments;
Demo: https://ideone.com/AqEwGu

How to initiate array element to 0 in bash?

declare -a MY_ARRAY=()
Does the declaration of array in this way in bash will initiate all the array elements to 0?
If not, How to initiate array element to 0?
Your example will declare/initialize an empty array.
If you want to initialize array members, you do something like this:
declare -a MY_ARRAY=(0 0 0 0) # this initializes an array with four members
If you want to initialize an array with 100 members, you can do this:
declare -a MY_ARRAY=( $(for i in {1..100}; do echo 0; done) )
Keep in mind that arrays in bash are not fixed length (nor do indices have to be consecutive). Therefore you can't initialize all members of the array unless you know what the number should be.
Default Values with Associative Arrays
Bash arrays are not fixed-length arrays, so you can't pre-initialize all elements. Indexed arrays are also not sparse, so you can't really use default values the way you're thinking.
However, you can use associative arrays with an expansion for missing values. For example:
declare -A foo
echo "${foo[bar]:-baz}"
This will return "baz" for any missing key. As an alternative, rather than just returning a default value, you can actually set one for missing keys. For example:
echo "${foo[bar]:=baz}"
This alternate invocation will not just return "baz," but it will also store the value into the array for later use. Depending on your needs, either method should work for the use case you defined.
Yes, it initiates an empty array and assigns it to MY_ARRAY. You could verify with something like this:
#!/bin/bash
declare -a MY_ARRAY=()
echo ${#MY_ARRAY} # this prints out the length of the array

Conditionals for data type in gnuplot functions

I would like to define a function which returns the string "NaN" or sprintf("%g",val) depending on whether val is a string or a numeric value. Initially I was trying to test if val was defined (using the gnuplot "exists" function) but it seems that I cannot pass any undefined variable to a function (an error is issued before the function is evaluated). Therefore: is there a way to test inside a function whether the argument is a string or numeric?
I search for a function isstring which I can use somehow like
myfunc(val)=(isstring(val)?"NaN":sprintf("%g",val))
The goal is to output the values of variables without risking errors in case they are undefined. However I need it as a function if I want a compact code for many variables.
Gnuplot doesn't really have the introspection abilities that many other languages have. In fact, it treats strings and numbers (at least integers) very similarly:
print "1"+2 #prints 3
a=1
print "foo".a #prints foo1
I'm not exactly sure how this is implemented internally. However, what you're asking is very tricky to get to work.
Actually, I think your first attempt (checking if a variable exists) is more sensible as type-checking in gnuplot is impossible*. You can pass the variable name to the function as a string, but the problem is that you don't seem to have a handle on the value. All seems lost -- But wait, gnuplot has an eval statement which when given a string will evaluate it. This seems great! Unfortunately, it's a statement, not a function (so it can't be used in a function -- argv!). The best solution I can come up with is to write a function which returns an expression that can be evaluated using eval. Here goes:
def exists_func(result,var)=sprintf("%s=exists('%s')?sprintf('%g',var):'NaN'",result,var,var)
Now when you want to use it, you just prefix it with eval
a=3
eval exists_func("my_true_result","a")
print my_true_result #3
eval exists_func("my_false_result","b")
print my_false_result #NaN
This goes against the grain a little bit. In most programming languages, you'd probably want to do something like this:
my_true_result=exists_func(a)
But alas, I can't figure out how to make that form work.
Of course, the same thing goes here that always goes with eval. Don't use this function with untrusted strings.
*I don't actually know that it's impossible, but I've never been able to get it to work
EDIT
In response to your comment above on the question, I think a function like this would be a little more intuitive:
def fmt(x)=(x==x)?sprintf("%g",x):"NaN"
With this function, your "sentinal/default" value should be NaN instead of "undefined", but it doesn't seem like this should make too much of a difference...(Really, if you're willing to live with "nan" instead of "NaN" you don't need this function at all -- sprintf will do just fine. (Note that this works because according to IEEE, NaN doesn't equal anything (even itself)).
You helped me a lot these days with gnuplot. I want to give you something back because I have found a solution to check if a variable is numeric or not. This helps to decide which operators can be used on it (e.g. == for numbers, eq for strings).
The solution is not very simple, but it works. It redirects gnuplot's print command to a temp file, writes the variable to the file with print myvar and evaluates the file's first line with system("perl -e '<isnumeric(line#1 in temp file)?>' ") (<> is pseudo-code). Let me know if there's room for imrpovements and let me hear your suggestions!
Example: myvar is a float. Any integer (1 or "1") or string value ("*") works too!
myvar = -2.555
# create temporary file for checking if variables are numeric
Int_tmpfle = "tmp_isnumeric_check"
# redirect print output into temp file (existing file is overwritten)
set print Int_tmpfle
# save variable's value to file
print myvar
# check if file is numeric with Perl's 'looks_like_number' function
isnumeric = system("perl -e 'use Scalar::Util qw(looks_like_number); \
open(FLE,".Int_tmpfle."); $line = < FLE >; \
if (looks_like_number($line) > 0) {print qq(y)} ' ")
# reset print output to < STDOUT> (terminal)
set print "-"
# make sure to use "," when printing string and numeric values
if (isnumeric eq "y") {print myvar," is numeric."} else {print myvar," is not numeric."}

Resources