How do I print a hash in perl? - string

How do I print $stopwords? It seems to be a string ($) but when I print it I get: "HASH(0x8B694)" with the memory address changing on each run.
I am using Lingua::StopWords and I simply want to print the stop words that it's using so I know for sure what stop words are there. I would like to print these two a file.
Do I need to deference the $stopwords some how?
Here is the code:
use Lingua::StopWords qw( getStopWords );
open(TEST, ">results_stopwords.txt") or die("Unable to open requested file.");
my $stopwords = getStopWords('en');
print $stopwords;
I've tried:
my #temp = $stopwords;
print "#temp";
But that doesn't work. Help!
Last note: I know there is a list of stop words for Lingua::StopWords, but I am using the (en) and I just want to make absolute sure what stop words I am using, so that is why I want to print it and ideally I want to print it to a file which the file part I should already know how to do.

$ doesn't mean string. It means a scalar, which could be a string, number or reference.
$stopwords is a hash reference. To use it as a hash, you would use %$stopwords.
Use Data::Dumper as a quick way to print the contents of a hash (pass by reference):
use Data::Dumper;
...
print Dumper($stopwords);

to dereference a hashref :
%hash = %{$hashref}; # makes a copy
so to iterate over keys values
while(($key,$value)=each%{$hashref}){
print "$key => $value\n";
}
or (less efficient but didactic purpose)
for $key (keys %{$hashref}){
print "$key => $hashref->{$key}\n";
}

Have a look at Data::Printer as a nice alternative to Data::Dumper. It will give you pretty-printed output as well as information on methods which the object provides (if you're printing an object). So, whenever you don't know what you've got:
use Data::Printer;
p( $some_thing );
You'll be surprised at how handy it is.

getStopWords returns a hashref — a reference to a hash — so you would dereference it by prepending %. And you actually only want its keys, not its values (which are all 1), so you would use the keys function. For example:
print "$_\n" foreach keys %$stopwords;
or
print join(' ', keys %$stopwords), "\n";
You can also skip the temporary variable $stopwords, but then you need to wrap the getStopWords call in curly-brackets {...} so Perl can tell what's going on:
print join(' ', keys %{getStopWords('en')}), "\n";

Related

How can I print "\n" using exec()?

ab = open("bonj.txt","w")
exec(f'''print("Hi I'm Mark\n", file=ab)
print("\tToday I'm tired", file=ab)
''')
ab.close()
I would absolutely need to use exec() to print some informations on a txt doc. The problem is that when I use exec(), I lost the possibility of put newlines or tabs on my text, and I dont understand why, could you help me ?
This is the error message that I receive : "SyntaxError: EOL while scanning string literal"
You just need to escape \n and \t properly
ab = open("bonj.txt","w")
exec(f'''print("Hi I'm Mark\\n", file=ab)
print("\\tToday I'm tired", file=ab)
''')
ab.close()
You need to prevent python from interpreting the \n early.
This can be done by specifying the string as a raw string, using the r prefix:
ab = open("bonj.txt","w")
exec(rf'''print("Hi I'm Mark\n", file=ab)
print("\tToday I'm tired", file=ab)
''')
ab.close()
Anyway, using exec is odd there, you would rather try to see if you can write your code as something like:
lines = ["Hi I'm Mark\n", "\tToday I'm tired"]
with open("bonj.txt", "w") as f:
f.write("\n".join(lines))
Note that you need to use "\n".join to obtain the same result as with print because print adds a newline by default (see its end="\n" argument).
Also, when handling files, using the context manager syntax (with open ...) is good practice.

set function with file- python3

I have a text file with given below content
Credit
Debit
21/12/2017
09:10:00
Written python code to convert text into set and discard \n.
with open('text_file_name', 'r') as file1:
same = set(file1)
print (same)
print (same.discard('\n'))
for first print statement print (same). I get correct result:
{'Credit\n','Debit\n','21/12/2017\n','09:10:00\n'}
But for second print statement print (same.discard('\n')) . I am getting result as
None.
Can anybody help me to figure out why I am getting None. I am using same.discard('\n') to discard \n in the set.
Note:
I am trying to understand the discard function with respect to set.
The discard method will only remove an element from the set, since your set doesn't contain just \n it can't discard it. What you are looking for is a map that strips the \n from each element like so:
set(map(lambda x: x.rstrip('\n'), same))
which will return {'Credit', 'Debit', '09:10:00', '21/12/2017'} as the set. This sample works by using the map builtin which applies it's first argument to each element in the set. The first argument in our map usage is lambda x: x.rstrip('\n') which is simply going to remove any occurrences of \n on the right-hand side of each string.
discard removes the given element from the set only if it presents in it.
In addition, the function doesn't return any value as it changes the set it was ran from.
with open('text_file_name', 'r') as file1:
same = set(file1)
print (same)
same = {elem[:len(elem) - 1] for elem in same if elem.endswith('\n')}
print (same)
There are 4 elements in the set, and none of them are newline.
It would be more usual to use a list in this case, as that preserves order while a set is not guaranteed to preserve order, plus it discards duplicate lines. Perhaps you have your reasons.
You seem to be looking for rstrip('\n'). Consider processing the file in this way:
s = {}
with open('text_file_name') as file1:
for line in file1:
s.add(line.rstrip('\n'))
s.discard('Credit')
print(s) # This displays 3 elements, without trailing newlines.

How to get the value after a string using perl reg expression

I have the following string :
{\"id\":01,\"start_time\":\"1477954800000\",\"stop_time\":\"1485817200000\",\"url\":http:://www.example.com\}
and I'd like to get for example the value of start_time (1477954800000).
I tried several things in https://regex101.com/ but I could not find a way to deal with the special characters (\":\") between the string and the value .
If the for example the string was like start_time = 1477954800000
I know that by using
start_time\":\"(\w+)/)
I'll get the value.
Any idea on how to get the value when \":\" are involved?
Your sample data looks like a stringified JSON object, if that is the case you should use a JSON parser not a regular expression:
#!perl
use strict;
use warnings;
use feature qw(say);
use JSON;
my $json_string = <DATA>;
chomp($json_string);
my $json_object = decode_json $json_string;
# get the value of the start_time key
say $json_object->{start_time};
# 1477954800000
__DATA__
{"id":1,"start_time":"1477954800000","stop_time":"1485817200000","url":"http://www.example.com"}

Capture Output to stream and store as a string variable

Although this question relates to 'BioPerl', the question, I believe, is probably more general than that.
Basically I have produced a Bio::Tree::TreeI object and I am trying to convert that into a string variable.
The only way I can come close to converting that to a string variable is to write that tree to a stream using:
# a $tree = Bio::Tree::TreeI->new() (which I know is an actual tree as it prints to the terminal console)
my $treeOut = Bio::TreeIO->new(-format => 'newick')
$treeOut->write_tree($tree)
The output of ->write_tree is "Writes a tree onto the stream" but how do I capture that in a string variable as I can't find another way of returning a string from any of the functions in Bio::TreeIO
You can redirect standard output to variable,
my $captured;
{
local *STDOUT = do { open my $fh, ">", \$captured; $fh };
$treeOut->write_tree($tree);
}
print $captured;
There is an easier way to accomplish the same goal by setting the file handle for BioPerl objects, and I think it is less of a hack. Here is an example:
#!/usr/bin/env perl
use strict;
use warnings;
use Bio::TreeIO;
my $treeio = Bio::TreeIO->new(-format => 'newick', -fh => \*DATA);
my $treeout = Bio::TreeIO->new(-format => 'newick', -fh => \*STDOUT);
while (my $tree = $treeio->next_tree) {
$treeout->write_tree($tree);
}
__DATA__
(A:9.70,(B:8.234,(C:7.932,(D:6.321,((E:2.342,F:2.321):4.231,((((G:4.561,H:3.721):3.9623,
I:3.645):2.341,J:4.893):4.671)):0.234):0.567):0.673):0.456);
Running this script prints the newick string to your terminal, as you would expect. If you use Bio::Phylo (which I recommend), there is a to_string method (IIRC), so you don't have to create an object just to print your trees, you can just do say $tree->to_string.

Perl - basic STDIN issue

I'm now with Perl.
i have the following code which the purpose is to extract the software name
by text parsing.
the software name in this case is "ddd" :
print "Please provide full installation path (Ex:/a/b/c/ddd)\n";
my $installPath = <STDIN>;
#going to extract software name
my #soft = split '/', $installPath;
my $softName = print "#soft[4]\n";
print "$softName\n";
but,
instead of getting "ddd" as software name i got:
ddd
1
i don't understand from where the '1' comes from?
Thanks for the help.
The error comes from this:
my $softName = print "#soft[4]\n";
# ^^^^^
The function print returns 1 (true) when it succeeds, which it does here. The 1 is assigned to your variable, which you then print.
print "$softName\n";
Short recap:
my $installPath = <STDIN>; # "/a/b/c/ddd"
my #soft = split '/', $installPath; # 5th element is "ddd"
my $softName = print "#soft[4]\n"; # this prints "ddd", but "1" is returned
# ^^^^^ print returns 1, which is assigned to $softName
print "$softName\n"; # "1" is printed
What you want is:
my $softName = $soft[4];
Which is just taking the 5th element of the array. You should use $ and not # when referring to a single element. You can use # when referring to a slice, multiple elements.
A better way to do what you are trying to do is using File::Basename:
use File::Basename;
my $softName = basename($installPath);
File::Basename is a core module in Perl 5.
my $softName = print "#soft[4]\n"; is a bad way of treating an array, and this is what is causing the issue.
When referencing an array as a whole, then the # should be used. What you have done here by referencing #soft[4], you do point at a particular value in the array, but you are still referring to it in an array context, and since $softName is a scalar that only wants one single value, perl tries its best to figure out what you want, since you want nothing like it at all. To make it clear to perl that you are referencing a specific item in the array and not the array as a whole, use $ instead. Perl will understand since you also specify [4].
In addition, what is being assigned to $softName is not that array value, but the result of the print which is the status code (this is where the "1" comes from).
To correct your code, change that line to:
my $softName = $soft[4];

Resources