Perl - How to split input into an array? - linux

I have a small Perl script that is used to take user command line input, and then splitting the input into an array where each service should be its own element.
My goal is to enable single arguments, as well as several arguments using a split character where every argument is separated by a comma sign. Examples of these would be:
ssh
ssh,named,cups
My code looks as follows, it does compile without errors but the output is not the intended.
print "Please provide the service name that you wish to analyze (Named service):\n";
print "Several services may be provided, using a comma sign as a separator.\n";
my $service_name_input = <>;
chomp($service_name_input);
my #service_list = split(/,/, $service_name_input);
foreach my $line (#service_list)
{
open(my $service_input, "service #service_list status|");
}
foreach my $line (#service_list)
{
#Matches for "running".
if ($line =~ m/(\(running\))/)
{
print "The service #service_list is running\n";
}
#Matches for "dead".
elsif ($line =~ m/(dead)/)
{
print "The service #service_list is dead\n";
}
}
The program should output if the service is running or dead, but I only get the following error code. When manually issuing the service command, it works just fine though.
The service command supports only basic LSB actions (start, stop,
restart, try-restart, reload, force-reload, status). For other
actions, please try to use systemctl.
Any help regarding the steps I should take in order to achieve a working program will be much appreciated. Thank you for reading.

foreach my $line (#service_list)
{
open(my $service_input, "service #service_list status|");
}
This loop doesn't use $line. You're passing the whole array #service_list to the service command, i.e. you're running service ssh named cups status. This causes an error because service thinks you want to execute the named action on the ssh service. To fix that, write "service $line status |".
But there's another issue: You never do anything with $service_input either. If you want to loop over each line of each service command output, you need something more like:
foreach my $service (#service_list)
{
open(my $service_input, '-|', 'service', $service, 'status')
or die "$0: can't run 'service': $!\n";
while (my $line = readline $service_input)
{
#Matches for "running".
if ($line =~ m/(\(running\))/)
{
print "The service $service is running\n";
}
#Matches for "dead".
elsif ($line =~ m/(dead)/)
{
print "The service $service is dead\n";
}
}
}
I've changed your open line to use a separate mode argument -| (which is good style in general) and the list form of process open, which avoids the shell, which is also good in general (no issues with "special characters" in $service, for example).

Related

search multi line string from multiple files in a directory

the string to to be searched is:
the file_is being created_automaically {
period=20ns }
the perl script i am using is following ( this script is working fine for single line string but not working for multi line )
#!/usr/bin/perl
my $dir = "/home/vikas";
my #files = glob( $dir . '/*' );
#print "#files";
system ("rm -rf $dir/log.txt");
my $list;
foreach $list(#files){
if( !open(LOGFILE, "$list")){
open (File, ">>", "$dir/log.txt");
select (File);
print " $list \: unable to open file";
close (File);
else {
while (<LOGFILE>){
if($_ =~ /".*the.*automaically.*\{\n.*period\=20ns.*\}"/){
open (File, ">>", "$dir/log.txt");
select (File);
print " $list \: File contain the required string\n";
close (File);
break;
}
}
close (LOGFILE);
}
}
This code does not compile, it contains errors that causes it to fail to execute. You should never post code that you have not first tried to run.
The root of your problem is that for a multiline match, you cannot read the file in line-by-line mode, you have to slurp the whole file into a variable. However, your program contains many flaws. I will demonstrate. Here follows excerpts of your code (with fixed indentation and missing curly braces).
First off, always use:
use strict;
use warnings;
This will save you many headaches and long searches for hidden problems.
system ("rm -rf $dir/log.txt");
This is better done in Perl, where you can control for errors:
unlink "$dir/log.txt" or die "Cannot delete '$dir/log.txt': $!";
foreach my $list (#files) {
# ^^
Declare the loop variable in the loop itself, not before it.
if( !open(LOGFILE, "$list")){
open (File, ">>", "$dir/log.txt");
select (File);
print " $list \: unable to open file";
close (File);
You never have to explicitly select a file handle before you print to it. You just print to the file handle: print File "....". What you are doing is just changing the STDOUT file handle, which is not a good thing to do.
Also, this is error logging, which should go to STDERR instead. This can be done simply by opening STDERR to a file at the beginning of your program. Why do this? If you are not debugging a program at a terminal, for example via the web or some other process where STDERR does not show up on your screen. Otherwise it is just extra work while debugging.
open STDERR, ">", "$dir/log.txt" or die "Cannot open 'log.txt' for overwrite: $!";
This has the added benefit of you not having to delete the log first. And now you do this instead:
if (! open LOGFILE, $list ) {
warn "Unable to open file '$list': $!";
} else ....
warn goes to STDERR, so it is basically the same as print STDERR.
Speaking of open, you should use three argument open with explicit file handle. So it becomes:
if (! open my $fh, "<", $list )
} else {
while (<LOGFILE>) {
Since you are looking for a multiline match, you need to slurp the file(s) instead. This is done by setting the input record separator to undef. Typically like this:
my $file = do { local $/; <$fh> }; # $fh is our file handle, formerly LOGFILE
Next how to apply the regex:
if($_ =~ /".*the.*automaically.*\{\n.*period\=20ns.*\}"/) {
$_ =~ is optional. A regex automatically matches against $_ if no other variable is used.
You should probably not use " in the regex. Unless you have " in the target string. I don't know why you put it there, maybe you think strings need to be quoted inside a regex. If you do, that is wrong. To match the string you have above, you do:
if( /the.*automaically.*{.*period=20ns.*}/s ) {
You don't have to escape \ curly braces {} or equal sign =. You don't have to use quotes. The /s modifier makes . (wildcard character period) also match newline, so we can remove \n. We can remove .* from start or end of string, because that is implied, regex matches are always partial unless anchors are used.
break;
The break keyword is only used with the switch feature, which is experimental, plus you don't use it, or have it enabled. So it is just a bareword, which is wrong. If you want to exit a loop prematurely, you use last. Note that we don't have to use last because we slurp the file, so we have no loop.
Also, you generally should pick suitable variable names. If you have a list of files, the variable that contains the file name should not be called $list, I think. It is logical that it is called $file. And the input file handle should not be called LOGFILE, it should be called $input, or $infh (input file handle).
This is what I get if I apply the above to your program:
use strict;
use warnings;
my $dir = "/home/vikas";
my #files = glob( $dir . '/*' );
my $logfile = "$dir/log.txt";
open STDERR, ">", $logfile or die "Cannot open '$logfile' for overwrite: $!";
foreach my $file (#files) {
if(! open my $input, "<", $file) {
warn "Unable to open '$file': $!";
} else {
my $txt = do { local $/; <$fh> };
if($txt =~ /the.*automaically.*{.*period=20ns.*}/) {
print " $file : File contain the required string\n";
}
}
}
Note that the print goes to STDOUT, not to the error log. It is not common practice to have STDOUT and STDERR to the same file. If you want, you can simply redirect output in the shell, like this:
$ perl foo.pl > output.txt
The following sample code demonstrates usage of regex for multiline case with logger($fname,$msg) subroutine.
Code snippet assumes that input files are relatively small and can be read into a variable $data (an assumption is that computer has enough memory to read into).
NOTE: input data files should be distinguishable from rest files in home directory $ENV{HOME}, in this code sample these files assumed to match pattern test_*.dat, perhaps you do not intend to scan absolutely all files in your home directory (there could be many thousands of files but you interested in a few only)
#!/usr/bin/env perl
use strict;
use warnings;
use feature 'say';
my($dir,$re,$logfile);
$dir = '/home/vikas/';
$re = qr/the file_is being created_automaically \{\s+period=20ns\s+\}/;
$logfile = $dir . 'logfile.txt';
unlink $logfile if -e $logfile;
for ( glob($dir . "test_*.dat") ) {
if( open my $fh, '<', $_ ) {
my $data = do { local $/; <$fh> };
close $fh;
logger($logfile, "INFO: $_ contains the required string")
if $data =~ /$re/gsm;
} else {
logger($logfile, "WARN: unable to open $_");
}
}
exit 0;
sub logger {
my $fname = shift;
my $text = shift;
open my $fh, '>>', $fname
or die "Couldn't to open $fname";
say $fh $text;
close $fh;
}
Reference: regex modifies, unlink, perlvar

Setting Binary Transfer mode

My Perl script below is very basic. It goes and copies a .zip file located on one server and transfers it to another server.
#!/usr/bin/perl -w
use strict;
use warnings;
my $remotehost ="XXXXXX";
my $remotepath = "/USA/Fusion_Keyword_Reports";
my $remoteuser = "XXXXXXX";
my $remotepass = "XXXXXXX";
my $inputfile ="/fs/fs01/crmdata/SYWR/AAM/list8.txt";
my $remotefile1;
#my $DIR="/fs/fs01/crmdata/SYWR/AAM";
open (FILEIN, "<", $inputfile) or die "can't open list8 file";
while (my $line =<FILEIN>) {
if ($line =~ m /Keywords-Report(.*?)/i && $line !~ m/Keywords-Report-loopback/i) {
print $line;
$remotefile1 =$line;
last;
}
}
close FILEIN;
print "remotefile $remotefile1\n";
my $DIR1="/fs/fs01/crmdata/SYWR/AAM/$remotefile1";
my $cmd= "ftp -in";
my $ftp_command = "open $remotehost
user $remoteuser $remotepass
cd $remotepath
asc
get $remotefile1
bye
";
open (CMD, "|$cmd");
print CMD $ftp_command;
close (CMD);
exit(0);
When I run the script it does work but I get an error and the file that gets transferred is corrupted as a result.
226 Transfer complete.
WARNING! 40682 bare linefeeds received in ASCII mode.
File may not have transferred correctly.
I did some reading and I think I need to set the transfer mode to binary. However I am really not sure how to do that in my script. Additionally, I am not sure that is the right solution either.
I would really appreciate your thoughts about this error. If setting the transfer mode to Binary will fix this problem can you please show me where I would do that?
my $ftp_command = "open $remotehost
user $remoteuser $remotepass
cd $remotepath
binary
get $remotefile1
bye
";

bash - How to find string from file and get its position?

File services - contains many records like this one:
define service {
host_name\t\t\t\tHOSTNAME
...
...
}
File hosts - contains records:
define host {
host_name\t\t\t\tHOSTNAME
...
...
}
and I need to go to hosts, somehow get name of HOSTNAME from first record, then go to file services and find all records with that HOSTNAME and put them to other file. Then do it for every HOSTNAME in hosts.
What I don't know is primary how to get the HOSTNAME from file hosts and then how to get a whole record in file services to a variable. I have prepared a regex (hope it's right) ^define.*host_name\t\t\t\t$HOSTNAME.*}
Please give me a few advices or examples how to get wanted result.
The files you provide look very much like nagios configuration files.
sed might be your friend here, as it allows you to slice the file into smaller parts, eg:
:t
/^define service {/,/}$/ { # For each line between these block markers..
/}$/!{ # If we are not at the /end/ marker
$!{ # nor the last line of the file,
N; # add the Next line to the pattern space
bt
} # branch (loop back) to the :t label.
} # This line matches the /end/ marker.
/host_name[ \t]\+HOSTNAME\b/!d; # delete the block if wrong host.
}
That example lifted from the sed faq 4.21, and adapted slightly. You could also look at question 4.22 which appears to address this directly:
http://sed.sourceforge.net/sedfaq4.html#s4.22
Like the previous answer, I'm also inclined to say you're probably better off using another scripting language. If you need a different interpreter to get this done anyway, might as well use something you know.
This task a bit too complex for a bash script. I would use Perl:
#!/usr/bin/perl
use warnings;
use strict;
open my $SRV, '<', 'services' or die $!;
open my $HST, '<', 'hosts' or die $!;
my %services;
{ local $/ = "\n}";
while (my $service = <$SRV>) {
my ($hostname) = $service =~ /^\s*host_name\t+(.+?)\s*$/m;
push #{ $services{$hostname} }, $service if defined $hostname;
}
}
while (my $line = <$HST>) {
if (my ($host) = $line =~ /^\s*host_name\t+(.+?)\s*$/) {
if (exists $services{$host}) {
print "===== $host =====\n";
print "$_\n" for #{ $services{$host} };
} else {
warn "$host not found in services!\n";
}
}
}

Issues with reducing duplicate output from log file search

This website has been a great help since I'm getting back into programming and I'm attempting to write a simple perl script that will analyze apache log files from a directory (multiple domains), pull the last 1000 lines of each log file, strip the IP addresses from the log file and then compare them with a known block list of bot spammers.
Now so far I've got the script working except for one issue. Lets say I have the IP address 10.128.45.5 in two log files, the script of course analyzes each log file in turn stripping and reducing the IP's to one PER log file but what I'm trying to do is narrow that down even more to one per instance I run this script, regardless if the same IP appears across multiple log files.
Here's the code I've gotten so far, sorry if it's a bit messy.
#!/usr/bin/perl
# Extract IP's from apache access logs for the last hour and matches with forum spam bot list.
# The fun work of Daniel Pearson
use strict;
use warnings;
use Socket;
# Declarations
my ($file,$list,#files,%ips,$match,$path,$sort);
my $timestamp = localtime(time);
# Check to see if matching file exists
$list ='list';
if (-e $list) {
Delete the file so we can download a new one if it exists
print "File Exists!";
print "Deleting File $list\n";
unlink($list);
}
sleep(5);
system ("wget http://www.domain.com/list");
sleep(5);
my $dir = $ARGV[0] or die "Need to specify the log file directory\n";
opendir(DIR, "$dir");
#files = grep(/\.*$/,readdir(DIR));
closedir(DIR);
foreach my $file(#files) {
my $sum = 0;
if (-d $file) {
print "Skipping Directory $file\n";
}
else {
$path = "$dir$file";
open my $path, "-|", "/usr/bin/tail", "-1000", "$path" or die "could not start tail on $path: $!";
my %ips;
while (my $line = <$path>) {
chomp $line;
if ($line =~ m/(?!0+\.0+\.0+\.0+$)(([01]?\d\d?|2[0-4]\d|25[0-5])\.([01]?\d\d?|2[0-4]\d|25[0-5])\.([01]?\d\d?|2[0-4]\d|25[0-5])\.([01]?\d\d?|2[0-4]\d|25[0-5]))/g) {
my $ip = $1;
$ips{$ip} = $ip;
}
}
}
foreach my $key (sort keys %ips) {
open ("files","$list");
while (my $sort = <files>) {
chomp $sort;
if ($key =~ $sort) {
open my $fh, '>>', 'banned.out';
print "Match Found we need to block it $key\n";
print $fh "$key:$timestamp\n";
close $fh;
}
}
}
}
Any advice that could be given I would be grateful for.
To achieve the task:
Move my %ips outside of (above) the foreach my $file (#files) loop.
Move foreach my $key ( sort keys %ips ) outside of (below) the foreach my $file (#files) loop.

Simple perl opendir

I am completely new to perl and have just been learning it. I came across this script I need to run that has some network Tstat trace data. However, I get an error 'Cannot parse date.'
The code that generates this is here
foreach my $dir (#trace_dirs) {
undef #traces;
opendir(DIR, $dir) || die "Can't open dir: $dir \n";
#traces = grep { /.out$/ && -d "$dir/$_" } readdir(DIR);
foreach my $trace (#traces) {
$trace =~ /^(\d\d)_(\d\d)_(\d\d)_(\w\w\w)_(\d\d\d\d)/;
$trace_date=&ParseDate("$3/$4/$5 $1:$2") || die "Cannot parse date \n";
$traces{$trace_date} = $trace;
$trace_dir{$trace_date} = $dir;
}
closedir DIR;
}
can some tell me what this code is looking for?
When you run into problems like this, throw yourself a bone by looking at the data you are trying to play with. Make sure that the value in $trace is what you expect and that the date string you create is what you expect:
print "Trace is [$trace]\n";
if( $trace =~ /^(\d\d)_(\d\d)_(\d\d)_(\w\w\w)_(\d\d\d\d)/ ) {
my $date = "$3/$4/$5 $1:$2";
print "date is [$date]\n";
$trace_date= ParseDate( $date ) || die "Cannot parse date [$date]\n";
}
I'm guessing that the value in $4, which apparently is a string like 'Jan', 'Feb', and so on, isn't something that ParseDate likes.
Note that you should only use the capture variables after a successful pattern match, lest they be left over from a different match.
However, I get an error 'Cannot parse date.'
You get the error due to the line:
$trace =~ /^(\d\d)_(\d\d)_(\d\d)_(\w\w\w)_(\d\d\d\d)/;
The script expects that all files in the directory with extension .out have proper timestamps in the beginning of their names. And the line of the script lack any error handling.
Try adding some check here, e.g.:
unless($trace =~ /^(\d\d)_(\d\d)_(\d\d)_(\w\w\w)_(\d\d\d\d)/) {
warn "WRN: Malformed file name: $trace\n";
next;
}
That checks if the file name matches, and if it doesn't, warning would be printed and it would be skipped.
Alternatively you can also add the check to the grep {} readdir() line:
#traces = grep { /.out$/ && /^(\d\d)_(\d\d)_(\d\d)_(\w\w\w)_(\d\d\d\d)/ && -d "$dir/$_" } readdir(DIR);
to filter out misplaced .out files (hm, actually directories) before they reach the loop which calls the ParseDate function.

Resources