How can I consolidate several Perl one-liners into a single script? - string

I would like to move several one liners into a single script.
For example:
perl -i.bak -pE "s/String_ABC/String_XYZ/g" Cities.Txt
perl -i.bak -pE "s/Manhattan/New_England/g" Cities.Txt
Above works well for me but at the expense of two disk I/O operations.
I would like to move the aforementioned logic into a single script so that all substitutions are effectuated with the file opened and edited only once.
EDIT1: Based on your recommendations, I wrote this snippet in a script which when invoked from a windows batch file simply hangs:
#!/usr/bin/perl -i.bak -p Cities.Txt
use strict;
use warnings;
while( <> ){
s/String_ABC/String_XYZ/g;
s/Manhattan/New_England/g;
print;
}
EDIT2: OK, so here is how I implemented your recommendation. Works like a charm!
Batch file:
perl -i.bal MyScript.pl Cities.Txt
MyScript.pl
#!/usr/bin/perl
use strict;
use warnings;
while( <> ){
s/String_ABC/String_XYZ/g;
s/Manhattan/New_England/g;
print;
}
Thanks a lot to everyone that contributed.

The -p wraps the argument to -E with:
while( <> ) {
# argument to -E
print;
}
So, take all the arguments to -E and put them in the while:
while( <> ) {
s/String_ABC/String_XYZ/g;
s/Manhattan/New_England/g;
print;
}
The -i sets the $^I variable, which turns on some special magic handling ARGV:
$^I = "bak";
The -E turns on the new features for that versions of Perl. You can do that by just specifying the version:
use v5.10;
However, you don't use anything loaded with that, at least in what you've shown us.
If you want to see everything a one-liner does, put a -MO=Deparse in there:
% perl -MO=Deparse -i.bak -pE "s/Manhattan/New_England/g" Cities.Txt
BEGIN { $^I = ".bak"; }
BEGIN {
$^H{'feature_unicode'} = q(1);
$^H{'feature_say'} = q(1);
$^H{'feature_state'} = q(1);
$^H{'feature_switch'} = q(1);
}
LINE: while (defined($_ = <ARGV>)) {
s/Manhattan/New_England/g;
}
continue {
die "-p destination: $!\n" unless print $_;
}
-e syntax OK

You can put arguments on the #! line. Perl will read them, even on Windows.
#!/usr/bin/perl -i.bak -p
s/String_ABC/String_XYZ/g;
s/Manhattan/New_England/g;
or you can keep it a one-liner as #ephemient said in the comments.
perl -i.bak -pE "s/String_ABC/String_XYZ/g; s/Manhattan/New_England/g" Cities.Txt
-i + -p basically puts a while loop around your program. Each line comes in as $_, your code runs, and $_ is printed out at the end. Repeat. So you can have as many statements as you want.

Related

Perl script to search a word inside the directory

I'am looking for a perl script to grep for a string in all files inside a directory .
bash command .
Code:
grep -r 'word' /path/to/dir
This is a fairly canonical task while I couldn't find straight answers with a possibly easiest and simples tool for the job, the handy Path::Tiny
use warnings;
use strict;
use feature 'say';
use Data::Dump; # dd
use Path::Tiny; # path
my $dir = shift // '.';
my $pattern = qr/word/;
my $ret = path($dir)->visit(
sub {
my ($entry, $state) = #_;
return if not -f;
for ($entry->lines) {
if (/$pattern/) {
print "$entry: $_";
push #{$state->{$entry}}, $_;
}
}
},
{ recurse => 1 }
);
dd $ret; # print the returned complex data structure
The way a file is read here, using lines, is just one way to do that. It may not be suitable for extremely large files as it reads all lines at once, where one better read line by line.
The visit method is based on iterator, which accomplishes this task cleanly as well
my $iter = path($dir)->iterator({ recurse => 1 });
my $info;
while (my $e = $iter->()) {
next if not -f $e;
# process the file $e as needed
#/$pattern/ and push #{$info->{$e}}, $_ and print "$e: $_"
# for $e->lines
}
Here we have to provide a data structure to accumulate information but we get more flexibility.
The -f filetest used above, of a "plain" file, is still somewhat permissive; it allows for swap files, for example, which some editors keep during a session (vim for instance). Those will result in all kinds of matches. To stay with purely ASCII or UTF-8 files use -T test.
Otherwise, there are libraries for recursive traversal and searching, for example File::Find (or File::Find::Rule) or Path::Iterator::Rule.
For completeness, here is a take with the core File::Find
use warnings;
use strict;
use feature 'say';
use File::Find;
my #dirs = #ARGV ? #ARGV : '.';
my $pattern = qr/word/;
my %res;
find( sub {
return if not -T; # ASCII or UTF-8 only
open my $fh, '<', $_ or do {
warn "Error opening $File::Find::name: $!";
return;
};
while (<$fh>) {
if (/$pattern/) {
chomp;
push #{$res{$File::Find::name}}, $_
}
}
}, #dirs
);
for my $k (keys %res) {
say "In file $k:";
say "\t$_" for #{$res{$k}};
}

finding a file in directory using perl script

I'm trying to develop a perl script that looks through all of the user's directories for a particular file name without the user having to specify the entire pathname to the file.
For example, let's say the file of interest was data.list. It's located in /home/path/directory/project/userabc/data.list. At the command line, normally the user would have to specify the pathname to the file like in order to access it, like so:
cd /home/path/directory/project/userabc/data.list
Instead, I want the user just to have to enter script.pl ABC in the command line, then the Perl script will automatically run and retrieve the information in the data.list. which in my case, is count the number of lines and upload it using curl. the rest is done, just the part where it can automatically locate the file
Even though very feasible in Perl, this looks more appropriate in Bash:
#!/bin/bash
filename=$(find ~ -name "$1" )
wc -l "$filename"
curl .......
The main issue would of course be if you have multiple files data1, say for example /home/user/dir1/data1 and /home/user/dir2/data1. You will need a way to handle that. And how you handle it would depend on your specific situation.
In Perl that would be much more complicated:
#! /usr/bin/perl -w
eval 'exec /usr/bin/perl -S $0 ${1+"$#"}'
if 0; #$running_under_some_shell
use strict;
# Import the module File::Find, which will do all the real work
use File::Find ();
# Set the variable $File::Find::dont_use_nlink if you're using AFS,
# since AFS cheats.
# for the convenience of &wanted calls, including -eval statements:
# Here, we "import" specific variables from the File::Find module
# The purpose is to be able to just type '$name' instead of the
# complete '$File::Find::name'.
use vars qw/*name *dir *prune/;
*name = *File::Find::name;
*dir = *File::Find::dir;
*prune = *File::Find::prune;
# We declare the sub here; the content of the sub will be created later.
sub wanted;
# This is a simple way to get the first argument. There is no
# checking on validity.
our $filename=$ARGV[0];
# Traverse desired filesystem. /home is the top-directory where we
# start our seach. The sub wanted will be executed for every file
# we find
File::Find::find({wanted => \&wanted}, '/home');
exit;
sub wanted {
# Check if the file is our desired filename
if ( /^$filename\z/) {
# Open the file, read it and count its lines
my $lines=0;
open(my $F,'<',$name) or die "Cannot open $name";
while (<$F>){ $lines++; }
print("$name: $lines\n");
# Your curl command here
}
}
You will need to look at the argument-parsing, for which I simply used $ARGV[0] and I do dont know what your curl looks like.
A more simple (though not recommended) way would be to abuse Perl as a sort of shell:
#!/usr/bin/perl
#
my $fn=`find /home -name '$ARGV[0]'`;
chomp $fn;
my $wc=`wc -l '$fn'`;
print "$wc\n";
system ("your curl command");
Following code snippet demonstrates one of many ways to achieve desired result.
The code takes one parameter, a word to look for in all subdirectories inside file(s) data.list. And prints out a list of found files in a terminal.
The code utilizes subroutine lookup($dir,$filename,$search) which calls itself recursively once it come across a subdirectory.
The search starts from current working directory (in question was not specified a directory as start point).
use strict;
use warnings;
use feature 'say';
my $search = shift || die "Specify what look for";
my $fname = 'data.list';
my $found = lookup('.',$fname,$search);
if( #$found ) {
say for #$found;
} else {
say 'Not found';
}
exit 0;
sub lookup {
my $dir = shift;
my $fname = shift;
my $search = shift;
my $files;
my #items = glob("$dir/*");
for my $item (#items) {
if( -f $item && $item =~ /\b$fname\b/ ) {
my $found;
open my $fh, '<', $item or die $!;
while( my $line = <$fh> ) {
$found = 1 if $line =~ /\b$search\b/;
if( $found ) {
push #{$files}, $item;
last;
}
}
close $fh;
}
if( -d $item ) {
my $ret = lookup($item,$fname,$search);
push #{$files}, $_ for #$ret;
}
}
return $files;
}
Run as script.pl search_word
Output sample
./capacitor/data.list
./examples/data.list
./examples/test/data.list
Reference:
glob,
Perl file test operators

perl shell command variable error

I am trying following code in one of my perl script and getting error, how do i execute following shell command and store in variable
#!/usr/bin/perl -w
my $p = $( PROCS=`echo /proc/[0-9]*|wc -w|tr -d ' '`; read L1 L2 L3 DUMMY < /proc/loadavg ; echo ${L1}:${L2}:${L3}:${PROCS} );
print $p;
Error:
./foo.pl
Bareword found where operator expected at /tmp/foo.pl line 3, near "$( PROCS"
(Missing operator before PROCS?)
syntax error at /tmp/foo.pl line 3, near "$( PROCS"
Unterminated <> operator at /tmp/foo.pl line 3.
What is wrong?
This:
my $p = $( PROCS=`echo /proc/[0-9]*|wc -w|tr -d ' '`; read L1 L2 L3 DUMMY < /proc/loadavg ; echo ${L1}:${L2}:${L3}:${PROCS} );
Isn't perl. It's how you'd execute a command in bash.
To run a command in perl you can:
use system.
put your command in backticks
qx (quote-execute): http://perldoc.perl.org/perlop.html#Quote-Like-Operators
However, you're enumerating a directory there, wordcounting, tr-ing and reading. So you don't actually need to do all that using a shell command. And indeed, I'd discourage you from doing so, because that's just a way to make a mess with no productive benefit.
Looks like what you're after as an end result is the 3 load average samples and a count of number of processes. Is that right?
In which case:
my $proc_count = scalar ( () = glob ( "/proc/[0-9]*" ));
open ( my $la, "<", "/proc/loadavg" ) or warn $!;
print join ( ":", split ( /\s+/, <$la> ), $proc_count ),"\n";
Something like that, anyway.
Simply printing a shell command in your Perl script won't actually execute it. You have to tell Perl that it's an external command, which you can do with system:
use strict;
use warnings;
my $command = q{
PROCS=`echo /proc/[0-9]*|wc -w|tr -d ' '`;
read L1 L2 L3 DUMMY < /proc/loadavg;
echo ${L1}:${L2}:${L3}:${PROCS}
};
system($command);
(Note that you should put use strict; use warnings; at the top of every Perl script you write.)
However, it's generally better to use native Perl functionality instead of system. All you're doing is reading from files, which Perl is perfectly capable of doing:
use strict;
use warnings;
use 5.010;
my #procs = glob '/proc/[0-9]*';
my $file = '/proc/loadavg';
open my $fh, '<', $file or die "Failed to open '$file': $!";
my $load = <$fh>;
say(join ':', (split ' ', $load)[0..2], scalar #procs);
Even better might be to use the Proc::ProcessTable module, which provides a consistent interface to the /proc filesystem across different flavors of *nix. It got some bad reviews early on but is supposedly getting bugfixes now; I haven't used it myself but you might take a look.

How to get Perl to loop over all files in a directory?

I have a Perl script with contains
open (FILE, '<', "$ARGV[0]") || die "Unable to open $ARGV[0]\n";
while (defined (my $line = <FILE>)) {
# do stuff
}
close FILE;
and I would like to run this script on all .pp files in a directory, so I have written a wrapper script in Bash
#!/bin/bash
for f in /etc/puppet/nodes/*.pp; do
/etc/puppet/nodes/brackets.pl $f
done
Question
Is it possible to avoid the wrapper script and have the Perl script do it instead?
Yes.
The for f in ...; translates to the Perl
for my $f (...) { ... } (in the case of lists) or
while (my $f = ...) { ... } (in the case of iterators).
The glob expression that you use (/etc/puppet/nodes/*.pp) can be evaluated inside Perl via the glob function: glob '/etc/puppet/nodes/*.pp'.
Together with some style improvements:
use strict; use warnings;
use autodie; # automatic error handling
while (defined(my $file = glob '/etc/puppet/nodes/*.pp')) {
open my $fh, "<", $file; # lexical file handles, automatic error handling
while (defined( my $line = <$fh> )) {
do stuff;
}
close $fh;
}
Then:
$ /etc/puppet/nodes/brackets.pl
This isn’t quite what you asked, but another possibility is to use <>:
while (<>) {
my $line = $_;
# do stuff
}
Then you would put the filenames on the command line, like this:
/etc/puppet/nodes/brackets.pl /etc/puppet/nodes/*.pp
Perl opens and closes each file for you. (Inside the loop, the current filename and line number are $ARGV and $. respectively.)
Jason Orendorff has the right answer:
From Perlop (I/O Operators)
The null filehandle <> is special: it can be used to emulate the behavior of sed and awk, and any other Unix filter program that takes a list of filenames, doing the same to each line of input from all of them. Input from <> comes either from standard input, or from each file listed on the command line.
This doesn't require opendir. It doesn't require using globs or hard coding stuff in your program. This is the natural way to read in all files that are found on the command line, or piped from STDIN into the program.
With this, you could do:
$ myprog.pl /etc/puppet/nodes/*.pp
or
$ myprog.pl /etc/puppet/nodes/*.pp.backup
or even:
$ cat /etc/puppet/nodes/*.pp | myprog.pl
take a look at this documentation it explains all you need to know
#!/usr/bin/perl
use strict;
use warnings;
my $dir = '/tmp';
opendir(DIR, $dir) or die $!;
while (my $file = readdir(DIR)) {
# We only want files
next unless (-f "$dir/$file");
# Use a regular expression to find files ending in .pp
next unless ($file =~ m/\.pp$/);
open (FILE, '<', $file) || die "Unable to open $file\n";
while (defined (my $line = <FILE>)) {
# do stuff
}
}
closedir(DIR);
exit 0;
I would suggest to put all filenames to array and then use this array as parameters list to your perl method or script. Please see following code:
use Data::Dumper
$dirname = "/etc/puppet/nodes";
opendir ( DIR, $dirname ) || die "Error in opening dir $dirname\n";
my #files = grep {/.*\.pp/} readdir(DIR);
print Dumper(#files);
closedir(DIR);
Now you can pass \#files as parameter to any perl method.
my #x = <*>;
foreach ( #x ) {
chomp;
if ( -f "$_" ) {
print "process $_\n";
# do stuff
next;
};
};
Perl can shell out to execute system commands in various ways, the most straightforward is using backticks ``
use strict;
use warnings FATAL => 'all';
my #ls = `ls /etc/puppet/nodes/*.pp`;
for my $f ( #ls ) {
open (my $FILE, '<', $f) || die "Unable to open $f\n";
while (defined (my $line = <$FILE>)) {
# do stuff
}
close $FILE;
}
(Note: you should always use strict; and use warnings;)

Reading specified line using Perl program and command-line arguments

So, let's say I am writing a Perl program:
./program.perl 10000 < file
I want it to read the 10000th line of "file" only. How could I do it using input redirection in this form? It seems that I keep getting something along the lines of 10000 is not a file.
I thought this would work:
#!/usr/bin/perl -w
$line_num = 0;
while ( defined ($line = <>) && $line_num < $ARGV[0]) {
++$line_no;
if ($line_no == $ARGV[0]) {
print "$line\n";
exit 0;
}
}
But it failed miserably.
If there are command-line arguments, then <> opens the so-named files and reads from them, and if not, then it takes from standard-input. (See "I/O Operators" in the perlop man-page.)
If, as in your case, you want to read from standard-input whether or not there are command-line arguments, then you need to use <STDIN> instead:
while ( defined ($line = <STDIN>) && $line_num < $ARGV[0]) {
Obligatory one-liner:
perl -ne 'print if $. == 10000; exit if $. > 10000'
$. counts lines read from stdin. -n implicitly wraps program in:
while (<>) {
...program...
}
You could use Tie::File
use Tie::File;
my ($name, $num) = #ARGV;
tie my #file, 'Tie::File', $name or die $!;
print $file[$num];
untie #file;
Usage:
perl script.pl file.csv 10000
You could also do this very simply using awk:
awk 'NR==10000' < file
or sed:
sed -n '10000p' file

Resources