Reading specified line using Perl program and command-line arguments - linux

So, let's say I am writing a Perl program:
./program.perl 10000 < file
I want it to read the 10000th line of "file" only. How could I do it using input redirection in this form? It seems that I keep getting something along the lines of 10000 is not a file.
I thought this would work:
#!/usr/bin/perl -w
$line_num = 0;
while ( defined ($line = <>) && $line_num < $ARGV[0]) {
++$line_no;
if ($line_no == $ARGV[0]) {
print "$line\n";
exit 0;
}
}
But it failed miserably.

If there are command-line arguments, then <> opens the so-named files and reads from them, and if not, then it takes from standard-input. (See "I/O Operators" in the perlop man-page.)
If, as in your case, you want to read from standard-input whether or not there are command-line arguments, then you need to use <STDIN> instead:
while ( defined ($line = <STDIN>) && $line_num < $ARGV[0]) {

Obligatory one-liner:
perl -ne 'print if $. == 10000; exit if $. > 10000'
$. counts lines read from stdin. -n implicitly wraps program in:
while (<>) {
...program...
}

You could use Tie::File
use Tie::File;
my ($name, $num) = #ARGV;
tie my #file, 'Tie::File', $name or die $!;
print $file[$num];
untie #file;
Usage:
perl script.pl file.csv 10000

You could also do this very simply using awk:
awk 'NR==10000' < file
or sed:
sed -n '10000p' file

Related

Need to open a file and replace multiple strings

I have a really big xml file. It has certain incrementing numbers inside, which i would like to replace with a different incrementing number. I've looked and here is what someone suggested here before. Unfortunately i cant get it to work :(
In the code below all instances of 40960 should be replaced with 41984, all instances of 40961 with 41985 etc. Nothing happens. What am i doing wrong?
use strict;
use warnings;
my $old = 40960;
my $new = 41984;
my $string;
my $file = 'file.txt';
rename($file, $file.'.bak');
open(IN, '<'.$file.'.bak') or die $!;
open(OUT, '>'.$file) or die $!;
$old++;
$new++;
for (my $i = 0; $i < 42; $i++) {
while(<IN>) {
$_ =~ s/$old/$new/g;
print OUT $_;
}
}
close(IN);
close(OUT);
Other answers give you better solutions to your problem. Mine concentrates on explaining why your code didn't work.
The core of your code is here:
$old++;
$new++;
for (my $i = 0; $i < 42; $i++) {
while(<IN>) {
$_ =~ s/$old/$new/g;
print OUT $_;
}
}
You increment the values of $old and $new outside of your loops. And you never change those values again. So you're only making the same substitution (changing 40961 to 41985) 42 times. You never try to change any other numbers.
Also, look at the while loop that reads from IN. On your first iteration (when $i is 0) you read all of the data from IN and the file pointer is left at the end of the file. So when you go into the while loop again on your second iteration (and all subsequent iterations) you read no data at all from the file. You need to reset the file pointer to the start of your file at the end of each iteration.
Oh, and the basic logic is wrong. If you think about it, you'll end up writing each line to the output file 42 times. You need to do all possible substitutions before writing the line. So your inner loop needs to be the outer loop (and vice versa).
Putting those suggestions together, you need something like this:
my $old = 40960;
my $change = 1024;
while (<IN>) {
# Easier way to write your loop
for my $i ( 1 .. 42 ) {
my $new = $old + $change;
# Use \b to mark word boundaries
s/\b$old\b/$new/g;
$old++;
}
# Print each output line only once
print OUT $_;
}
Here's an example that works line by line, so the size of file is immaterial. The example assumes you want to replace things like "45678", but not "fred45678". The example also assumes that there is a range of numbers, and you want them replaced with a new range offset by a constant.
#!/usr/bin/perl
use strict;
use warnings;
use constant MIN => 40000;
use constant MAX => 90000;
use constant DIFF => +1024;
sub repl { $_[0] >= MIN && $_[0] <= MAX ? $_[0] + DIFF : $_[0] }
while (<>) {
s/\b(\d+)\b/repl($1)/eg;
print;
}
exit(0);
Invoked with the file you want to transform as an argument, it produces altered output on stdout. With the following input ...
foo bar 123
40000 50000 60000 99999
fred60000
fred 60000 fred
... it produces this output.
foo bar 123
41024 51024 61024 99999
fred60000
fred 61024 fred
There are a couple of classic Perlisms here, but the example shouldn't be hard to follow if you RTFM appropriately.
Here is an alternative way which reads the input file into a string and does all the substitutions at once:
use strict;
use warnings;
{
my $old = 40960;
my $new = 41984;
my ($regexp) = map { qr/$_/ } join '|', map { $old + $_ } 0..41;
my $file = 'file.txt';
rename($file, $file.'.bak');
open(IN, '<'.$file.'.bak') or die $!;
my $str = do {local $/; <IN>};
close IN;
$str =~ s/($regexp)/do_subst($1, $old, $new)/ge;
open(OUT, '>'.$file) or die $!;
print OUT $str;
close OUT;
}
sub do_subst {
my ( $old, $old_base, $new_base ) = #_;
my $i = $old - $old_base;
my $new = $new_base + $i;
return $new;
}
Note: Can probably be made more efficient by using Regexp::Assemble

perl shell command variable error

I am trying following code in one of my perl script and getting error, how do i execute following shell command and store in variable
#!/usr/bin/perl -w
my $p = $( PROCS=`echo /proc/[0-9]*|wc -w|tr -d ' '`; read L1 L2 L3 DUMMY < /proc/loadavg ; echo ${L1}:${L2}:${L3}:${PROCS} );
print $p;
Error:
./foo.pl
Bareword found where operator expected at /tmp/foo.pl line 3, near "$( PROCS"
(Missing operator before PROCS?)
syntax error at /tmp/foo.pl line 3, near "$( PROCS"
Unterminated <> operator at /tmp/foo.pl line 3.
What is wrong?
This:
my $p = $( PROCS=`echo /proc/[0-9]*|wc -w|tr -d ' '`; read L1 L2 L3 DUMMY < /proc/loadavg ; echo ${L1}:${L2}:${L3}:${PROCS} );
Isn't perl. It's how you'd execute a command in bash.
To run a command in perl you can:
use system.
put your command in backticks
qx (quote-execute): http://perldoc.perl.org/perlop.html#Quote-Like-Operators
However, you're enumerating a directory there, wordcounting, tr-ing and reading. So you don't actually need to do all that using a shell command. And indeed, I'd discourage you from doing so, because that's just a way to make a mess with no productive benefit.
Looks like what you're after as an end result is the 3 load average samples and a count of number of processes. Is that right?
In which case:
my $proc_count = scalar ( () = glob ( "/proc/[0-9]*" ));
open ( my $la, "<", "/proc/loadavg" ) or warn $!;
print join ( ":", split ( /\s+/, <$la> ), $proc_count ),"\n";
Something like that, anyway.
Simply printing a shell command in your Perl script won't actually execute it. You have to tell Perl that it's an external command, which you can do with system:
use strict;
use warnings;
my $command = q{
PROCS=`echo /proc/[0-9]*|wc -w|tr -d ' '`;
read L1 L2 L3 DUMMY < /proc/loadavg;
echo ${L1}:${L2}:${L3}:${PROCS}
};
system($command);
(Note that you should put use strict; use warnings; at the top of every Perl script you write.)
However, it's generally better to use native Perl functionality instead of system. All you're doing is reading from files, which Perl is perfectly capable of doing:
use strict;
use warnings;
use 5.010;
my #procs = glob '/proc/[0-9]*';
my $file = '/proc/loadavg';
open my $fh, '<', $file or die "Failed to open '$file': $!";
my $load = <$fh>;
say(join ':', (split ' ', $load)[0..2], scalar #procs);
Even better might be to use the Proc::ProcessTable module, which provides a consistent interface to the /proc filesystem across different flavors of *nix. It got some bad reviews early on but is supposedly getting bugfixes now; I haven't used it myself but you might take a look.

How to get Perl to loop over all files in a directory?

I have a Perl script with contains
open (FILE, '<', "$ARGV[0]") || die "Unable to open $ARGV[0]\n";
while (defined (my $line = <FILE>)) {
# do stuff
}
close FILE;
and I would like to run this script on all .pp files in a directory, so I have written a wrapper script in Bash
#!/bin/bash
for f in /etc/puppet/nodes/*.pp; do
/etc/puppet/nodes/brackets.pl $f
done
Question
Is it possible to avoid the wrapper script and have the Perl script do it instead?
Yes.
The for f in ...; translates to the Perl
for my $f (...) { ... } (in the case of lists) or
while (my $f = ...) { ... } (in the case of iterators).
The glob expression that you use (/etc/puppet/nodes/*.pp) can be evaluated inside Perl via the glob function: glob '/etc/puppet/nodes/*.pp'.
Together with some style improvements:
use strict; use warnings;
use autodie; # automatic error handling
while (defined(my $file = glob '/etc/puppet/nodes/*.pp')) {
open my $fh, "<", $file; # lexical file handles, automatic error handling
while (defined( my $line = <$fh> )) {
do stuff;
}
close $fh;
}
Then:
$ /etc/puppet/nodes/brackets.pl
This isn’t quite what you asked, but another possibility is to use <>:
while (<>) {
my $line = $_;
# do stuff
}
Then you would put the filenames on the command line, like this:
/etc/puppet/nodes/brackets.pl /etc/puppet/nodes/*.pp
Perl opens and closes each file for you. (Inside the loop, the current filename and line number are $ARGV and $. respectively.)
Jason Orendorff has the right answer:
From Perlop (I/O Operators)
The null filehandle <> is special: it can be used to emulate the behavior of sed and awk, and any other Unix filter program that takes a list of filenames, doing the same to each line of input from all of them. Input from <> comes either from standard input, or from each file listed on the command line.
This doesn't require opendir. It doesn't require using globs or hard coding stuff in your program. This is the natural way to read in all files that are found on the command line, or piped from STDIN into the program.
With this, you could do:
$ myprog.pl /etc/puppet/nodes/*.pp
or
$ myprog.pl /etc/puppet/nodes/*.pp.backup
or even:
$ cat /etc/puppet/nodes/*.pp | myprog.pl
take a look at this documentation it explains all you need to know
#!/usr/bin/perl
use strict;
use warnings;
my $dir = '/tmp';
opendir(DIR, $dir) or die $!;
while (my $file = readdir(DIR)) {
# We only want files
next unless (-f "$dir/$file");
# Use a regular expression to find files ending in .pp
next unless ($file =~ m/\.pp$/);
open (FILE, '<', $file) || die "Unable to open $file\n";
while (defined (my $line = <FILE>)) {
# do stuff
}
}
closedir(DIR);
exit 0;
I would suggest to put all filenames to array and then use this array as parameters list to your perl method or script. Please see following code:
use Data::Dumper
$dirname = "/etc/puppet/nodes";
opendir ( DIR, $dirname ) || die "Error in opening dir $dirname\n";
my #files = grep {/.*\.pp/} readdir(DIR);
print Dumper(#files);
closedir(DIR);
Now you can pass \#files as parameter to any perl method.
my #x = <*>;
foreach ( #x ) {
chomp;
if ( -f "$_" ) {
print "process $_\n";
# do stuff
next;
};
};
Perl can shell out to execute system commands in various ways, the most straightforward is using backticks ``
use strict;
use warnings FATAL => 'all';
my #ls = `ls /etc/puppet/nodes/*.pp`;
for my $f ( #ls ) {
open (my $FILE, '<', $f) || die "Unable to open $f\n";
while (defined (my $line = <$FILE>)) {
# do stuff
}
close $FILE;
}
(Note: you should always use strict; and use warnings;)

How can I consolidate several Perl one-liners into a single script?

I would like to move several one liners into a single script.
For example:
perl -i.bak -pE "s/String_ABC/String_XYZ/g" Cities.Txt
perl -i.bak -pE "s/Manhattan/New_England/g" Cities.Txt
Above works well for me but at the expense of two disk I/O operations.
I would like to move the aforementioned logic into a single script so that all substitutions are effectuated with the file opened and edited only once.
EDIT1: Based on your recommendations, I wrote this snippet in a script which when invoked from a windows batch file simply hangs:
#!/usr/bin/perl -i.bak -p Cities.Txt
use strict;
use warnings;
while( <> ){
s/String_ABC/String_XYZ/g;
s/Manhattan/New_England/g;
print;
}
EDIT2: OK, so here is how I implemented your recommendation. Works like a charm!
Batch file:
perl -i.bal MyScript.pl Cities.Txt
MyScript.pl
#!/usr/bin/perl
use strict;
use warnings;
while( <> ){
s/String_ABC/String_XYZ/g;
s/Manhattan/New_England/g;
print;
}
Thanks a lot to everyone that contributed.
The -p wraps the argument to -E with:
while( <> ) {
# argument to -E
print;
}
So, take all the arguments to -E and put them in the while:
while( <> ) {
s/String_ABC/String_XYZ/g;
s/Manhattan/New_England/g;
print;
}
The -i sets the $^I variable, which turns on some special magic handling ARGV:
$^I = "bak";
The -E turns on the new features for that versions of Perl. You can do that by just specifying the version:
use v5.10;
However, you don't use anything loaded with that, at least in what you've shown us.
If you want to see everything a one-liner does, put a -MO=Deparse in there:
% perl -MO=Deparse -i.bak -pE "s/Manhattan/New_England/g" Cities.Txt
BEGIN { $^I = ".bak"; }
BEGIN {
$^H{'feature_unicode'} = q(1);
$^H{'feature_say'} = q(1);
$^H{'feature_state'} = q(1);
$^H{'feature_switch'} = q(1);
}
LINE: while (defined($_ = <ARGV>)) {
s/Manhattan/New_England/g;
}
continue {
die "-p destination: $!\n" unless print $_;
}
-e syntax OK
You can put arguments on the #! line. Perl will read them, even on Windows.
#!/usr/bin/perl -i.bak -p
s/String_ABC/String_XYZ/g;
s/Manhattan/New_England/g;
or you can keep it a one-liner as #ephemient said in the comments.
perl -i.bak -pE "s/String_ABC/String_XYZ/g; s/Manhattan/New_England/g" Cities.Txt
-i + -p basically puts a while loop around your program. Each line comes in as $_, your code runs, and $_ is printed out at the end. Repeat. So you can have as many statements as you want.

Perl: adding a string to $_ is producing strange results

I wrote a super simple script:
#!/usr/bin/perl -w
use strict;
open (F, "<ids.txt") || die "fail: $!\n";
my #ids = <F>;
foreach my $string (#ids) {
chomp($string);
print "$string\n";
}
close F;
This is producing an expected output of all the contents of ids.txt:
hello
world
these
annoying
sourcecode
lines
Now I want to add a file-extension: .txt for every line. This line should do the trick:
#!/usr/bin/perl -w
use strict;
open (F, "<ids.txt") || die "fail: $!\n";
my #ids = <F>;
foreach my $string (#ids) {
chomp($string);
$string .= ".txt";
print "$string\n";
}
close F;
But the result is as follows:
.txto
.txtd
.txte
.txtying
.txtcecode
Instead of appending ".txt" to my lines, the first 4 letters of my string will be replaced by ".txt" Since I want to check if some files exist, I need the full filename with extension.
I have tried to chop, chomp, to substitute (s/\n//), joins and whatever. But the result is still a replacement instead of an append.
Where is the mistake?
Chomp does not remove BOTH \r and \n if the file has DOS line endings and you are running on Linux/Unix.
What you are seeing is actually the original string, a carriage return, and the extension, which overwrites the first 4 characters on the display.
If the incoming file has DOS/Windows line endings you must remove both:
s/\R+$//
A useful debugging technique when you are not quite sure why your data is getting set to what it is is to dump it with Data::Dumper:
#!/usr/bin/perl -w
use strict;
use Data::Dumper ();
$Data::Dumper::Useqq = 1; # important to be able to actually see differences in whitespace, etc
open (F, "<ids.txt") || die "fail: $!\n";
my #ids = <F>;
foreach my $string (#ids) {
chomp($string);
print "$string\n";
print Data::Dumper::Dumper( { 'string' => $string } );
}
close F;
have you tried this?
foreach my $string (#ids) {
chomp($string);
print $string.".txt\n";
}
I'm not sure what's wrong with your code though. these results are strange

Resources