Replace character with other character in a text file using perl - linux

I am having problem in parsing the output from the text file. I want to add pipe symbol in between the character to do mutliple search similar to egrep, the text file is as follows
service entered the stopped state,critical
service entered the running state,clear
Code:
open(my $data, '<', $Config_File) or die "Could not open '$Config_File"
my $reg_exp;
my $severity;
my #fields=();
while (my $line = <$data>)
{
chomp $line;
if(!$line =~ /^$/)
{
#fields = split "," , $line;
$reg_exp = $fields[0];
$severity = $fields[1];
print $reg_exp;
}
}
#print $fields[0];
#last unless defined $line;
close($data);
expected output
service entered the stopped state|service entered the running state

You are not far off, you just need to actually concatenate the strings. The simplest way would be to push the $fields[0] to an array, and wait until the input is done to print it. I.e.:
my #data;
while (my $line = <$data>) {
next if $line =~ /^$/; # no need to chomp
my #fields = split /,/, $line;
push #data, $fields[0];
}
print join("|", #data), "\n";
I sense that you are trying to achieve something else with this code, and that this is a so-called XY-problem.

Related

Need to open a file and replace multiple strings

I have a really big xml file. It has certain incrementing numbers inside, which i would like to replace with a different incrementing number. I've looked and here is what someone suggested here before. Unfortunately i cant get it to work :(
In the code below all instances of 40960 should be replaced with 41984, all instances of 40961 with 41985 etc. Nothing happens. What am i doing wrong?
use strict;
use warnings;
my $old = 40960;
my $new = 41984;
my $string;
my $file = 'file.txt';
rename($file, $file.'.bak');
open(IN, '<'.$file.'.bak') or die $!;
open(OUT, '>'.$file) or die $!;
$old++;
$new++;
for (my $i = 0; $i < 42; $i++) {
while(<IN>) {
$_ =~ s/$old/$new/g;
print OUT $_;
}
}
close(IN);
close(OUT);
Other answers give you better solutions to your problem. Mine concentrates on explaining why your code didn't work.
The core of your code is here:
$old++;
$new++;
for (my $i = 0; $i < 42; $i++) {
while(<IN>) {
$_ =~ s/$old/$new/g;
print OUT $_;
}
}
You increment the values of $old and $new outside of your loops. And you never change those values again. So you're only making the same substitution (changing 40961 to 41985) 42 times. You never try to change any other numbers.
Also, look at the while loop that reads from IN. On your first iteration (when $i is 0) you read all of the data from IN and the file pointer is left at the end of the file. So when you go into the while loop again on your second iteration (and all subsequent iterations) you read no data at all from the file. You need to reset the file pointer to the start of your file at the end of each iteration.
Oh, and the basic logic is wrong. If you think about it, you'll end up writing each line to the output file 42 times. You need to do all possible substitutions before writing the line. So your inner loop needs to be the outer loop (and vice versa).
Putting those suggestions together, you need something like this:
my $old = 40960;
my $change = 1024;
while (<IN>) {
# Easier way to write your loop
for my $i ( 1 .. 42 ) {
my $new = $old + $change;
# Use \b to mark word boundaries
s/\b$old\b/$new/g;
$old++;
}
# Print each output line only once
print OUT $_;
}
Here's an example that works line by line, so the size of file is immaterial. The example assumes you want to replace things like "45678", but not "fred45678". The example also assumes that there is a range of numbers, and you want them replaced with a new range offset by a constant.
#!/usr/bin/perl
use strict;
use warnings;
use constant MIN => 40000;
use constant MAX => 90000;
use constant DIFF => +1024;
sub repl { $_[0] >= MIN && $_[0] <= MAX ? $_[0] + DIFF : $_[0] }
while (<>) {
s/\b(\d+)\b/repl($1)/eg;
print;
}
exit(0);
Invoked with the file you want to transform as an argument, it produces altered output on stdout. With the following input ...
foo bar 123
40000 50000 60000 99999
fred60000
fred 60000 fred
... it produces this output.
foo bar 123
41024 51024 61024 99999
fred60000
fred 61024 fred
There are a couple of classic Perlisms here, but the example shouldn't be hard to follow if you RTFM appropriately.
Here is an alternative way which reads the input file into a string and does all the substitutions at once:
use strict;
use warnings;
{
my $old = 40960;
my $new = 41984;
my ($regexp) = map { qr/$_/ } join '|', map { $old + $_ } 0..41;
my $file = 'file.txt';
rename($file, $file.'.bak');
open(IN, '<'.$file.'.bak') or die $!;
my $str = do {local $/; <IN>};
close IN;
$str =~ s/($regexp)/do_subst($1, $old, $new)/ge;
open(OUT, '>'.$file) or die $!;
print OUT $str;
close OUT;
}
sub do_subst {
my ( $old, $old_base, $new_base ) = #_;
my $i = $old - $old_base;
my $new = $new_base + $i;
return $new;
}
Note: Can probably be made more efficient by using Regexp::Assemble

How to read "<somestring>" in input string in perl

Below is my code. It still produces same string with no "<init>"
input string :
1: invokespecial #1 // Method java/lang/Object."<init>":()V
my $file = "Hello.javap";
open my $fh, '<', $file or die "Could not open '$file' $!";
while (my $line = <$fh>) {
if (index(uc($line), uc("Code:")) != -1) {
$code_block_started=1;
}
if(index($line,":")==-1)
{
if (my ($method) = $line =~ /.* \/\/ Method (.*);/) {
print "Method: $method\n";
}
print $line;
$code_block_started=0;
}
if($code_block_started){
if ($line =~/[0-9]/) {
my #num_strip = split(':',$line);
my #get_command = split(' ',$num_strip[1]);
# print "\n $get_command[0]";
$count{$get_command[0]}++;
}
}
Are you simply asking how to escape the " in perl? If so, write \"<init>\" just like in most languages.
Are you asking for a regular expression? If so, $str ~= /.* \/\/ Method (.*);/ will put java/lang/Object."<init>":()V into $1.
while (my $str = <>) {
if (my ($method) = $str =~ m{// Method (.*)}) {
print "$method\n";
}
}
when Perl sees the double-quote just before the word "name" it thinks that was the end of the string and then it complains about the word name being a bareword.
You might have already guessed, we need to escape the embedded " character:
use strict;
use warnings;
my $name = 'foo';
print "The \"name\" is \"$name\"\n";
http://perlmaven.com/quoted-interpolated-and-escaped-strings-in-perl

Perl Inserting a string from a file after every occurence of a slash in a url

I have the following URL's:
FILE1.txt
http://www.stackoveflow.com/dog/cat/rabbit/hamster/
192.168.192.168/lion/tiger/elephant/
FILE2.txt
HELLO
GOODBYE
The output I am trying to achieve:
http://www.stackoveflow.com/dogHELLO/cat/rabbit/hamster/
http://www.stackoveflow.com/dog/catHELLO/rabbit/hamster/
http://www.stackoveflow.com/dog/cat/rabbitHELLO/hamster/
http://www.stackoveflow.com/dog/cat/rabbit/hamsterHELLO/
http://www.stackoveflow.com/dog/cat/rabbit/hamster/HELLO
http://www.stackoveflow.com/dogGOODBYE/cat/rabbit/hamster/
http://www.stackoveflow.com/dog/catGOODBYE/rabbit/hamster/
http://www.stackoveflow.com/dog/cat/rabbitGOODBYE/hamster/
http://www.stackoveflow.com/dog/cat/rabbit/hamsterGOODBYE/
http://www.stackoveflow.com/dog/cat/rabbit/hamster/GOODBYE
192.168.192.168/lionHELLO/tiger/elephant/
192.168.192.168/lion/tigerHELLO/elephant/
192.168.192.168/lion/tiger/elephantHELLO/
192.168.192.168/lion/tiger/elephant/HELLO
192.168.192.168/lionGOODBYE/tiger/elephant/
192.168.192.168/lion/tigerGOODBYE/elephant/
192.168.192.168/lion/tiger/elephantGOODBYE/
192.168.192.168/lion/tiger/elephant/GOODBYE
As you can see the strings HELLO and GOODBYE are inserted after every slash, and if there is already a string after the slash it will append the HELLO and GOODBYE after that (e.g http://www.stackoveflow.com/dogHELLO/cat/rabbit/hamster/ and so on).
What I have tried
use strict;
use warnings;
my #f1 = do {
open my $fh, '<', 'FILE1.txt';
<$fh>;
};
chomp #f1;
my #f2 = do {
open my $fh, '<', 'FILE2.txt';
<$fh>;
};
chomp #f2;
for my $f1 (#f1) {
my #fields = $f1 =~ m{[^/]+}g;
for my $f2 (#f2) {
for my $i (0 .. $#fields) {
my #new = #fields;
$new[$i] .= $f2;
print qq{/$_/\n}, for join '/', #new;
}
print "\n\n";
}
}
#courtesy of Borodin
However this code does not cater for url's that have the slashes in the http:// part as these are replaced with http:HELLO/ when it should not do.
Also it does not put HELLO or GOODBYE after the slash if there is no string already there e.g http://www.stackoveflow.com/dog/cat/rabbit/hamster/<--SHOULD PUT HELLO AFTER THIS SLASH AS WELL BUT DOSN'T
It appears that this code removes then re-inserts the slashes with the strings from FILE2.txt, as opposed to inserting HELLO and GOODBYE in the correct place to start with.
My question
Is there a better method of going about achieving the output I require or is there something I can do to my existing code to cater for the problems described above?
Your help is much appreciated, many thanks
Here is the algorithm in prose:
Open File2.txt. Read in all lines, removing the newline. We call the array #words.
Open File2.txt. We call the file handle $fh.
As long as we can read a $line from $fh:
Remove the newline, remove starting and ending slashes.
Split the $line at every slash, call the array #animals.
Loop through the #words, calling each element $word:
Loop through the indices of the #animals, calling each index $i:
Make a #copy of the #animals.
Append the $word to the $i-th element of #copy.
Join the #copy with slashes, surround it with slashes, and print with newline.
Print an empty line.
This program will do what you ask.
use strict;
use warnings;
use autodie;
my #f1 = do {
open my $fh, '<', 'FILE1.txt';
<$fh>;
};
chomp #f1;
my #f2 = do {
open my $fh, '<', 'FILE2.txt';
<$fh>;
};
chomp #f2;
for my $f1 (#f1) {
my #fields = $f1 =~ m{[^/]+}g;
for my $f2 (#f2) {
for my $i (0 .. $#fields) {
my #new = #fields;
$new[$i] .= $f2;
print qq{/$_/\n}, for join '/', #new;
}
print "\n\n";
}
}
output
/dogHELLO/cat/rabbit/hamster/
/dog/catHELLO/rabbit/hamster/
/dog/cat/rabbitHELLO/hamster/
/dog/cat/rabbit/hamsterHELLO/
/dogGOODBYE/cat/rabbit/hamster/
/dog/catGOODBYE/rabbit/hamster/
/dog/cat/rabbitGOODBYE/hamster/
/dog/cat/rabbit/hamsterGOODBYE/
/lionHELLO/tiger/elephant/
/lion/tigerHELLO/elephant/
/lion/tiger/elephantHELLO/
/lionGOODBYE/tiger/elephant/
/lion/tigerGOODBYE/elephant/
/lion/tiger/elephantGOODBYE/
Rather than splitting the line on every slash, you can do it all with a regex.
Updated version:
#!usr/bin/perl
use strict;
use warnings;
my #insert_words = qw/HELLO GOODBYE/;
my $word = 0;
while (<DATA>)
{
chomp;
foreach my $word (#insert_words)
{
my $repeat = 1;
while ((my $match=$_) =~ s|(?<!/)(?:/(?!/)[^/]*){$repeat}[^/]*\K|$word|)
{
print "$match\n";
$repeat++;
}
print "\n";
}
}
__DATA__
/dog/cat/rabbit/hamster/
http://www.stackoverflow.com/dog/cat/rabbit/hamster/
The key is the substitution operator: s|(?<!/)(?:/(?!/)[^/]*){$repeat}[^/]*\K|$word|.
(?<!/) and (?!/) are negative look-behind and look-ahead, respectively. They ensure that we are only matching a single /, thus ignoring http://.
(?:/(?!/)[^/]*){$repeat} is a capturing group that must match a specified number of times, and we increase that number until it no longer matches.
I had to use [^/]* instead of [^/]+ to meet your requirement of matching at the end of the string. That is why both the look-behind and the look-ahead are needed.
\K means "match everything up to this point, but don't include it in the match itself." Thus we don't have to worry about including the whole beginning of the string that matched in the replacement.
Note: The r option is another way to perform substitution without modifying the original string. However, it requires Perl 5.16 (thanks Amon). Thus I removed it from the example.

My array is showing empty after I insert huge data into it in perl

#!/usr/bin/perl -w
################################################################################
##Get_Duration.pl
#
# This is a perl script which is used to parse the audio files
# present in the device and build's the xml containing all the
# track i.e both audio and video files duration
#
# The xml file is created in the name of ParsedMetadataInformation.xml
# in <ATAF Path>/tmp/ directory.
#
#
# CHANGE HISTORY
# --------------------------------------------------------------------------
use strict;
use warnings;
use Env;
use File::Find;
use XML::TreePP;
use Data::Dumper;
my $data;
if (not defined $ATAF){
print "=====================================================\n";
print "ERROR: ATAF Path is not set.\n";
print "(Example: export ATAF=/home/roopa/ATAF)\n";
print "=====================================================\n";
exit 1;
}
print "Enter the Absolute path for the device to be scanned\n";
print "(Example: /media/RACE_1.6A)\n";
$DB::single=1;
my #metadataInfo = ();
print "Enter Path:";
my $configDir = <STDIN>;
chomp $configDir;
my #configFiles;
find( sub {push #configFiles, "$File::Find::name$/" if (/\.mp3|\.wma|\.wav|\.ogg| \.flac| \.m4a|\.mp4|\.avi|\.mpg|\.mpeg|\.mov|\.wmv|\.m4b$/i)}, $configDir);
chomp #configFiles;
if (!#configFiles){
print "=====================================================\n";
print "ERROR: No Files Found!!!\n";
print "=====================================================\n";
exit -1;
}
my $tpp = XML::TreePP->new();
my $metadataHashTree1 = ();
print "=====================================================\n";
print "Extracting the Metadata Information\n";
print "=====================================================\n";
foreach my $file (#configFiles){
print "Currently in: $file\n";
(my $fileName = $file) =~ s/^.*\///g;
$file =~ s/([\!\$\^\*\&\(\)\|\}\{\[\]\:\"\;\'\?\>\<\,\=\`\s])/\\$1/g;
#metadataInfo = (`ffmpeg -i $fileName`);
my $size= scalar (#metadataInfo);
#chomp #metadataInfo;
foreach my $eachfile (#metadataInfo){
if ($eachfile =~ m/^Duration: /i){
$eachfile =~ m/Duration:(.*?),/;
$data= $1;
$metadataHashTree1->{$fileName}->{'Duration'}=$data;
}
}
}
print "=====================================================\n";
print "Building XML tree\n";
print "=====================================================\n\n";
my $xml = $tpp->write($metadataHashTree1);
sleep 5;
print "=====================================================================\n";
print "Writing the XML tree in <ATAF Path>/tmp/ParsedMetadataInformation.xml\n";
print "=====================================================================\n\n";
open (FILEHANDLE, ">$ATAF/tmp/ParsedDurationInformation.xml") or die "ERROR: $!\n";
print FILEHANDLE $xml;
close FILEHANDLE;
sleep 5;
print "=====================================================\n";
print "Successfully Completed!!!\n";
print "=====================================================\n\n";
########################################################################################
In the above program I am trying to get the duration of a track using ffmpeg command and saving the output in #metadataInfo. But the array size shows 0 if I try to print using the command
$size= scalar (#metadataInfo);
"$File::Find::name$/"
should be
$File::Find::name
Appending $/ makes no sense.
You don't convert the file name to a shell literal.
`ffmpeg -i $fileName`
should be
use String::ShellQuote qw( shell_quote );
my $cmd = shell_quote('ffmpeg', '-i', $fileName);
`$cmd`
This will handle problems such as a spaces in the file name.
You don't check if the backticks succeeded. What's the value of $?? And if that's -1, what's the value of $!?

Perl: adding a string to $_ is producing strange results

I wrote a super simple script:
#!/usr/bin/perl -w
use strict;
open (F, "<ids.txt") || die "fail: $!\n";
my #ids = <F>;
foreach my $string (#ids) {
chomp($string);
print "$string\n";
}
close F;
This is producing an expected output of all the contents of ids.txt:
hello
world
these
annoying
sourcecode
lines
Now I want to add a file-extension: .txt for every line. This line should do the trick:
#!/usr/bin/perl -w
use strict;
open (F, "<ids.txt") || die "fail: $!\n";
my #ids = <F>;
foreach my $string (#ids) {
chomp($string);
$string .= ".txt";
print "$string\n";
}
close F;
But the result is as follows:
.txto
.txtd
.txte
.txtying
.txtcecode
Instead of appending ".txt" to my lines, the first 4 letters of my string will be replaced by ".txt" Since I want to check if some files exist, I need the full filename with extension.
I have tried to chop, chomp, to substitute (s/\n//), joins and whatever. But the result is still a replacement instead of an append.
Where is the mistake?
Chomp does not remove BOTH \r and \n if the file has DOS line endings and you are running on Linux/Unix.
What you are seeing is actually the original string, a carriage return, and the extension, which overwrites the first 4 characters on the display.
If the incoming file has DOS/Windows line endings you must remove both:
s/\R+$//
A useful debugging technique when you are not quite sure why your data is getting set to what it is is to dump it with Data::Dumper:
#!/usr/bin/perl -w
use strict;
use Data::Dumper ();
$Data::Dumper::Useqq = 1; # important to be able to actually see differences in whitespace, etc
open (F, "<ids.txt") || die "fail: $!\n";
my #ids = <F>;
foreach my $string (#ids) {
chomp($string);
print "$string\n";
print Data::Dumper::Dumper( { 'string' => $string } );
}
close F;
have you tried this?
foreach my $string (#ids) {
chomp($string);
print $string.".txt\n";
}
I'm not sure what's wrong with your code though. these results are strange

Resources