Replace a line in PERL with other line - string

I am iterating a file then there is a string for which I have to search for and from that string I have to replace a sub string with another string. Can anyone help me to solve this problem.
I tried like this.
while(<FH>)
{
if($_ =~ /AndroidAPIEventLogging=false/i)
{
if($& =~ s/false/True/)
{
print("Changed successfully\n");
}
}
}
Now it is showing that it can perform only read operations. I tried by opening the file in each possible mode.

Match and substitute is some kind of perl anti-pattern, as you're matching (often same strings) two times, so back to your question
while (<FH>) {
# everything before '\K' is not replaced (positive look behind)
if (s/AndroidAPIEventLogging=\Kfalse/True/i) { # $_ =~
print("Changed successfully\n");
}
}

You can do that using -i option via Perl one-liner
perl -i -pe 's/AndroidAPIEventLogging=false/AndroidAPIEventLogging=true/i' file1 file2...
As an alternative way, take a look at Tie::File. It seems to be designed for quick in-place file edits.

Related

how to split the data in the unix file

I've a file in Unix (solaris) system with data like below
[TYPEA]:/home/typeb/file1.dat
[TYPEB]:/home/typeb/file2.dat
[TYPEB]:/home/typeb/file3.dat
[TYPE_C]:/home/type_d/file4.dat
[TYPE_C]:/home/type_d/file5.dat
[TYPE_C]:/home/type_d/file6.dat
I want to separate the headings like below
[TYPEA]
/home/typeb/file1.dat
[TYPEB]
/home/typeb/file2.dat
/home/typeb/file3.dat
[TYPE_C]
/home/type_d/file4.dat
/home/type_d/file5.dat
/home/type_d/file6.dat
Files with similar type have to come under one type.
Please help me with any logic to achieve this without hardcoding.
Assuming the input is sorted by type like in your example,
awk -F : '$1 != prev { print $1 } { print $2; prev=$1 }' file
If there are more than 2 fields you will need to adjust the second clause.
sed 'H;$ !b
x
s/\(\(\n\)\(\[[^]]\{1,\}]\):\)/\1\2\1/g
:cycle
=;l
s/\(\n\[[^]]\{1,\}]\)\(.*\)\1/\1\2/g
t cycle
s/^\n//' YourFile
Posix sed version a bit unreadeable due to presence of [ in pattern
- allow : in label or file/path
- failed if same label have a line with another label between them (sample seems ordered).
If you can use perl you will be able to make use of hashes to create a simple data structure:
#! /usr/bin/perl
use warnings;
use strict;
my %h;
while(<>){
chomp;
my ($key,$value) = split /:/;
$h{$key} = [] unless exists $h{$key};
push ${h{$key}},$value;
}
foreach my $key (sort keys %h) {
print "$key"."\n";
foreach my $value (#{$h{$key}}){
print "$value"."\n";
}
}
In action:
perl script.pl file
[TYPEA]
/home/typeb/file1.dat
[TYPEB]
/home/typeb/file2.dat
/home/typeb/file3.dat
[TYPE_C]
/home/type_d/file4.dat
/home/type_d/file5.dat
/home/type_d/file6.dat
If you like it, there is a wholeTutorial to solve this simple problem. It's worth reading it.

Extract text between braces

I have a string as:
MESSAGES { "Instance":[{"InstanceID":"i-098098"}] } ff23710b29c0220849d4d4eded562770 45c391f7-ea54-47ee-9970-34957336e0b8
I need to extract the part { "Instance":[{"InstanceID":"i-098098"}] } i.e from the first occurence of '{' to last occurence of '}' and keep it in a separate file.
If you have this in a file,
sed 's/^[^{]*//;s/[^}]*$//' file
(This will print to standard output. Redirect to a file or capture into a variable or do whatever it is that you want to do with it.)
If you have this in a variable called MESSAGES,
EXTRACTED=${MESSAGES#*{}
EXTRACTED="{${EXTRACTED%\}*}}"
I would suggest either sed or awk from this article. But initial testing shows its a little more complicated and you will probably have to use a combination or pipe:
echo "MESSAGES { "Instance":[{"InstanceID":"i-098098"}] } ff23710b29c0220849d4d4eded562770 45c391f7-ea54-47ee-9970-34957336e0b8" | sed 's/^\(.*\)}.*$/\1}/' | sed 's/^[^{]*{/{/'
So the first sed delete everything after the last } and replace it with a } so it still shows; and the second sed delete everything up to the first { and replace it with a { so it still shows.
This is the output I got:
{ Instance:[{InstanceID:i-098098}] }

Efficient way to replace strings in one file with strings from another file

Searched for similar problems and could not find anything that suits my needs exactly:
I have a very large HTML file scraped from multiple websites and I would like to replace all
class="key->from 2nd file"
with
style="xxxx"
At the moment I use sed - it works well but only with small files
while read key; do sed -i "s/class=\"$key\"/style=\"xxxx\"/g"
file_to_process; done < keys
When I'm trying to process something larger it takes ages
Example:
keys - Count: 1233 lines
file_to_ process - Count: 1946 lines
It takes about 40 s to complete only 1/10 of processing I need
real 0m40.901s
user 0m8.181s
sys 0m15.253s
Untested since you didn't provide any sample input and expected output:
awk '
NR==FNR { keys = keys sep $0; sep = "|"; next }
{ gsub("class=\"(" keys ")\"","style=\"xxxx\"") }
1' keys file_to_process > tmp$$ &&
mv tmp$$ file_to_process
I think it's time to Perl (untested):
my $keyfilename = 'somekeyfile'; // or pick up from script arguments
open KEYFILE, '<', $keyfilename or die("Could not open key file $keyfilename\n");
my %keys = map { $_ => 1 } <KEYFILE>; // construct a map for lookup speed
close KEYFILE;
my $htmlfilename = 'somehtmlfile'; // or pick up from script arguments
open HTMLFILE, '<', $htmlfilename or die("Could not open html file $htmlfilename\n");
my $newchunk = qq/class="xxxx"/;
for my $line (<$htmlfile>) {
my $newline = $line;
while($line =~ m/(class="([^"]+)")/) {
if(defined($keys{$2}) {
$newline =~ s/$1/$newchunk/g;
}
}
print $newline;
}
This uses a hash for lookups of keys, which should be reasonably fast, and does this only on the key itself when the line contains a class statement.
Try to generate a very long sed script with all sub commands from the keys file, something like:
s/class=\"key1\"/style=\"xxxx\"/g; s/class=\"key2\"/style=\"xxxx\"/g ...
and use this file.
This way you will read the input file only once.
Here's one way using GNU awk:
awk 'FNR==NR { array[$0]++; next } { for (i in array) { a = "class=\"" i "\""; gsub(a, "style=\"xxxx\"") } }1' keys.txt file.txt
Note that the keys in keys.txt are taken as the whole line, including whitespace. If leading and lagging whitespace could be a problem, use $1 instead of $0. Unfortunately I cannot test this properly without some sample data. HTH.
First convert your keys file into a sed or-pattern which looks like this: key1|key2|key3|.... This can be done using the tr command. Once you have this pattern, you can use it in a single sed command.
Try the following:
sed -i -r "s/class=\"($(tr '\n' '|' < keys | sed 's/|$//'))\"/style=\"xxxx\"/g" file

Using Perl to remove n characters from the end of multiple lines

I want to remove n characters from each line using PERL.
For example, I have the following input:
catbathatxx (length 11; 11%3=2 characters) (Remove 2 characters from this line)
mansunsonx (length 10; 10%3=1 character) (Remove 1 character from this line)
#!/usr/bin/perl -w
open FH, "input.txt";
#array=<FH>;
foreach $tmp(#array)
{
$b=length($tmp)%3;
my $c=substr($tmp, 0, length($tmp)-$b);
print "$c\n";
}
I want to output the final string (after the characters have been removed).
However, this program is not giving the correct result. Can you please guide me on what the mistake is?
Thanks a lot. Please let me know if there are any doubts/clarifications.
I am assuming trailing whitespace is not significant.
#!/usr/bin/env perl
use strict; use warnings;
use constant MULTIPLE_OF => 3;
while (my $line = <DATA>) {
$line =~ s/\s+\z//;
next unless my $length = length $line;
my $chars_to_remove = $length % MULTIPLE_OF;
$line =~ s/.{$chars_to_remove}\z//;
print $line, "\n";
}
__DATA__
catbathatxx
mansunsonx
0123456789
012345678
The \K regex sequence makes this a lot clearer; it was introduced in Perl v5.10.0.
The code looks like this
use 5.10.0;
use warnings;
for (qw/ catbathatxx mansunsonx /) {
(my $s = $_) =~ s/^ (?:...)* \K .* //x;
say $s;
}
output
catbathat
mansunson
In general you would want to post the result you are getting. That being said...
Each line in the file has a \n (or \r\n on windows) on the end of it that you're not accounting for. You need to chomp() the line.
Edit to add: My perl is getting rusty from non-use but if memory serves me correct you can actually chomp() the entire array after reading the file: chomp(#array)
You should use chomp() on your array, like this:
#array=<FH>;
chomp(#array);
perl -plwe 'chomp; $c = length($_) % 3; chop while $c--' < /tmp/zock.txt
Look up the options in perlrun. Note that line endings are characters, too. Get them out of the way using chomp; re-add them on output using the -l option. Use chop to efficiently remove characters from the end of a string.
Reading your code, you are trying to print just the first 'nx3' characters for the largest value of n for each line.
The following code does this using a simple regular expression.
For each line, it first removes the line ending, then greedy matches
as many .{3} as it can (. matches any character, {3} asks for exactly 3 of them).
The memory requirement of this approach (compared with using an array the size of your file) is fixed. Not too important if your file is small compared with your free memory, but sometimes files are gigabytes, and sometimes memory is very small.
It's always worth using variable names that reflect the purpose of the variable, rather than things like $a or #array. In this case I used only one variable, which I called $line.
It's also good practice to close files as soon as you have finished with them.
#!/usr/bin/perl
use strict;
use warnings; # This will apply warnings even if you use command perl to run it
open FH, '<', 'input.txt'; # Use three part file open - single quote where no interpolation required.
for my $line (<FH>){
chomp($line);
$line =~ s/((.{3})*).*/$1\n/;
print $line;
}
close FH;

Best way to check if argument is a filename or a file containing a list of filenames?

I'm writing a Perl script, and I'd like a way to have the user enter a file or a file containing a list of files in $ARGV[0].
The current way that I'm doing it is to check if the filename starts with an #, if it does, then I treat that file as a list of filenames.
This is definitely not the ideal way to do it, because I've noticed that # is a special character in bash (What does it do by the way? I've only seen it used in $# in bash).
You can specify additional parameter on your command line to treat it differenly e.g.
perl script.pl file
for reading file's content, or
perl script.pl -l file
for reading list of files from file.
You can use getopt module for easier parsing of input arguments.
First, you could use your shell to grab the list for you:
perl script.pl <( cat list )
If you don't want to do that, perhaps because you are running against the maximum command line length, you could use the following before you use #ARGV or ARGV (including <>):
#ARGV = map {
if (my $qfn = /^\#(.*)/s) {
if (!open(my $fh, '<', $qfn)) {
chomp( my #args = <$fh> );
#args
} else {
warn("Can't open $qfn: $!\n");
()
}
} else {
$_
}
} #ARGV;
Keep in mind that you'll have unintended side effects if you have a file whose name starts with "#".
'#' is special in Perl, so you need to escape it in your Perl strings--unless you use the non-interpolating string types of 'a non-interpolating $string' or q(another non-interpolating $string) or you need to escape it, like so
if ( $arg =~ /^\#/ ) {
...
}
Interpolating delimiters are any of the following:
"..." or qq/.../
`...` or qx/.../
/.../ or qr/.../
For all those, you will have to escape any literal #.
Otherwise, a filename starting with a # has pretty good precedence in command line arguments.

Resources