sed command working on command line but not in perl script - linux

I have a file in which i have to replace all the words like $xyz and for them i have to substitutions like these:
$xyz with ${xyz}.
$abc_xbs with ${abc_xbc}
$ab,$cd with ${ab},${cd}
This file also have some words like ${abcd} which i don't have to change.
I am using this command
sed -i 's?\$([A-Z_]+)?\${\1}?g' file
its working fine on command line but not inside a perl script as
sed -i 's?\$\([A-Z_]\+\)?\$\{\1\}?g' file;
What i am missing?
I think adding some backslashes would help.I tried adding some but no success.
Thanks

In a Perl script you need valid Perl language, just like you need valid C text in a C program. In the terminal sed.. is understood and run by the shell as a command but in a Perl program it is just a bunch of words, and that line sed.. isn't valid Perl.
You would need this inside qx() (backticks) or system() so that it is run as an external command. Then you'd indeed need "some backslashes," which is where things get a bit picky.
But why run a sed command from a Perl script? Do the job with Perl
use warnings;
use strict;
use File::Copy 'move';
my $file = 'filename';
my $out_file = 'new_' . $file;
open my $fh, '<', $file or die "Can't open $file: $!";
open my $fh_out, '>', $out_file or die "Can't open $out_file: $!";
while (<$fh>)
{
s/\$( [^{] [a-z_]* )/\${$1}/gix;
print $fh_out $_;
}
close $fh_out;
close $fh;
move $out_file, $file or die "Can't move $out_file to $file: $!";
The regex uses a negated character class, [^...], to match any character other than { following $, thus excluding already braced words. Then it matches a sequence of letters or underscore, as in the question (possibly none, since the first non-{ already provides at least one).
With 5.14+ you can use the non-destructive /r modifier
print $fh_out s/\$([^{][a-z_]*)/\${$1}/gir;
with which the changed string is returned (and original is unchanged), right for the print.
The output file, in the end moved over the original, should be made using File::Temp. Overwriting the original this way changes $file's inode number; if that's a concern see this post for example, for how to update the original inode.
A one-liner (command-line) version, to readily test
perl -wpe's/\$([^{][a-z_]*)/\${$1}/gi' file
This only prints to console. To change the original add -i (in-place), or -i.bak to keep backup.
A reasonable question of "Isn't there a shorter way" came up.
Here is one, using the handy Path::Tiny for a file that isn't huge so we can read it into a string.
use warnings;
use strict;
use Path::Tiny;
my $file = 'filename';
my $out_file = 'new_' . $file;
my $new_content = path($file)->slurp =~ s/\$([^{][a-z_]*)/\${$1}/gir;
path($file)->spew( $new_content );
The first line reads the file into a string, on which the replacement runs; the changed text is returned and assigned to a variable. Then that variable with new text is written out over the original.
The two lines can be squeezed into one, by putting the expression from the first instead of the variable in the second. But opening the same file twice in one (complex) statement isn't exactly solid practice and I wouldn't recommend such code.
However, since module's version 0.077 you can nicely do
path($file)->edit_lines( sub { s/\$([^{][a-z_]*)/\${$1}/gi } );
or use edit to slurp the file into a string and apply the callback to it.
So this cuts it to one nice line after all.
I'd like to add that shaving off lines of code mostly isn't worth the effort while it sure can lead to trouble if it disturbs the focus on the code structure and correctness even a bit. However, Path::Tiny is a good module and this is legitimate, while it does shorten things quite a bit.

Related

Perl splits string incorrectly when loaded from file

I am probably missing out on something becauseI started Perl today, so please excuse me if it's something very obvious.
I would like to load string from a file and then split it character by character.
I have done the following
use strict;
open my $fh, "<", "hello.txt" || die "Cannot open file!\n";
my $data = do { local $/ ; <$fh>};
print $data;
print join( ', ',(split( //, $data)));
close $fh;
When I execute this script the first print statement prints $data without problem, however the second print prints only the join string.
Hello, world!
,
I am running on Windows 7 machine with Strawberry Perl, I don't have access to Unix/Linux machine at the moment so I could not test it elsewhere.
This is probably an issue with the carriage return character "\r" – Windows line endings are \r\n, and a \r on its own moves back to the start of the line, overwriting what you have already written.
You could chomp $data first to remove the line ending, though this will only remove the last line ending.
You can also have Perl convert the Windows \r\n line endings to Unix \n line endings when reading in the file, by applying the :crlf IO layer:
open my $fh, "<:crlf", "hello.txt" or die "Cannot open file!\n";
(Note that it must be open … or die … or open(…) || die … but not open … || die …, because of operator precedence rules.)

"read" command not executing in "while read line" loop [duplicate]

This question already has answers here:
Read user input inside a loop
(6 answers)
Closed 5 years ago.
First post here! I really need help on this one, I looked the issue on google, but can't manage to find an useful answer for me. So here's the problem.
I'm having fun coding some like of a framework in bash. Everyone can create their own module and add it to the framework. BUT. To know what arguments the script require, I created an "args.conf" file that must be in every module, that kinda looks like this:
LHOST;true;The IP the remote payload will connect to.
LPORT;true;The port the remote payload will connect to.
The first column is the argument name, the second defines if it's required or not, the third is the description. Anyway, long story short, the framework is supposed to read the args.conf file line by line to ask the user a value for every argument. Here's the piece of code:
info "Reading module $name argument list..."
while read line; do
echo $line > line.tmp
arg=`cut -d ";" -f 1 line.tmp`
requ=`cut -d ";" -f 2 line.tmp`
if [ $requ = "true" ]; then
echo "[This argument is required]"
else
echo "[This argument isn't required, leave a blank space if you don't wan't to use it]"
fi
read -p " $arg=" answer
echo $answer >> arglist.tmp
done < modules/$name/args.conf
tr '\n' ' ' < arglist.tmp > argline.tmp
argline=`cat argline.tmp`
info "Launching module $name..."
cd modules/$name
$interpreter $file $argline
cd ../..
rm arglist.tmp
rm argline.tmp
rm line.tmp
succes "Module $name execution completed."
As you can see, it's supposed to ask the user a value for every argument... But:
1) The read command seems to not be executing. It just skips it, and the argument has no value
2) Despite the fact that the args.conf file contains 3 lines, the loops seems to be executing just a single time. All I see on the screen is "[This argument is required]" just one time, and the module justs launch (and crashes because it has not the required arguments...).
Really don't know what to do, here... I hope someone here have an answer ^^'.
Thanks in advance!
(and sorry for eventual mistakes, I'm french)
Alpha.
As #that other guy pointed out in a comment, the problem is that all of the read commands in the loop are reading from the args.conf file, not the user. The way I'd handle this is by redirecting the conf file over a different file descriptor than stdin (fd #0); I like to use fd #3 for this:
while read -u3 line; do
...
done 3< modules/$name/args.conf
(Note: if your shell's read command doesn't understand the -u option, use read line <&3 instead.)
There are a number of other things in this script I'd recommend against:
Variable references without double-quotes around them, e.g. echo $line instead of echo "$line", and < modules/$name/args.conf instead of < "modules/$name/args.conf". Unquoted variable references get split into words (if they contain whitespace) and any wildcards that happen to match filenames will get replaced by a list of matching files. This can cause really weird and intermittent bugs. Unfortunately, your use of $argline depends on word splitting to separate multiple arguments; if you're using bash (not a generic POSIX shell) you can use arrays instead; I'll get to that.
You're using relative file paths everywhere, and cding in the script. This tends to be fragile and confusing, since file paths are different at different places in the script, and any relative paths passed in by the user will become invalid the first time the script cds somewhere else. Worse, you aren't checking for errors when you cd, so if any cd fails for any reason, then entire rest of the script will run in the wrong place and fail bizarrely. You'd be far better off figuring out where your system's root directory is (as an absolute path), then referencing everything from it (e.g. < "$module_root/modules/$name/args.conf").
Actually, you're not checking for errors anywhere. It's generally a good idea, when writing any sort of program, to try to think of what can go wrong and how your program should respond (and also to expect that things you didn't think of will also go wrong). Some people like to use set -e to make their scripts exit if any simple command fails, but this doesn't always do what you'd expect. I prefer to explicitly test the exit status of the commands in my script, with something like:
command1 || {
echo 'command1 failed!' >&2
exit 1
}
if command2; then
echo 'command2 succeeded!' >&2
else
echo 'command2 failed!' >&2
exit 1
fi
You're creating temp files in the current directory, which risks random conflicts (with other runs of the script at the same time, any files that happen to have names you're using, etc). It's better to create a temp directory at the beginning, then store everything in it (again, by absolute path):
module_tmp="$(mktemp -dt module-system)" || {
echo "Error creating temp directory" >&2
exit 1
}
...
echo "$answer" >> "$module_tmp/arglist.tmp"
(BTW, note that I'm using $() instead of backticks. They're easier to read, and don't have some subtle syntactic oddities that backticks have. I recommend switching.)
Speaking of which, you're overusing temp files; a lot of what you're doing with can be done just fine with shell variables and built-in shell features. For example, rather than reading line from the config file, then storing them in a temp file and using cut to split them into fields, you can simply echo to cut:
arg="$(echo "$line" | cut -d ";" -f 1)"
...or better yet, use read's built-in ability to split fields based on whatever IFS is set to:
while IFS=";" read -u3 arg requ description; do
(Note that since the assignment to IFS is a prefix to the read command, it only affects that one command; changing IFS globally can have weird effects, and should be avoided whenever possible.)
Similarly, storing the argument list in a file, converting newlines to spaces into another file, then reading that file... you can skip any or all of these steps. If you're using bash, store the arg list in an array:
arglist=()
while ...
arglist+=("$answer") # or ("#arg=$answer")? Not sure of your syntax.
done ...
"$module_root/modules/$name/$interpreter" "$file" "${arglist[#]}"
(That messy syntax, with the double-quotes, curly braces, square brackets, and at-sign, is the generally correct way to expand an array in bash).
If you can't count on bash extensions like arrays, you can at least do it the old messy way with a plain variable:
arglist=""
while ...
arglist="$arglist $answer" # or "$arglist $arg=$answer"? Not sure of your syntax.
done ...
"$module_root/modules/$name/$interpreter" "$file" $arglist
... but this runs the risk of arguments being word-split and/or expanded to lists of files.

How to remove only the last word from a file

I created the following Perl one-liner in order to remove a word from a file
This Perl also escapes special characters such as # or $ or *, so every word that contains a special character will removed from the file.
How to change the Perl syntax in order to delete only the last matched word from a file and not all the words?
Example
more file
Kuku
Toto
Kuku
kuku
export REPLACE_NAME="Kuku"
export REPLACE_WITH=""
perl -i -pe 'next if /^#/; s/(^|\s)\Q$ENV{REPLACE_NAME }\E(\s|$)/$1$ENV{ REPLACE_WITH }$2/' file
expected results
more file
Kuku
Toto
Kuku
another example
when - export REPLACE_NAME="mark#$!"
more file
mark#$!
hgst##
hhfdd##
expected results
hgst##
hhfdd##
Use Tie::File to make this easier.
$ perl -MTie::File -E'tie #file, "Tie::File", shift or die $!; $file[-1] =~ s/\b\Q$ENV{REPLACE_NAME}\E\b/$ENV{REPLACE_WITH}/' file
Update: Rewriting as a program in order to explain it.
# Load the Tie::File module
use Tie::File;
# Tie::File allows you to create a link between an array and a file,
# so that any changes you make to the array are reflected in file.
# The "tie()" function connects the file (passed as an argument and
# therefore accessible using shift()) to a new array (called #file).
tie my #file, 'Tie::File', shift
or die $!;
# The last line of the file will be in $file[-1].
# We use s/.../.../ to make a substitution on that line.
$file[-1] =~ s/\b\Q$ENV{REPLACE_NAME}\E\b/$ENV{REPLACE_WITH}/;
Update: So now you've changed you requirements spec. You want to remove the last occurrence of the string, which is not necessarily on the last line of the file.
Honestly, I think you've moved past the kind of task that I'd write in command-line switches. It'd write a separate program that looks something like this:
#!/usr/bin/perl
use strict;
use warnings;
use Tie::File;
tie my #file, 'Tie::File', shift
or die $!;
foreach (reverse #file) {
if (s/\b\Q$ENV{REPLACE_NAME}\E\b/$ENV{REPLACE_WITH}/) {
last;
}
}

Read and Write Operation in perl script

I am Newbie to Perl script.
I want to do a read and write operation on a file. I will open a file in read and write mode (+<), and will write into a file. Now, I want read the file whatever I have written to it previously. Below is my code:
#!/usr/bin/perl
`touch file.txt`; #Create a file as opening the file in +< mode
open (OUTFILE, "+<file.txt") or die "Can't open file : $!";
print OUTFILE "Hello, welcome to File handling operations in perl\n"; #write into the file
$line = <OUTFILE>; #read from the file
print "$line\n"; #display the read contents.
When I am displaying the read contents it's showing a blank line. But the file "file.txt" has the data
Hello, welcome to File handling operations in perl
Why am I not able to read the contents. Whether my code is wrong or am I missing something.
The problem is that your filehandle position is located after the line you have written. Use the seek function to move the "cursor" back to the top before reading again.
An example, with some extra comments:
#!/usr/bin/env perl
# use some recommended safeguards
use strict;
use warnings;
my $filename = 'file.txt';
`touch $filename`;
# use indirect filehandle, and 3 argument form of open
open (my $handle, "+<", $filename) or die "Can't open file $filename : $!";
# btw good job on checking open sucess!
print $handle "Hello, welcome to File handling operations in perl\n";
# seek back to the top of the file
seek $handle, 0, 0;
my $line = <$handle>;
print "$line\n";
If you will be doing lots of reading and writing you may want to try (and not everyone suggests it) using Tie::File which lets you treat a file like an array; line access by line number (newline written automatically).
#!/usr/bin/env perl
# use some recommended safeguards
use strict;
use warnings;
use Tie::File;
my $filename = 'file.txt';
tie my #file, 'Tie::File', $filename
or die "Can't open/tie file $filename : $!";
# note file not emptied if it already exists
push #file, "Hello, welcome to File handling operations in perl";
push #file, "Some more stuff";
print "$file[0]\n";
This is a seemingly common beginner mistake. Most often you will find that reading and writing to the same file, while possible, is not worth the trouble. As Joel Berger says, you can seek to the beginning of the file. You can also simply re-open the file. Seeking is not as straightforward as reading line by line, and will present you with difficulties.
Also, you should note, that creating an empty file beforehand is not required. Simply do:
open my $fh, ">", "file.txt" or die $!;
print $fh "Hello\n";
open $fh, "<", "file.txt" or die $!;
print <$fh>;
Note that:
using open on the same file handle will automatically close it.
I use three-argument open, and a lexical (defined by my) file handle, which is the recommended way.
you do not need to add newline when printing a variable read in line by line mode, as it will already have a newline at the end. Or end of file.
You can use print <$fh>, as the print statement is in list context, it will extract all the lines from the file handle (print the entire file).
If you only want to print one line, you can do:
print scalar <$fh>; # put <$fh> in scalar context

How to clean a data file from binary junk?

I have this data file, which is supposed to be a normal ASCII file. However, it has some junk in the end of the first line. It only shows when I look at it with vi or less -->
y mon d h XX11 XX22 XX33 XX44 XX55 XX66^#
2011 6 6 10 14.0 15.5 14.3 11.3 16.2 16.1
grep is also saying that it's a binary file: Binary file data.dat matches
This is causing some trouble in my parsing script. I'm splitting each line and putting them to array. The last element(XX66) in first array is somehow corrupted, because of the junk and I can't make a match to it.
How to clean that line or the array? I have tried dos2unix to the file and substituting array members with s/\s+$//. What is that junk anyway? Unfortunately I have no control over the data, it's a third party data.
Any ideas?
Grep is trying to be smart and, when it sees an unprintable character, switches to "binary" mode. Add "-a" or "--text" to force grep to stay in "text" mode.
As for sed, try sed -e 's/\([^ -~]*\)//g', which says, "change everything not between space and tilde (chars 0x20 and 0x7E, respectively) into nothing". That'll strip tabs, too, but you can insert a tab character before the space to include them (or any other special character).
The "^#" is one way to represent an NUL (aka "ascii(0)" or "\0"). Some programs may also see that as an end-of-file if they were implemented in a naive way.
If it's always the same codes (eg ^# or related) then you can find/replace them.
In Vim for example:
:%s/^#//g in edit mode will clear out any of those characters.
To enter a character such as ^#, press and hold down the Ctrl button, press 'v' and then press the character you need - in the above case, remember to hold shift down to get the # key. The Ctrl key should be held down til the end.
The ^# looks like it's a control character. I can't figure out what character it should be, but I guess that's not important.
You can use s/^#//g to get rid of them, but you have to actually COPY the character, just putting ^ and # together won't do it.
e:f;b.
I created this small script to remove all binary, non-ASCII and some annoying characters from a file. Notice that the char are octal-based:
#!/usr/bin/perl
use strict;
use warnings;
my $filename = $ARGV[0];
open my $fh, '<', $filename or die "File not found: $!";
open my $fh2, '>', 'report.txt' ;
binmode($fh);
my ($xdr, $buffer) = "";
# read 1 byte at a time until end of file ...
while (read ($fh, $buffer, 1) != 0) {
# append the buffer value to xdr variable
$xdr .= $buffer;
if (!($xdr =~ /[\0-\11]/) and (!($xdr =~ /[\13-\14]/))and (!($xdr =~ /[\16-\37]/)) and (!($xdr =~ /[\41-\55]/)) and (!($xdr =~ /[\176-\177]/))) {
print $fh2 $xdr;
}
$xdr = "";
}
# finaly, clean all the characters that are not ASCII.
system("perl -plne 's/[^[:ascii:]]//g' report.txt > $filename.clean.txt");
Stripping individual characters using sed is going to be very slow, perhaps several minutes for 100MB file.
As an alternative, if you know the format/structure of the file, e.g. a log file where the "good" lines of the file start with a timestamp, then you can grep out the good lines and redirect those to a new file.
For example, if we know that all good lines start with a timestamp with the year 2021, we can use this expression to only output those lines to a new file:
grep -a "^2021" mylog.log > mylog2.log
Note that you must use the -a or --text option with grep to force grep to output lines when it detects that the file is binary.

Resources