Shell Script to parse/retrieve a string found after another string/match - string

The shell script will be passed a string of arguments. The position of the key/value I am looking to parse out may change over time, i.e. it may come before or after another key at any time so parsing between two keys wouldn't be an option.
I am looking to parse the domain key out of a string like this:
maxpark 0 maxsub n domain sample.foo maxlst n max_defer_fail_percentage user oli force no_cache_update 0 maxpop n maxaddon 0 locale en contactemail
The key would be "domain" the value would be "sample.foo". The domain key could have more than one '.' in it so I would need to grab the entire domain key.
I am not the best with regular expressions but I imagine using 'sed' is what I'm going to need to do.
I am accessing this full string using $*, if I could simply reference the key by accessing $DOMAIN that would be great, but since my only option is to access based on position, $3, and the position could change, that isn't an option

Solved the problem using PERL.
#!/usr/bin/perl -w
use strict;
my %OPTS = #ARGV;
open(FILE, "</var/named/$OPTS{'domain'}.db") || die "File not found";
my #lines = <FILE>;
close(FILE);
my #newlines;
foreach(#lines) {
$_ =~ s/$LOCAL_IP/$PUBLIC_IP/g;
push(#newlines,$_);
}
open(FILE, ">/var/named/$OPTS{'domain'}.db") || die "File not found";
print FILE #newlines;
close(FILE);

If you do have perl, just use this one-liner from your shell script.
domain=$( echo $* | perl -ne '/domain\s([^\s]+)\s/ and print "$1"' )
Or if you'd rather just do it with sed:
domain=$( echo $* | sed 's/.*\<domain \([^ ]\+\).*/\1/' )

Related

finding a file in directory using perl script

I'm trying to develop a perl script that looks through all of the user's directories for a particular file name without the user having to specify the entire pathname to the file.
For example, let's say the file of interest was data.list. It's located in /home/path/directory/project/userabc/data.list. At the command line, normally the user would have to specify the pathname to the file like in order to access it, like so:
cd /home/path/directory/project/userabc/data.list
Instead, I want the user just to have to enter script.pl ABC in the command line, then the Perl script will automatically run and retrieve the information in the data.list. which in my case, is count the number of lines and upload it using curl. the rest is done, just the part where it can automatically locate the file
Even though very feasible in Perl, this looks more appropriate in Bash:
#!/bin/bash
filename=$(find ~ -name "$1" )
wc -l "$filename"
curl .......
The main issue would of course be if you have multiple files data1, say for example /home/user/dir1/data1 and /home/user/dir2/data1. You will need a way to handle that. And how you handle it would depend on your specific situation.
In Perl that would be much more complicated:
#! /usr/bin/perl -w
eval 'exec /usr/bin/perl -S $0 ${1+"$#"}'
if 0; #$running_under_some_shell
use strict;
# Import the module File::Find, which will do all the real work
use File::Find ();
# Set the variable $File::Find::dont_use_nlink if you're using AFS,
# since AFS cheats.
# for the convenience of &wanted calls, including -eval statements:
# Here, we "import" specific variables from the File::Find module
# The purpose is to be able to just type '$name' instead of the
# complete '$File::Find::name'.
use vars qw/*name *dir *prune/;
*name = *File::Find::name;
*dir = *File::Find::dir;
*prune = *File::Find::prune;
# We declare the sub here; the content of the sub will be created later.
sub wanted;
# This is a simple way to get the first argument. There is no
# checking on validity.
our $filename=$ARGV[0];
# Traverse desired filesystem. /home is the top-directory where we
# start our seach. The sub wanted will be executed for every file
# we find
File::Find::find({wanted => \&wanted}, '/home');
exit;
sub wanted {
# Check if the file is our desired filename
if ( /^$filename\z/) {
# Open the file, read it and count its lines
my $lines=0;
open(my $F,'<',$name) or die "Cannot open $name";
while (<$F>){ $lines++; }
print("$name: $lines\n");
# Your curl command here
}
}
You will need to look at the argument-parsing, for which I simply used $ARGV[0] and I do dont know what your curl looks like.
A more simple (though not recommended) way would be to abuse Perl as a sort of shell:
#!/usr/bin/perl
#
my $fn=`find /home -name '$ARGV[0]'`;
chomp $fn;
my $wc=`wc -l '$fn'`;
print "$wc\n";
system ("your curl command");
Following code snippet demonstrates one of many ways to achieve desired result.
The code takes one parameter, a word to look for in all subdirectories inside file(s) data.list. And prints out a list of found files in a terminal.
The code utilizes subroutine lookup($dir,$filename,$search) which calls itself recursively once it come across a subdirectory.
The search starts from current working directory (in question was not specified a directory as start point).
use strict;
use warnings;
use feature 'say';
my $search = shift || die "Specify what look for";
my $fname = 'data.list';
my $found = lookup('.',$fname,$search);
if( #$found ) {
say for #$found;
} else {
say 'Not found';
}
exit 0;
sub lookup {
my $dir = shift;
my $fname = shift;
my $search = shift;
my $files;
my #items = glob("$dir/*");
for my $item (#items) {
if( -f $item && $item =~ /\b$fname\b/ ) {
my $found;
open my $fh, '<', $item or die $!;
while( my $line = <$fh> ) {
$found = 1 if $line =~ /\b$search\b/;
if( $found ) {
push #{$files}, $item;
last;
}
}
close $fh;
}
if( -d $item ) {
my $ret = lookup($item,$fname,$search);
push #{$files}, $_ for #$ret;
}
}
return $files;
}
Run as script.pl search_word
Output sample
./capacitor/data.list
./examples/data.list
./examples/test/data.list
Reference:
glob,
Perl file test operators

How to use `diff` on files whose paths contain whitespace

I am trying to find the differences between files, but the filename and directory name contain white space. I am trying to execute the command in a Perl script.
diff /home/users/feroz/logs/back_up20161112/Security File/General Security.csv /home/users/feroz/logs/back_up20161113/Security File/General Security.csv
Perl
open( my $FH, '>', $logfile ) or die "Cannot open the file '$logfile' $!";
foreach $filename ( keys %filenames ) {
$old_file = $parent_directory . $previous_date . $search_directory . "$filenames{$filename}";
$new_file = $parent_directory . $current_date . $search_directory . "$filenames{$filename}";
if ( !-e $old_file ) {
#print ("\nFile does not exist in previos date backup");
print $FH "\nERROR:'$old_file' ---- does not exist in the backup directory ";
}
elsif ( !-e $new_file ) {
#print ("\n The file does not exist in current directory");
print $FH "\nERROR:'$new_file' --- does not exist in the present directory ";
}
else {
# print $FH "\nDifference between the files $filenames{$filename} of $previous_date and $current_date ";
my $cmd = 'diff $old_file $new_file| xargs -0';
open( my $OH, '|-', $cmd ) or die "Failed to read the output";
while ( <OH> ) {
print $FH "$_";
}
close $OH;
}
}
To be absolutly safe, use ShellQuote
use String::ShellQuote;
my $old_file2 = shell_quote($old_file);
my $new_file2 = shell_quote($new_file);
`diff $old_file2 $new_file2`;
Thank you for showing your Perl code
Single quotes don't interpolate, so that will pass the strings $old_file and $new_file to the command instead of those variables' contents. The shell will then try to interpret them as shell variables
I suggest that you write this instead
my $cmd = qq{diff '$old_file' '$new_file' | xargs -0};
open( my $OH, '-|', $cmd ) or die "Failed to read the output";
That will use double quotes (qq{...}) around the command string so that the variables are interpolated. The file paths have single quotes around them to indicate that the shell should treat them as individual strings
This won't work if there's a chance that your file paths could contain a single quote, but that's highly unusual
Pass arguments out-of-band to avoid the need to shell-quote them, rather than interpolating them into a string which is parsed by a shell as a script. Substituting filenames as literal text into a script generates exposure to shell injection attacks -- the shell-scripting equivalent to the family of database security bugs known as SQL injection.
Without Any Shell At All
The pipe to xargs -0 appears to be serving no purpose here. Eliminating it allows this to be run without any shell involved at all:
open(my $fh, "-|", "diff", $old_file, $new_file)
With Shell Arguments Passed Out-Of-Band From Script Text
If you really do want the shell to be invoked, the safe thing to do is to keep the script text an audited constant, and have it retrieve arguments from either the argv list passed to the shell or the environment.
# Putting $1 and $2 in double quotes ensures that the shell treats contents as literal
# the "_" is used for $0 in the shell.
$shell_script='diff "$1" "$2" | xargs -0'
open(my $fh, "-|",
"sh", "-c", $shell_script,
"_", $old_file, $new_file);
You can either
Put the whitespace path segment inside quotes
diff /home/users/feroz/logs/back_up20161112/"Security File"/General Security.csv /home/users/feroz/logs/back_up20161113/"Security File"/General Security.csv
or escape the whitespace
diff /home/users/feroz/logs/back_up20161112/Security\ File/General Security.csv /home/users/feroz/logs/back_up20161113/Security\ File/General Security.csv`

How to extract key value pairs from a file when values span multiple lines?

I'm a few weeks into bash scripting and I haven't advanced enough yet to get my head wrapped around this problem. Any help would be appreciated!
I have a "script.conf" file that contains the following:
key1=value1
key2=${HOME}/Folder
key3=( "k3v1" "k3 v2" "k3v3")
key4=( "k4v1"
"k4 v2"
"k4v3"
)
key5=value5
#key6="Do Not Include Me"
In a bash script, I want to read the contents of this script.conf file into an array. I've learned how to handle the scenarios for keys 1, 2, 3, and 5, but the key4 scenario throws a wrench into it with it spanning across multiple lines.
I've been exploring the use of sed -n '/=\s*[(]/,/[)]/{/' which does capture key4 and its value, but I can't figure out how to mix this so that the other keys are also captured in the matches. The range syntax is also new to me, so I haven't figured out how to separate the key/value. I feel like there is an easy regex that would accomplish what I want... in plain-text: "find and group the pattern ^(.*)= (for the key), then group everything after the '=' char until another ^(.*)= match is found, rinse and repeat". I guess if I do this, I need to change the while read line to not handle the key/value separation for me (I'll be looking into this while I'm waiting for a response). BTW, I think a solution where the value of key4 is flattened (new lines removed) would be acceptable; I know for key3 I have to store the value as a string and then convert it to an array later when I want to iterate over it since an array element apparently can't contain a list.
Am I on the right path with sed or is this a job for awk or some other tool? (I haven't ventured into awk yet). Is there an easier approach that I'm missing because I'm too deep into the forest (like changing the while read line in the LoadConfigFile function)?
Here is the code that I have so far in script.sh for processing and capturing the other pairs into the $config array:
__AppDir=$(dirname $0)
__AppName=${__ScriptName%.*}
typeset -A config #init config array
config=( #Setting Default Config values
[key1]="defaultValue1"
[key2]="${HOME}/defaultFolder"
[QuietMode]=0
[Verbose]=0 #Ex. Usage: [[ "${config[Verbose]}" -gt 0 ]] && echo ">>>Debug print"
)
function LoadConfigFile() {
local cfgFile="${1}"
shopt -s extglob #Needed to remove trailing spaces
if [ -f ${cfgFile} ]; then
while IFS='=' read -r key value; do
if [[ "${key:0:1}" == "#" ]]; then
#echo "Skipping Comment line: ${key}"
elif [ "${key:-EMPTY}" != "EMPTY" ]; then
value="${value%%\#*}" # Delete in-line, right comments
value="${value%%*( )}" # Delete trailing spaces
value="${value%%( )*}" # Delete leading spaces
#value="${value%\"*}" # Delete opening string quotes
#value="${value#\"*}" # Delete closing string quotes
#Manipulate any variables included in the value so that they can be expanded correctly
# - value must be stored in the format: "${var1}". `backticks`, "$var2", and "doubleQuotes" are left as is
value="${value//\"/\\\"}" # Escape double quotes for eval
value="${value//\`/\\\`}" # Escape backticks for eval
value="${value//\$/\\\$}" # Escape ALL '$' for eval
value="${value//\\\${/\${}" # Undo the protection of '$' if it was followed by a '{'
value=$(eval "printf '%s\n' \"${value}\"")
config[${key}]=${value} #Store the value into the config array at the specified key
echo " >>>DBG: Key = ${key}, Value = ${value}"
#else
# echo "Skipped Empty Key"
fi
done < "${cfgFile}"
fi
}
CONFIG_FILE=${__AppDir}/${__AppName}.conf
echo "Config File # ${CONFIG_FILE}"
LoadConfigFile ${CONFIG_FILE}
#Print elements of $config
echo "Script Config Values:"
echo "----------------------------"
for key in "${!config[#]}"; do #The '!' char gets an array of the keys, without it, we would get an array of the values
printf " %-20s = %s\n" "${key}" "${config[${key}]}"
done
echo "------ End Script Config ------"
#To convert to an array...
declare -a valAsArray=${config[RequiredAppPackages]} #Convert the value from a string to an array
echo "Count = ${#valAsArray[#]}"
for itemCfg in "${valAsArray[#]}"; do
echo " item = ${itemCfg}"
done
As I mentioned before, I'm just starting to learn bash and Linux scripting in general, so if you see that I'm doing some taboo things in other areas of my code too, please feel free to provide feedback in the comments... I don't want to start bad habits early on :-).
*If it matters, the OS is Ubuntu 14.04.
EDIT:
As requested, after reading the script.conf file, I would like for the elements in $config[#] to be equivalent to the following:
typeset -A config #init config array
config=(
[key1]="value1"
[key2]="${HOME}/Folder"
[key3]="( \"k3v1\" \"k3 v2\" \"k3v3\" )"
[key4]="( \"k4v1\" \"k4 v2\" \"k4v3\" )"
[key5]="value5"
)
I want to be able to convert the values of elements 'key4' and 'key3' into an array and iterated over them the same way in the following code:
declare -a keyValAsArray=${config[keyN]} #Convert the value from a string to an array
echo "Count = ${#keyValAsArray[#]}"
for item in "${keyValAsArray[#]}"; do
echo " item = ${item}"
done
I don't think it matters if \n is preserved for key4's value or not... that depends on if declare has a problem with it.
A shell is an environment from which to call tools with a language to sequence those calls. It is NOT a tool to manipulate text. The standard UNIX tool to manipulate text is awk. Trying to manipulate text in shell IS a bad habit, see why-is-using-a-shell-loop-to-process-text-considered-bad-pr‌​actice for SOME of the reasons why
You still didn't post the expected result of populating the config array so I'm not sure but I think this is what you wanted:
$ cat tst.sh
declare -A config="( $(awk '
{ gsub(/^[[:space:]]+|([[:space:]]+|#.*)$/,"") }
!NF { next }
/^[^="]+=/ {
name = gensub(/=.*/,"",1)
value = gensub(/^[^=]+=/,"",1)
n2v[name] = value
next
}
{ n2v[name] = n2v[name] OFS $0 }
END {
for (name in n2v) {
value = gensub(/"/,"\\\\&","g",n2v[name])
printf "[%s]=\"%s\"\n", name, value
}
}
' script.conf
) )"
declare -p config
$ ./tst.sh
declare -A config='([key5]="value5" [key4]="( \"k4v1\" \"k4 v2\" \"k4v3\" )" [key3]="( \"k3v1\" \"k3 v2\" \"k3v3\")" [key2]="/home/Ed/Folder" [key1]="value1" )'
The above uses GNU awk for gensub(), with other awks you'd use [g]sub() instead.

Why can't I print a very long string? [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 8 years ago.
Improve this question
I'm writing a Perl script that searches a kml file and I need to print a very long line of latitude/longitude coordinates. The following script successfully finds the string I'm looking for, but just prints a blank line instead of the value of the string:
#!/usr/bin/perl
# Strips unsupported tags out of a QGIS-generated kml and writes a new one
$file = $ARGV[0];
# read existing kml file
open( INFO, $file ); # Open the file
#lines = <INFO>; # Read it into an array
close(INFO); # Close the file
#print #lines; # Print the array
$x = 0;
$coord_string = "<coordinates>";
# go through each line looking for above string
foreach $line (#lines) {
$x++;
if ( $x > 12 ) {
if ( $line =~ $coord_string ) {
$thisCooordString = $line;
$var_startX = $x;
print "Found coord string: $thisCoordString\n";
print " on line: $var_startX\n";
}
}
}
The file that it's reading is here
and this is the output I get:
-bash-4.3$ perl writekml.pl HUC8short.kml
Found coord string:
on line: 25
Found coord string:
on line: 38
Is there some cap on the maximum length that a string can be in Perl? The longest line in this file is ~151,000 characters long. I've verified that all the lines in the file are read successfully.
You've misspelled the variable name (two os vs three os):
$thisCooordString = $line;
...
print "Found coord string: $thisCoordString\n";
Add use strict and use warnings to your script to prevent these sorts of errors.
Always include use strict and use warnings in EVERY perl script.
If you had done this, you would've gotten the following error message to clue you into your bug:
Global symbol "$thisCoordString" requires explicit package name
Adding these pragmas and simplifying your code results in the following:
#!/usr/bin/env perl
# Strips unsupported tags out of a QGIS-generated kml and writes a new one
use strict;
use warnings;
local #ARGV = 'HUC8short.kml';
while (<>) {
if ( $. > 12 && /<coordinates>/ ) {
print "Found coord string: $_\n";
print " on line: $.\n";
}
}
You can even try with perl one liners as shown below:
Perl One liner on windows command prompt:
perl -lne "if($_ =~ /<coordinates>/is && $. > 12) { print \"Found coord string : $_ \n"; print \" on line : $. \n\";}" HUC8short.kml
Perl One liner on unix prompt:
perl -lne 'if($_ =~ /<coordinates>/is && $. > 12) { print "Found coord string : $_ \n"; print " on line : $. \n";}' HUC8short.kml
As others have pointed out, you need. No, you MUST always use use strict; and use warnings;.
If you used strict, you would have gotten an error message telling you that your variable $thisCoordString or $thisCooordString was not declared with my. Using warnings would have warned you that you're printing an undefined string.
Your whole program is written in a very old (and obsolete) Perl programming style. This is the type of program writing I would have done back in Perl 3.0 days about two decades ago. Perl has changed quite a bit since then, and using the newer syntax will allow you to write easier to read and maintain programs.
Here's your basic program written in a more modern syntax:
#! /usr/bin/env perl
#
use strict; # Lets you know when you misspell variable names
use warnings; # Warns of issues (using undefined variables
use feature qw(say); # Let's you use 'say' instead of 'print' (No \n needed)
use autodie; # Program automatically dies on bad file operations
use IO::File; # Lots of nice file activity.
# Make Constants constant
use constant {
COORD_STRING => qr/<coordinates>/, # qr is a regular expression quoted string
};
my $file = shift;
# read existing kml file
open my $fh, '<', $file; # Three part open with scalar filehandle
while ( my $line = <$fh> ) {
chomp $line; # Always "chomp" on read
next unless $line =~ COORD_STRING; #Skip non-coord lines
say "Found coord string: $line";
say " on line: " . $fh->input_line_number;
}
close $fh;
Many Perl developers are self taught. There is nothing wrong with that, but many people learn Perl from looking at other people's obsolete code, or from reading old Perl manuals, or from developers who learned Perl from someone else back in the 1990s.
So, get some books on Modern Perl and learn the new syntax. You might also want to learn about things like references which can lead you to learn Object Oriented Perl. References and OO Perl will allow you to write longer and more complex programs.

How explicitly resolve variables in a perl string?

In my perl script I want to have both versions of $config directory:
my $config='$home/client/config';
and
my $config_resolved="$home/client/config";
But I want to get $config_resolved from $config, i.e. something like this:
my $config_resolved=resolve_vars($config);
How can I do such thing in perl?
From the Perl FAQ (which every Perl programmer should read at least once):
How can I expand variables in text strings?
(contributed by brian d foy)
If you can avoid it, don't, or if you can
use a templating system, such as Text::Template or Template Toolkit,
do that instead. You might even be able to get the job done with
sprintf or printf:
my $string = sprintf 'Say hello to %s and %s', $foo, $bar;
However, for the one-off simple case where I don't want to pull out a
full templating system, I'll use a string that has two Perl scalar
variables in it. In this example, I want to expand $foo and $bar to
their variable's values:
my $foo = 'Fred';
my $bar = 'Barney';
$string = 'Say hello to $foo and $bar';
One way I can do this involves the substitution operator and a double /e flag. The
first /e evaluates $1 on the replacement side and turns it into $foo. The
second /e starts with $foo and replaces it with its value. $foo,
then, turns into 'Fred', and that's finally what's left in the string:
$string =~ s/(\$\w+)/$1/eeg; # 'Say hello to Fred and Barney'
The /e will also silently ignore violations of strict, replacing undefined
variable names with the empty string. Since I'm using the /e flag
(twice even!), I have all of the same security problems I have with
eval in its string form. If there's something odd in $foo, perhaps
something like #{[ system "rm -rf /" ]}, then I could get myself in
trouble.
To get around the security problem, I could also pull the
values from a hash instead of evaluating variable names. Using a
single /e, I can check the hash to ensure the value exists, and if it
doesn't, I can replace the missing value with a marker, in this case
??? to signal that I missed something:
my $string = 'This has $foo and $bar';
my %Replacements = (
foo => 'Fred',
);
# $string =~ s/\$(\w+)/$Replacements{$1}/g;
$string =~ s/\$(\w+)/
exists $Replacements{$1} ? $Replacements{$1} : '???'
/eg;
print $string;
I use eval for this.
So, you must replace all scalars (their names) with their values.
$config = 'stringone';
$boo = '$config/any/string';
$boo =~ s/(\$\w+)/eval($1)/eg;
print $boo;
Because you are using my to declare it as private variable, you might as well use a /ee modifier. This can find variables declared to be in local scope:
$boo =~ s/(\$\w+)/$1/eeg;
This is most tidily and safely done by the double-eval modifier on s///.
In the program below, the first /e evaluates the string $1 to get $home, while the second evaluates $home to get the variable's value HOME.
use strict;
my $home = 'HOME';
my $config = '$home/client/config';
my $config_resolved = resolve_vars($config);
print $config_resolved, "\n";
sub resolve_vars {
(my $str = shift) =~ s/(\$\w+)/$1/eeg;
return $str;
}
output
HOME/client/config

Resources