passing awk variable to bash script - linux

I am writing a bash/awk script to process hundreds of files under one directory. They all have name suffix of "localprefs". The purpose is to extract two values from each file (they are quoted by ""). I also want to use the same file name, but without the name suffix.
Here is what I did so far:
#!/bin/bash
for file in * # Traverse all the files in current directory.
read -r name < <(awk ' $name=substr(FILENAME,1,length(FILENAME)-10a) END {print name}' $file) #get the file name without suffix and pass to bash. PROBLEM TO SOLVE!
echo $name # verify if passing works.
do
awk 'BEGIN { FS = "\""} {print $2}' $file #this one works fine to extract two values I want.
done
exit 0
I could use
awk '{print substr(FILENAME,1,length(FILENAME)-10)}' to extract the file name without suffix, but I am stuck on how to pass that to bash as a variable which I will use as output file name (I read through all the posts on this subject here, but maybe I am dumb none of them works for me).
If anyone can shed a light on this, especially the line starts with "read", you are really appreciated.
Many thanks.

Try this one:
#!/bin/bash
dir="/path/to/directory"
for file in "$dir"/*localprefs; do
name=${file%localprefs} ## Or if it has a .: name=${file%.localprefs}
name=${name##*/} ## To exclude the dir part.
echo "$name"
awk 'BEGIN { FS = "\""} {print $2}' "$file" ## I think you could also use cut: cut -f 2 -d '"' "$file"
done
exit 0

To just take sbase name, you don't even need awk:
for file in * ; do
name="${file%.*}"
etc
done

Related

Split and rename single file into multiple files using keywords present in file

New to awk like commands. I have single text file holding SQL DDL's in below format.
DROP TABLE IF EXISTS $database.TABLE_A ;
...
...
DROP TABLE IF EXISTS $database.TABLE_B ;
...
...
Would like to split single file into multiple files as
TABLE_A.SQL
TABLE_B.SQL
TABLE_X.SQL
I am able to get the table names from single file with the help of below awk command. Still struggling to split and rename file with TABLE_X.SQL name.
awk 'FNR==1 {split($5,a,"."); print a[2]}' *.SQL
I am using Windows 10 DOS shell.
Finally I am able to acheive desired output with the help of below Shell script, which we can run in Windows bash shell ...
#!/bin/bash
#Split single file
awk '/DROP/{x="F"++i;}{print > x".TXT";}' $1
#Create output directory
mkdir -p ./_output
#Move file by chaning extention
for f in *.TXT ; do
newfilename=$(awk 'FNR==1 {split($5,a,"."); print a[2]}' "$f")
echo Processed $f ... new file is $newfilename".SQL" ...
mv $f ./_output/$newfilename".SQL"
done
Could you please try following.
awk '/DROP/{if(file){close(file)};match($0,/TABLE_[^ ]*/);file=substr($0,RSTART,RLENGTH)".SQL"} {print > (file)}' Input_file
awk -F "[. ]" '{print >($(NF-1)".SQL")}' file.sql

How to get the filename from a http link in linux?

In a shell script i have a variable $FILE_LINK, which contains the following string:
http://links.twibright.com/download/links-2.13.tar.gz
What i need is to get the filename from the link, and store it in a different variable, so the process would look similar to this:
Set variable $FILE_LINK
Get the last string after the last "/", in this case 'links-2.13.tar.gz'
Store the string in a variable $FILE_LINK_NAME
How i could achieve that?
If using BASH use:
file_link='http://links.twibright.com/download/links-2.13.tar.gz'
file_link_name="${file_link##*/}"
links-2.13.tar.gz
Or else use basename (not available on OSX):
file_link_name=$(basename "$file_link")
If not use this awk:
file_link_name=$(awk -F / '{print $NF}' <<< "$file_link")
Or using sed:
file_link_name=$(sed 's~.*/~~' <<< "$file_link")
PS: I'm avoiding all uppercase variable names in order to avoid clash with ENV variables.
LINK=http://links.twibright.com/download/links-2.13.tar.gz
FILE=`echo $LINK | awk -F "/" '{print $NF}'`
echo $FILE
The output is links-2.13.tar.gz
awk is a good tool for text processing.
https://en.wikipedia.org/wiki/AWK
-F set the separator
$NF means the last column

Changing the file names and copying into different directory

I have some files say about 1000 numbers.. Wanted to rename those files in such a way that, wanted to cut out only few chars from file name and copy it to some other directory.
Ex: Original file name.
vfcon062562~19.xml
vfcon058794~29.xml
vfcon072009~3.xml
vfcon071992~10.xml
vfcon071986~2.xml
vfcon071339~4.xml
vfcon069979~43.xml
Required O/P is cutting the ~and following chars.
O/P Ex:
vfcon058794.xml
vfcon062562.xml
vfcon069979.xml
vfcon071339.xml
vfcon071986.xml
vfcon071992.xml
vfcon072009.xml
But want to place n different directory.
If you are using bash or similar you can use the following simple loop:
for input in vfcon*xml
do
mv $input targetDir/$(echo $input | awk -F~ '{print $1".xml"}')
done
Or in a single line:
for input in vfcon*xml; do mv $input targetDir/$(echo $input | awk -F~ '{print $1".xml"}'); done
This uses awk to separate everything before ~ using it as a field separator and printing the first column and appending ".xml" to create the output file name. All this is prepended with the targetDir which can be a full path.
If you are using csh / tcsh then the syntax of the loop will be slightly different but the commands will be the same.
I like to make sure that my data set is correct prior to changing anything so I would put that into a variable first and then check over it.
files=$(ls vfcon*xml)
echo $files | less
Then, like #Stefan said, use a loop:
for i in $files
do
mv "$i" "$( echo "$file" | sed 's/~[0-9].//g')"
done
If you need help with bash you can use http://www.shellcheck.net/

Unable to delete a line in file in shell script

I have to delete a line in a file from inside a shell script.
I am trying this:
linenumber=0
##CHeck If server IP exists
if grep -wq $serverip $FILE; then
echo "IP exists"
linenumber=$(awk -v serverip="$serverip" '$0 ~ serverip {print NR}' $FILE)
echo "$linenumber"
sed -e '${$linenumber}d' $FILE
fi
Basically I extract the line number and then want to delete it.
sed -e '1d' $FILE --> WOrks on CLI but inside script does not work
Why? How to get it working ?
This is simply a case of using the incorrect quotes around your sed command, so the variable isn't being used. Ignoring the fact that you're unnecessarily using 3 tools when 1 would suffice, the fix is this:
sed -e "${linenumber}d" "$FILE"
Perhaps your requirement is more complex than it appears but I would suggest changing your entire script to this:
awk -v serverip="$serverip" '!($0 ~ serverip)' "$FILE"
This prints every line that doesn't contain the shell variable $serverip. It is assumed that you have escaped any regex meta-characters present in the variable.
Alternatively (and more succinctly):
sed "/$serverip/d" "$FILE"
If you actually want the messages to be printed out (I assumed that they were for debugging), then that's easy enough to achieve:
awk -v serverip="$serverip" '$0 ~ serverip { print "IP exists"; print NR; next } 1' "$FILE"
If you're not familiar with the 1 at the end, it's just a common shorthand which causes awk to print each line (1 is always true and the default action is { print }).

Unix command to remove everything after first column

I have a text file in which I have something like this-
10.2.57.44 56538154 3028
120.149.20.197 28909678 3166
10.90.158.161 869126135 6025
In that text file, I have around 1,000,000 rows exactly as above. I am working in SunOS environment. I needed a way to remove everything from that text file leaving only IP Address (first column in the above text file is IP Address). So after running some unix command, file should look like something below.
10.2.57.44
120.149.20.197
10.90.158.161
Can anyone please help me out with some Unix command that can remove all the thing leaving only IP Address (first column) and save it back to some file again.
So output should be something like this in some file-
10.2.57.44
120.149.20.197
10.90.158.161
If delimiter is space character use
cut -d " " -f 1 filename
If delimiter is tab character , no need for -d option as tab is default delimiter for cut command
cut -f 1 filename
-d
Delimiter; the character immediately following the -d option is the field delimiter .
-f
Specifies a field list, separated by a delimiter
nawk '{print $1}' file > newFile && mv newFile file
OR
cut -f1 file > newFile && mv newFile file
As you're using SunOS, you'll want to get familiar with nawk (not awk, which is the old, and cranky version of awk, while nawk= new awk ;-).
In either case, you're printing the first field in the file to newFile.
(n)awk is a complete programming language designed for the easy manipulation of text files. The $1 means the first field on each line, $9 would mean the ninth field, etc, while $0 means the whole line. You can tell (n)awk what to use to separate the fields by, it might be a tab char, or a '|' char, or multiple spaces. By default, all versions of awk uses white space, i.e. multiple spaces, or 1 tab to delimit the columns/fields, per line in a file.
For a very good intro to awk, see Grymoire's Awk page
The && means, execute the next command only if the previous command finished without a problem. This way you don't accidentally erase your good data file, becuase of some error.
IHTH
If you have vim , open the file with it. Then in command mode write for substitution (tab or space or whatever is the delimiter) %s:<delimiter>.*$::g. Now save the file with :wq.
Using sed give command like this sed -e 's/<delimiter>.*$//' > file.txt
How about a perl script ;)
#!/usr/bin/perl -w
use strict;
my $file = shift;
die "Missing file or can't read it" unless $file and -r $file;
sub edit_in_place
{
my $file = shift;
my $code = shift;
{
local #ARGV = ($file);
local $^I = '';
while (<>) {
&$code;
}
}
}
edit_in_place $file, sub {
my #columns = split /\s+/;
print "$columns[0]\n";
};
This will edit the file in place since you say it is a large one. You can also create a backup by modifying local $^I = ''; to local $^I = '.bak';
Try this
awk '{$1=$1; print $1}' temp.txt
Output
10.2.57.44
120.149.20.197
10.90.158.161
awk '{ print $1 }' file_name.txt > tmp_file_name.txt
mv tmp_file_name.txt file_name.txt
'> tmp_file_name.txt' means redirecting STDOUT of awk '{ print $1 }' file_name.txt to a file named tmp_file_name.txt
FYI :
$1 means first column based on delimiter. The default delimiter is whitespace
$2 means second column based on delimiter. The default delimiter is whitespace
..
..
$NR means last column based on delimiter. The default delimiter is whitespace
If you want to change delimiter, use awk with -F

Resources