How can I cat output of a system("pwd") with text in awk? - linux

Please bear with me as I'm very new to shell scripting and awk. I've been trying to create a playlist to play some of my music in Mplayer, but it only works if the full path to each file is specified. I've been trying to set up a little shell script to insert the output of $pwd before each filename and write it to a playlist, like so:
ls | awk '{system("pwd") | getline x; grep -v 0; gsub("\n",""); print x"/"$1}' > playlist.txt
(the grep is to get rid of the "0" status output from system("pwd")). However, there is a newline in x, so I get the output
/home/(directory)
/Song_1.mp3
/home/(directory)
/Song_2.mp3
and what I want is
/home/(directory)/Song_1.mp3
/home/(directory)/Song_2.mp3

As others have answered, there are other ways to get the fully-qualified paths requested in the OP.
For those who are using Awk for other purposes and require the working directory, Awk has access to environment variables through the ENVIRON array. You can access it via ENVIRON["PWD"].
The directory is fully qualified, so the requested /home/(directory)/filename.mp3 may not quite work as expected. Even so, a one-liner that should work for OP's purpose:
ls -1 | awk '{ print(ENVIRON["PWD"] "/" $0); }' > playlist.txt

You can use printf like this:
printf "%s\n" "$PWD"/*
OR else your awk:
printf "%s\n" * | awk 'BEGIN{"pwd"|getline d} {print d "/" $0}'

Not using awk or anything, but I think this does what you want?
find -printf "\"$PWD/%f\"\n"

Are you perhaps trying to do this?
ls -1 ${PWD}/*.mp3
or
find "${PWD}" -maxdepth 1 -name "*.mp3" -print
or
for f in ${PWD}/*.mp3
do
echo "${f}"
done

You could also do
find $(pwd) -maxdepth 1

Related

Bash Script to Pull Employee Name

I've recently enrolled in a cybersecurity bootcamp and am having a little trouble figuring out where I'm going wrong writing this script with a grep command. I'm supposed to be pulling employee names from a schedule and the script is supposed to be able to accept 2 arguments representing a specific date and a specific time.
If I type the following line below, it successfully goes into the schedule file with the date of 0310 and pulls the name of the employee that was working at 5am.
find -type f -iname *0310* | grep "05:00:00 AM" ./* | awk -F" " '{print $5, $6}'
However when I turn it into a script like this:
#!/bin/bash
find -type f -iname *$1* | grep $2 ./* | awk -F" " '{print $3, $4}'
And execute like this:
./script.sh 0310 "05:00:00 AM"
It gives me the following error code of the following and prints the employees who were working at 5am and also 5pm.
grep: AM: No such file or directory
I also get this error if I have another file with "0310" in the name
find: paths must precede expression: random_file_with_0310.txt' find: possible unquoted pattern after predicate -iname'?
Where am I going wrong with this script? Very new to BASH
I think what you actually want is:
#!/bin/bash
find -type f -iname "*${1}*" -exec awk -v i="${2}" '$0 ~ i {print $5, $6}' "{}" +
Note that awk by default uses any number of whitespace (spaces, tabs) as a separator, so your field-separator may not actually be what you need/want, either.
And a different approach:
#!/bin/bash
grep "${2}" $( find -type f -iname "*${1}*" ) | awk '{print $5, $6}'
Slightly shorter (less typing), but more processes involved.
You first problem is quoting.
grep: AM: No such file or directory
This is because what grep $2 ./* is running is
grep 05:00:00 AM
making AM the file argument, followed by the expansion of ./* which is every file in whatever directory you ran the command from, which is also not what you want. You quoted it correctly in your CLI example, but you have to quote it in your script.
grep "$2" ./* # still not looking at the right file
This will pass the "5:00:00 AM" correctly, but isn't going to search for it in the file(s) returned from find.
Assuming there is only one file (I wouldn't, but for simplicity's sake...) - try
file=`find -type f -iname *"$1"*` # note the quoting here also
Personally, I prefer the improved syntax for the same thing -
file="$(find -type f -iname *"$1"*)" # note the quoting here also
If there is any chance you are going to get multiple files, then this is likely way beyond the scope of a bootcamp unless they are really doing it right, in which case c.f. this discussion of why filenames are not to be trusted.
ANYWAY - once you have your filename, you still don't need grep.
awk -v ts="$ts" '$0~ts{print $5, $6}' "$file"
or even, in one step,
awk -v ts="$ts" '$0~ts{print $5, $6}' "$(find -type f -iname *"$1"*)"
...but if you just felt the need to add a redundant pattern parser antipattern, then
grep "$2" "$(find -type f -iname *"$1"*)" | awk '{print $5, $6}'
A possible alternative, with no promises on performance...
#/bin/bash
ts="$1"; # save the string search pattern
shift; # and shift it off the argument list
shopt -s globstar; # make ** match an arbitrary depth of folders
awk -v ts="$ts" '$0~ts{print $5, $6}' "$#" # just use awk
Call it with
./script.sh "05:00:00 AM" **/*0310* # pass search pattern first, then file list
This should let the interpreter locate matching files for you and pass that list to the script. awk itself will only open the files it is passed as arguments, so you no longer need the find. awk can also pattern match for lines in those files, so you no longer need the separate grep.
(This does run the possibility of returning directories and other weirdness as well as just plain files; we can add lines to accommodate that, but I'm trying to keep it fairly simple for the given problem.)
I omitted the -F" " - you probably don't need that, but be sure to test to see if it changes your actual output dataset. If what you literally meant was that you want every space to delimit a field, so that consecutive spaces mean empty fields, use -F'[ ]'.
If that's too fancy for your context, tink's answer is probably what you want.

How to print only the filename part of files that contain a certain string?

I need to print out the filename (e.g. A001.txt) that contains the string "XYZ".
I tried this:
grep -H -R "XYZ" ~/directory/*.txt | cut -d':' -f1
It would output the entire path (e.g. ~/directory/A001.txt). How can I make it so that it would only output the filename (e.g. A001.txt)?
Why oh why did the GNU guys give grep an option to recursively find files when there's a perfectly good tool designed for the job and with an extremely obvious name. Sigh...
find . -type f -exec awk '/XYZ/{print gensub(/.*\//,"",1,FILENAME); nextfile}' {} +
The above uses GNU awk which I assume you have since you were planning to use GNU grep.
grep -lr term dir/to/search/ | awk -F'/' '{print $NF}' should do the trick.
-l just lists filenames, including their directories.
-r is recursive to go through the directory tree and all files in the dir specified.
This all gets piped to awk, which is told to use / as a delimiter (not allowed in file names, so not as brittle as it could be) and to print the last field (NF is the field count, so $NF is the last field)
grep -Rl "content" | xargs -d '\n' basename -a
This should do the trick and print only the filename without the path.
basename prints filename NAME with any leading directory components
removed.
Reference: https://linux.die.net/man/1/basename

Concatenation of huge number of selective files from a directory in Shell

I have more than 50000 files in a directory such as file1.txt, file2.txt, ....., file50000.txt. I would like to concatenate of some files whose file numbers are listed in the following text file (need.txt).
need.txt
1
4
35
45
71
.
.
.
I tried with the following. Though it is working, but I look for more simpler and short way.
n1=1
n2=$(wc -l < need.txt)
while [ $n1 -le $n2 ]
do
f1=$(awk 'NR=="$n1" {print $1}' need.txt)
cat file$f1.txt >> out.txt
(( n1++ ))
done
This might also work for you:
sed 's/.*/file&.txt/' < need.txt | xargs cat > out.txt
Something like this should work for you:
sed -e 's/.*/file&.txt/' need.txt | xargs cat > out.txt
It uses sed to translate each line into the appropriate file name and then hands the filenames to xargs to hand them to cat.
Using awk it could be done this way:
awk 'NR==FNR{ARGV[ARGC]="file"$1".txt"; ARGC++; next} {print}' need.txt > out.txt
Which adds each file to the ARGV array of files to process and then prints every line it sees.
It is possible do it without any sed or awk command. Directly using bash built-in functions and cat (of course).
for i in $(cat need.txt); do cat file${i}.txt >> out.txt; done
And as you want, it is quite simple.

Rename directories from abc.folder.xyz to folder.xyz

Say I have a directory with a bunch of site names in it.
i.e.
dev.domain.com
dev.domain2.com
dev.domain3.com
How can I rename those to <domain>.com on the linux cli using piping and/or redirection bash?
I get to a point than am stuck.
find . -maxdepth 1 | grep -v "all" | cut --delimiter="." -f3 | awk '{ print $0 }'
Gives me the domain part, but I can't get past that. Not sure awk is the answer either. Any advice is appreciated.
To strip the leading 'dev.' from names it should be like this:
for i in $(find * -maxdepth 1 -type d); do mv $i $(echo $i | sed 's/dev.\(.*\)/\1/'); done
for i in *; do mv $i $( echo $i | sed 's/\([^\.]*\).\([^\.]*\).\([^\.]*\)/\2.\1/' ); done
Explained:
for i in *; do ....; done
do it for every file
echo $i | sed 's/\([^\.]*\).\([^\.]*\).\([^\.]*\)/\2.\1/'
takes three groups of "every character except ." and changes their order
\2.\1 means: print second group, a dot, first group
the $( ... ) takes output of sed and "pastes" it after mv $i and is called "command substitution" http://www.gnu.org/software/bash/manual/bashref.html#Command-Substitution
Try the rename command. It can take a regular expression like this:
rename 's/\.domain.*/.com/' *.com
under the directory you want to work with, try :
ls -dF *|grep "/$"|awk 'BEGIN{FS=OFS="."} {print "mv "$0" "$2,$3}'
will print mv command. if you want to do the rename, add "|sh" and the end:
ls -dF *|grep "/$"|awk 'BEGIN{FS=OFS="."} {print "mv "$0" "$2,$3}'|sh

Problems with Grep Command in bash script

I'm having some rather unusual problems using grep in a bash script. Below is an example of the bash script code that I'm using that exhibits the behaviour:
UNIQ_SCAN_INIT_POINT=1
cat "$FILE_BASENAME_LIST" | uniq -d >> $UNIQ_LIST
sed '/^$/d' $UNIQ_LIST >> $UNIQ_LIST_FINAL
UNIQ_LINE_COUNT=`wc -l $UNIQ_LIST_FINAL | cut -d \ -f 1`
while [ -n "`cat $UNIQ_LIST_FINAL | sed "$UNIQ_SCAN_INIT_POINT"'q;d'`" ]; do
CURRENT_LINE=`cat $UNIQ_LIST_FINAL | sed "$UNIQ_SCAN_INIT_POINT"'q;d'`
CURRENT_DUPECHK_FILE=$FILE_DUPEMATCH-$CURRENT_LINE
grep $CURRENT_LINE $FILE_LOCTN_LIST >> $CURRENT_DUPECHK_FILE
MATCH=`grep -c $CURRENT_LINE $FILE_BASENAME_LIST`
CMD_ECHO="$CURRENT_LINE matched $MATCH times," cmd_line_echo
echo "$CURRENT_DUPECHK_FILE" >> $FILE_DUPEMATCH_FILELIST
let UNIQ_SCAN_INIT_POINT=UNIQ_SCAN_INIT_POINT+1
done
On numerous occasions, when grepping for the current line in the file location list, it has put no output to the current dupechk file even though there have definitely been matches to the current line in the file location list (I ran the command in terminal with no issues).
I've rummaged around the internet to see if anyone else has had similar behaviour, and thus far all I have found is that it is something to do with buffered and unbuffered outputs from other commands operating before the grep command in the Bash script....
However no one seems to have found a solution, so basically I'm asking you guys if you have ever come across this, and any idea/tips/solutions to this problem...
Regards
Paul
The `problem' is the standard I/O library. When it is writing to a terminal
it is unbuffered, but if it is writing to a pipe then it sets up buffering.
try changing
CURRENT_LINE=`cat $UNIQ_LIST_FINAL | sed "$UNIQ_SCAN_INIT_POINT"'q;d'`
to
CURRENT LINE=`sed "$UNIQ_SCAN_INIT_POINT"'q;d' $UNIQ_LIST_FINAL`
Are there any directories with spaces in their names in $FILE_LOCTN_LIST? Because if they are, those spaces will need escaped somehow. Some combination of find and xargs can usually deal with that for you, especially xargs -0
A small bash script using md5sum and sort that detects duplicate files in the current directory:
CURRENT="" md5sum * |
sort |
while read md5sum filename;
do
[[ $CURRENT == $md5sum ]] && echo $filename is duplicate;
CURRENT=$md5sum;
done
you tagged linux, some i assume you have tools like GNU find,md5sum,uniq, sort etc. here's a simple example to find duplicate files
$ echo "hello world">file
$ md5sum file
6f5902ac237024bdd0c176cb93063dc4 file
$ cp file file1
$ md5sum file1
6f5902ac237024bdd0c176cb93063dc4 file1
$ echo "blah" > file2
$ md5sum file2
0d599f0ec05c3bda8c3b8a68c32a1b47 file2
$ find . -type f -exec md5sum "{}" \; |sort -n | uniq -w32 -D
6f5902ac237024bdd0c176cb93063dc4 ./file
6f5902ac237024bdd0c176cb93063dc4 ./file1

Resources