Reading the path of files as string in shell script - linux

My Aim -->
Files Listing from a command has to be read line by line and be used as part of another command.
Description -->
A command in linux returns
archive/Crow.java
archive/Kaka.java
mypmdhook.sh
which is stored in changed_files variable. I use the following while loop to read the files line by line and use it as part of a pmd command
while read each_file
do
echo "Inside Loop -- $each_file"
done<$changed_files
I am new to writing shell script but my assumption was that the lines would've been separated in the loop and printed in each iteration but instead I get the following error --
mypmdhook.sh: 7: mypmdhook.sh: cannot open archive/Crow.java
archive/Kaka.java
mypmdhook.sh: No such file
Can you tell me how I can just get the value as a string and not as a file what is opened. By the way, the file does exist which made me feel even more confused.(and later use it inside a command). I'd be happy with any kind of answer that helps me understand and resolve this issue.

Since you have data stored in a variable, use a "here string" instead of file redirection:
changed_files="archive/Crow.java
archive/Kaka.java
mypmdhook.sh"
while read each_file
do
echo "Inside Loop -- $each_file"
done <<< "$changed_files"
Inside Loop -- archive/Crow.java
Inside Loop -- archive/Kaka.java
Inside Loop -- mypmdhook.sh
Extremely important to quote "$changed_files" in order to preserve the newlines, so the while-read loop works as you expect. A rule of thumb: always quote variables, unless you knows exactly why you want to leave the quotes off.

What happens here is that the value of your variable $changed_files is substituted into your command, and you get something like
while read each_file
do
echo "Inside Loop -- $each_file"
done < archive/Crow.java
archive/Kaka.java
mypmdhook.sh
then the shell tries to open the file for redirecting the input and obviously fails.
The point is that redirections (e.g. <, >, >>) in most cases accept filenames, but what you really need is to give the contents of the variable to the stdin. The most obvious way to do that is
echo $changed_files | while read each_file; do echo "Inside Loop -- $each_file"; done
You can also use the for loop instead of while read:
for each_file in $changed_files; do echo "inside Loop -- $each_file"; done
I prefer using while read ... if there is a chance that some filename may contain spaces, but in most cases for ... in will work for you.

Rather than storing command's output in a variable use while loop like this:
mycommand | while read -r each_file; do echo "Inside Loop -- $each_file"; done
If you're using BASH you can use process substitution:
while read -r each_file; do echo "Inside Loop -- $each_file"; done < <(mycommand)
btw your attempt of done<$changed_files will assume that changed_files represents a file.

Related

How do you append a string built with interpolation of vars and STDIN to a file?

Can someone fix this for me.
It should copy a version log file to backup after moving to a repo directory
Then it automatically appends line given as input to the log file with some formatting.
That's it.
Assume existence of log file and test directory.
#!/bin/bash
cd ~/Git/test
cp versionlog.MD .versionlog.MD.old
LOGDATE="$(date --utc +%m-%d-%Y)"
read -p "MSG > " VHMSG |
VHENTRY="- **${LOGDATE}** | ${VHMSG}"
cat ${VHENTRY} >> versionlog.MD
shell output
virufac#box:~/Git/test$ ~/.logvh.sh
MSG > testing script
EOF
EOL]
EOL
e
E
CTRL + C to get out of stuck in reading lines of input
virufac#box:~/Git/test$ cat versionlog.MD
directly outputs the markdown
# Version Log
## version 0.0.1 established 01-22-2020
*Working Towards Working Mission 1 Demo in 0.1 *
- **01-22-2020** | discovered faker.Faker and deprecated old namelessgen
EOF
EOL]
EOL
e
E
I finally got it to save the damned input lines to the file instead of just echoing the command I wanted to enter on the screen and not executing it. But... why isn't it adding the lines built from the VHENTRY variable... and why doesn't it stop reading after one line sometimes and this time not. You could see I was trying to do something to tell it to stop reading the input.
After some realizing a thing I had done in the script was by accident... I tried to fix it and saw that the | at the end of the read command was seemingly the only reason the script did any of what it did save to the file in the first place.
I would have done this in python3 if I had know this script wouldn't be the simplest thing I had ever done. Now I just have to know how you do it after all the time spent on it so that I can remember never to think a shell script will save time again.
Use printf to write a string to a file. cat tries to read from a file named in the argument list. And when the argument is - it means to read from standard input until EOF. So your script is hanging because it's waiting for you to type all the input.
Don't put quotes around the path when it starts with ~, as the quotes make it a literal instead of expanding to the home directory.
Get rid of | at the end of the read line. read doesn't write anything to stdout, so there's nothing to pipe to the following command.
There isn't really any need for the VHENTRY variable, you can do that formatting in the printf argument.
#!/bin/bash
cd ~/Git/test
cp versionlog.MD .versionlog.MD.old
LOGDATE="$(date --utc +%m-%d-%Y)"
read -p "MSG > " VHMSG
printf -- '- **%s** | %s\n' "${LOGDATE}" "$VHMSG" >> versionlog.MD

"read" command not executing in "while read line" loop [duplicate]

This question already has answers here:
Read user input inside a loop
(6 answers)
Closed 5 years ago.
First post here! I really need help on this one, I looked the issue on google, but can't manage to find an useful answer for me. So here's the problem.
I'm having fun coding some like of a framework in bash. Everyone can create their own module and add it to the framework. BUT. To know what arguments the script require, I created an "args.conf" file that must be in every module, that kinda looks like this:
LHOST;true;The IP the remote payload will connect to.
LPORT;true;The port the remote payload will connect to.
The first column is the argument name, the second defines if it's required or not, the third is the description. Anyway, long story short, the framework is supposed to read the args.conf file line by line to ask the user a value for every argument. Here's the piece of code:
info "Reading module $name argument list..."
while read line; do
echo $line > line.tmp
arg=`cut -d ";" -f 1 line.tmp`
requ=`cut -d ";" -f 2 line.tmp`
if [ $requ = "true" ]; then
echo "[This argument is required]"
else
echo "[This argument isn't required, leave a blank space if you don't wan't to use it]"
fi
read -p " $arg=" answer
echo $answer >> arglist.tmp
done < modules/$name/args.conf
tr '\n' ' ' < arglist.tmp > argline.tmp
argline=`cat argline.tmp`
info "Launching module $name..."
cd modules/$name
$interpreter $file $argline
cd ../..
rm arglist.tmp
rm argline.tmp
rm line.tmp
succes "Module $name execution completed."
As you can see, it's supposed to ask the user a value for every argument... But:
1) The read command seems to not be executing. It just skips it, and the argument has no value
2) Despite the fact that the args.conf file contains 3 lines, the loops seems to be executing just a single time. All I see on the screen is "[This argument is required]" just one time, and the module justs launch (and crashes because it has not the required arguments...).
Really don't know what to do, here... I hope someone here have an answer ^^'.
Thanks in advance!
(and sorry for eventual mistakes, I'm french)
Alpha.
As #that other guy pointed out in a comment, the problem is that all of the read commands in the loop are reading from the args.conf file, not the user. The way I'd handle this is by redirecting the conf file over a different file descriptor than stdin (fd #0); I like to use fd #3 for this:
while read -u3 line; do
...
done 3< modules/$name/args.conf
(Note: if your shell's read command doesn't understand the -u option, use read line <&3 instead.)
There are a number of other things in this script I'd recommend against:
Variable references without double-quotes around them, e.g. echo $line instead of echo "$line", and < modules/$name/args.conf instead of < "modules/$name/args.conf". Unquoted variable references get split into words (if they contain whitespace) and any wildcards that happen to match filenames will get replaced by a list of matching files. This can cause really weird and intermittent bugs. Unfortunately, your use of $argline depends on word splitting to separate multiple arguments; if you're using bash (not a generic POSIX shell) you can use arrays instead; I'll get to that.
You're using relative file paths everywhere, and cding in the script. This tends to be fragile and confusing, since file paths are different at different places in the script, and any relative paths passed in by the user will become invalid the first time the script cds somewhere else. Worse, you aren't checking for errors when you cd, so if any cd fails for any reason, then entire rest of the script will run in the wrong place and fail bizarrely. You'd be far better off figuring out where your system's root directory is (as an absolute path), then referencing everything from it (e.g. < "$module_root/modules/$name/args.conf").
Actually, you're not checking for errors anywhere. It's generally a good idea, when writing any sort of program, to try to think of what can go wrong and how your program should respond (and also to expect that things you didn't think of will also go wrong). Some people like to use set -e to make their scripts exit if any simple command fails, but this doesn't always do what you'd expect. I prefer to explicitly test the exit status of the commands in my script, with something like:
command1 || {
echo 'command1 failed!' >&2
exit 1
}
if command2; then
echo 'command2 succeeded!' >&2
else
echo 'command2 failed!' >&2
exit 1
fi
You're creating temp files in the current directory, which risks random conflicts (with other runs of the script at the same time, any files that happen to have names you're using, etc). It's better to create a temp directory at the beginning, then store everything in it (again, by absolute path):
module_tmp="$(mktemp -dt module-system)" || {
echo "Error creating temp directory" >&2
exit 1
}
...
echo "$answer" >> "$module_tmp/arglist.tmp"
(BTW, note that I'm using $() instead of backticks. They're easier to read, and don't have some subtle syntactic oddities that backticks have. I recommend switching.)
Speaking of which, you're overusing temp files; a lot of what you're doing with can be done just fine with shell variables and built-in shell features. For example, rather than reading line from the config file, then storing them in a temp file and using cut to split them into fields, you can simply echo to cut:
arg="$(echo "$line" | cut -d ";" -f 1)"
...or better yet, use read's built-in ability to split fields based on whatever IFS is set to:
while IFS=";" read -u3 arg requ description; do
(Note that since the assignment to IFS is a prefix to the read command, it only affects that one command; changing IFS globally can have weird effects, and should be avoided whenever possible.)
Similarly, storing the argument list in a file, converting newlines to spaces into another file, then reading that file... you can skip any or all of these steps. If you're using bash, store the arg list in an array:
arglist=()
while ...
arglist+=("$answer") # or ("#arg=$answer")? Not sure of your syntax.
done ...
"$module_root/modules/$name/$interpreter" "$file" "${arglist[#]}"
(That messy syntax, with the double-quotes, curly braces, square brackets, and at-sign, is the generally correct way to expand an array in bash).
If you can't count on bash extensions like arrays, you can at least do it the old messy way with a plain variable:
arglist=""
while ...
arglist="$arglist $answer" # or "$arglist $arg=$answer"? Not sure of your syntax.
done ...
"$module_root/modules/$name/$interpreter" "$file" $arglist
... but this runs the risk of arguments being word-split and/or expanded to lists of files.

How to understand and avoid non-interactive mode errors when running ispell from script?

Background
Ispell is a basic command line spelling program in linux, which I want to call for a previously collected list of file names. These file names are recursively collected from a latex root file for example. This is usefull when requiring to spell all recursively included latex files, and no other files. However, calling ispell from the command line turns out to be non-trivial as ispell gives errors of the form
"Can't deal with non-interactive use yet." in some cases.
(As a side not, ideally I would like to call ispell programmatically from java using the ProcessBuilder class, and without requiring bash. The same error seems to pester this approach however.)
Question
Why is it that ispell gives the error "Can't deal with non-interactive use yet." in certain cases, when called in bash from a loop involving the read method, but not in other cases, as shown in the below code example?
The below minimal code example creates two small files
(testFileOne.txt, testFileTwo.txt) and a file containing the paths of the two created files (testFilesListTemp.txt).
Next, ispell is called for testFilesListTemp.txt in three different ways:
1. With the help of "cat"
2. By first collecting the names as a string, then looping over the substrings in the collected string, and calling ispell for each of them.
3. By looping over the contents of testFilesListTemp.txt directly, and
calling ispell for the extracted paths.
For some reaons the third method does not work, and yields an error
"Can't deal with non-interactive use yet.". Why exactly does this error
occur, and how can it be prevented, and/or is there perhaps another variation
of the third approach that would work without errors?
#!/bin/bash
#ispell ./testFiles/ispellTestFile1.txt
# Creating two small files and a file with file paths for testing
printf "file 1 contents" > testFileOne.txt
printf "file 2 contents. With a spelling eeeeror." > testFileTwo.txt
printf "./testFileOne.txt\n./testFileTwo.txt\n" > testFilesListTemp.txt
COLLECTED_LATEX_FILE_NAMES_FILE=testFilesListTemp.txt
# Approach 1: produce list of file names with cat and
# pass as argumentto ispell
# WORKS
ispell $(cat $COLLECTED_LATEX_FILE_NAMES_FILE)
# Second approach, first collecting file names as long string,
# then looping over substrings and calling ispell for each one of them
FILES=""
while read p; do
echo "read file $p"
FILES="$FILES $p"
done < $COLLECTED_LATEX_FILE_NAMES_FILE
printf "files list: $FILES\n"
for latexName in $FILES; do
echo "filename: $latexName"
ispell $latexName
done
# Third approach, not working
# ispell compmlains in this case about not working in non-interactive
# mode
#: "Can't deal with non-interactive use yet."
while read p; do
ispell "$p"
done < $COLLECTED_LATEX_FILE_NAMES_FILE
The third example does not work, because you redirect standard input. ispell needs a terminal and a user interaction. When you write code like this:
while read p; do
ispell "$p"
done < $COLLECTED_LATEX_FILE_NAMES_FILE
everything that is read from standard input by any program within the loop will be taken from the $COLLECTED_LATEX_FILE_NAMES_FILE file. ispell detects that and refuses operating. However, you can use "description redirection" to make read p read from the file, and ispell "$p" read from the "real" terminal. Just do:
exec 3<&0
while read p; do
ispell "$p" 0<&3
done < $COLLECTED_LATEX_FILE_NAMES_FILE
exec 3<&0 "copies" (saves) your standard input (0, the "terminal") to descriptor 3. And later on you redirect standard input (0) to ispell from that descriptor, by typing 0<&3 (you can omit 0 if you like).

Unix: What does cat by itself do?

I saw the line data=$(cat) in a bash script (just declaring an empty variable) and am mystified as to what that could possibly do.
I read the man pages, but it doesn't have an example or explanation of this. Does this capture stdin or something? Any documentation on this?
EDIT: Specifically how the heck does doing data=$(cat) allow for it to run this hook script?
#!/bin/bash
# Runs all executable pre-commit-* hooks and exits after,
# if any of them was not successful.
#
# Based on
# http://osdir.com/ml/git/2009-01/msg00308.html
data=$(cat)
exitcodes=()
hookname=`basename $0`
# Run each hook, passing through STDIN and storing the exit code.
# We don't want to bail at the first failure, as the user might
# then bypass the hooks without knowing about additional issues.
for hook in $GIT_DIR/hooks/$hookname-*; do
test -x "$hook" || continue
echo "$data" | "$hook"
exitcodes+=($?)
done
https://github.com/henrik/dotfiles/blob/master/git_template/hooks/pre-commit
cat will catenate its input to its output.
In the context of the variable capture you posted, the effect is to assign the statement's (or containing script's) standard input to the variable.
The command substitution $(command) will return the command's output; the assignment will assign the substituted string to the variable; and in the absence of a file name argument, cat will read and print standard input.
The Git hook script you found this in captures the commit data from standard input so that it can be repeatedly piped to each hook script separately. You only get one copy of standard input, so if you need it multiple times, you need to capture it somehow. (I would use a temporary file, and quote all file name variables properly; but keeping the data in a variable is certainly okay, especially if you only expect fairly small amounts of input.)
Doing:
t#t:~# temp=$(cat)
hello how
are you?
t#t:~# echo $temp
hello how are you?
(A single Controld on the line by itself following "are you?" terminates the input.)
As manual says
cat - concatenate files and print on the standard output
Also
cat Copy standard input to standard output.
here, cat will concatenate your STDIN into a single string and assign it to variable temp.
Say your bash script script.sh is:
#!/bin/bash
data=$(cat)
Then, the following commands will store the string STR in the variable data:
echo STR | bash script.sh
bash script.sh < <(echo STR)
bash script.sh <<< STR

All files in one dir, linux

Today I tried a script in linux to get all files in one dir. It was pretty straightforward, but I found something interesting.
#!/bin/bash
InputDir=/home/XXX/
for file in $InputDir'*'
do
echo $file
done
The output is:
/home/XXX/fileA /home/XXX/fileB
But when I just input the dir directly, like:
#!/bin/bash
InputDir=/home/XXX/
for file in /home/XXX/*
do
echo $file
done
The output is:
/home/XXX/fileA
/home/XXX/fileB
It seems, in the first script, there was only one loop and all the file names were stored in the variable $file in the FIRST loop, separated by space. But in the second script, one file name was stored in $file just in one loop, and there were more than one loop. What is exactly the difference between these two scripts?
Thanks very much, maybe my question is a little bit naive..
The behavior is correct and "as expected".
for file in $InputDir'*' means assign "/home/XXX/*" to $file (note the quotes). Since you quoted the asterisk, it will not be executed at this time. When the shell sees echo $file, it first expands the variables and then it does glob expansion. So after the first step, it sees
echo /home/XXX/*
and after glob expansion, it sees:
echo /home/XXX/fileA /home/XXX/fileB
Only now, it will execute the command.
In the second case, the pattern /home/XXX/* is expanded before the for is executed and thus, each file in the directory is assigned to file and then the body of the loop is executed.
This will work:
for file in "$InputDir"*
but it's brittle; it will fail, for example, when you forget to add a / to the end of the variable $InputDir.
for file in "$InputDir"/*
is a little bit better (Unix will ignore double slashes in a path) but it can cause trouble when $InputDir is not set or empty: You'll suddenly list files in the / (root) folder. This can happen, for example, because of a typo:
inputDir=...
for file in "$InputDir"/*
Case matters on Unix :-)
To help you understand code like this, use set -x ("enable tracing") in a line before the code you want to debug.
The difference is the quoting of '*'. In the first case the loop only executes once, with $file equal to /home/XXX/* which then expands to all the files in the directory when passed to echo. In the second case it executes once per file, with $file equal to each file name in turn.
Bottom line - change:
for file in $InputDir'*'
to:
for file in $InputDir*
or, better, and to make it more readable - change:
InputDir=/home/XXX/
for file in $InputDir'*'
to:
InputDir=/home/XXX
for file in $InputDir/*

Resources