I run some command that generates ip address as output. But I am developing a workflow where I need the ip address to be written twice. Below is the sample command and its output.
$ some command >> out.txt
$ cat out.txt
10.241.1.85
hdfs://10.241.1.236/
hdfs://10.241.1.237/
What i want is to duplicate the output and it should look like this.
10.241.1.85
hdfs://10.241.1.236/
hdfs://10.241.1.237/
10.241.1.85
hdfs://10.241.1.236/
hdfs://10.241.1.237/
Any help please?
The solution given by #ott in a comment seems fine:
var=$(some cmd); echo -e "$var\n$var".
This is not assigning the command o a varable but it is assigning the output of the command to a variable.
When you do not want this, you can use tee (perhaps this will give some ordering problems) or duplicate it differently:
some_command > out.txt.tmp
cat out.txt.tmp out.txt.tmp > out.txt
rm out.txt.tmp
This way you first get the lines of the copy after al the lines of first entries. When you want to double the output directly, you can use
some_command | sed 'p' > out.txt
some command | tee -a out.txt out.txt
Or
some command | tee -a out.txt >> out.txt
Or
some command | tee -a out.txt out.txt >/dev/null
Run command
Pipe to tee
Enable append mode
Append to same file twice
Generate the output to some temporary, then duplicate the temporary to destination and remove temporary :
some command > /tmp/out.txt; cat /tmp/out.txt /tmp/out.txt > out.txt; rm /tmp/out.txt
Here are some more options you could play around with. Seeing as the output is too large to store in a variable, I'd probably go with tee, a temp file, and gzip if the disk write speed is a bottleneck.
someCommand > tmp.txt && cat tmp.txt tmp.txt > out.txt && rm tmp.txt
Now, if the disk read/write speed is a bottleneck, you can tee the output of someCommand and redirect one of the pipelines through gzip initially.
someCommand | tee >(gzip > tmp.gz) > out.txt && gunzip -c tmp.gz >> out.txt && rm tmp.gz
Additionally, if you don't need random access abilities for out.txt and plan on processing it through some other pipeline, you could always keep it stored gzipped until you need it.
someCommand | gzip > tmp.gz && cat tmp.gz tmp.gz > out.txt.gz && rm tmp.gz
I would suggest this:
(someCommand | tee tmp.txt; cat tmp.txt) > out.txt; rm tmp.txt
Not sure there's a way to safely do this without resorting to a temporary file. You could capture it to a variable, as some have suggested, but you have to be careful about quoting then to make sure whitespace doesn't get mangled, and you also might run into problems if the output is particularly large.
Related
I want to send a program the most recent lines from a text file using tail as stdin.
First, I echo to the program some input that will be the same every time, then send in tail input from an inputfile which should first be processed through sed. The following is the command line that I expect to work. But when the program runs it only receives the echo input, not the tail input.
(echo "new" && tail -f ~/inputfile 2> /dev/null | sed -n -r 'some regex' && cat) | ./program
However, the following works exactly as expected, printing everything out to the terminal:
echo "new" && tail -f ~/inputfile 2> /dev/null | sed -n -r 'some regex' && cat
So I tried with another type of output, and again while the echoed text posted, the tail text does not appear anywhere:
(echo "new" && tail -f ~/inputfile 2> /dev/null | sed -n -r 'some regex') | tee out.txt
This made me think it is a problem with buffering, but I tried the unbuffer program and all other advice here (https://superuser.com/questions/59497/writing-tail-f-output-to-another-file) without results. Where is the tail output going and how can I get it to go into my program as expected?
The buffering problem was resolved when I prefixed the sed command with the following:
stdbuf -i0 -o0 -e0
Much more preferable to using unbuffer, which didn't even work for me. Dave M's suggestion of using sed's relatively new -u also seems to do the trick.
One thing you may be getting confused by -- | (pipeline) is higher precedence than && (consecutive execution). So when you say
(echo "new" && tail -f ~/inputfile 2> /dev/null | sed -n -r 'some regex' && cat) | ./program
that is equivalent to
(echo "new" && (tail -f ~/inputfile 2> /dev/null | sed -n -r 'some regex') && cat) | ./program
So the cat isn't really doing anything, and the sed output is probably buffered a bit. You can try using the -u option to sed to get it to use unbuffered output:
(echo "new" && (tail -f ~/inputfile 2> /dev/null | sed -n -u -r 'some regex')) | ./program
I believe some versions of sed default to -u when the output is a terminal and not when it is a pipe, so that may be the source of the difference you're seeing.
You can use the i command in sed (see the command list in the manpage for details) to do the inserting at the beginning:
tail -f inputfile | sed -e '1inew file' -e 's/this/that/' | ./program
I'm building a little bash script to run another bash script that's found in multiple directories. Here's the code:
cd /home/mainuser/CaseStudies/
grep -R -o --include="Auto.sh" [\w] | wc -l
When I execute just that part, it finds the same file 5 times in each folder. So instead of getting 49 results, I get 245. I've written a recursive bash script before and I used it as a template for this problem:
grep -R -o --include=*.class [\w] | wc -l
This code has always worked perfectly, without any duplication. I've tried running the first code with and without the " ", I've tried -r as well. I've read through the bash documentation and I can't seem to find a way to prevent, or even why I'm getting, this duplication. Any thoughts on how to get around this?
As a separate, but related question, if I could launch Auto.sh inside of each directory so that the output of Auto.sh was dumped into that directory; without having to place Auto.sh in each folder. That would probably be much more efficient that what I'm currently doing and it would also probably fix my current duplication problem.
This is the code for Auto.sh:
#!/bin/bash
index=1
cd /home/mainuser/CaseStudies/
grep -R -o --include=*.class [\w] | wc -l
grep -R -o --include=*.class [\w] |awk '{print $3}' > out.txt
while read LINE; do
echo 'Path '$LINE > 'Outputs/ClassOut'$index'.txt'
javap -c $LINE >> 'Outputs/ClassOut'$index'.txt'
index=$((index+1))
done <out.txt
Preferably I would like to make it dump only the javap outputs for the application its currently looking at. Since those .class files could be in any number of sub-directories, I'm not sure how to make them all dump in the top folder, without executing a modified Auto.sh in the top directory of each application.
Ok, so to fix the multiple find:
grep -R -o --include="Auto.sh" [\w] | wc -l
Should be:
grep -R -l --include=Auto.sh '\w' | wc -l
The reason this was happening, was that it was looking for instances of the letter w in Auto.sh. Which occurred 5 times in the file.
However, the overall fix that doesn't require having to place Auto.sh in every directory, is something like this:
MAIN_DIR=/home/mainuser/CaseStudies/
cd $MAIN_DIR
ls -d */ > DirectoryList.txt
while read LINE; do
cd $LINE
mkdir ProjectOutputs
bash /home/mainuser/Auto.sh
cd $MAIN_DIR
done <DirectoryList.txt
That calls this Auto.sh code:
index=1
grep -R -o --include=*.class '\w' | wc -l
grep -R -o --include=*.class '\w' | awk '{print $3}' > ProjectOutputs.txt
while read LINE; do
echo 'Path '$LINE > 'ProjectOutputs/ClassOut'$index'.txt'
javap -c $LINE >> 'ProjectOutputs/ClassOut'$index'.txt'
index=$((index+1))
done <ProjectOutputs.txt
Thanks again for everyone's help!
Usually we may redirect a command output to a file, as following:
cat a.txt >> output.txt
As I tried, if cat failed, the output.txt will still be created, which isn't my expected. I know I could test as this:
if [ "$?" -ne "0"]; then
rm output.txt
fi
But this may cause some issues overhead when there's already such output.txt prior to my cat execution.
So I also need store the output.txt state before cat, if there's already such output.txt before cat execution, I should not rm output.txt by mistake... but there may still be problem on race condition, what if any other process create this output.txt right before my cat very closely?
So is there any simple way that, if the command fails, the redirection output.txt will be removed, or even not created?
Fixed output file names are bad news; don't use them.
You should probably redesign the processing so that you have a date-stamped file name. Failing that, you should use the mktemp command to create a temporary file, have the command you want executed write to that, and when the command is successful, you can move the temporary to the 'final' output — and you can automatically clean up the temporary on failure.
outfile="./output-$(date +%Y-%m-%d.%H:%M:%S).txt"
tmpfile="$(mktemp ./gadget-maker.XXXXXXXX)"
trap "rm -f '$tmpfile'; exit 1" 0 1 2 3 13 15
if cat a.txt > "$tmpfile"
then mv "$tmpfile" "$outfile"
else rm "$tmpfile"
fi
trap 0
You can simplify the outfile to output.txt if you insist (but it isn't safe). You can use any prefix you like with the mktemp command. Note that by creating the temporary file in the current directory, where the final output file will be created too, you avoid cross-device file copying at the mv phase of operations — it is a link() and an unlink() system call (or maybe even a rename() system call if such a thing exists on your machine; it does on Mac OS X) only.
You can't tell that the command has failed until it terminates, and by then it might have produced some output.
Probably a more useful condition is to avoid creating the output file until the command actually produces some output, and not worry about its status code.
This comes close:
command | { IFS= read -rn1 -d '' a &&
{ printf %s "$a" >> output.txt
cat >> output.txt
}
}
However, if the first character output by command is a NUL byte, the NUL won't be written to the output file. Since the extension of the output file is .txt, that's unlikely in this particular case, but it could be handled by adding the command
[[ -z $a ]] && printf '\0' >> output.txt
after the printf and before the cat.
I think this will work, check this out.
[ -e output.txt ] && (mv output.txt output.txt_bkp)
cat a.txt > /dev/null 2>&1;[ $? -eq 0 ] && (cat a.txt > output.txt)
another way as suggested by Jonathan,
[ -e output.txt ] && (mv output.txt output.txt_bkp)
if cat a.txt > /dev/null 2>&1
then
cat a.txt > output.txt
fi
I want to grep the output of my script - which itself contains call to different binaries...
Since the script has multiple binaries within I can't simply put exec and dump the output in file (it does not copy output from the binaries)...
And to let you know, I am monitoring the script output to determine if the system has got stuck!
Why don't you append instead?
mybin1 | grep '...' >> mylog.txt
mybin2 | grep '...' >> mylog.txt
mybin3 | grep '...' >> mylog.txt
Does this not work?
#!/bin/bash
exec 11>&1 12>&2 > >(exec tee /var/log/somewhere) 2>&1 ## Or add -a option to tee to append.
# call your binaries here
exec >&- 2>&- >&11 2>&12 11>&- 12>&-
I'm making a script and every time something is done I would like to write into my custom .log file. How do I do that?
And in the end.. I'd just like to read it with Bash,.. do I just use cat?
Thanks.
The simplest syntax I always use is 2>&1 | tee -a file_name.log.
The syntax can be used after a command or execution of a file. e.g.
find . -type f 2>&1 | tee -a file_name.log
or
./test.sh 2>&1 | tee -a file_name.log
Just cat <log message here> >> custom.log.
The >> means add on to the bottom of the file rather than > which would delete the contents of the file and then write the message.