Linux, how using tee in piped command - linux

time curl http://www.google.com | tee | wc | gzip > google.gz
Why doesn't this command work? It creates the file, and times the operation, but does not print the number of lines, words, and characters (wc).
time curl http://www.google.com | tee | wc
This will print print the words characters and lines, but obviously, the tee portion is pointless.
Is it because I'm sending the word count of the url to google.gz?
I have to use tee, gzip, time, curl to download google web page to a gziped file, print the word count, how long it took.
It is an assignment, so I'm not looking for someone to do it for me. I just am having a problem in that I can't tee to utility, and I cant to and gzip at the same time.
Maybe there is a way to use gzip with curl?

Well, wc outputs the number of characters, words and lines, but then you send it to gzip which compresses it. Eventually, compressed information ends up in google.gz. If you decompress the file, e.g. with
gunzip google.gz
you'll see the three numbers.
Also, normally when one uses tee, they specify a file where the tee'ed data is supposed to be stored.

time curl http://www.google.com | tee /dev/tty | gzip > google.gz

I'm going to guess that something like this is what you want:
time curl http://www.google.com | tee /tmp/z | gzip > google.gz; wc /tmp/z; rm /tmp/z

Related

how to use zcat without the warning when using pipe

I'm trying to silent zcat warning through the option -q or 2>/dev/null
so far nothing is working. I keep getting the same warning when a file name is missing.
I'm looping through 100s of compressed files to extract a specific data. The idea is if zcat encounter a bad name or a missing file name, zcat will just stay quite and wait for the next cycle, but currently this is what I'm getting when using both options
zcat -q $ram | head -n1 or zcat $ram | head -n1 2>/dev/null
gzip: compressed data not read from a terminal. Use -f to force decompression.
For help, type: gzip -h
Any idea how to solve that or a faster way to read a .gz file with a silent feature that works?
Thanks
At present, you're redirecting only stderr from head; you're not redirecting from zcat at all. If you want to redirect stderr from zcat, then you need to put the redirection before the pipe symbol, like so:
zcat $ram 2>/dev/null | head -n1

Grep files in between wget recursive downloads

I am trying to recursively download several files using wget -m, and I intend to grep all of the downloaded files to find specific text. Currently, I can wait for wget to fully complete, and then run grep. However, the wget process is time consuming as there are many files and instead I would like to show progress by grep-ing each file as it downloads and printing to stdout, all before the next file downloads.
Example:
download file1
grep file1 >> output.txt
download file2
grep file2 >> output.txt
...
Thanks for any advice on how this could be achieved.
As c4f4t0r pointed out
wget -m -O - <wesbites>|grep --color 'pattern'
using grep's color function to highlight the patterns may seem helpful especially when dealing with bulky data output to terminal.
EDIT:
Below is a command line you can use. it creates a file called file and save the output messages from wget.Afterwards it tails the message file.
Using awk to find any lines with "saved" and extract filename, then use grep to pattern from filename.
wget -m websites &> file & tail -f -n1 file|awk -F "\'|\`" '/saved/{system( ("grep --colour pattern ") $2)}'
Based on Xorg's solution I was able to achieve my desired effect with some minor adjustments:
wget -m -O file.txt http://google.com 2> /dev/null & sleep 1 && tail -f -n1 file.txt | grep pattern
This will print out all lines that contain pattern to stdout, and wget itself will produce no output visible from the terminal. The sleep is included because otherwise file.txt would not be created by the time the tail command executed.
As a note, this command will miss any results that wget downloads within the first second.

bash standard output can not be redirected into file

I am reading 'advanced bash script', in Chapter 31, there is a problem. I can not figure it out.
tail -f /var/log/msg | grep 'error' >> logfile
Why is there nothing to output into logfile?
can you offer me an explanation?
thank you in advance
As #chepner comments, grep is using a larger buffer (perhaps 4k or more) to buffer its stdout. Most of the standard utilities do this when piping or redirecting to a file. They typically only switch to line-buffered mode when outputting directly to the terminal.
You can use the stdbuf utility to force grep to do line buffering of its output:
tail -f /var/log/msg | stdbuf -oL grep 'error' >> logfile
As an easily observable demonstration of this effect, you can try the following two commands:
for ((i=0;;i++)); do echo $i; sleep 0.001; done | grep . | cat
and
for ((i=0;;i++)); do echo $i; sleep 0.001; done | stdbuf -oL grep . | cat
In the first command, the output from grep . (i.e. match all lines) be buffered going into the pipe to cat. On mine the buffer appears to be about 4k. You will see the ascending numbers output in chunks as the buffer gets filled and then flushed.
In the second command, grep's output into the pipe to cat is line-buffered, so you should see terminal output for every line, i.e. more-or-less continuous output.

Why no output is shown when using grep twice?

Basically I'm wondering why this doesn't output anything:
tail --follow=name file.txt | grep something | grep something_else
You can assume that it should produce output I have run another line to confirm
cat file.txt | grep something | grep something_else
It seems like you can't pipe the output of tail more than once!? Anyone know what the deal is and is there a solution?
EDIT:
To answer the questions so far, the file definitely has contents that should be displayed by the grep. As evidence if the grep is done like so:
tail --follow=name file.txt | grep something
Output shows up correctly, but if this is used instead:
tail --follow=name file.txt | grep something | grep something
No output is shown.
If at all helpful I am running ubuntu 10.04
You might also run into a problem with grep buffering when inside a pipe.
ie, you don't see the output from
tail --follow=name file.txt | grep something > output.txt
since grep will buffer its own output.
Use the --line-buffered switch for grep to work around this:
tail --follow=name file.txt | grep --line-buffered something > output.txt
This is useful if you want to get the results of the follow into the output.txt file as rapidly as possible.
Figured out what was going on here. It turns out that the command is working it's just that the output takes a long time to reach the console (approx 120 seconds in my case). This is because the buffer on the standard out is not written each line but rather each block. So instead of getting every line from the file as it was being written I would get a giant block every 2 minutes or so.
It should be noted that this works correctly:
tail file.txt | grep something | grep something
It is the following of the file with --follow=name that is problematic.
For my purposes I found a way around it, what I was intending to do was capture the output of the first grep to a file, so the command would be:
tail --follow=name file.txt | grep something > output.txt
A way around this is to use the script command like so:
script -c 'tail --follow=name file.txt | grep something' output.txt
Script captures the output of the command and writes it to file, thus avoiding the second pipe.
This has effectively worked around the issue for me, and I have explained why the command wasn't working as I expected, problem solved.
FYI, These other stackoverflow questions are related:
Trick an application into thinking its stdin is interactive, not a pipe
Force another program's standard output to be unbuffered using Python
You do know that tail starts by default with the last ten lines of the file? My guess is everything the cat version found is well into the past. Try tail -n+1 --follow=name file.txt to start from the beginning of the file.
works for me on Mac without --follow=name
bash-3.2$ tail delme.txt | grep po
position.bin
position.lrn
bash-3.2$ tail delme.txt | grep po | grep lr
position.lrn
grep pattern filename | grep pattern | grep pattern | grep pattern ......

CURL Progress Bar: How to pipe and extract numbers only using grep?

This is what I have so far:
[my1#graf home]$ curl -# -o f1.flv 'http://osr.com/f1.flv' | grep -o '*[0-9]*'
####################################################################### 100.0%
I wish to use grep and only extract the percentage from that progress bar that CURL outputs.
I think my regex is not correct and I am also not sure if this grep will take effect of the the percentage being continuously updated?
What I am trying to do is basically get CURL only to give me the percentage number as the output and nothing else.
Thank you for any help.
With curl 7.36.0 (should also work for other versions) you can extract the percentage in the following way:
curl ... 2>&1 -# | stdbuf -oL tr '\r' '\n' | grep -o '[0-9]*\.[0-9]'
Here ... stands for options/filenames. This outputs a sequence of percentage numbers.
Curl uses carriage returns \r in its output, so you need tr to transform them first into \n because grep is line oriented. You also need to modify output buffer settings with stdbuf to get the percentage numbers immediately after curl outputs them.
You can't get the progress info like that through grep; it doesn't make sense.
curl writes the progress bar to stderr, so you have to redirect to stdout before you can grep it:
$ curl -# -o f1.flv 'http://osr.com/f1.flv' 2>&1 | grep 1 | less results in:
^M 0.0
%^M######################################################################## 100.
0%^M######################################################################## 100
.0%^M######################################################################## 10
0.0%
Are you expecting a continual stream of numbers that you are redirecting somewhere else? Or do you expect to grab the numbers at a single point?
If it's the former, this sort of half-assedly works on a small file:
$ curl -# -o f1.flv 'http://osr.com/f1.flv' 2>&1 | sed 's/#//g' -
100.0% 0.0%
But it's useless on a large file. The output doesn't print until the download is finished, probably because curl seems to be sending ^H's to the terminal. There might be a better way to sed it, but I wouldn't hold my breath.
$ curl -# -o l.tbz 'ftp://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2009/06/2009-06-02-05-mozilla-1.9.1/firefox-3.5pre.en-US.linux-x86_64.tar.bz2' 2>&1 | sed 's/#//g' -
100.0%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Try this:
curl source -o dest -# 2> tmp&
grep -o ".....%" tmp | tail -n1
You need to use .* not * in your regex.
grep -o '.*[0-9].*'
That will catch all text though, so maybe try:
grep -p '[0-9]+'

Resources