I see that when I use rsync with the -v option it prints the changed files list and some useful infos at the end, like the total transfer size.
Is it somewhat possible to cut out the first (long) part and just print the stats? I am using it in a script, and the log shouldn't be so long. Only the stats are useful.
Thank you.
As I was looking for an answer and came across this question:
rsync also supports the --stats option.
Best solution for now i think :
rsync --info=progress0,name0,flist0,stats2 ...
progress0 hides progress
progress2 display progress
name0 hides file names
stats2 displays stats at the end of transfer
This solution is more a "hack" than the right way to do it because the output is generated but only filtered afterwards. You can use the option --out-format.
rsync ... --out-format="" ... | grep -v -E "^sending|^created" | tr -s "\n"
The grep filter should probably be updated with unwanted lines you see in the output. The tr is here to filter the long sequence of carriage returns.
grep -E for extended regexes
grep -v to invert the match. "Selected lines are those not matching any of the specified patterns."
tr -s to squeeze the repeated carriage returns into a single one
Related
Is there a way to make grep output "words" from files that match the search expression?
If I want to find all the instances of, say, "th" in a number of files, I can do:
grep "th" *
but the output will be something like (bold is by me);
some-text-file : the cat sat on the mat
some-other-text-file : the quick brown fox
yet-another-text-file : i hope this explains it thoroughly
What I want it to output, using the same search, is:
the
the
the
this
thoroughly
Is this possible using grep? Or using another combination of tools?
Try grep -o:
grep -oh "\w*th\w*" *
Edit: matching from Phil's comment.
From the docs:
-h, --no-filename
Suppress the prefixing of file names on output. This is the default
when there is only one file (or only standard input) to search.
-o, --only-matching
Print only the matched (non-empty) parts of a matching line,
with each such part on a separate output line.
Cross distribution safe answer (including windows minGW?)
grep -h "[[:alpha:]]*th[[:alpha:]]*" 'filename' | tr ' ' '\n' | grep -h "[[:alpha:]]*th[[:alpha:]]*"
If you're using older versions of grep (like 2.4.2) which do not include the -o option, then use the above. Else use the simpler to maintain version below.
Linux cross distribution safe answer
grep -oh "[[:alpha:]]*th[[:alpha:]]*" 'filename'
To summarize: -oh outputs the regular expression matches to the file content (and not its filename), just like how you would expect a regular expression to work in vim/etc... What word or regular expression you would be searching for then, is up to you! As long as you remain with POSIX and not perl syntax (refer below)
More from the manual for grep
-o Print each match, but only the match, not the entire line.
-h Never print filename headers (i.e. filenames) with output lines.
-w The expression is searched for as a word (as if surrounded by
`[[:<:]]' and `[[:>:]]';
The reason why the original answer does not work for everyone
The usage of \w varies from platform to platform, as it's an extended "perl" syntax. As such, those grep installations that are limited to work with POSIX character classes use [[:alpha:]] and not its perl equivalent of \w. See the Wikipedia page on regular expression for more
Ultimately, the POSIX answer above will be a lot more reliable regardless of platform (being the original) for grep
As for support of grep without -o option, the first grep outputs the relevant lines, the tr splits the spaces to new lines, the final grep filters only for the respective lines.
(PS: I know most platforms by now would have been patched for \w.... but there are always those that lag behind)
Credit for the "-o" workaround from #AdamRosenfield answer
It's more simple than you think. Try this:
egrep -wo 'th.[a-z]*' filename.txt #### (Case Sensitive)
egrep -iwo 'th.[a-z]*' filename.txt ### (Case Insensitive)
Where,
egrep: Grep will work with extended regular expression.
w : Matches only word/words instead of substring.
o : Display only matched pattern instead of whole line.
i : If u want to ignore case sensitivity.
You could translate spaces to newlines and then grep, e.g.:
cat * | tr ' ' '\n' | grep th
Just awk, no need combination of tools.
# awk '{for(i=1;i<=NF;i++){if($i~/^th/){print $i}}}' file
the
the
the
this
thoroughly
grep command for only matching and perl
grep -o -P 'th.*? ' filename
I was unsatisfied with awk's hard to remember syntax but I liked the idea of using one utility to do this.
It seems like ack (or ack-grep if you use Ubuntu) can do this easily:
# ack-grep -ho "\bth.*?\b" *
the
the
the
this
thoroughly
If you omit the -h flag you get:
# ack-grep -o "\bth.*?\b" *
some-other-text-file
1:the
some-text-file
1:the
the
yet-another-text-file
1:this
thoroughly
As a bonus, you can use the --output flag to do this for more complex searches with just about the easiest syntax I've found:
# echo "bug: 1, id: 5, time: 12/27/2010" > test-file
# ack-grep -ho "bug: (\d*), id: (\d*), time: (.*)" --output '$1, $2, $3' test-file
1, 5, 12/27/2010
cat *-text-file | grep -Eio "th[a-z]+"
You can also try pcregrep. There is also a -w option in grep, but in some cases it doesn't work as expected.
From Wikipedia:
cat fruitlist.txt
apple
apples
pineapple
apple-
apple-fruit
fruit-apple
grep -w apple fruitlist.txt
apple
apple-
apple-fruit
fruit-apple
I had a similar problem, looking for grep/pattern regex and the "matched pattern found" as output.
At the end I used egrep (same regex on grep -e or -G didn't give me the same result of egrep) with the option -o
so, I think that could be something similar to (I'm NOT a regex Master) :
egrep -o "the*|this{1}|thoroughly{1}" filename
To search all the words with start with "icon-" the following command works perfect. I am using Ack here which is similar to grep but with better options and nice formatting.
ack -oh --type=html "\w*icon-\w*" | sort | uniq
You could pipe your grep output into Perl like this:
grep "th" * | perl -n -e'while(/(\w*th\w*)/g) {print "$1\n"}'
grep --color -o -E "Begin.{0,}?End" file.txt
? - Match as few as possible until the End
Tested on macos terminal
$ grep -w
Excerpt from grep man page:
-w: Select only those lines containing matches that form whole words. The test is that the matching substring must either be at the beginning of the line, or preceded by a non-word constituent character.
ripgrep
Here are the example using ripgrep:
rg -o "(\w+)?th(\w+)?"
It'll match all words matching th.
I make extensive use of piping multiple linux shell commands, for example:
grep BLAH file1 | sed 's/old/new/' | sort -k 1,1 > file3
My files often have a header line, and often I have to preserve it throughout the pipeline. So, for example, I would want to grep, sed and sort from line 2 and on, while keeping the 1st line unchanged.
I am looking for some general solution that given some command(s) would preserve the header. I usually write the header to a file before the pipe and then cat it back after the pipe ends. I have started using zshell, so I was wondering if that might help to get a more streamlined solution.
Perhaps something like this:
(arrows are pipes in the image)
but I am not sure how to get that to work in zshell or if it is even possible. One problem is that I need to follow up the first pipe split with a command on both pipes.
Any creative solutions?
Vaughn and devnull have already directed you towards the solution. They both contain typos though and I have some remarks to add and would advise to use this instead:
{ head -n 1 file1; tail -n +2 file1 | grep BLAH | sed 's/old/new/' | sort -k 1,1; } >file3
What it does is take the first line of file1 in one command (your header) and does your grep/sed/whatever magic in a second command on the rest of the file (sans the header, tail -n +2) and redirects the combined output to file3.
Notes:
If your shell supports { } it is preferred over the ( ) construct in this case as it does not spawn a subshell (sometimes it is desirable to have the subshell, though).
head -2 is deprecated, you should use the -n parameter like head -n 2
You can skip the tail -n +2 file1 part if you absolutely know that what you are grepping for cannot be found in your header, but it is certainly cleaner this way.
This should work in most recent shells, btw (bash, ksh, zsh).
I am working on a Java EE application where its logs will be generated inside a Linux server .
I have used the command tail -f -n -10000 MyLog
It displayed last 1000 lines from that log file .
Now I pressed Ctrl + c in Putty to disconnect the logs updation ( as i am feared it may be updated with new requests and I will loose my data )
In the displayed result, how can I search for a particular keyword ?? (Used / String name to search but it's not working)
Pipe your output to PAGER.
tail -f -n LINE_CNT LOG_FILE | less
then you can use
/SEARCH_STRING
Two ways:
tail -n 10000 MyLog| grep -i "search phrase"
tail -f -n 10000 MyLog | less
The 2nd method will allow you to search with /. It will only search down but you can press g to go back to the top.
Edit: On testing it seems method 2 doesn't work all that well... if you hit the end of the file it will freeze till you ctrl+c the tail command.
You need to redirect the output from tail into a search utility (e.g. grep). You could do this in two steps: save the output to a file, then search in the file; or in one go: pipe the ouput to the search utility
To see what goes into the file (so you can hit Ctlr+c) you can use the tee command, which duplicates the output to the screen and to a file:
tail -f -n -10000 MyLog | tee <filename>
Then search within the file.
If you want to pipe the result into the search utility, you can use the same trick as above, but use your search program instead of tee
Controlling terminal output on the fly
While running any command in a terminal such as Putty you can use CTRL-S and CTRL-Q to stop and start output to the Putty terminal.
Excluding lines using grep
If you want to exclude lines that contain a specific pattern use grep -v the following would remove all line that contain the string INFO
tail -f logfile | grep -v INFO
Show lines that do not contain the words INFO or DEBUG
tail -f logfile | grep -v -E 'INFO|DEBUG'
Finally, the MOTHER AND FATHER of all tailing tools is xtail.pl
If you have perl on your host xtail.pl is a very nice tool to learn and in a nutshell you can use it to tail multiple files. Very handy.
You can just open it with less command
less logfile_name
when you open the file you can use this guide here
Tip: I suggest, first to use G to go to the end of the file and then to you use Backward Search
It's a well-known task, simple to describe:
Given a text file foo.txt, and a blacklist file of exclusion strings, one per line, produce foo_filtered.txt that has only the lines of foo.txt that do not contain any exclusion string.
A common application is filtering compiler warnings from a build log, but to ignore warnings on files that are not yours. The file foo.txt is the warnings file (itself filtered from the build log), and a blacklist file excluded_filenames.txt with file names, one per line.
I know how it's done in procedural languages like Perl or AWK, and I've even done it with combinations of Linux commands such as cut, comm, and sort.
But I feel that I should be really close with xargs, and just can't see the last step.
I know that if excluded_filenames.txt has only 1 file name in it, then
grep -v foo.txt `cat excluded_filenames.txt`
will do it.
And I know that I can get the filenames one per line with
xargs -L1 -a excluded_filenames.txt
So how do I combine those two into a single solution, without explicit loops in a procedural language?
Looking for the simple and elegant solution.
You should use the -f option (or you can use fgrep which is the same):
grep -vf excluded_filenames.txt foo.txt
You could also use -F which is more directly the answer to what you asked:
grep -vF "`cat excluded_filenames.txt`" foo.txt
from man grep
-f FILE, --file=FILE
Obtain patterns from FILE, one per line. The empty file contains zero patterns, and therefore matches nothing.
-F, --fixed-strings
Interpret PATTERN as a list of fixed strings, separated by newlines, any of which is to be matched.
I have a set of 10 CSV files, which normally have a an entry of this kind
a,b,c,d
d,e,f,g
Now due to some error entries in this file have become of this kind
a,b,c,d
d,e,f,g
,,,
h,i,j,k
Now I want to remove the line with only commas in all the files. These files are on a Linux filesystem.
Any command that you recommend that can replaces the erroneous lines in all the files.
It depends on what you mean by replace. If you mean 'remove', then a trivial variant on #wnoise's solution is:
grep -v '^,,,$' old-file.csv > new-file.csv
Note that this deletes just those lines with exactly three commas. If you want to delete mal-formed lines with any number of commas (including zero) - and no other characters on the line, then:
grep -v '^,*$' ...
There are endless other variations on the regex that would deal with other scenarios. Dealing with full CSV data with commas inside quotes starts to need something other than a regex machine. It can be done, within broad limits, especially in more complex regex systems such as PCRE or Perl. But it requires more work.
Check out Mastering Regular Expressions.
sed 's/,,,/replacement/' < old-file.csv > new-file.csv
optionally followed by
mv new-file.csv old-file.csv
Replace or remove, your post is not clear... For replacement see wnoise's answer. For removing, you could use
awk '$0 !~ /,,,/ {print}' <old-file.csv > new-file.csv
What about trying to keep only lines which are matching the desired format instead of handling one exception ?
If the provided input is what you really want to match:
grep -E '[a-z],[a-z],[a-z],[a-z]' < oldfile.csv > newfile.csv
If the input is different, provide it, the regular expression should not be too hard to write.
Do you want to replace them with something, or delete them entirely? Either way, it can be done with sed. To delete:
sed -i -e '/^,\+$/ D' yourfile1.csv yourfile2.csv ...
To replace: well, see wnoise's answer, or if you don't want to create new files with the output,
sed -i -e '/^,\+$/ s//replacement/' yourfile1.csv yourfile2.csv ...
or
sed -i -e '/^,\+$/ c\
replacement' yourfile1.csv yourfile2.csv ...
(that should be entered exactly as is, including the line break). Of course, you can also do this with awk or perl or, if you're only deleting lines, even grep:
egrep -v '^,+$' < oldfile.csv > newfile.csv
I tested these to make sure they work, but I'd advise you to do the same before using them (just in case). You can omit the -i option from sed, in which case it'll print out the results (rather than writing them back to the file), or omit the output redirection >newfile.csv from grep.
EDIT: It was pointed out in a comment that some features of these sed commands only work on GNU sed. As far as I can tell, these are the -i option (which can be replaced with shell redirection, sed ... <infile >outfile ) and the \+ modifier (which can be replaced with \{1,\} ).
Most simply:
$ grep -v ,,,, oldfile > newfile
$ mv newfile oldfile
yes, awk or grep are very good option if you are working in linux platform. However you can use perl regex for other platform. using join & split options.