Remove empty lines in a text file via grep

Remove empty lines in a text file via grep - linux

FILE:
hello
world
foo
bar
How can I remove all the empty new lines in this FILE?
Output of command:
FILE:
hello
world
foo
bar

grep . FILE
(And if you really want to do it in sed, then: sed -e /^$/d FILE)
(And if you really want to do it in awk, then: awk /./ FILE)

Try the following:
grep -v -e '^$'

with awk, just check for number of fields. no need regex
$ more file
hello
world
foo
bar
$ awk 'NF' file
hello
world
foo
bar

Here is a solution that removes all lines that are either blank or contain only space characters:
grep -v '^[[:space:]]*$' foo.txt

If removing empty lines means lines including any spaces, use:
grep '\S' FILE
For example:
$ printf "line1\n\nline2\n \nline3\n\t\nline4\n" > FILE
$ cat -v FILE
line1
line2
line3
line4
$ grep '\S' FILE
line1
line2
line3
line4
$ grep . FILE
line1
line2
line3
line4
See also:
How to remove empty/blank lines (including spaces) in a file in Unix?
How to remove blank lines from a file in shell?
With sed: Delete empty lines using sed
With awk: Remove blank lines using awk

Simplest Answer -----------------------------------------
[root#node1 ~]# cat /etc/sudoers | grep -v -e ^# -e ^$
Defaults !visiblepw
Defaults always_set_home
Defaults match_group_by_gid
Defaults always_query_group_plugin
Defaults env_reset
Defaults env_keep = "COLORS DISPLAY HOSTNAME HISTSIZE KDEDIR LS_COLORS"
Defaults env_keep += "MAIL PS1 PS2 QTDIR USERNAME LANG LC_ADDRESS LC_CTYPE"
Defaults env_keep += "LC_COLLATE LC_IDENTIFICATION LC_MEASUREMENT LC_MESSAGES"
Defaults env_keep += "LC_MONETARY LC_NAME LC_NUMERIC LC_PAPER LC_TELEPHONE"
Defaults env_keep += "LC_TIME LC_ALL LANGUAGE LINGUAS _XKB_CHARSET XAUTHORITY"
Defaults secure_path = /sbin:/bin:/usr/sbin:/usr/bin
root ALL=(ALL) ALL
%wheel ALL=(ALL) ALL
[root#node1 ~]#

Try this: sed -i '/^[ \t]*$/d' file-name
It will delete all blank lines having any no. of white spaces (spaces or tabs) i.e. (0 or more) in the file.
Note: there is a 'space' followed by '\t' inside the square bracket.
The modifier -i will force to write the updated contents back in the file. Without this flag you can see the empty lines got deleted on the screen but the actual file will not be affected.

grep '^..' my_file
example
THIS
IS
THE
FILE
EOF_MYFILE
it gives as output only lines with at least 2 characters.
THIS
IS
THE
FILE
EOF_MYFILE
See also the results with grep '^' my_file outputs
THIS
IS
THE
FILE
EOF_MYFILE
and also with grep '^.' my_file outputs
THIS
IS
THE
FILE
EOF_MYFILE

Try ex-way:
ex -s +'v/\S/d' -cwq test.txt
For multiple files (edit in-place):
ex -s +'bufdo!v/\S/d' -cxa *.txt
Without modifying the file (just print on the standard output):
cat test.txt | ex -s +'v/\S/d' +%p +q! /dev/stdin

Perl might be overkill, but it works just as well.
Removes all lines which are completely blank:
perl -ne 'print if /./' file
Removes all lines which are completely blank, or only contain whitespace:
perl -ne 'print if ! /^\s*$/' file
Variation which edits the original and makes a .bak file:
perl -i.bak -ne 'print if ! /^\s*$/' file

If you want to know what the total lines of code is in your Xcode project and you are not interested in listing the count for each swift file then this will give you the answer. It removes lines with no code at all and removes lines that are prefixed with the comment //
Run it at the root level of your Xcode project.
find . \( -iname \*.swift \) -exec grep -v '^[[:space:]]*$' \+ | grep -v -e '//' | wc -l
If you have comment blocks in your code beginning with /* and ending with */ such as:
/*
This is an comment block
*/
then these will get included in the count. (Too hard).

Related

How to replace a specific line in a file with a string variable using the line number bash script?

Here is the contents of a target.txt file:
line1
line2
Environment=xLink=https://11111/route
line4
line5
I am trying to write a bash script that will find the number of the line containing 'https' and then replace this entire line with a new string variable obtained within the bash script, here is the bash script without the replacement line:
#!/bin/bash
x="12345"
route="/route"
x_route="${x}${route}"
x_init="Environment=xLink=https://"
new_line="${x_init}${x_route}"
echo "${new_line}"
to_replace_line_number=$(find target.txt -type f | xargs grep -n 'https' | cut -c1-2)
echo "${to_replace_line_number}"
targetfile=target.txt
echo "${targetfile}"
Invoking this script outputs the following as expected:
Environment=xLink=https://12345/route
3:
target.txt
Now, without the bash script, if I invoked:
sudo sed -i '3 c\Environment=xLink=https://12345/route' target.txt
The target.txt changes as desired to:
line1
line2
Environment=xLink=https://12345/route
line4
line5
But the goal is to automate, so I am trying to use sed command to do the job inside the bash script. So far I tried two methods, none of them worked.
Method 1:
I added the following line to the bash script:
sudo sed -i "${to_replace_line_number}s/.*/${new_line}/" ${targetfile}
When I ran the script, it didn't work and I got this error:
sed: -e expression #1, char 2: : doesn't want any addresses
Method 2:
I added the following command to the bash script:
sudo sed -i "${to_replace_line_number} c\${new_line}" ${targetfile}
When I ran the script, it didn't work and I got this error:
sed: -e expression #1, char 2: : doesn't want any addresses
What is that I am missing exactly? Any help is very much appreciated.

This is because you are taking out two characters when reading the line number.As a result, an extra ':' is popping up in the variable. Instead, take out only the one field and it should work fine.
Replace
to_replace_line_number=$(find target.txt -type f | xargs grep -n 'https' | cut -c1-2)
with
to_replace_line_number=$(find target.txt -type f | xargs grep -n 'https' | cut -c1)

\ is a special character, so when you use it in double quotes you have to escape it:
# Set example values and create a test file:
to_replace_line_number="3"
new_line="Environment=xLink=https://12345/route"
targetfile="test.txt"
printf 'line%d\n' {1..5} > "$targetfile"
# The actual command
sudo sed -i "${to_replace_line_number} c\\${new_line}" "${targetfile}"
This would make it equivalent to your manual invocation.
If you have wondered why the documentation for c appears to be weirdly formatted compared to r or y, it's because the linefeed after the \ is intentional. This is the POSIX way of doing it:
sudo sed -i "${to_replace_line_number} c\\
${new_line}" "${targetfile}"

There are a couple issues with the current code:
to_replace_line_number == 3: - NOTICE the colon (:); this is fed into the sed command like such: sed -i "3: c\.... and is generating the error message stating an issue with the 2nd character, ie, the :
as #thatotherguy has pointed out in his answer, the c\ option requires an escape character and an embedded carriage return ... or ...
Minimal changes to OPs current code:
# parse the `grep -n` by having `cut` pull everthing before the (first) `:`
$ to_replace_line_number=$(find target.txt -type f | xargs grep -n 'https' | cut -d":" -f1)
$ echo "${to_replace_line_number}"
3
# modification to #thatotherguy's `sed/c` suggestion to allow all code to go on a single line:
$ sed -i -e "${to_replace_line_number} c\\" -e "${new_line}" ${targetfile}
$ cat "${targetfile}"
line1
line2
Environment=xLink=https://12345/route
line4
line5
Instead of spawning the sub-process calls to get the line number, there are several ways sed can be used to find and replace the desired line.
One sed idea:
sed -i "s|^.*https.*$|${new_line}|" ${targetfile}
Where:
| - use pipe as sed delimiter since ${new_line} contains forward slashes
^.*https.*$ - match any line that contains the string https
${new_line} - replace the line with the contents of ${new_line}
After running the above:
$ cat target.txt
line1
line2
Environment=xLink=https://12345/route
line4
line5

grep - print all lines containing 'cat' as the second word

Ok so considering i have a file containing the following text:
lknsglkn cat lknrhlkn lsrhkn
cat lknerylnk lknaselk cat
awiooiyt lkndrhlk dhlknl
blabla cat cat bla bla
I need to use grep to print only the lines containing 'cat' as the second word on the line, namely lines 1 and 4. I've tried multiple grep -e 'regex' <file> commands but can't seem to get the right one. I don't know how to match the N'th word of a line.

this may work for you?
grep -E '^\w+\s+cat\s' file
if the first "word" can contain some non-word characters, e.g. "#, (,[..", you could also try:
grep -E '^\S+\s+cat\s' file
with your example input:
kent$ echo "lknsglkn cat lknrhlkn lsrhkn
cat lknerylnk lknaselk cat
awiooiyt lkndrhlk dhlknl
blabla cat cat bla bla"|grep -E '^\S+\s+cat\s'
lknsglkn cat lknrhlkn lsrhkn
blabla cat cat bla bla

What constitutes a word?
grep '^[a-z][a-z]* *cat '
This will work if there is at least a blank after cat. If that's not guaranteed, then:
grep -E '^[a-z]+ +cat( |$)'
which looks for cat followed by a blank or end of line.
If you want a more extensive definition of 'first word' (upper case, digits, punctuation), change the character class. If you want to allow for blanks or tabs, there are changes that can be made. If you can have leading blanks, add '*' at the caret. Variations as required.
These variations will work with any version of grep that supports the -E option. POSIX does not mandate notations such as \S to mean 'non-white-space', though GNU grep does support that as an extension. The grep -E version will work with regular egrep if grep -E does not work but egrep exists (don't use the -E option with egrep).

The following should work:
grep -e '^\S\+\scat\s'
The line should start with a non-whitespace of length at least 1, followed by a whitespace and the word "cat" followed by a whitespace.

Will be slower, but perhaps more readable:
awk '$2 == "cat"' file

how can I move all lines beginning in 'foobar' to the end of a file?

Say I have a script with a number of lines beginning foobar
I would like to move all of the lines to the end of the document while keeping their order
e.g. go from:
# There's a Polar Bear
# In our Frigidaire--
foobar['brangelina'] <- 2
# He likes it 'cause it's cold in there.
# With his seat in the meat
foobar['billybob'] <- 1
# And his face in the fish
to
# There's a Polar Bear
# In our Frigidaire--
# He likes it 'cause it's cold in there.
# With his seat in the meat
# And his face in the fish
foobar['brangelina'] <- 2
foobar['billybob'] <- 1
This is as far as I have gotten:
grep foobar file.txt > newfile.txt
sed -i 's/foobar//g' foo.txt
cat newfile.txt > foo.txt

This might work:
sed '/^foobar/{H;$!d;s/.*//};$G;s/\n*//' input_file
EDIT: Amended for the corner case when foobar is on the last line

This will do:
grep -v ^foobar file.txt > tmp1.txt
grep ^foobar file.txt > tmp2.txt
cat tmp1.txt tmp2.txt > newfile.txt
rm tmp1.txt tmp2.txt
The -v option returns all the lines which do not match the given pattern. The ^ marks the beginning of a line, so ^foobar matches lines beginning with foobar.

grep -v ^foobar file.txt > file1.txt
grep ^foobar file.txt > file2.txt
cat file2.txt >> file1.txt

grep -v ^foobar file.txt >newfile.txt
grep ^foobar file.txt >>newfile.txt
no need for temporary file

You can also do:
vim file.txt -c 'g/^foobar/m$' -c 'wq'
The -c switch means an Ex command follows, the g commands operates on all lines containing the pattern given, and the action is here m$ which means “move to end of file” (it preserves order). wq weans “save and exit vim”.
If this is too slow you can also prevent vim from reading vimrc:
vim -u NONE file.txt -c 'g/^foobar/m$' -c 'wq'

Removing line that contains more than one word

I need to remove a line in a specified file if it has more than one word in it using a bash script in linux.
e.g. file:
$ cat testfile
This is a text
file
This line should be deleted
this-should-not.

awk 'NF<=1{print}' testfile
a word being a run of non-whitespace.

Just for fun, here's a pure bash version which doesn't call any other executable (since you asked for it in bash):
$ while read a b; do if [ -z "$b" ]; then echo $a;fi;done <testfile

awk '!/[ \t]/{print $1}' testfile
This reads "print the first element of lines that don't contain a space or a tab".
Empty lines will be output (since they don't contain more than one word).

Easy enough:
$ egrep -v '\S\s+\S' testfile

$ sed '/ /d' << EOF
> This is a text
> file
>
> This line should be deleted
> this-should-not.
> EOF
file
this-should-not.

If you want to edit files in-place (without any backups), you may also use man ed:
cat <<-'EOF' | ed -s testfile
H
,g/^[[:space:]]*/s///
,g/[[:space:]]*$/s///
,g/[[:space:]]/.d
wq
EOF

This should satisfy your needs:
cat filename | sed -n '/^\S*$/p'

Remove blank lines with grep

I tried grep -v '^$' in Linux and that didn't work. This file came from a Windows file system.

Try the following:
grep -v -e '^$' foo.txt
The -e option allows regex patterns for matching.
The single quotes around ^$ makes it work for Cshell. Other shells will be happy with either single or double quotes.
UPDATE: This works for me for a file with blank lines or "all white space" (such as windows lines with \r\n style line endings), whereas the above only removes files with blank lines and unix style line endings:
grep -v -e '^[[:space:]]*$' foo.txt

Keep it simple.
grep . filename.txt

Use:
$ dos2unix file
$ grep -v "^$" file
Or just simply awk:
awk 'NF' file
If you don't have dos2unix, then you can use tools like tr:
tr -d '\r' < "$file" > t ; mv t "$file"

grep -v "^[[:space:]]*$"
The -v makes it print lines that do not completely match
===Each part explained===
^ match start of line
[[:space:]] match whitespace- spaces, tabs, carriage returns, etc.
* previous match (whitespace) may exist from 0 to infinite times
$ match end of line
Running the code-
$ echo "
> hello
>
> ok" |
> grep -v "^[[:space:]]*$"
hello
ok
To understand more about how/why this works, I recommend reading up on regular expressions. http://www.regular-expressions.info/tutorial.html

If you have sequences of multiple blank lines in a row, and would like only one blank line per sequence, try
grep -v "unwantedThing" foo.txt | cat -s
cat -s suppresses repeated empty output lines.
Your output would go from
match1
match2
to
match1
match2
The three blank lines in the original output would be compressed or "squeezed" into one blank line.

The same as the previous answers:
grep -v -e '^$' foo.txt
Here, grep -e means the extended version of grep. '^$' means that there isn't any character between ^(Start of line) and $(end of line). '^' and '$' are regex characters.
So the command grep -v will print all the lines that do not match this pattern (No characters between ^ and $).
This way, empty blank lines are eliminated.

I prefer using egrep, though in my test with a genuine file with blank line your approach worked fine (though without quotation marks in my test). This worked too:
egrep -v "^(\r?\n)?$" filename.txt

Do lines in the file have whitespace characters?
If so then
grep "\S" file.txt
Otherwise
grep . file.txt
Answer obtained from:
https://serverfault.com/a/688789

This code removes blank lines and lines that start with "#"
grep -v "^#" file.txt | grep -v ^[[:space:]]*$

awk 'NF' file-with-blank-lines > file-with-no-blank-lines

It's true that the use of grep -v -e '^$' can work, however it does not remove blank lines that have 1 or more spaces in them. I found the easiest and simplest answer for removing blank lines is the use of awk. The following is a modified a bit from the awk guys above:
awk 'NF' foo.txt
But since this question is for using grep I'm going to answer the following:
grep -v '^ *$' foo.txt
Note: the blank space between the ^ and *.
Or you can use the \s to represent blank space like this:
grep -v '^\s*$' foo.txt

I tried hard, but this seems to work (assuming \r is biting you here):
printf "\r" | egrep -xv "[[:space:]]*"

Using Perl:
perl -ne 'print if /\S/'
\S means match non-blank characters.

egrep -v "^\s\s+"
egrep already do regex, and the \s is white space.
The + duplicates current pattern.
The ^ is for the start

Use:
grep pattern filename.txt | uniq

Here is another way of removing the white lines and lines starting with the # sign. I think this is quite useful to read configuration files.
[root#localhost ~]# cat /etc/sudoers | egrep -v '^(#|$)'
Defaults requiretty
Defaults !visiblepw
Defaults always_set_home
Defaults env_reset
Defaults env_keep = "COLORS DISPLAY HOSTNAME HISTSIZE INPUTRC KDEDIR
LS_COLORS"
root ALL=(ALL) ALL
%wheel ALL=(ALL) ALL
stack ALL=(ALL) NOPASSWD: ALL

Read lines from file exclude EMPTY Lines
grep -v '^$' folderlist.txt
folderlist.txt
folder1/test
folder2
folder3
folder4/backup
folder5/backup
Results will be:
folder1/test
folder2
folder3
folder4/backup
folder5/backup

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Remove empty lines in a text file via grep - linux

FILE: hello world foo bar How can I remove all the empty new lines in this FILE? Output of command: FILE: hello world foo bar

grep . FILE (And if you really want to do it in sed, then: sed -e /^$/d FILE) (And if you really want to do it in awk, then: awk /./ FILE)

Try the following: grep -v -e '^$'

with awk, just check for number of fields. no need regex $ more file hello world foo bar $ awk 'NF' file hello world foo bar

Here is a solution that removes all lines that are either blank or contain only space characters: grep -v '^[[:space:]]*$' foo.txt

grep '^..' my_file example THIS IS THE FILE EOF_MYFILE it gives as output only lines with at least 2 characters. THIS IS THE FILE EOF_MYFILE See also the results with grep '^' my_file outputs THIS IS THE FILE EOF_MYFILE and also with grep '^.' my_file outputs THIS IS THE FILE EOF_MYFILE

Try ex-way: ex -s +'v/\S/d' -cwq test.txt For multiple files (edit in-place): ex -s +'bufdo!v/\S/d' -cxa *.txt Without modifying the file (just print on the standard output): cat test.txt | ex -s +'v/\S/d' +%p +q! /dev/stdin

Related

How to replace a specific line in a file with a string variable using the line number bash script?

grep - print all lines containing 'cat' as the second word

how can I move all lines beginning in 'foobar' to the end of a file?

Removing line that contains more than one word

Remove blank lines with grep

Categories

Resources