Delete empty lines from a text file via Bash including empty spaces characters [duplicate] - linux

This question already has answers here:
Delete empty lines using sed
(17 answers)
Closed 6 years ago.
I tried to use 'sed' command to remove the empty lines.
sed -i '/^$/d' file.txt
My sample txt file looks likes this. The second line has space characters. sed command only removes the empty lines but not the lines with white space.
Sample text
Sample text
So is there away to accomplish this via bash.
My intended out put is
Sample text
Sample text

Use character class [:blank:] to indicate space or tab:
With sed:
sed -i '/^[[:blank:]]*$/ d' file.txt
With perl:
perl -ne 'print if !/^[[:blank:]]*$/' file.txt
With awk:
awk '!/^[[:blank:]]*$/' file.txt
With grep:
grep -v '^[[:blank:]]*$' file.txt
If the tool does not support editing in-place, leverage a temporary file e.g. for grep:
grep -v '^[[:blank:]]*$' file.txt >file.txt.tmp && mv file.txt{.tmp,}

sed -i '/^ *$/d' file.txt
or to also match other white space characters such as tabs, etc:
sed -i '/^[[:space:]]*$/d' file.txt
the * character matches 0 or more instances of preceding character

Related

Replace real newlines with \n in text file [duplicate]

This question already has answers here:
How can I replace each newline (\n) with a space using sed?
(43 answers)
Closed 3 years ago.
On a Pi, in a text file like this
line1
line2
line3
...
how can I translate that to a file with just one line formatted like this
line1\n\line2\nline3\n......
NB The real file is 50MB and 200000 lines long
You can use sed
sed ':a;N;$!ba;s/\n/\\n/g' my.txt >> new_my.txt
This will read the whole file in a loop, then replaces the newline(s) with a "\n" and store it in a new file.
With GNU sed you can:
sed -z -i -e 's/\n/\\n/g' file
replace all newlines for \n character. This can use some memory, as it can read the whole file into memory.
With awk you can print each line with \\n on the end:
awk '{printf "%s\\n", $0}'
You can use xargs to split the input on newlines and run printf:
cat file | xargs -d $'\n' printf '%s\\n'

Combining SED commands [duplicate]

This question already has answers here:
Combining two sed commands
(2 answers)
Closed 1 year ago.
Combine sed commands into one command
I am currently doing these two commands
removes the first character of each line
sed -i 's/\(.\{1\}\)//'
removes extra spaces in each line
sed -i 's/ / /g'
There are 3.4 BILLION lines in the 237GB file it is parsing, and i dont want it to need to run through twice.
The below sed command would combine the both. Use ; as separator to combine two sed operations.
sed -i 's/\(.\{1\}\)//;s/ / /g' file
Another way:
sed -i -e 's/\(.\{1\}\)//' -e 's/ / /g' file
You can try an awk
awk '{sub(/./,"");$1=$1}1' file
sub(/./,"") removes first character
$1=$1 removes all double space.

How do I replace single quotes with another character in sed?

I have a flat file where I have multiple occurrences of strings that contains single quote, e.g. hari's and leader's.
I want to replace all occurrences of the single quote with space, i.e.
all occurences of hari's to hari s
all occurences of leader's to leader s
I tried
sed -e 's/"'"/ /g' myfile.txt
and
sed -e 's/"'"/" "/g' myfile.txt
but they are not giving me the expected result.
Try to keep sed commands simple as much as possible.
Otherwise you'll get confused of what you'd written reading it later.
#!/bin/bash
sed "s/'/ /g" myfile.txt
This will do what you want to
echo "hari's"| sed 's/\x27/ /g'
It will replace single quotes present anywhere in your file/text. Even if they are used for quoting they will be replaced with spaces. In that case(remove the quotes within a word not at word boundary) you can use the following:
echo "hari's"| sed -re 's/(\<.+)\x27(.+\>)/\1 \2/g'
HTH
Just go leave the single quote and put an escaped single quote:
sed 's/'\''/ /g' input
also possible with a variable:
quote=\'
sed "s/$quote/ /g" input
Here is based on my own experience.
Please notice on how I use special char ' vs " after sed
This won't do (no output)
2521 #> echo 1'2'3'4'5 | sed 's/'/ /g'
>
>
>
but This would do
2520 #> echo 1'2'3'4'5 | sed "s/'/ /g"
12345
The -i should replace it in the file
sed -i 's/“/"/g' filename.txt
if you want backups you can do
sed -i.bak 's/“/"/g' filename.txt
I had to replace "0x" string with "32'h" and resolved with:
sed 's/ 0x/ 32\x27h/'

Delete empty lines using sed

I am trying to delete empty lines using sed:
sed '/^$/d'
but I have no luck with it.
For example, I have these lines:
xxxxxx
yyyyyy
zzzzzz
and I want it to be like:
xxxxxx
yyyyyy
zzzzzz
What should be the code for this?
You may have spaces or tabs in your "empty" line. Use POSIX classes with sed to remove all lines containing only whitespace:
sed '/^[[:space:]]*$/d'
A shorter version that uses ERE, for example with gnu sed:
sed -r '/^\s*$/d'
(Note that sed does NOT support PCRE.)
I am missing the awk solution:
awk 'NF' file
Which would return:
xxxxxx
yyyyyy
zzzzzz
How does this work? Since NF stands for "number of fields", those lines being empty have 0 fields, so that awk evaluates 0 to False and no line is printed; however, if there is at least one field, the evaluation is True and makes awk perform its default action: print the current line.
sed
'/^[[:space:]]*$/d'
'/^\s*$/d'
'/^$/d'
-n '/^\s*$/!p'
grep
.
-v '^$'
-v '^\s*$'
-v '^[[:space:]]*$'
awk
/./
'NF'
'length'
'/^[ \t]*$/ {next;} {print}'
'!/^[ \t]*$/'
sed '/^$/d' should be fine, are you expecting to modify the file in place? If so you should use the -i flag.
Maybe those lines are not empty, so if that's the case, look at this question Remove empty lines from txtfiles, remove spaces from start and end of line I believe that's what you're trying to achieve.
I believe this is the easiest and fastest one:
cat file.txt | grep .
If you need to ignore all white-space lines as well then try this:
cat file.txt | grep '\S'
Example:
s="\
\
a\
b\
\
Below is TAB:\
\
Below is space:\
\
c\
\
"; echo "$s" | grep . | wc -l; echo "$s" | grep '\S' | wc -l
outputs
7
5
Another option without sed, awk, perl, etc
strings $file > $output
strings - print the strings of printable characters in files.
With help from the accepted answer here and the accepted answer above, I have used:
$ sed 's/^ *//; s/ *$//; /^$/d; /^\s*$/d' file.txt > output.txt
`s/^ *//` => left trim
`s/ *$//` => right trim
`/^$/d` => remove empty line
`/^\s*$/d` => delete lines which may contain white space
This covers all the bases and works perfectly for my needs. Kudos to the original posters #Kent and #kev
The command you are trying is correct, just use -E flag with it.
sed -E '/^$/d'
-E flag makes sed catch extended regular expressions. More info here
You can say:
sed -n '/ / p' filename #there is a space between '//'
You are most likely seeing the unexpected behavior because your text file was created on Windows, so the end of line sequence is \r\n. You can use dos2unix to convert it to a UNIX style text file before running sed or use
sed -r "/^\r?$/d"
to remove blank lines whether or not the carriage return is there.
This works in awk as well.
awk '!/^$/' file
xxxxxx
yyyyyy
zzzzzz
You can do something like that using "grep", too:
egrep -v "^$" file.txt
My bash-specific answer is to recommend using perl substitution operator with the global pattern g flag for this, as follows:
$ perl -pe s'/^\n|^[\ ]*\n//g' $file
xxxxxx
yyyyyy
zzzzzz
This answer illustrates accounting for whether or not the empty lines have spaces in them ([\ ]*), as well as using | to separate multiple search terms/fields. Tested on macOS High Sierra and CentOS 6/7.
FYI, the OP's original code sed '/^$/d' $file works just fine in bash Terminal on macOS High Sierra and CentOS 6/7 Linux at a high-performance supercomputing cluster.
If you want to use modern Rust tools, you can consider:
ripgrep:
cat datafile | rg '.' line with spaces is considered non empty
cat datafile | rg '\S' line with spaces is considered empty
rg '\S' datafile line with spaces is considered empty (-N can be added to remove line numbers for on screen display)
sd
cat datafile | sd '^\n' '' line with spaces is considered non empty
cat datafile | sd '^\s*\n' '' line with spaces is considered empty
sd '^\s*\n' '' datafile inplace edit
Using vim editor to remove empty lines
:%s/^$\n//g
For me with FreeBSD 10.1 with sed worked only this solution:
sed -e '/^[ ]*$/d' "testfile"
inside [] there are space and tab symbols.
test file contains:
fffffff next 1 tabline ffffffffffff
ffffffff next 1 Space line ffffffffffff
ffffffff empty 1 lines ffffffffffff
============ EOF =============
NF is the command of awk you can use to delete empty lines in a file
awk NF filename
and by using sed
sed -r "/^\r?$/d"

How to remove lines from text file not starting with certain characters (sed or grep)

How do I delete all lines in a text file which do not start with the characters #, & or *? I'm looking for a solution using sed or grep.
Deleting lines:
With grep
From http://lowfatlinux.com/linux-grep.html :
The grep command selects and prints lines from a file (or a bunch of files) that match a pattern.
I think you can do something like this:
grep -v '^[\#\&\*]' yourFile.txt > output.txt
You can also use sed to do the same thing (check http://lowfatlinux.com/linux-sed.html ):
sed '^[\#\&\*]/d' yourFile.txt > output.txt
It's up to you to decide
Filtering lines:
My mistake, I understood you wanted to delete the lines. But if you want to "delete" all other lines (or filter the lines starting with the specified characters), then grep is the way to go:
grep '^[\#\&\*]' yourFile.txt > output.txt
sed -n '/^[#&*].*/p' input.txt > output.txt
this should work.
sed -ni '/^[#&*].*/p' input.txt
this one will edit the input file directly, be careful +
egrep '^(&|#|\*)' input.txt > output.txt

Resources