shell duplicate spaces in file - linux

Is it possible to remove multiple spaces from a text file and save the changes in the same file using awk or grep?
Input example:
aaa bbb ccc
ddd yyyy
Output I want:
aaa bbb ccc
ddd yyyy

Simply reset value of $1 to again $1 which will allow OFS to come into picture and will add proper spaces into lines.
awk '{$1=$1} 1' Input_file
EDIT: Since OP mentioned that what if we want to keep only starting spaces then try following.
awk '
match($0,/^ +/){
spaces=substr($0,RSTART,RLENGTH)
}
{
$1=$1
$1=spaces $1
spaces=""
}
1
' Input_file

Using sed
sed -i -E 's#[[:space:]]+# #g' < input file
For removing spaces at the start
sed -i -E 's#[[:space:]]+# #g; s#^ ##g' < input file
Demo:
$cat test.txt
aaa bbb ccc
ddd yyyy
Output I want:
aaa bbb ccc
ddd yyyy
$sed -i -E 's#[[:space:]]+# #g' test.txt
$cat test.txt
aaa bbb ccc
ddd yyyy
Output I want:
aaa bbb ccc
ddd yyyy
$

Related

In Linux command line console, how to get the sub-string from a file?

The content of the file is fixed.
Example:
2016-03-28T00:02 AAA 2016-03-28T00:03 ADASDASD
2016-03-28T00:03 BBB 2016-03-28T00:04 FAFAFDAS
2016-03-28T00:05 CCC 2016-03-28T00:06 SDAFAFAS
....
Which command can I use to get all sub-strings, AAA, BBB, CCC, etc.
you can use cut and awk and perl for this.
cat >> file.data << EOF
2016-03-28T00:02 AAA 2016-03-28T00:03 ADASDASD
2016-03-28T00:03 BBB 2016-03-28T00:04 FAFAFDAS
2016-03-28T00:05 CCC 2016-03-28T00:06 SDAFAFAS
EOF
AWK
awk '{ print $2 }' file.data
AAA
BBB
CCC
CUT
cut -d " " -f2 file.data
AAA
BBB
CCC
PERL
perl -alne 'print $F[1] ' file.data
AAA
BBB
CCC
You can use cut:
cut -d' ' -f 2 file
You can use AWK for this:
jayforsythe$ cat > file
2016-03-28T00:02 AAA 2016-03-28T00:03 ADASDASD
2016-03-28T00:03 BBB 2016-03-28T00:04 FAFAFDAS
2016-03-28T00:05 CCC 2016-03-28T00:06 SDAFAFAS
jayforsythe$ awk '{ print $2 }' file
AAA
BBB
CCC
To save the result to another file, simply add the redirection operator:
jayforsythe$ awk '{ print $2 }' file > file2

awk how to print the rest

my file contains lines like this
any1 aaa bbb ccc
The delimiter is space. the number of words in the line is unknown
I want to put the first word into a var1. It's simple with
awk '{print $1}'
Now I want to put the rest of the line into a var2 with awk.
How I can print the rest of the line with awk ?
Better to use read here:
s="any1 aaa bbb ccc"
read var1 var2 <<< "$s"
echo "$var1"
any1
echo "$var2"
aaa bbb ccc
For awk only solution use:
echo "$s" | awk '{print $1; print substr($0, index($0, " ")+1)}'
any1
aaa bbb ccc
$ var=$(awk '{sub(/^[^[:space:]]+[[:space:]]+/,"")}1' file)
$ echo "$var"
aaa bbb ccc
or in general to skip some number of fields use a RE interval:
$ awk '{sub(/^[[:space:]]*([^[:space:]]+[[:space:]]+){1}/,"")}1' file
aaa bbb ccc
$ awk '{sub(/^[[:space:]]*([^[:space:]]+[[:space:]]+){2}/,"")}1' file
bbb ccc
$ awk '{sub(/^[[:space:]]*([^[:space:]]+[[:space:]]+){3}/,"")}1' file
ccc
Note that doing this gets much more complicated if you have a FS that's more than a single char, and the above is just for the default FS since it additionally skips any leading blanks if present (remove the first [[:space:]]* if you have a non-default but still single-char FS).
awk solution:
awk '{$1 = ""; print $0;}'`

merge specific line using awk and sed

I want to merge specific line
Input :
AAA
BBB
CCC
DDD
EEE
AAA
BBB
DDD
CCC
EEE
Output Should be
AAA
BBB
CCC DDD
EEE
AAA
BBB
DDD
CCC EEE
I want to search CCC and merge next line with it.
I have tried with awk command but didn't get success
Use awk patterns, if the line matches /CCC/ then print the line with a space at the end and go on to the next line. Otherwise (1), print the line.
awk '/CCC/ { printf("%s ", $0); next } 1' file
Using sed:
sed '/CCC/ { N; s/\n/ / }' file
Using awk:
awk '{ ORS=(/CCC/ ? FS : RS) }1' file

grep -v except pattern

I want to grep -v file except pattern.
this is my file content (test.txt):
a
aaa
bbb
ccc
I want to this result:
aaa
bbb
ccc
And cat test.txt |grep -v "a" --exclude="aaa" is not correctly work and return this:
bbb
ccc
You need to use word boundary \b which matches between a word character and a non-word character.
$ grep -v '\ba\b' file
aaa
bbb
ccc
OR
$ grep -v '^a$' file
aaa
bbb
ccc
^ Asserts that we are at the start of a line and $ asserts that we are at the end of a line.
$ grep -w -v "a" test.txt
aaa
bbb
ccc
From the man page
-w, --word-regexp
Select only those lines containing matches that form whole
words.

Remove trailing letters at the end of string

I have some strings like below:
ffffffffcfdeee^dddcdeffffffffdddcecffffc^cbcb^cb`cdaba`eeeeeefeba[NNZZcccYccaccBBBBBBBBBBBBBBBBBBBBBB
eedeedffcc^bb^bccccbadddba^cc^e`eeedddda`deca_^^\```a```^b^`I^aa^bb^`_b\a^b```Y_\`b^`aba`cM[SS\ZY^BBB
Each string may (or may not) end with a stretch of trailing B of varied length.
I'm just wondering if we can simply use Bash code to remove the B stretch?
You could try something like
sed 's/\(.\)B*$/\1/' file
Input
aaa BBBBB
aaa BBBBB cccc
aaa bbb ccc BBBBBBB
Output
aaa
aaa BBBBB cccc
aaa bbb ccc
just with bash
shopt -s extglob
str="a.zxn;lqwyerpyqgha;lsdnBBBBB"
str=${str%%+(B)}
echo $str # ==> a.zxn;lqwyerpyqgha;lsdn
This might work for you:
sed 's/B*$//' file

Resources