Remove trailing letters at the end of string - string

I have some strings like below:
ffffffffcfdeee^dddcdeffffffffdddcecffffc^cbcb^cb`cdaba`eeeeeefeba[NNZZcccYccaccBBBBBBBBBBBBBBBBBBBBBB
eedeedffcc^bb^bccccbadddba^cc^e`eeedddda`deca_^^\```a```^b^`I^aa^bb^`_b\a^b```Y_\`b^`aba`cM[SS\ZY^BBB
Each string may (or may not) end with a stretch of trailing B of varied length.
I'm just wondering if we can simply use Bash code to remove the B stretch?

You could try something like
sed 's/\(.\)B*$/\1/' file
Input
aaa BBBBB
aaa BBBBB cccc
aaa bbb ccc BBBBBBB
Output
aaa
aaa BBBBB cccc
aaa bbb ccc

just with bash
shopt -s extglob
str="a.zxn;lqwyerpyqgha;lsdnBBBBB"
str=${str%%+(B)}
echo $str # ==> a.zxn;lqwyerpyqgha;lsdn

This might work for you:
sed 's/B*$//' file

Related

shell duplicate spaces in file

Is it possible to remove multiple spaces from a text file and save the changes in the same file using awk or grep?
Input example:
aaa bbb ccc
ddd yyyy
Output I want:
aaa bbb ccc
ddd yyyy
Simply reset value of $1 to again $1 which will allow OFS to come into picture and will add proper spaces into lines.
awk '{$1=$1} 1' Input_file
EDIT: Since OP mentioned that what if we want to keep only starting spaces then try following.
awk '
match($0,/^ +/){
spaces=substr($0,RSTART,RLENGTH)
}
{
$1=$1
$1=spaces $1
spaces=""
}
1
' Input_file
Using sed
sed -i -E 's#[[:space:]]+# #g' < input file
For removing spaces at the start
sed -i -E 's#[[:space:]]+# #g; s#^ ##g' < input file
Demo:
$cat test.txt
aaa bbb ccc
ddd yyyy
Output I want:
aaa bbb ccc
ddd yyyy
$sed -i -E 's#[[:space:]]+# #g' test.txt
$cat test.txt
aaa bbb ccc
ddd yyyy
Output I want:
aaa bbb ccc
ddd yyyy
$

get paragraph with awk, and start-of-line regexp

I use awk to get paragraphs from a textfile, like so:
awk -v RS='' -v ORS='\n\n' '/pattern/' ./textfile
Say I have the following textfile:
aaa bbb ccc
aaa bbb ccc
aaa bbb ccc
aaa ccc
bbb aaa ccc
bbb aaa ccc
ccc bbb aaa
ccc bbb aaa
ccc bbb aaa
Now I only want the paragraph with one of the (original) lines starting with "bbb" (hence the second paragraph). However - using regexp ^ will not work anymore, (I presume) because of the RS='' line; awk now only matches to the begin of the paragraph.
Is there another way?
^ means start-of-string. You want start-of-line which is (^|\n), e.g.:
$ awk -v RS='' -v ORS='\n\n' '/(^|\n)bbb/' file
aaa ccc
bbb aaa ccc
bbb aaa ccc

grep -v except pattern

I want to grep -v file except pattern.
this is my file content (test.txt):
a
aaa
bbb
ccc
I want to this result:
aaa
bbb
ccc
And cat test.txt |grep -v "a" --exclude="aaa" is not correctly work and return this:
bbb
ccc
You need to use word boundary \b which matches between a word character and a non-word character.
$ grep -v '\ba\b' file
aaa
bbb
ccc
OR
$ grep -v '^a$' file
aaa
bbb
ccc
^ Asserts that we are at the start of a line and $ asserts that we are at the end of a line.
$ grep -w -v "a" test.txt
aaa
bbb
ccc
From the man page
-w, --word-regexp
Select only those lines containing matches that form whole
words.

Add a line counter to lines matching a pattern

I need to prepend a line counter to lines matching specific patterns in a file, while still outputting the lines that do not match this pattern.
For example, if my file looks like this:
aaa 123
bbb 456
aaa 666
ccc 777
bbb 999
and the patterns I want to count are 'aaa' and 'ccc', I'd like to get the following output:
1:aaa 123
bbb 456
2:aaa 666
3:ccc 777
bbb 999
Preferably I'm looking for a Linux one-liner. Shell or tool doesn't matter as long it's installed by default in most distros.
With awk:
awk '{if ($1=="aaa" || $1=="ccc") {a++; $0=a":"$0}} {print}' file
1: aaa 123
bbb 456
2: aaa 666
3: ccc 777
bbb 999
Explanation
Loop through lines checking whether first field is aaa or ccc. If so, append the line ($0) with the variable a and auto increment it. Finally, print the line in all cases: if the pattern was matched will have a in the beginning, otherways just the original line.
Use the following code. The following approach is in perl
open FH,"<abc.txt";
$incremental_val = 1;
while(my $line = <FH>){
chomp($line);
if($line =~ m/^aaa / || $line =~ m/^ccc /){
print "$incremental_val : $line\n";
$incremental_val++;
next;
}
print "$line\n";
}
close FH;
The output will be as follows.
1 : aaa 123
bbb 456
2 : aaa 666
3 : ccc 777
bbb 999

Delete whole line NOT containing given string

Is there a way to delete the whole line if it contains specific word using sed? i.e.
I have the following:
aaa bbb ccc
qqq fff yyy
ooo rrr ttt
kkk ccc www
I want to delete lines that contain 'ccc' and leave other lines intact. In this example the output would be:
qqq fff yyy
ooo rrr ttt
All this using sed. Any hints?
sed -n '/ccc/!p'
or
sed '/ccc/d'

Resources