Exclude one string from bash output - linux

I'm working now on a project. In this project for some reasons I need to exclude first string from the output (or file) that matches the pattern. The difficulty is in that I need to exclude just one string, just first string from the stream.
For example, if I have:
1 abc
2 qwerty
3 open
4 abc
5 talk
After some script working I should have this:
2 qwerty
3 open
4 abc
5 talk
NOTE: I don't know anything about digits before words, so I can't filter the output using knowledge about them.
I've written small script with grep, but it cuts out every string, that matches the pattern:
'some program' | grep -v "abc"
Read info about awk, sed, etc. but didn't understand if I can solve my problem.
Anything helps, Thank you.

Using awk:
some program | awk '{ if (/abc/ && !seen) { seen = 1 } else print }'
Alternatively, using only filters:
some program | awk '!/abc/ || seen { print } /abc/ && !seen { seen = 1 }'

You can use Ex editor. For example to remove the first pattern from the file:
ex +"/abc/d" -scwq file.txt
From the input (replace cat with your program):
ex +"/abc/d" +%p -scq! <(cat file.txt)
You can also read from stdin by replacing cat with /dev/stdin.
Explanation:
+cmd - execute Ex/Vim command
/pattern/d - find the pattern and delete,
%p - print the current buffer
-s - silent mode
-cq! - execute quite without saving (!)
<(cmd) - shell process substitution

give line numbers using sed which you want to delete
sed 1,2d
instead of 1 2 use line numbers that you want to delete
otherwise you can use
sed '/pattrent to match/d'
here we can have
sed '0,/abc/{//d;}'

You can also use a list of commands { list; } to read the first line and print the rest:
command | { read first_line; cat -; }
Simple example:
$ cat file
1 abc
2 qwerty
3 open
4 abc
5 talk
$ cat file | { read first_line; cat -; }
2 qwerty
3 open
4 abc
5 talk

awk '!/1/' file
2 qwerty
3 open
4 abc
5 talk
Thats all!

Related

How can I fix my bash script to find a random word from a dictionary?

I'm studying bash scripting and I'm stuck fixing an exercise of this site: https://ryanstutorials.net/bash-scripting-tutorial/bash-variables.php#activities
The task is to write a bash script to output a random word from a dictionary whose length is equal to the number supplied as the first command line argument.
My idea was to create a sub-dictionary, assign each word a number line, select a random number from those lines and filter the output, which worked for a similar simpler script, but not for this.
This is the code I used:
6 DIC='/usr/share/dict/words'
7 SUBDIC=$( egrep '^.{'$1'}$' $DIC )
8
9 MAX=$( $SUBDIC | wc -l )
10 RANDRANGE=$((1 + RANDOM % $MAX))
11
12 RWORD=$(nl "$SUBDIC" | grep "\b$RANDRANGE\b" | awk '{print $2}')
13
14 echo "Random generated word from $DIC which is $1 characters long:"
15 echo $RWORD
and this is the error I get using as input "21":
bash script.sh 21
script.sh: line 9: counterintelligence's: command not found
script.sh: line 10: 1 + RANDOM % 0: division by 0 (error token is "0")
nl: 'counterintelligence'\''s'$'\n''electroencephalograms'$'\n''electroencephalograph': No such file or directory
Random generated word from /usr/share/dict/words which is 21 characters long:
I tried in bash to split the code in smaller pieces obtaining no error (input=21):
egrep '^.{'21'}$' /usr/share/dict/words | wc -l
3
but once in the script line 9 and 10 give error.
Where do you think is the error?
problems
SUBDIC=$( egrep '^.{'$1'}$' $DIC ) will store all words of the given length in the SUBDIC variable, so it's content is now something like foo bar baz.
MAX=$( $SUBDIC | ... ) will try to run the command foo bar baz which is obviously bogus; it should be more like MAX=$(echo $SUBDIC | ... )
MAX=$( ... | wc -l ) will count the lines; when using the above mentioned echo $SUBDIC you will have multiple words, but all in one line...
RWORD=$(nl "$SUBDIC" | ...) same problem as above: there's only one line (also note #armali's answer that nl requires a file or stdin)
RWORD=$(... | grep "\b$RANDRANGE\b" | ...) might match the dictionary entry catch 22
likely RWORD=$(... | awk '{print $2}') won't handle lines containing spaces
a simple solution
doing a "random sort" over the all the possible words and taking the first line, should be sufficient:
egrep "^.{$1}$" "${DIC}" | sort -R | head -1
MAX=$( $SUBDIC | wc -l ) - A pipe is used for connecting a command's output, while $SUBDIC isn't a command; an appropriate syntax is MAX=$( <<<$SUBDIC wc -l ).
nl "$SUBDIC" - The argument to nl has to be a filename, which "$SUBDIC" isn't; an appropriate syntax is nl <<<"$SUBDIC".
This code will do it. My test dictionary of words is in file file. It's a good idea to get all words of a given length first but put them in an array not in var. And then get a random index and echo it.
dic=( $(sed -n "/^.\{$1\}$/p" file) )
ind=$((0 + RANDOM % ${#dic[#]}))
echo ${dic[$ind]}
I am also doing this activity and I create one simple solution.
I create the script.
#!/bin/bash
awk "NR==$1 {print}" /usr/share/dict/words
Here if you want a random string then you have to run the script as per the below command from the terminal.
./script.sh $RANDOM
If you want the print any specific number string then you can run as per the below command from the terminal.
./script.sh 465
cat /usr/share/dict/american-english | head -n $RANDOM | tail -n 1
$RANDOM - Returns a different random number each time is it referred to.
this simple line outputs random word from the mentioned dictionary.
Otherwise as umläute mentined you can do:
cat /usr/share/dict/american-english | sort -R | head -1

Extract information (subset) from a main files using a list of identifiers saved in another file

I have one file containing a list of name (refer as file 1):
Apple
Bat
Cat
I have another file (refer as file 2) containing a list of name and details refer:
Apple bla blaa
aaaaaaaaaggggggggggttttttsssssssvvvvvvv
ssssssssiiuuuuuuuuuueeeeeeeeeeennnnnnnn
sdasasssssssssssssssssssssswwwwwwwwwwww
Aeroplane dsafgeq dasfqw dafsad
vvvvvvvvvvvvvvvvuuuuuuuuuuuuuuuuuuuuuus
fcsadssssssssssssssssssssssssssssssssss
ddddddddddddddddwwwwwwwwwwwwwwwwwwwwwww
sdddddddddddddddddddddddddddddwwwwwwwww
Bat sdasdas dsadw dasd
sssssssssssssssssssssssssssssssssssswww
ssssssssssssssssswwwwwwwwwwwwwwwwwwwwwf
aaaaaaaaaawwwwwwwwwwwwwwwwwwwwwwddddddd
sadddddddddddddddddd
Cat dsafw fasdsa dawwdwaw
sssssssssssssssssssssssssssssssssssssss
wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwssss
I need to extract info out of file 2 using the list of names in file 1.
Output file should be something like below:
Apple bla blaa
aaaaaaaaaggggggggggttttttsssssssvvvvvvv
ssssssssiiuuuuuuuuuueeeeeeeeeeennnnnnnn
sdasasssssssssssssssssssssswwwwwwwwwwww
Bat sdasdas dsadw dasd
sssssssssssssssssssssssssssssssssssswww
ssssssssssssssssswwwwwwwwwwwwwwwwwwwwwf
aaaaaaaaaawwwwwwwwwwwwwwwwwwwwwwddddddd
sadddddddddddddddddd
Cat dsafw fasdsa dawwdwaw
sssssssssssssssssssssssssssssssssssssss
wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwssss
Is there any commands for doing this using Linux (Ubuntu)? I am a new Linux user.
This might work for you (GNU sed):
sed 's#.*#/^&/bb#' file1 |
sed -e ':a' -f - -e 'd;:b;n;/^[A-Z]/!bb;ba' file2
Generate a string of sed commands from the first file and pipe them into another sed script which is run against the second file.
The first file creates a regexp for each line which when matched jumps to a piece of common code. If none of the regexps are matched the lines are deleted. If a regexp is matched then further lines are printed until a new delimiter is found at which point the code then jumps to the start and the process is repeated.
$ awk 'NR==FNR{a[$1];next} NF>1{f=($1 in a)} f' file1 file2
Apple bla blaa
aaaaaaaaaggggggggggttttttsssssssvvvvvvv
ssssssssiiuuuuuuuuuueeeeeeeeeeennnnnnnn
sdasasssssssssssssssssssssswwwwwwwwwwww
Bat sdasdas dsadw dasd
sssssssssssssssssssssssssssssssssssswww
ssssssssssssssssswwwwwwwwwwwwwwwwwwwwwf
aaaaaaaaaawwwwwwwwwwwwwwwwwwwwwwddddddd
sadddddddddddddddddd
Cat dsafw fasdsa dawwdwaw
sssssssssssssssssssssssssssssssssssssss
wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwssss
Taking into consideration that each section has to be separated by an empty line, this solution with awk works ok:
while read -r pat;do
pat="^\\\<${pat}\\\>"
awk -vpattern=$pat '$0 ~ pattern{p=1}$0 ~ /^$/{p=0}p==1' file2
done <file1
This solution to work , requires the file to like this:
Apple bla blaa
1 aaaaaaaaaggggggggggttttttsssssssvvvvvvv
2 ssssssssiiuuuuuuuuuueeeeeeeeeeennnnnnnn
3 sdasasssssssssssssssssssssswwwwwwwwwwww
Aeroplane dsafgeq dasfqw dafsad
4 vvvvvvvvvvvvvvvvuuuuuuuuuuuuuuuuuuuuuus
5 fcsadssssssssssssssssssssssssssssssssss
6 ddddddddddddddddwwwwwwwwwwwwwwwwwwwwwww
7 sdddddddddddddddddddddddddddddwwwwwwwww
Bat sdasdas dsadw dasd
8 sssssssssssssssssssssssssssssssssssswww
9 ssssssssssssssssswwwwwwwwwwwwwwwwwwwwwf
10 aaaaaaaaaawwwwwwwwwwwwwwwwwwwwwwddddddd
11 sadddddddddddddddddd
Cat dsafw fasdsa dawwdwaw
12 sssssssssssssssssssssssssssssssssssssss
13 wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwssss
PS: Numbering has been applied by me in order to be able to "check" that awk will return the correct results per section. Numbering is not required in your real file.
If there are not empty lines separating each section then it is much harder to achieve the correct result.

Use sed or awk to replace line after match

I'm trying to create a little script that basically uses dig +short to find the IP of a website, and then pipe that to sed/awk/grep to replace a line. This is what the current file looks like:
#Server
123.455.1.456
246.523.56.235
So, basically, I want to search for the '#Server' line in a text file, and then replace the two lines underneath it with an IP address acquired from dig.
I understand some of the syntax of sed, but I'm really having trouble figuring out how to replace two lines underneath a match. Any help is much appreciated.
Based on the OP, it's not 100% clear exactly what needs to replaced where, but here's a a one-liner for the general case, using GNU sed and bash. Replace the two lines after "3" with standard input:
echo Hoot Gibson | sed -e '/3/{r /dev/stdin' -e ';p;N;N;d;}' <(seq 7)
Outputs:
1
2
3
Hoot Gibson
6
7
Note: sed's r command is opaquely documented (in Linux anyway). For more about r, see:
"5.9. The 'r' command isn't inserting the file into the text" in this sed FAQ.
here's how in awk:
newip=12.34.56.78
awk -v newip=$newip '{
if($1 == "#Server"){
l = NR;
print $0
}
else if(l>0 && NR == l+1){
print newip
}
else if(l==0 || NR != l+2){
print $0
}
}' file > file.tmp
mv -f file.tmp file
explanation:
pass $newip to awk
if the first field of the current line is #Server, let l = current line number.
else if the current line is one past #Server, print the new ip.
else if the current row is not two past #Server, print the line.
overwrite original file with modified version.

find a pattern and print line based on finding the first pattern sed, awk grep

I have a rather large file. What is common to all is the hostname to break each section example :
HOSTNAME:host1
data 1
data here
data 2
text here
section 1
text here
part 4
data here
comm = 2
HOSTNAME:host-2
data 1
data here
data 2
text here
section 1
text here
part 4
data here
comm = 1
The above prints
As you see above, in between each section there are other sections broken down by key words or lines that have specific values
I like to use a oneliner to print host name for each section and then print which ever lines I want to extract under each hostname section
Can you please help. I am using now grep -C 10 HOSTNAME | gerp -C pattern
but this assumes that there are 10 lines in each section. This is not an optimal way to do this; can someone show a better way. I also need to be able to print more than one line under each pattern that I find . So if I find data1 and there are additional lines under it I like to grab and print them
So output of command would be like
grep -C 10 HOSTNAME | grep data 1
grep -C 10 HOSTNAME | grep -A 2 data 1
HOSTNAME:Host1
data 1
HOSTNAME:Hoss2
data 1
Beside Grep I use this sed command to print my output
sed -r '/HOSTNAME|shared/!d' filename
The only problem with this sed command is that it only prints the lines that have patterns shared & HOSTNAME in them. I also need to specify the number of lines I like to print in my case under the line that matched patterns shared. So I like to print HOSTNAME and give the number of lines I like to print under second search pattern shared.
Thanks
awk to the rescue!
$ awk -v lines=2 '/HOSTNAME/{c=lines} NF&&c&&c--' file
HOSTNAME:host1
data 1
HOSTNAME:host-2
data 1
print lines number of lines including pattern match, skips empty lines.
If you want to specify secondary keyword instead number of lines
$ awk -v key='data 1' '/HOSTNAME/{h=1; print} h&&$0~key{print; h=0}' file
HOSTNAME:host1
data 1
HOSTNAME:host-2
data 1
Here is a sed twoliner:
sed -n -r '/HOSTNAME/ { p }
/^\s+data 1/ {p }' hostnames.txt
It prints (p)
when the line contains a HOSTNAME
when the line starts with some whitespace (\s+) followed by your search criterion (data 1)
non-mathing lines are not printed (due to the sed -n option)
Edit: Some remarks:
this was tested with GNU sed 4.2.2 under linux
you dont need the -r if your sed version does not support it, replace the second pattern to /^.*data 1/
we can squash everything in one line with ;
Putting it all together, here is a revised version in one line, without the need for the extended regex ( i.e without -r):
sed -n '/HOSTNAME/ { p } ; /^.*data 1/ {p }' hostnames.txt
The OP requirements seem to be very unclear, but the following is consistent with one interpretation of what has been requested, and more importantly, the program has no special requirements, and the code can easily be modified to meet a variety of requirements. In particular, both search patterns (the HOSTNAME pattern and the "data 1" pattern) can easily be parameterized.
The main idea is to print all lines in a specified subsection, or at least a certain number up to some limit.
If there is a limit on how many lines in a subsection should be printed, specify a value for limit, otherwise set it to 0.
awk -v limit=0 '
/^HOSTNAME:/ { subheader=0; hostname=1; print; next}
/^ *data 1/ { subheader=1; print; next }
/^ *data / { subheader=0; next }
subheader && (limit==0 || (subheader++ < limit)) { print }'
Given the lines provided in the question, the output would be:
HOSTNAME:host1
data 1
HOSTNAME:host-2
data 1
(Yes, I know the variable 'hostname' in the awk program is currently unused, but I included it to make it easy to add a test to satisfy certain obvious requirements regarding the preconditions for identifying a subheader.)
sed -n -e '/hostname/,+p' -e '/Duplex/,+p'
The simplest way to do it is to combine two sed commands ..

How can I swap two lines using sed?

Does anyone know how to replace line a with line b and line b with line a in a text file using the sed editor?
I can see how to replace a line in the pattern space with a line that is in the hold space (i.e., /^Paco/x or /^Paco/g), but what if I want to take the line starting with Paco and replace it with the line starting with Vinh, and also take the line starting with Vinh and replace it with the line starting with Paco?
Let's assume for starters that there is one line with Paco and one line with Vinh, and that the line Paco occurs before the line Vinh. Then we can move to the general case.
#!/bin/sed -f
/^Paco/ {
:notdone
N
s/^\(Paco[^\n]*\)\(\n\([^\n]*\n\)*\)\(Vinh[^\n]*\)$/\4\2\1/
t
bnotdone
}
After matching /^Paco/ we read into the pattern buffer until s// succeeds (or EOF: the pattern buffer will be printed unchanged). Then we start over searching for /^Paco/.
cat input | tr '\n' 'ç' | sed 's/\(ç__firstline__\)\(ç__secondline__\)/\2\1/g' | tr 'ç' '\n' > output
Replace __firstline__ and __secondline__ with your desired regexps. Be sure to substitute any instances of . in your regexp with [^ç]. If your text actually has ç in it, substitute with something else that your text doesn't have.
try this awk script.
s1="$1"
s2="$2"
awk -vs1="$s1" -vs2="$s2" '
{ a[++d]=$0 }
$0~s1{ h=$0;ind=d}
$0~s2{
a[ind]=$0
for(i=1;i<d;i++ ){ print a[i]}
print h
delete a;d=0;
}
END{ for(i=1;i<=d;i++ ){ print a[i] } }' file
output
$ cat file
1
2
3
4
5
$ bash test.sh 2 3
1
3
2
4
5
$ bash test.sh 1 4
4
2
3
1
5
Use sed (or not at all) for only simple substitution. Anything more complicated, use a programming language
A simple example from the GNU sed texinfo doc:
Note that on implementations other than GNU `sed' this script might
easily overflow internal buffers.
#!/usr/bin/sed -nf
# reverse all lines of input, i.e. first line became last, ...
# from the second line, the buffer (which contains all previous lines)
# is *appended* to current line, so, the order will be reversed
1! G
# on the last line we're done -- print everything
$ p
# store everything on the buffer again
h

Resources