what would this line of command do? grep ^..[l-z]$ hello.txt - linux

I am wondering what would this line of command do:
grep ^..[l-z]$ hello.txt
I know grep but I don't know what is this part " ^..[I-z]$ " do?
I am using Linux Ubuntu.

Your command will search only for the line which contains 3 Alphanumeric characters from start to last in which the last character belongs to l to z.
see below example -
cat hello.txt
hello vipin
..l
1ll
1la
abl
..l
#.z
1.a
..l..
..vipin
grep ^..[l-z]$ hello.txt
..l
1ll
abl
#.z

it searches all lines that are build of exactly 3 letters where the last is one of l through z

Related

Linux tail command includes more lines than intended

so I want to get a little into Linux scripting and started by a simple example in a book. In this book, the author wants me to grab the five lines before "Step #6: Configure output plugins" from snort.conf.
Analogous to the author I determined where the line is that I want, which returns 445 for me. If I then use tail the result returns more text than I expect and the searched line that should be in line 5 is at line 88. I fail to understand how I use the tail command and start at the specific line but then more text is included.
To search for the line I used
nl /etc/snort/snort.conf | grep output.
To get the 5 lines before including the searched line:
tail -n+440 /etc/snort/snort.conf | head -n+6
where as the tail statement seems to be the problem. Any help is appreciated on why my answer is not working!
Your tail command is correct in principle.
The problem lies in the way in which you acquire the line number using nl. The nl command does not count empty lines by default, while the tail command does. You should specify in your nl command that you want to count the empty lines as well, which you can do using the -b, (body-numbering) option and specify a as your style. This would look as follows:
nl -ba /etc/snort/snort.conf | grep output.
From nl --help:
Usage: nl [OPTION]... [FILE]...
Write each FILE to standard output, with line numbers added.
With no FILE, or when FILE is -, read standard input.
Mandatory arguments to long options are mandatory for short options too.
-b, --body-numbering=STYLE use STYLE for numbering body lines
[...]
By default, selects -v1 -i1 -l1 -sTAB -w6 -nrn -hn -bt -fn. CC are
two delimiter characters for separating logical pages, a missing
second character implies :. Type \\ for \. STYLE is one of:
a number all lines
t number only nonempty lines
Number all lines and use that line number in tail.
Hello in trying the same with same book that you are using but I didn’t find any great solution with tail or nl but i come up with simple grep switch -B and -A before and after switches for grep.
I achieved this issue by typing
grep -B 5 “Step #6: Configure output plugins “ /etc/snort/snort.conf
After that you will gonna get 5 lines before that line same for After -A for after lines.
Hope this will help someone staysafe happy learning 🙂

How to extract a specific text from gz file?

I need to extract the 5 to 11 characters from my fastq.gz data this data is just too large for running in R. So I was wondering if I can do it directly in Linux command line?
The fastq file looks like this:
#NB501399:67:HFKTCBGX5:1:11101:13202:1044 1:N:0:CTTGTA
GAGGTNACGGAGTGGGTGTGTGCAGGGCCTGGTGGGAATGGGGAGACCCGTGGACAGAGCTTGTTAGAGTGTCCTAGAGCCAGGGGGAACTCCAGGCAGGGCAAATTGGGCCCTGGATGTTGAGAAGCTGGGTAACAAGTACTGAGAGAAC
+
AAAAA#EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAAAEEEEEEEEEEEEEEEEAEEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAE6
#NB501399:67:HFKTCBGX5:1:11101:1109:1044 1:N:0:CTTGTA
TAGGCNACCTGGTGGTCCCCCGCTCCCGGGAGGTCACCATATTGATGCCGAACTTAGTGCGGACACCCGATCGGCATAGCGCACTACAGCCCAGAACTCCTGGACTCAAGCGATCCTCCAGCCTCAGCCTCCCGAGTAGCTGGGACTACAG
+
And I only want to extract the 5 to 11 character which located in sequence part (for the first one is TNACGG, for the second is CNACCT) and makes it a new txt file. Can I do that?
You can use GNU sed with zcat:
zcat fastq.gz | sed -n '2~5{s/.\{4\}\(.\{6\}\).*/\1/;p}'
-n means lines are not printed by default
2~5 means start with line 2, match every fifth line
when the "address" matches, the substitution remembers the fifth to tenth character in \1 and replaces the whole line with it, p prints the result
Another using zgrep and positive lookbehind:
$ zgrep -oP "(?<=^[ACTGN]{4})[ACTGN]{6}" foo.gz
TNACGG
CNACCT
Explained:
zgrep : man zgrep: search possibly compressed files for a regular expression
-o Print only the matched (non-empty) parts of a matching line
-P Interpret the pattern as a Perl-compatible regular expression (PCRE).
(?<=^[ACTGN]{4}) positive lookbehind
[ACTGN]{6} match 6 named characters that are preceeded by above
foo.gz my test file
$ zcat fastq.gz | awk '(NR%5)==2{print substr($0,5,6)}'
TNACGG
CNACCT

How can I find which lines in a certain file are not started by lines from another file using bash?

I have two text files, A and B:
A:
a start
b stop
c start
e start
B:
b
c
How can I find which lines in A are not started by lines from B using shell(bash...) command. In this case, I want to get this answer:
a start
e start
Can I implement this using a single line of command?
This should do:
sed '/^$/d;s/^/^/' B | grep -vf - A
The sed command will take all non-empty lines (observe the /^$/d command) from the file B and prepend a caret ^ in front of each line (so as to obtain an anchor for grep's regexp), and spits all this to stdout. Then grep, with the -f option (which means take all patterns from a file, which happens to be stdin here, thanks to the - symbol) and does an invert matching (thanks to the -v option) on file A. Done.
I think this should do it:
sed 's/^/\^/g' B > C.tmp
grep -vEf C.tmp A
rm C.tmp
You can try using a combination of xargs, cat, and grep
Save the first letters of each line into FIRSTLETTERLIST. You can do this with some cat and sed work.
The idea is to take the blacklist and then match it against the interesting file.
cat file1.txt | xargs grep ^[^[$FIRSTLETTERLIST]]
This is untested, so I won't guarantee it will work, but it should point you in the right direction.

How can I see which line ending characters are used in a file?

This has been asked a dozen times already, but the answer is always "see what type of file vi says it is and deduce from that" or "run it through cat and see if the windows line endings are rendered" or "run it through egrep to see if egrep finds instances of one type of line ending or another".
Is there not a reasonably easy way to just directly view which characters are used? Ideally I would just have a flag on cat that spat out escape characters in their human-readbable representation instead of rending them as whitespace.
You can also use "cat -v", it doesn't show everything but it does show "\r\n" as "^M":
$ cat -v WKB2.gff | head
##gff-version 2^M
# seqname source feature start end score strand frame attributes^M
CP007446 - source 1 2527978 . + . organism "Snodgrassella alvi wkB2" ; mol_type "genomic DNA" ; strain "wkB2" ; db_xref "taxon:1196094"^M
$ cat -v PAO1.gff | head
##gff-version 3
#!gff-spec-version 1.20
#!processor NCBI annotwriter
##sequence-region AE004091.2 1 6264404
##species http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=208964
AE004091.2 Genbank region 1 6264404 . + . ID=id0;Dbxref=taxon:208964;Is_circular=true;gbkey=Src;genome=chromosome;mol_type=genomic DNA;strain=PAO1
try 'od' http://en.wikipedia.org/wiki/Od_(Unix)
head -1 file.txt | hexdump -C
Look at the last bytes printed out and you should be able to tell what the line endings are.

"sed" command in bash

Could someone explain this command for me:
cat | sed -e 's,%,$,g' | sudo tee /etc/init.d/dropbox << EOF
echo "Hello World"
EOF
What does the "sed" command do?
sed is the Stream EDitor. It can do a whole pile of really cool things, but the most common is text replacement.
The s,%,$,g part of the command line is the sed command to execute. The s stands for substitute, the , characters are delimiters (other characters can be used; /, : and # are popular). The % is the pattern to match (here a literal percent sign) and the $ is the second pattern to match (here a literal dollar sign). The g at the end means to globally replace on each line (otherwise it would only update the first match).
Here sed is replacing all occurrences of % with $ in its standard input.
As an example
$ echo 'foo%bar%' | sed -e 's,%,$,g'
will produce "foo$bar$".
It reads Hello World (cat), replaces all (g) occurrences of % by $ and (over)writes it to /etc/init.d/dropbox as root.
sed is a stream editor. I would say try man sed.If you didn't find this man page in your system refer this URL:
http://unixhelp.ed.ac.uk/CGI/man-cgi?sed

Resources