Linux grep only n,m character strings - linux

I have a file which has below contents,
TESTING
TSET24D
DSWEDFBG
WTSETO
MSDWEHLKGY
and want to grep the strings only 6 and 8 characters long.
I tried the below one,
[root#server ~]# cat listfile | grep -o -w -E '^[[:alnum:]]{6,8}'
TESTING
TSET24D
DSWEDFBG
WTSETO
[root#server ~]#
which seems to work on some servers and its returing the strings between 6-8 also coming.
Any idea please..

In your regex, {6,8} looks for a repetition between 6 and 8 times (so 7 is included). You need to use a pipe (OR in regex) to split the search for only 6 OR 8 times.
$ grep -o -w -E '^[[:alnum:]]{6}|^[[:alnum:]]{8}' listfile
DSWEDFBG
WTSETO

Using Awk with its POSIX compliant length() function,
awk '(length($0)==6 || length($0)==8) && $0 ~ /[[:alnum:]]{6}|[[:alnum:]]{8}/' file
DSWEDFBG
WTSETO
works fine for a input file as
TESTING
TSET24D
DSWEDFBG
WTSETO
MSDWEHLKGY
------
--------
------a
#1234$21
(or)
more simply as Ed Morton suggests, just do
awk '/^([[:alnum:]]{6}|[[:alnum:]]{8})$/' file

In awk:
awk '/^......(..)?$/' file
DSWEDFBG
WTSETO
The same for [[:alnum:]]s:
$ awk '/^[[:alnum:]]{6}([[:alnum:]]{2})?$/' file
DSWEDFBG
WTSETO

Related

How to grep full words based on partial input?

I have a file text.txt which contains the below words.
1. moon,one
2. sun,two
3. well,three
4. doll,four
if i grep this file using sun
grep -i sun text.txt
I will get the output
sun,two
But, my requirement is I need to grep with the word which is starting with sun not exactly sun.
grep -i sunlight text.txt
Here I need the same output for grep -i sun text.txt.
You don't need awk or gawk, nor sed. Just do
grep -o 'sun.*'
Other more complex / elegant solutions may be available depending on the system you are using.
What you are looking for are regular expressions.
In your case, it would be
grep -i 'sun.*' text.txt
Try using -o, as showed in the documentation.
The -o make grep return only the matched part. You can also use regular expressions.
grep -io sun text.txt
Is this what you're looking for?
awk -F ',' '/^[SsuUnN]/ {print $0}' test.txt
or if you want to search the pattern "sun" in general from the input_file, then use this:
awk -F ',' 'BEGIN{IGNORECASE=1} /sun/ {print $0}' test.txt

grep a particular content before a period

I am trying to read/grep a particular word or content that is before a period (.).
e.g. file1 has abinaya.ashok and I want to grep whatever is before the period (.) without hardcoding anything.
if I try
grep \.\ file1
it gives abinaya.ashok.
I've tried: grep\*\.\ file1
it doesn't give anything.Can we find it using grep commands or should we do it only using awk command? Any thoughts?
Using GNU grep for PCRE regex (for non-greedy and positive look-ahead), you can do:
echo 'abinaya.ashok' | grep -oP '.*?(?=\.)'
abinaya
Using awk:
echo 'abinaya.ashok' | awk -F\. '{print $1}'
abinaya
Check the following simple examples.
Including the dot:
$ echo abinaya.ashok | grep -o '.*[.]'
abinaya.
Without the dot:
$ echo abinaya.ashok | grep -o '^[^.]\+'
abinaya
Hope I understand you correctly:
sed -n 's/\..*//p' file1 | grep whatever
sed expression will print only part before dot (lines without dot are not printed).
Now use grep to search what you need.

How to return substring from a linux command

I'm connecting to an exadata and want to get information about "ORACLE_HOME" variable inside them. So i'm using this command:
ls -l /proc/<pid>/cwd
this is the output:
2 oracle oinstall 0 Jan 23 21:20 /proc/<pid>/cwd -> /u01/app/database/11.2.0/dbs/
i need the get the last part :
/u01/app/database/11.2.0 (i dont want the "/dbs/" there)
i will be using this command several times in different machines. So how can i get this substring from whole output?
Awk and grep are good for these types of issues.
New:
ls -l /proc/<pid>/cwd | awk '{print ($NF) }' | sed 's#/dbs/##'
Old:
ls -l /proc/<pid>/cwd | awk '{print ($NF) }' | egrep -o '^.+[.0-9]'
Awk prints the last column of the input which is your ls command and then grep grabs the beginning of that string up the last occurrence of numbers and dots. This is a situational solution and perhaps not the best.
Parsing the output of ls is generally considered sub-optimal. I would use something more like this instead:
dirname $(readlink -f /proc/<pid>/cwd)

Linux cut string

In Linux (Cento OS) I have a file that contains a set of additional information that I want to removed. I want to generate a new file with all characters until to the first |.
The file has the following information:
ALFA12345|7890
Beta0-XPTO-2|30452|90 385|29
ZETA2334423 435; 2|2|90dd5|dddd29|dqe3
The output expected will be:
ALFA12345
Beta0 XPTO-2
ZETA2334423 435; 2
That is removed all characters after the character | (inclusive).
Any suggestion for a script that reads File1 and generates File2 with this specific requirement?
Try
cut -d'|' -f1 oldfile > newfile
And, to round out the "big 3", here's the awk version:
awk -F\| '{print $1}' in.dat
You can use a simple sed script.
sed 's/^\([^|]*\).*/\1/g' in.dat
ALFA12345
Beta0-XPTO-2
ZETA2334423 435; 2
Redirect to a file to capture the output.
sed 's/^\([^|]*\).*/\1/g' in.dat > out.dat
And with grep:
$ grep -o '^[^|]*' file1
ALFA12345
Beta0-XPTO-2
ZETA2334423 435; 2
$ grep -o '^[^|]*' file1 > file2

Grep - returning both the line number and the name of the file

I have a number of log files in a directory. I am trying to write a script to search all the log files for a string and echo the name of the files and the line number that the string is found.
I figure I will probably have to use 2 grep's - piping the output of one into the other since the -l option only returns the name of the file and nothing about the line numbers. Any insight in how I can successfully achieve this would be much appreciated.
Many thanks,
Alex
$ grep -Hn root /etc/passwd
/etc/passwd:1:root:x:0:0:root:/root:/bin/bash
combining -H and -n does what you expect.
If you want to echo the required informations without the string :
$ grep -Hn root /etc/passwd | cut -d: -f1,2
/etc/passwd:1
or with awk :
$ awk -F: '/root/{print "file=" ARGV[1] "\nline=" NR}' /etc/passwd
file=/etc/passwd
line=1
if you want to create shell variables :
$ awk -F: '/root/{print "file=" ARGV[1] "\nline=" NR}' /etc/passwd | bash
$ echo $line
1
$ echo $file
/etc/passwd
Use -H. If you are using a grep that does not have -H, specify two filenames. For example:
grep -n pattern file /dev/null
My version of grep kept returning text from the matching line, which I wasn't sure if you were after... You can also pipe the output to an awk command to have it ONLY print the file name and line number
grep -Hn "text" . | awk -F: '{print $1 ":" $2}'

Resources