How to specify more spaces for the delimiter using cut? - linux

Is there any way to specify a field delimiter for more spaces with the cut command? (like " "+) ?
For example: In the following string, I like to reach value '3744', what field delimiter I should say?
$ps axu | grep jboss
jboss 2574 0.0 0.0 3744 1092 ? S Aug17 0:00 /bin/sh /usr/java/jboss/bin/run.sh -c example.com -b 0.0.0.0
cut -d' ' is not what I want, for it's only for one single space.
awk is not what I am looking for either, but how to do with 'cut'?
thanks.

Actually awk is exactly the tool you should be looking into:
ps axu | grep '[j]boss' | awk '{print $5}'
or you can ditch the grep altogether since awk knows about regular expressions:
ps axu | awk '/[j]boss/ {print $5}'
But if, for some bizarre reason, you really can't use awk, there are other simpler things you can do, like collapse all whitespace to a single space first:
ps axu | grep '[j]boss' | sed 's/\s\s*/ /g' | cut -d' ' -f5
That grep trick, by the way, is a neat way to only get the jboss processes and not the grep jboss one (ditto for the awk variant as well).
The grep process will have a literal grep [j]boss in its process command so will not be caught by the grep itself, which is looking for the character class [j] followed by boss.
This is a nifty way to avoid the | grep xyz | grep -v grep paradigm that some people use.

awk version is probably the best way to go, but you can also use cut if you firstly squeeze the repeats with tr:
ps axu | grep jbos[s] | tr -s ' ' | cut -d' ' -f5
# ^^^^^^^^^^^^ ^^^^^^^^^ ^^^^^^^^^^^^^
# | | |
# | | get 5th field
# | |
# | squeeze spaces
# |
# avoid grep itself to appear in the list

I like to use the tr -s command for this
ps aux | tr -s [:blank:] | cut -d' ' -f3
This squeezes all white spaces down to 1 space. This way telling cut to use a space as a delimiter is honored as expected.

I am going to nominate tr -s [:blank:] as the best answer.
Why do we want to use cut? It has the magic command that says "we want the third field and every field after it, omitting the first two fields"
cat log | tr -s [:blank:] |cut -d' ' -f 3-
I do not believe there is an equivalent command for awk or perl split where we do not know how many fields there will be, ie out put the 3rd field through field X.

Shorter/simpler solution: use cuts (cut on steroids I wrote)
ps axu | grep '[j]boss' | cuts 4
Note that cuts field indexes are zero-based so 5th field is specified as 4
http://arielf.github.io/cuts/
And even shorter (not using cut at all) is:
pgrep jboss

One way around this is to go:
$ps axu | grep jboss | sed 's/\s\+/ /g' | cut -d' ' -f3
to replace multiple consecutive spaces with a single one.

Personally, I tend to use awk for jobs like this. For example:
ps axu| grep jboss | grep -v grep | awk '{print $5}'

As an alternative, there is always perl:
ps aux | perl -lane 'print $F[3]'
Or, if you want to get all fields starting at field #3 (as stated in one of the answers above):
ps aux | perl -lane 'print #F[3 .. scalar #F]'

If you want to pick columns from a ps output, any reason to not use -o?
e.g.
ps ax -o pid,vsz
ps ax -o pid,cmd
Minimum column width allocated, no padding, only single space field separator.
ps ax --no-headers -o pid:1,vsz:1,cmd
3443 24600 -bash
8419 0 [xfsalloc]
8420 0 [xfs_mru_cache]
8602 489316 /usr/sbin/apache2 -k start
12821 497240 /usr/sbin/apache2 -k start
12824 497132 /usr/sbin/apache2 -k start
Pid and vsz given 10 char width, 1 space field separator.
ps ax --no-headers -o pid:10,vsz:10,cmd
3443 24600 -bash
8419 0 [xfsalloc]
8420 0 [xfs_mru_cache]
8602 489316 /usr/sbin/apache2 -k start
12821 497240 /usr/sbin/apache2 -k start
12824 497132 /usr/sbin/apache2 -k start
Used in a script:-
oldpid=12824
echo "PID: ${oldpid}"
echo "Command: $(ps -ho cmd ${oldpid})"

Another way if you must use cut command
ps axu | grep [j]boss |awk '$1=$1'|cut -d' ' -f5
In Solaris, replace awk with nawk or /usr/xpg4/bin/awk

I still like the way Perl handles fields with white space.
First field is $F[0].
$ ps axu | grep dbus | perl -lane 'print $F[4]'

My approach is to store the PID to a file in /tmp, and to find the right process using the -S option for ssh. That might be a misuse but works for me.
#!/bin/bash
TARGET_REDIS=${1:-redis.someserver.com}
PROXY="proxy.somewhere.com"
LOCAL_PORT=${2:-6379}
if [ "$1" == "stop" ] ; then
kill `cat /tmp/sshTunel${LOCAL_PORT}-pid`
exit
fi
set -x
ssh -f -i ~/.ssh/aws.pem centos#$PROXY -L $LOCAL_PORT:$TARGET_REDIS:6379 -N -S /tmp/sshTunel$LOCAL_PORT ## AWS DocService dev, DNS alias
# SSH_PID=$! ## Only works with &
SSH_PID=`ps aux | grep sshTunel${LOCAL_PORT} | grep -v grep | awk '{print $2}'`
echo $SSH_PID > /tmp/sshTunel${LOCAL_PORT}-pid
Better approach might be to query for the SSH_PID right before killing it, since the file might be stale and it would kill a wrong process.

Related

Why does `ps | grep -o RandomString` always print `RandomString`? [duplicate]

I am using this command to get the process ID of another command:
ps aux | grep 7000.conf | awk '{print $2}'
This will return two PIDs:
7731
22125
I only want the first one. The second is the PID for grep in the above command. Thanks in advance to any one who knows how to alter the above command to return just the first pid.
p.s. open to a new command that does the same thing
In this particular case, escaping the . to what I assume it was meant to do should work:
ps aux | grep '7000\.conf' | awk '{print $2}'
Alternatively, exclude grep:
ps aux | grep 7000.conf | grep -v grep | awk '{print $2}'
ps aux | grep "[7]000.conf" will work as well.

grep a variable containing special characters

i'm new to bash scripting and i have to determine if a process is running in a linux environment.
Actually i use the follow command to do the job:
#ps -ef | awk '{print substr($0, index($0,$8))}' | grep -v grep | grep -w -F $PROCESSNAME
where
awk '{print substr($0, index($0,$8))}'
allow me to ignore UID PID PPID C STIME TTY TIME fields and
grep -v grep
allow me to ignore the row that contains the command itself. So at this point i have a list of all processes running on the system.
Finally:
grep -w -F $PROCESSNAME
read a variable which contains the name of the process that i want to check.
For what i understand the full command should return only the row that has the exact value of $PROCESSNAME
Actually this doesn't works correctly for processes that follow the pattern "[processname]", and probably also for other patterns.
For example to simplify, if i have a running process named "[vmmemctl]" and i run:
#ps -ef | grep -v grep | grep -w -F "vmmemctl]"
it actually returns a result:
#root 615 2 0 Feb26 ? 00:01:00 [vmmemctl]
but the actual process name in the command is different from the process name in the result.
What is the correct command that doesn't have this behavior?
Thank you
awk to the rescue!
ps -ef | awk '$8=="[command]"{NF=8;print}'
or
ps -ef | awk -v c="vmmemctl]" '$8==c{NF=8;print}'
note that this is for an exact match not pattern.
since this is an exact match the command can have spaces and other special chars in it (it's not pattern match but string equality). Using your variable name it will look like this
ps -ef | awk -v c="$PROCESSNAME" '$8==c{NF=8;print}'

Bash grep output filename and line no without matches

I need to get a list of matches with grep including filename and line number but without the match string
I know that grep -Hl will give only file names and grep -Hno will give filename with only matching string. But those not ideal for me. I need to get a list without match but with line no. For this grep -Hln doesn't work. I tried with grep -Hn 'pattern' | cut -d " " -f 1 But it doesn't cut the filename and line no properly.
awk can do that in single command:
awk '/pattern/ {print FILENAME ":" NR}' *.txt
You were pointing it well with cut, only that you need the : field separator. Also, I think you need the first and second group. Hence, use:
grep -Hn 'pattern' files* | cut -d: -f1,2
Sample
$ grep -Hn a a*
a:3:are
a:10:bar
a:11:that
a23:1:hiya
$ grep -Hn a a* | cut -d: -f1,2
a:3
a:10
a:11
a23:1
I guess you want this, just line numbers:
grep -nh PATTERN /path/to/file | cut -d: -f1
example output:
12
23
234
...
Unfortunately you'll need to use cut here. There is no way to do it with pure grep.
Try
grep -RHn Studio 'pattern' | awk -F: '{print $1 , ":", $2}'

AWK - Remove last item out of list

I'm trying to get the PID's of a certain service. I'm trying to do that with the following command:
ps aux | grep 'DynamoDBLocal' | awk '{print $2}'
Gives output:
1021
1022
1161
This returns me 3 PID's, 2 of the service that I want, and 1 for the grep it just did. I would like to remove the last PID (the one from the grep) out of the list.
How can I achieve this?
Just use pgrep, it is the correct tool in this case:
pgrep 'DynamoDBLocal'
Using grep -v:
ps aux | grep 'DynamoDBLocal' | grep -v grep | awk '{print $2}'
If you have pgrep in your system
pgrep DynamoDBLocal
You can say:
ps aux | grep '[D]ynamoDBLocal' | awk '{print $2}'
With a single call to awk
ps aux | awk '!/awk/ && /DynamoDBLocal/{print $2}'
Try pidof, it should give you the pid directly.
pidof DynamoDBLocal
Answering the original question: How to remove lines from output:
ps aux | grep 'DynamoDBLocal' | awk '{print $2}' | head --lines=-1
head allows you to view the X (default 10) first lines of whatever comes in. Given a value X with minus prepended it shows all but the last X lines. The 'inverse' is tail, btw (when you are interested in the last X lines).
However, given your specific problem of looking for PIDs, I recommend pgrep (perreals answer).
I'm not sure that ps aux | grep '...' is the right way.
But assuming that it is right way, you could do
ps aux | grep '...' | awk '{ if (prev) print prev; prev=$2 }'

How to extract version from a single command line in linux?

I have a product which has a command called db2level whose output is given below
I need to extract 8.1.1.64 out of it, so far i came up with,
db2level | grep "DB2 v" | awk '{print$5}'
which gave me an output v8.1.1.64",
Please help me to fetch 8.1.1.64. Thanks
grep is enough to do that:
db2level| grep -oP '(?<="DB2 v)[\d.]+(?=", )'
Just with awk:
db2level | awk -F '"' '$2 ~ /^DB2 v/ {print substr($2,6)}'
db2level | grep "DB2 v" | awk '{print$5}' | sed 's/[^0-9\.]//g'
remove all but numbers and dot
sed is your friend for general extraction tasks:
db2level | sed -n -e 's/.*tokens are "DB2 v\([0-9.]*\)".*/\1/p'
The sed line does print no lines (the -n) but those where a replacement with the given regexp can happen. The .* at the beginning and the end of the line ensure that the whole line is matched.
Try grep with -o option:
db2level | grep -E -o "[0-9]+\.[0-9]+\.[0-9]\+[0-9]+"
Another sed solution
db2level | sed -n -e '/v[0-9]/{s/.*DB2 v//;s/".*//;p}'
This one desn't rely on the number being in a particular format, just in a particular place in the output.
db2level | grep -o "v[0-9.]*" | tr -d v
Try s.th. like db2level | grep "DB2 v" | cut -d'"' -f2 | cut -d'v' -f2
cut splits the input in parts, seperated by delimiter -d and outputs field number -f

Resources