I'm using ps, grep, and sed to try to identify some java processes that are uniquely identified by some specific argument, e.g. -DAppService=DDDABC_456 or -DAppService=DDDXYZ_456_cazorla. I want to return a comma separated list: PID,argument,process
I'm working on CentOS7. So far I'm only about half way down the line but getting tangled up.
I'm shooting for this:
1234,-DAppService=DDDABC_456,/usr/java/jdk1.8.0_112/bin/java
2345,-DAppService=DDDABC_456_cazorla,/usr/java/jdk1.8.0_112/bin/java
3456,-DAppService=DDDXYZ_789,/usr/java/jdk1.8.0_112/bin/java
4567,-DAppService=DDDXYZ_789_cazorla,/usr/java/jdk1.8.0_112/bin/java
Note that the argument may or may not have a suffix of "_cazorla".
I tried this but it loses the arguments (and the number of arguments may vary so I don't think I can continue with $9, $10, etc.):
ps -ef | grep DAppService=DDD[A-Z]*_[0-9]*(?:_[a-z]*)? | grep -v grep | awk '{OFS=","; print $2,$8}'
Gives me:
1234,/usr/java/jdk1.8.0_112/bin/java
2345,/usr/java/jdk1.8.0_112/bin/java
3456,/usr/java/jdk1.8.0_112/bin/java
4567,/usr/java/jdk1.8.0_112/bin/java
Also this which comma separates all the grep column results and all arguments too which I don't want:
ps -aef | grep DAppService=DDD[A-Z]*_[0-9]*(?:_[a-z]*)? | grep -v grep | sed -e "s/\s\+/,/g"
Actual result too much to list here but e.g.
user,1234,1,0,Jul03,pts/0,00:03:21,/usr/java/jdk1.8.0_112/bin/java,arg1,arg2,arg3,argn...
user,2345,1,0,Jul03,pts/0,00:03:21,/usr/java/jdk1.8.0_112/bin/java,arg1,arg2,arg3,argn...
user,3456,1,0,Jul03,pts/0,00:03:21,/usr/java/jdk1.8.0_112/bin/java,arg1,arg2,arg3,argn...
user,4567,1,0,Jul03,pts/0,00:03:21,/usr/java/jdk1.8.0_112/bin/java,arg1,arg2,arg3,argn...
My sed knowledge is pretty poor (as is awk but would be open to that as an option too). Once I'm happy with the commands I want to put them into a bash script that I can call from elsewhere.
ps -eo pid=,args= |\
awk '
{
for (i=3; i<=NF; i++)
if ($i ~ regex) {
print $1, $i, $2
next
}
}
' OFS=, regex='awk re to match arg'
ask ps to output just pid and the commandline
specify a regex to awk and have it check each argument (fields 3 to NF) for a match
if found, output pid ($1), command ($2), and the relevant argument ($i)
Notes:
awk can't distinguish cmd "arg1 with spaces" from cmd arg1 arg2 arg3 but that may not matter here
spaces in the command (eg. in a directory name in the path) will cause the command to be truncated at the first space
commas in the command (or the relevant argument) will break the csv output
To find the 2nd character it was grep -e '^.[aA]'. Then what will be for the 4th character? I tried grep -e'^...[aA]'. But it went wrong.
grep processes the input line by line. ^.[aA] is true if a or A is the second character on any line.
You can combine grep with head to only inspect the first line:
head -n1 filename | grep '^...[aA]'
But it still wouldn't work for a file whose first line is shorter than four characters:
x
ya
To really check the fourth character in a file, grep is not the best tool.
#! /bin/bash
read -N4 chars < filename
if [[ "${chars:3:1}" == [aA] ]] ; then
echo Found
fi
But if you tried hard enough, you can still use it. E.g., use tr to replace newlines by spaces, then you can run your grep:
tr '\n' ' ' < filename | grep '^...[aA]'
I want to redirect the output of some command to awk and use system call in awk. But Awk does not accept flags with hyphen. For example, Lets say I have bunch of files, and I want to "cat" them. I would use ls -1 | awk '{ system(" cat " $1)}'
Now, if I want to print the line number also with -n then it does not work ls -1 | awk '{ system(" cat -n" $1)}'
You need a space between -n and the file name:
ls -1 | awk '{ system(" cat -n " $1)}'
Notes
-1 is not needed. ls implicitly prints 1 file per line when its output goes to a pipe.
Any file name with whitespace in it will cause this code to fail.
Parsing the output of ls is generally a bad idea. Both find and the shell offer superior handling of difficult file names.
John1024's helpful answer fixes your problem and contains helpful advice, but let me focus on the syntax aspects:
As a command string, cat -n <file> requires at least 1 space (or tab) between the n, which is an option, and <file>, which is an operand.
String concatenation works differently in awk than in the shell:
" cat -n" $1, despite the presence of a space between " cat -n" and $1, does not insert that space in the resulting string, because awk's string concatenation works by directly joining strings placed next to one another irrespective of intervening whitespace.
For instance, the following commands all yield string literal ab, irrespective of any whitespace between the operands of the string concatenation:
awk 'BEGIN { print "a""b" }'
awk 'BEGIN { print "a" "b" }'
awk 'BEGIN { s = "b"; print "a"s }'
awk 'BEGIN { s = "b"; print "a" s }'
this is not a proper use case for awk, you're better off with something like this
find . -maxdepth 1 -type f -exec cat -n {} \;
How do I join the result of ls -1 into a single line and delimit it with whatever I want?
paste -s -d joins lines with a delimiter (e.g. ","), and does not leave a trailing delimiter:
ls -1 | paste -sd "," -
EDIT: Simply "ls -m" If you want your delimiter to be a comma
Ah, the power and simplicity !
ls -1 | tr '\n' ','
Change the comma "," to whatever you want. Note that this includes a "trailing comma" (for lists that end with a newline)
This replaces the last comma with a newline:
ls -1 | tr '\n' ',' | sed 's/,$/\n/'
ls -m includes newlines at the screen-width character (80th for example).
Mostly Bash (only ls is external):
saveIFS=$IFS; IFS=$'\n'
files=($(ls -1))
IFS=,
list=${files[*]}
IFS=$saveIFS
Using readarray (aka mapfile) in Bash 4:
readarray -t files < <(ls -1)
saveIFS=$IFS
IFS=,
list=${files[*]}
IFS=$saveIFS
Thanks to gniourf_gniourf for the suggestions.
I think this one is awesome
ls -1 | awk 'ORS=","'
ORS is the "output record separator" so now your lines will be joined with a comma.
Parsing ls in general is not advised, so alternative better way is to use find, for example:
find . -type f -print0 | tr '\0' ','
Or by using find and paste:
find . -type f | paste -d, -s
For general joining multiple lines (not related to file system), check: Concise and portable “join” on the Unix command-line.
The combination of setting IFS and use of "$*" can do what you want. I'm using a subshell so I don't interfere with this shell's $IFS
(set -- *; IFS=,; echo "$*")
To capture the output,
output=$(set -- *; IFS=,; echo "$*")
Adding on top of majkinetor's answer, here is the way of removing trailing delimiter(since I cannot just comment under his answer yet):
ls -1 | awk 'ORS=","' | head -c -1
Just remove as many trailing bytes as your delimiter counts for.
I like this approach because I can use multi character delimiters + other benefits of awk:
ls -1 | awk 'ORS=", "' | head -c -2
EDIT
As Peter has noticed, negative byte count is not supported in native MacOS version of head. This however can be easily fixed.
First, install coreutils. "The GNU Core Utilities are the basic file, shell and text manipulation utilities of the GNU operating system."
brew install coreutils
Commands also provided by MacOS are installed with the prefix "g". For example gls.
Once you have done this you can use ghead which has negative byte count, or better, make alias:
alias head="ghead"
Don't reinvent the wheel.
ls -m
It does exactly that.
just bash
mystring=$(printf "%s|" *)
echo ${mystring%|}
This command is for the PERL fans :
ls -1 | perl -l40pe0
Here 40 is the octal ascii code for space.
-p will process line by line and print
-l will take care of replacing the trailing \n with the ascii character we provide.
-e is to inform PERL we are doing command line execution.
0 means that there is actually no command to execute.
perl -e0 is same as perl -e ' '
To avoid potential newline confusion for tr we could add the -b flag to ls:
ls -1b | tr '\n' ';'
It looks like the answers already exist.
If you want
a, b, c format, use ls -m ( Tulains Córdova’s answer)
Or if you want a b c format, use ls | xargs (simpified version of Chris J’s answer)
Or if you want any other delimiter like |, use ls | paste -sd'|' (application of Artem’s answer)
The sed way,
sed -e ':a; N; $!ba; s/\n/,/g'
# :a # label called 'a'
# N # append next line into Pattern Space (see info sed)
# $!ba # if it's the last line ($) do not (!) jump to (b) label :a (a) - break loop
# s/\n/,/g # any substitution you want
Note:
This is linear in complexity, substituting only once after all lines are appended into sed's Pattern Space.
#AnandRajaseka's answer, and some other similar answers, such as here, are O(n²), because sed has to do substitute every time a new line is appended into the Pattern Space.
To compare,
seq 1 100000 | sed ':a; N; $!ba; s/\n/,/g' | head -c 80
# linear, in less than 0.1s
seq 1 100000 | sed ':a; /$/N; s/\n/,/; ta' | head -c 80
# quadratic, hung
sed -e :a -e '/$/N; s/\n/\\n/; ta' [filename]
Explanation:
-e - denotes a command to be executed
:a - is a label
/$/N - defines the scope of the match for the current and the (N)ext line
s/\n/\\n/; - replaces all EOL with \n
ta; - goto label a if the match is successful
Taken from my blog.
If you version of xargs supports the -d flag then this should work
ls | xargs -d, -L 1 echo
-d is the delimiter flag
If you do not have -d, then you can try the following
ls | xargs -I {} echo {}, | xargs echo
The first xargs allows you to specify your delimiter which is a comma in this example.
ls produces one column output when connected to a pipe, so the -1 is redundant.
Here's another perl answer using the builtin join function which doesn't leave a trailing delimiter:
ls | perl -F'\n' -0777 -anE 'say join ",", #F'
The obscure -0777 makes perl read all the input before running the program.
sed alternative that doesn't leave a trailing delimiter
ls | sed '$!s/$/,/' | tr -d '\n'
Python answer above is interesting, but the own language can even make the output nice:
ls -1 | python -c "import sys; print(sys.stdin.read().splitlines())"
You can use:
ls -1 | perl -pe 's/\n$/some_delimiter/'
If Python3 is your cup of tea, you can do this (but please explain why you would?):
ls -1 | python -c "import sys; print(','.join(sys.stdin.read().splitlines()))"
ls has the option -m to delimit the output with ", " a comma and a space.
ls -m | tr -d ' ' | tr ',' ';'
piping this result to tr to remove either the space or the comma will allow you to pipe the result again to tr to replace the delimiter.
in my example i replace the delimiter , with the delimiter ;
replace ; with whatever one character delimiter you prefer since tr only accounts for the first character in the strings you pass in as arguments.
You can use chomp to merge multiple line in single line:
perl -e 'while (<>) { if (/\$/ ) { chomp; } print ;}' bad0 >test
put line break condition in if statement.It can be special character or any delimiter.
Quick Perl version with trailing slash handling:
ls -1 | perl -E 'say join ", ", map {chomp; $_} <>'
To explain:
perl -E: execute Perl with features supports (say, ...)
say: print with a carrier return
join ", ", ARRAY_HERE: join an array with ", "
map {chomp; $_} ROWS: remove from each line the carrier return and return the result
<>: stdin, each line is a ROW, coupling with a map it will create an array of each ROW