Bash sort files by filename containing year and abbreviated month [closed] - linux

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I have files for many years and all months, with example names: dir1/dir2/dir3/file_name_2017_v_2017.Jan.exp.var.txt, dir1/dir2/dir3/file_name_2017_v_2017.Feb.exp.var.txt, …, and dir1/dir2/dir3/file_name_2017_v_2017.Dec.exp.var.txt.
There is a script executing a one line command to store a list of files in an array.
ls dir1/dir2/dir3/file_name_2017_v_*.exp.var.txt
This works, however they are out of order by month. I would like them to be sorted by YYYY.MMM. I have tried various sort commands using -M to lastly sort by month, however nothing is working. What am I missing to sort these files? I prefer a one-line command to sort these.
Edit 1:
Using ls *.txt | sort --field-separator='.' -k 1,2M -r reverses the year order, and the alphabetical order of the months. Removing the -r puts the years in chronological order, however the months are in alphabetical order. This is not what I want, as I want the files in chronological order

Try this command:
ls dir1/dir2/dir3/file_name_2017_v_*.exp.var.txt | sort -t '.' -k 1.33,1.36n -k 2,2M
Or use _ as the field-separator:
ls dir1/dir2/dir3/file_name_2017_v_*.exp.var.txt | sort -t '_' -k 5.1,5.4n -k 5.6,5.8M
If years that are different before and after the v, need to add another -k:
ls dir1/dir2/dir3/file_name_*_v_*.exp.var.txt | sort -t '_' -k 3.1,3.4n -k 5.1,5.4n -k 5.6,5.8M
Example(Update):
$ mkdir -p dir1/dir2/dir3
$ touch dir1/dir2/dir3/file_name_2017_v_201{5..7}.{Jan,Feb,Mar,Apr,May,Jun,Jul,Aug,Sep,Oct,Nov,Dec}.exp.var.txt
$ ls dir1/dir2/dir3/file_name_2017_v_*.exp.var.txt | sort -t '.' -k 1.33,1.36n -k 2,2M
$ ls dir1/dir2/dir3/file_name_2017_v_*.exp.var.txt | sort -t '_' -k 5.1,5.4n -k 5.6,5.8M
dir1/dir2/dir3/file_name_2017_v_2015.Jan.exp.var.txt
dir1/dir2/dir3/file_name_2017_v_2015.Feb.exp.var.txt
dir1/dir2/dir3/file_name_2017_v_2015.Mar.exp.var.txt
dir1/dir2/dir3/file_name_2017_v_2015.Apr.exp.var.txt
dir1/dir2/dir3/file_name_2017_v_2015.May.exp.var.txt
dir1/dir2/dir3/file_name_2017_v_2015.Jun.exp.var.txt
dir1/dir2/dir3/file_name_2017_v_2015.Jul.exp.var.txt
dir1/dir2/dir3/file_name_2017_v_2015.Aug.exp.var.txt
dir1/dir2/dir3/file_name_2017_v_2015.Sep.exp.var.txt
dir1/dir2/dir3/file_name_2017_v_2015.Oct.exp.var.txt
dir1/dir2/dir3/file_name_2017_v_2015.Nov.exp.var.txt
dir1/dir2/dir3/file_name_2017_v_2015.Dec.exp.var.txt
dir1/dir2/dir3/file_name_2017_v_2016.Jan.exp.var.txt
dir1/dir2/dir3/file_name_2017_v_2016.Feb.exp.var.txt
dir1/dir2/dir3/file_name_2017_v_2016.Mar.exp.var.txt
dir1/dir2/dir3/file_name_2017_v_2016.Apr.exp.var.txt
dir1/dir2/dir3/file_name_2017_v_2016.May.exp.var.txt
dir1/dir2/dir3/file_name_2017_v_2016.Jun.exp.var.txt
dir1/dir2/dir3/file_name_2017_v_2016.Jul.exp.var.txt
dir1/dir2/dir3/file_name_2017_v_2016.Aug.exp.var.txt
dir1/dir2/dir3/file_name_2017_v_2016.Sep.exp.var.txt
dir1/dir2/dir3/file_name_2017_v_2016.Oct.exp.var.txt
dir1/dir2/dir3/file_name_2017_v_2016.Nov.exp.var.txt
dir1/dir2/dir3/file_name_2017_v_2016.Dec.exp.var.txt
dir1/dir2/dir3/file_name_2017_v_2017.Jan.exp.var.txt
dir1/dir2/dir3/file_name_2017_v_2017.Feb.exp.var.txt
dir1/dir2/dir3/file_name_2017_v_2017.Mar.exp.var.txt
dir1/dir2/dir3/file_name_2017_v_2017.Apr.exp.var.txt
dir1/dir2/dir3/file_name_2017_v_2017.May.exp.var.txt
dir1/dir2/dir3/file_name_2017_v_2017.Jun.exp.var.txt
dir1/dir2/dir3/file_name_2017_v_2017.Jul.exp.var.txt
dir1/dir2/dir3/file_name_2017_v_2017.Aug.exp.var.txt
dir1/dir2/dir3/file_name_2017_v_2017.Sep.exp.var.txt
dir1/dir2/dir3/file_name_2017_v_2017.Oct.exp.var.txt
dir1/dir2/dir3/file_name_2017_v_2017.Nov.exp.var.txt
dir1/dir2/dir3/file_name_2017_v_2017.Dec.exp.var.txt
P.S. You need to count the start and end index of the month, e.g. its 1.33,1.36n in your example.

Just for fun, this one liner will sort independently of directory names and with a bit of work, also independently of filenames.
First, add year and month as fields at beginning of record and sort by them
find dir1/ -name '*.exp.var.txt' | sed -re 's/^.*_v_(201[5-7])\.([A-Za-z]{3,3})\.exp.var.txt$/\1 \U\2 \E&/' | LC_TIME=en_US sort -k 1n -k 2M
This will return
2015 JAN dir1/dir2/dir3/file_name_2017_v_2015.Jan.exp.var.txt
2015 FEB dir1/dir2/dir3/file_name_2017_v_2015.Feb.exp.var.txt
2015 MAR dir1/dir2/dir3/file_name_2017_v_2015.Mar.exp.var.txt
2015 APR dir1/dir2/dir3/file_name_2017_v_2015.Apr.exp.var.txt
2015 MAY dir1/dir2/dir3/file_name_2017_v_2015.May.exp.var.txt
Then, just print the needed field
find dir1/ -name '*.exp.var.txt' | \
sed -re 's/^.*_v_(201[5-7])\.([A-Za-z]{3,3})\.exp.var.txt$/\1 \U\2 \E&/' | \
LC_TIME=en_US sort -k 1n -k 2M | \
gawk '{ print $3 }'
Result:
dir1/dir2/dir3/file_name_2017_v_2015.Jan.exp.var.txt
dir1/dir2/dir3/file_name_2017_v_2015.Feb.exp.var.txt
dir1/dir2/dir3/file_name_2017_v_2015.Mar.exp.var.txt
dir1/dir2/dir3/file_name_2017_v_2015.Apr.exp.var.txt
dir1/dir2/dir3/file_name_2017_v_2015.May.exp.var.txt
dir1/dir2/dir3/file_name_2017_v_2015.Jun.exp.var.txt
dir1/dir2/dir3/file_name_2017_v_2015.Jul.exp.var.txt

Related

Combine number of lines of more files with filename [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
Specify a command / command set that displays the number of lines of code in the .c and .h files in the current directory, displaying each file in alphabetical order followed by ":" and the number of lines in the files, and finally the total of the lines of code.
An example that might be displayed would be :
test.c: 202
example.c: 124
example.h: 43
Total: 369
I'd like to find a solution in the shortest form possible. I've experimented many commands like:
find . -name '*.c' -o -name '*.h' | xargs wc -l
== it shows 0 ./path/test.c and the total, but isn't close enough
stat -c "%n:%s" *
== it shows test.c:0, but it shows all file types and doesn't show the number of lines or the total
wc -l *.c *.h | tr ' ' '\:
== it shows 0:test.c and the total, but doesn't search in sub-directories and the order is reversed compared to the problem (filename: number_of_lines).
This one is closer to the answer but I'm out of ideas after searching most commands I saw in similar problems.
This should do it:
wc -l *.c *.h | awk '{print $2 ": " $1}'
Run a subshell in xargs
xargs -n1 sh -c 'printf "%s: %s\n" "$1" "$(wc -l <"$1")"' --
xargs -n1 sh -c 'echo "$1 $(wc -l <"$1")"' --

Bash descending filename sorting [duplicate]

This question already has answers here:
Sort files numerically in bash
(3 answers)
Closed 8 years ago.
I've been trying to sort my filenames with commands similar to ls -1 | sort -n -t "_" -k1 but just can't get it to work. Please help.
I have:
10_filename
11_filename
12_filename
1_filename
2_filename
I want to get:
1_filename
2_filename
...
10_filename
11_filename
Please try following this will solve the issue
ls -1v
-v It sorts on basis of file version versions
Try this,
ls -1 *\_filename | sort -n
or
ls -1 | sort -n
ls -1 | sort -t '_' +1 +0n
below, a bit heavy but working if sort does not accept field order and using simple string sort.
ls -1 | sed 's/^\([0-9]*\)_\(.*\)/000\1_\1_\2/;s/^0*\([0-9]\{3\}\)/\1/;s/\([0-9]\{1,\}_[0-9]\{1,\}_\)\(.*\)/\2_\1/' | sort -n | sed 's/\(.*\)_[0-9]\{1,\}_\([0-9]\{1,\}\)_$/\2_\1/'

Linux sorting "ls -al" output by date

I want to sort the output of the "ls -al" command according to date. I am able to easily do that for one column with command:
$ ls -al | sort -k6 -M -r
But how to do it for both collumn 6 and 7 simultaneously? The command:
$ ls -al | sort -k6 -M -r | sort -k7 -r
prints out results I do not understand.
The final goal would be to see all the files from the most recently modified (or v.v.).
Here is the attached example for the data to be sorted and the command used:
With sort, if you specify -k6, the key starts at field 6 and extends to the end of the line. To truncate it and only use field 6, you should specify -k6,6. To sort on multiple keys, just specify -k multiple times. Also, you need to apply the M modifier only to the month, and the n modifier to the day. So:
ls -al | sort -k 6,6M -k 7,7n -r
Do note Charles' comment about abusing ls though. Its output cannot be reliably parsed. A good demonstration of this is that the image you've posted shows the month/date in columns 4 and 5, so it's not clear why you want to sort on columns 6 and 7.
The final goal would be to see all the files from the most recently modified
ls -t
or (for reverse, most recent at bottom):
ls -tr
The ls man page describes this in more details, and lists other options.
You could try ls -lsa -it -r
sample
enter image description here

Scripting with unix to get the processes run by users [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
If I find out I have two users logged (UserA and UserB) in to the systems right now, How do i find out the processes run by those two users. but, the trick here is the script is to be run in an unattended batch without any input from the keyboard. other than being invoked.
I know the first part of the script would be
who | awk '{print $1}'
the output of this would be
UserA
UserB
What I would like to know is, how can I use this output and shove it with some ps command automatically and get the required result.
I finally figured out the one-liner I was searching for, with the help of the other answers (updated for case where no users logged in - see comments).
ps -fU "`who | cut -d' ' -f1 | uniq | xargs echo`" 2> /dev/null
The thing inside the backticks is executed and "inserted at the spot". It works as follows:
who : you know what that does
cut -d' ' : split strings into fields, using ' ' as separator
-f1 : and return only field 1
uniq : return only unique entries
xargs echo : take each of the values piped in, and send them through echo: this strips the \n
2> /dev/null : if there are any error messages (sent to 2: stderr)
: redirect those to /dev/null - i.e. "dump them, never to be seen again"
The output of all that is
user1 user2 user3
...however many there are. And you then call ps with the -fU flags, requesting all processes for these users with full format (you can of course change these flags to get the formatting you want, just keep the -U in there just before the thing in "` `"
ps -fU user1 user2 user3
Get a list of users (using who), save to a file, then list all processes, and grep that (using the file you just created),
tempfile=/tmp/wholist.$$
who | cut -f1 -d' '|sort -u > $tempfile
ps -ef |grep -f $tempfile
rm $tempfile
LOGGED_IN=$( who | awk '{print $1}' | sort -u | xargs echo )
[ "$LOGGED_IN" ] && ps -fU "$LOGGED_IN"
The standard switch -U will restrict output to only those processes whose real user ID corresponds to any given as its argument. (E.g., ps -f -U "UserA UserB".)
Not sure if I'm understanding your question correctly, but you can pipe the output of ps through grep to get the processes run by a particular user, like so:
ps -ef | grep '^xxxxx '
where xxxxx is the user.

Sort logs by date field in bash

let's have
126 Mar 8 07:45:09 nod1 /sbin/ccccilio[12712]: INFO: sadasdasdas
2 Mar 9 08:16:22 nod1 /sbin/zzzzo[12712]: sadsdasdas
1 Mar 8 17:20:01 nod1 /usr/sbin/cron[1826]: asdasdas
4 Mar 9 06:24:01 nod1 /USR/SBIN/CRON[27199]: aaaasdsd
1 Mar 9 06:24:01 nod1 /USR/SBIN/CRON[27201]: aaadas
I would like to sort this output by date and time key.
Thank you very much.
Martin
For GNU sort: sort -k2M -k3n -k4
-k2M sorts by second column by month (this way "March" comes before "April")
-k3n sorts by third column in numeric mode (so that " 9" comes before "10")
-k4 sorts by the fourth column.
See more details in the manual.
little off-topic - but anyway. only useful when working within filetrees
ls -l -r --sort=time
from this you could create a one-liner which for example deletes the oldest backup in town.
ls -l -r --sort=time | grep backup | head -n1 | while read line; do oldbackup=\`echo $line | awk '{print$8}'\`; rm $oldbackup; done;
days need numeric (not lexical) sort, so it should be sort -s -k 2M -k 3n -k 4,4
See more details here.
You can use the sort command:
cat $logfile | sort -M -k 2
That means: Sort by month (-M) beginning from second column (-k 2).

Resources