how to zip particular folders in linux using shell - linux

Let say we have a directory structure like:
A
|---B
| |---C
| | |--f1.txt
| |
| |---D
| | |--f2.txt
|
|---E
| |---f3.txt
| |
|
|---F
| |---f4.txt
Now how to zip a file BE.zip excluding F directory, i.e when I unzip BE.zip the result should be:
|---B
| |---C
| | |--f1.txt
| |
| |---D
| | |--f2.txt
|
|---E
| |---f3.txt

ok got it..
zip -r BE.zip B/ E/

you can u this for recursive
zip BE.zip B/* E/*

Related

Recursively add prefix to file names and moving these files from all subdirectories to a specified directory (linux environment)

I'd like to rename the files with the unique sample name (which is the title of the subdirectory 2 levels above the files).
Here is a snippet of the directory structure:
|-RNAdata
| |-Sample1
| | |-cufflinks
| | | |-genes.fpkm_tracking
| | | |-skipped.gtf
| | | |-isoforms.fpkm_tracking
| | | |-transcripts.gtf
| |-Sample2
| | |-cufflinks
| | | |-genes.fpkm_tracking
| | | |-skipped.gtf
| | | |-isoforms.fpkm_tracking
| | | |-transcripts.gtf
There are about 1000 files like this. I'd like to be able to see something like this:
|-RNAdata
| |-Sample1_genes.fpkm_tracking
| |-Sample1_skipped.gtf
| |-Sample1_isoforms.fpkm_tracking
| |-Sample1_transcripts.gtf
| |-Sample2_genes.fpkm_tracking
| |-Sample2_skipped.gtf
| |-Sample2_isoforms.fpkm_tracking
| |-Sample2_transcripts.gtf
I'm working in a Linux environment and have very basic knowledge on file management with this language. Any advice/suggestions for resources on this type of work, that would be great! I'd like to learn this so I can be more independent with this. Thank you!

Moving a split Vim window to the other half of the screen

Let's say we have 3 buffers (A, B, C) open in Vim arranged as follows
-----------------------------------------
| | |
| | |
| | |
| A | |
| | |
| | |
|------------------| B |
| | |
| | |
| C | |
| | |
| | |
-----------------------------------------
and we want to rearrange it as
-----------------------------------------
| | |
| | |
| | |
| | B |
| | |
| | |
| A |--------------------|
| | |
| | |
| | C |
| | |
| | |
-----------------------------------------
I know I can do this by closing C and reopening it after splitting B. Is there a simple way to do this where I don't have to close buffers and I can rearrange the windows directly?
You wouldn't "close" the buffer C, only the window that displays it.
Vim has dedicated normal mode commands for:
switching a window and the next one in a row or column,
rotating the whole window layout,
pushing a window to the far top, far right, far bottom, and far left,
but it doesn't have one for moving a window to an arbitrary point so, assuming the window you want to move has the focus, the command should look like this:
:q|winc w|sp c
which is not too shabby. You might be able to find a plugin that provides the level of control you are after on https://www.vim.org.

Excel if statement to display if anything is in the row

I don't really know how to search for this question or an appropriate title, so I hope that this will make sense.
I'm trying to construct an Excel spreadsheet to keep track of functions of a software that are currently have tests made for them. The spreadsheet looks something like below where A-F are placeholders for the tests and 1-5 are placeholders for functions.
| | A | B | C | D | E | F |
|:-:|---|---|---|---|---|---|
| 1 | X | | | | | X |
| 2 | | | | | | |
| 3 | | X | | | | |
| 4 | | | X | | | |
| 5 | | | | X | X | |
I would like to have another column at the end that would do something like this:
| | A | B | C | D | E | F | Tested? |
|:-:|---|---|---|---|---|---|---------|
| 1 | X | | | | | X | Yes |
| 2 | | | | | | | No |
| 3 | | X | | | | | Yes |
| 4 | | | X | | | | Yes |
| 5 | | | | X | X | | Yes |
where the final column is an if statement that will display a conditional string base on if there are any entries in the row. I know that Excel's IF statements work something like this =IF(A1=10,"YES","NO") but I can't think how I would construct an IF statement that would print YES or NO based on whether there are any entries at all in the row.
EDIT: To add a little more detail. I've thought about constructing an IF statement like this: =IF(SUM(C3:AI3)>0, "YES", "NO") and this works essentially if I use 1s or 0s instead of X or O but I'd rather use the latter. Or really I'd just rather use strings instead of integers.
You can use following formula:
=IF(COUNTA(A1:F1)>0,"Yes","No")
You're looking for the ISBLANK function.
Your solution should be something like this:
=IF(ISBLANK(A1:F1), "Yes","No")

how to join to files with awk/sed/grep/bash similar to SQL JOIN

how to join to files with awk/sed/grep/bash similar to SQL JOIN?
I have a file that looks like this:
and another one that looks like this:
i've also a text version of the image above:
+----------+------------------+------+------------+----+---------------------------------------------------+---------------------------------------------------+-----+-----+-----+------+-------+-------+--------------+------------+--+--+---+---+----+--+---+---+----+------------+------------+------------+------------+
| 21548598 | DSND001906102.2 | 0107 | 001906102 | 02 | FROZEN / O.S.T. | FROZEN / O.S.T. | 001 | 024 | | | 11.49 | 13.95 | 050087295745 | 11/25/2013 | | | N | N | 30 | | 1 | E | 1 | 10/07/2013 | 02/27/2014 | 10/07/2013 | 10/07/2013 |
| 25584998 | WD1194190DVD | 0819 | 1194190 | 18 | FROZEN / (WS DOL DTS) | FROZEN / (WS DOL DTS) | 050 | 110 | | G | 21.25 | 29.99 | 786936838961 | 03/18/2014 | | | N | N | 0 | | 1 | A | 2 | 12/20/2013 | 03/13/2014 | 12/20/2013 | 12/20/2013 |
| 25812794 | WHV1000292717BR | 0526 | 1000292717 | BR | GRAVITY / (UVDC) | GRAVITY / (UVDC) | 050 | 093 | | PG13 | 29.49 | 35.99 | 883929244577 | 02/25/2014 | | | N | N | 30 | | 1 | E | 3 | 01/16/2014 | 02/11/2014 | 01/16/2014 | 01/16/2014 |
| 24475594 | SNY303251.2 | 0085 | 303251 | 02 | BEYONCE | BEYONCE | 001 | 004 | | | 14.99 | 17.97 | 888430325128 | 12/20/2013 | | | N | N | 30 | | 1 | A | 4 | 12/19/2013 | 01/02/2014 | 12/19/2013 | 12/19/2013 |
| 25812787 | WHV1000284958DVD | 0526 | 1000284958 | 18 | GRAVITY (2PC) / (UVDC SPEC 2PK) | GRAVITY (2PC) / (UVDC SPEC 2PK) | 050 | 093 | | PG13 | 21.25 | 28.98 | 883929242528 | 02/25/2014 | | | N | N | 30 | | 1 | E | 5 | 01/16/2014 | 02/11/2014 | 01/16/2014 | 01/16/2014 |
| 21425462 | PBSDMST64400DVD | E349 | 64400 | 18 | MASTERPIECE CLASSIC: DOWNTON ABBEY SEASON 4 (3PC) | MASTERPIECE CLASSIC: DOWNTON ABBEY SEASON 4 (3PC) | 050 | 095 | 094 | | 30.49 | 49.99 | 841887019705 | 01/28/2014 | | | N | N | 30 | | 1 | A | 6 | 09/06/2013 | 01/15/2014 | 09/06/2013 | 09/06/2013 |
| 25584974 | WD1194170BR | 0819 | 1194170 | BR | FROZEN (2PC) (W/DVD) / (WS AC3 DTS 2PK DIGC) | FROZEN (2PC) (W/DVD) / (WS AC3 DTS 2PK DIGC) | 050 | 110 | | G | 27.75 | 39.99 | 786936838923 | 03/18/2014 | | | N | N | 0 | | 2 | A | 7 | 12/20/2013 | 03/13/2014 | 01/15/2014 | 01/15/2014 |
| 21388262 | HBO1000394029DVD | 0203 | 1000394029 | 18 | GAME OF THRONES: SEASON 3 | GAME OF THRONES: SEASON 3 | 050 | 095 | 093 | | 47.99 | 59.98 | 883929330713 | 02/18/2014 | | | N | N | 30 | | 1 | E | 8 | 08/29/2013 | 02/28/2014 | 08/29/2013 | 08/29/2013 |
| 25688450 | WD11955700DVD | 0819 | 11955700 | 18 | THOR: THE DARK WORLD / (AC3 DOL) | THOR: THE DARK WORLD / (AC3 DOL) | 050 | 093 | | PG13 | 21.25 | 29.99 | 786936839500 | 02/25/2014 | | | N | N | 30 | | 1 | A | 9 | 12/24/2013 | 02/20/2014 | 12/24/2013 | 12/24/2013 |
| 23061316 | PRT359054DVD | 0818 | 359054 | 18 | JACKASS PRESENTS: BAD GRANDPA / (WS DUB SUB AC3) | JACKASS PRESENTS: BAD GRANDPA / (WS DUB SUB AC3) | 050 | 110 | | R | 21.75 | 29.98 | 097363590545 | 01/28/2014 | | | N | N | 30 | | 1 | E | 10 | 12/06/2013 | 03/12/2014 | 12/06/2013 | 12/06/2013 |
| 21548611 | DSND001942202.2 | 0107 | 001942202 | 02 | FROZEN / O.S.T. (BONUS CD) (DLX) | FROZEN / O.S.T. (BONUS CD) (DLX) | 001 | 024 | | | 14.09 | 19.99 | 050087299439 | 11/25/2013 | | | N | N | 30 | | 1 | E | 11 | 10/07/2013 | 02/06/2014 | 10/07/2013 | 10/07/2013 |
+----------+------------------+------+------------+----+---------------------------------------------------+---------------------------------------------------+-----+-----+-----+------+-------+-------+--------------+------------+--+--+---+---+----+--+---+---+----+------------+------------+------------+------------+
The 2nd column from the first file can be joined to the 14th column of the second file!
here's what i've been trying to do:
join <(sort awk -F"\t" '{print $14,$12}' aecprda12.tab) <(sort awk -F"\t" '{print $2,$1}' output1.csv)
but i am getting these errors:
$ join <(sort awk -F"\t" '{print $14,$12}' aecprda12.tab) <(sort awk -F"\t" '{print $2,$1}' output1.csv)
sort: unknown option -- F
Try sort --help' for more information.
sort: unknown option -- F
Try sort --help' for more information.
-700476409 [waitproc] -bash 10336 sig_send: error sending signal 20 to pid 10336, pipe handle 0x84, Win32 error 109
the output i would like would be something like this:
+-------+-------+---------------+
| 12.99 | 14.77 | 3383510002151 |
| 13.97 | 17.96 | 3383510002175 |
| 13.2 | 13 | 3383510002267 |
| 13.74 | 14.19 | 3399240165349 |
| 9.43 | 9.52 | 3399240165363 |
| 12.99 | 4.97 | 3399240165479 |
| 7.16 | 7.48 | 3399240165677 |
| 11.24 | 9.43 | 4011550620286 |
| 13.86 | 13.43 | 4260182980316 |
| 13.98 | 12.99 | 4260182980507 |
| 10.97 | 13.97 | 4260182980514 |
| 11.96 | 13.2 | 4260182980545 |
| 15.88 | 13.74 | 4260182980552 |
+-------+-------+---------------+
what am i doing wrong?
You can do all the work in join and sort
join -1 2 -2 14 -t $'\t' -o 2.12,1.1,0 \
<( sort -t $'\t' -k 2,2 output1.csv ) \
<( sort -t $'\t' -k 14,14 aecprda12.tab )
Notes:
$'\t' is a bash ANSI-C quoted string which is a tab character: neither join nor sort seem to recognize the 2-character string "\t" as a tab
-k col,col sorts the file on the specified column
join has several options to control how it works; see the join(1) man page.
sort awk -F...
is not a valid command; it means sort a file named awk but of course, like the error message says, there is no -F option to sort. The syntax you are looking for is
awk -F ... | sort
However, you might be better off doing the joining in Awk directly.
awk -F"\t" 'NR==FNR{k[$14]=$12; next}
k[$2] { print $2, $1, k[$2] }' aecprda12.tab output1.csv
I am assuming that you don't know whether every item in the first file has a corresponding item in the second file - and that you want only "matching" items. There is indeed a good way to do this in awk. Create the following script (as a text file, call it myJoin.txt):
BEGIN {
FS="\t"
}
# loop around as long as the total number of records read
# is equal to the number of records read in this file
# in other words - loop around the first file only
NR==FNR {
a[$2]=$1 # create one array element for each $1/$2 pair
next
}
# loop around all the elements of the second file:
# since we're done processing the first file
{
# see if the associative array element exists:
gsub(/ /,"",$14) # trim leading/ trailing spaces
if (a[$14]) { # see if the value in $14 was seen in the first file
# print out the three values you care about:
print $12 " " a[$14] " " $14
}
}
Now execute this with
awk -f myJoin.txt file1 file2
Seems to work for me...

Performing file type counting in all directories

I have a bash script that gives me counts of files in all of the directories recursively that were edited in the last 45 days
find . -type f -mtime -45| rev | cut -d . -f1 | rev | sort | uniq -ic | sort -rn
I have a directory called
\parent
and in parent I have:
\parent\a
\parent\b
\parent\c
I would run the above script once on folder a, once on b and once on c.
The current output is:
91 xls
85 xlsx
49 doc
46 db
31 docx
24 jpg
22 pub
10 pdf
4 msg
2 xml
2 txt
1 zip
1 thmx
1 htm
1 /ic
I would like to run the script from \parent on all the folders inside \parent and get an output like this:
+-------+------+--------+
| count | ext | folder |
+-------+------+--------+
| 91 | xls | a |
| 85 | xlsx | a |
| 49 | doc | a |
| 46 | db | a |
| 31 | docx | a |
| 24 | jpg | a |
| 22 | pub | a |
| 10 | pdf | a |
| 4 | msg | a |
| 98 | jpg | b |
| 92 | pub | b |
| 62 | pdf | b |
| 2 | xml | b |
| 2 | txt | b |
| 1 | zip | b |
| 1 | thmx | b |
| 1 | htm | b |
| 1 | /ic | b |
| 66 | txt | c |
| 48 | msg | c |
| 44 | xml | c |
| 30 | zip | c |
| 12 | doc | c |
| 6 | db | c |
| 6 | docx | c |
| 3 | jpg | c |
+-------+------+--------+
How can I accomplish this with bash?
Put it into a script, make it executable: chmod +x script.sh and run it with: ./script.sh
#!/bin/sh
find . -type f -mtime -45 2>/dev/null \
| sed 's|^\./\([^/]*\)/|\1/|; s|/.*/|/|; s|/.*.\.| |p; d' \
| sort | uniq -ic \
| sort -b -k2,2 -k1,1rn \
| awk '
BEGIN{
sep = "+-------+------+--------+"
print sep "\n| count | ext | folder |\n" sep
}
{ printf("| %5d | %-4s | %-6s |\n", $1, $3, $2) }
END{ print sep }'
sed 's|^\./\([^/]*\)/|\1/|; s|/.*/|/|; s|/.*.\.| |p; d'
s|^\./\([^/]*\)/.*/|\1 | substitutes ./a/file.xls with a/file.xls.
s|/.*/|/| substitutes b/some/dir/file.mp3 with b/file.mp3.
s|/.*.\.| |p substitutes a file.xls with a xls, if s///p is successful then it also prints to standard out, (to avoid files without extension).
d deletes the line (to avoid printing matching (again) or non-matching lines).
sort | uniq -ic counts each group of extension and directory name.
sort -b -k2,2 -k1,1rn sorts first by directory (field 2), small -> large, and then by count (field 1) in reverse order (large -> small) and numerically. -b makes sort(1) ignore blanks (spaces/tabs).
the last awk part pretty prints the output, maybe you want to put this into a separate script.
If you want to see how each pipe filters the results just try to remove each and you will see the output.
Here you can find good tutorials about sh/awk/sed, etc.
http://www.grymoire.com/Unix/

Resources