Perforce - find the most recently updated files in a branch - perforce

How can I order files by "most recently modified" or get a list of the most recently modified files in Perforce?

One way to approach this is to start with p4 changes (which orders its output by most recent) and pipe the result into p4 files:
C:\Perforce\test>p4 -ztag -F #=%change% changes -m10 -ssubmitted | p4 -x - files
//stream/main/move/bar#1 - move/add change 276 (text)
//stream/main/move/foo#2 - move/delete change 276 (text)
//stream/main/move/foo#1 - add change 275 (text)
//stream/main/seongchan-test/B/01.txt#3 - move/add change 274 (text)
//stream/main/seongchan-test/B/02.txt#2 - move/delete change 274 (text)
//stream/main/seongchan-test/B/01.txt#2 - move/delete change 273 (text)
//stream/main/seongchan-test/B/legacy_01.txt#1 - move/add change 273 (text)
//stream/main/seongchan-test/main/01.txt#3 - move/add change 272 (text)
//stream/main/seongchan-test/main/02.txt#2 - move/delete change 272 (text)
//stream/main/seongchan-test/main/01.txt#2 - move/delete change 271 (text)
//stream/main/seongchan-test/main/legacy_01.txt#1 - move/add change 271 (text)
//stream/main/seongchan-test/A/01.txt#3 - move/add change 269 (text)
//stream/main/seongchan-test/A/02.txt#2 - move/delete change 269 (text)
//stream/main/seongchan-test/A/01.txt#2 - move/delete change 268 (text)
//stream/main/seongchan-test/A/legacy_01.txt#1 - move/add change 268 (text)
//stream/main/seongchan-test/A/01.txt#1 - branch change 267 (text)
//stream/main/seongchan-test/A/02.txt#1 - branch change 267 (text)
//stream/main/seongchan-test/B/01.txt#1 - branch change 267 (text)
//stream/main/seongchan-test/B/02.txt#1 - branch change 267 (text)
//stream/main/seongchan-test/main/01.txt#1 - add change 266 (text)
//stream/main/seongchan-test/main/02.txt#1 - add change 266 (text)
Note that the file revisions are ordered descending by change, because we got each batch via a p4 files #=CHANGE command.
Another approach if you just want "recently updated files" for some arbitrary value of "recent" is to do a simple p4 files command with a revision range, e.g. a date range:
C:\Perforce\test>p4 files #2021/07/30,now
//stream/main/main/A/01.txt#3 - move/add change 265 (text)
//stream/main/main/A/02.txt#2 - move/delete change 265 (text)
//stream/main/main/A/legacy_01.txt#1 - move/add change 264 (text)
//stream/main/main/B/01.txt#1 - branch change 263 (text)
//stream/main/main/B/02.txt#1 - branch change 263 (text)
//stream/main/move/bar#1 - move/add change 276 (text)
//stream/main/move/foo#2 - move/delete change 276 (text)
//stream/main/seongchan-test/A/01.txt#3 - move/add change 269 (text)
//stream/main/seongchan-test/A/02.txt#2 - move/delete change 269 (text)
//stream/main/seongchan-test/A/legacy_01.txt#1 - move/add change 268 (text)
//stream/main/seongchan-test/B/01.txt#3 - move/add change 274 (text)
//stream/main/seongchan-test/B/02.txt#2 - move/delete change 274 (text)
//stream/main/seongchan-test/B/legacy_01.txt#1 - move/add change 273 (text)
//stream/main/seongchan-test/main/01.txt#3 - move/add change 272 (text)
//stream/main/seongchan-test/main/02.txt#2 - move/delete change 272 (text)
//stream/main/seongchan-test/main/legacy_01.txt#1 - move/add change 271 (text)
In both cases, you can include a file specification to limit the results to a particular branch (e.g. add //depot/my_branch/... to the p4 changes command, or include it in the #date,now for the p4 files command).

Related

Calculating AUDPC Using Spotfire

I was following this question to address a similar situation:
How to Calculate Area Under the Curve in Spotfire?
My data is in the following format:
PLANT
OBS_DATE_RECORDED
TRAIT_VALUE
period
A
3/16/2021
225
A3/16/2021
A
3/23/2021
227
A3/23/2021
A
3/30/2021
220
A3/30/2021
A
4/7/2021
240
A4/7/2021
A
4/13/2021
197
A4/13/2021
A
4/20/2021
197
A4/20/2021
A
4/27/2021
218
A4/27/2021
B
3/16/2021
253
B3/16/2021
B
3/23/2021
274
B3/23/2021
B
3/30/2021
271
B3/30/2021
B
4/7/2021
257
B4/7/2021
B
4/13/2021
250
B4/13/2021
A
4/20/2021
241
A4/20/2021
B
4/27/2021
255
B4/27/2021
Following the answer's formula as a calculated column:
([TRAIT_VALUE] + Avg([TRAIT_VALUE]) over (Intersect(NextPeriod([period]),[PLANT]))) / 2 * (Avg([OBS_DATE_RECORDED]) over (Intersect(NextPeriod([period]),[PLANT])) - [OBS_DATE_RECORDED])
However, the results don't appear correct.
AUDPC
1603.19:59:59.928
1608.17:59:59.956
2924.20:0:0.100
7732.21:0:0.000
1395.14:41:44.404
1461.23:30:0.050
-4393.7:59:59.712
I think the problem might be the date format but don't understand the formula well enough to troubleshoot. In Excel I usually compute the AUDPC by using the SUMPRODUCTS multiplying the days between two dates by the average TRAIT_VALUE between those two dates.

how can I use multiple operation in awk to edit text file

I have a text file like this small example:
chr10:103909786-103910082 147 148 24 BA
chr10:103909786-103910082 149 150 11 BA
chr10:103909786-103910082 150 151 2 BA
chr10:103909786-103910082 152 153 1 BA
chr10:103909786-103910082 274 275 5 CA
chr10:103909786-103910082 288 289 15 CA
chr10:103909786-103910082 294 295 4 CA
chr10:103909786-103910082 295 296 15 CA
chr10:104573088-104576021 2925 2926 134 CA
chr10:104573088-104576021 2926 2927 10 CA
chr10:104573088-104576021 2932 2933 2 CA
chr10:104573088-104576021 58 59 1 BA
chr10:104573088-104576021 689 690 12 BA
chr10:104573088-104576021 819 820 33 BA
in this file there are 5 tab separated columns. the first column is considered as ID. for example in the first row the whole "chr10:103909786-103910082" is ID.
1- in the 1st step I would like to filter out the rows based on the 4th column.
if the number in the 4th column is less than 10 and the same row but in the 5th column the group is BA, that row will be filtered out. also if the number in the 4th column is less than 5 and the same row but in the 5th column the group is CA, that row will be filtered out.
3- 3rd step:
I want to get the ratio of number in 4th column. in fact in the 1st column there are repeated values which represent the same ID. I want to get one ratio per ID, so in the output every ID will be repeated only once. each ID has both BA and CA in the 5th column. for each ID I should get 2 values for CA and BA separately and get the ration of CA/BA as the final value for each ID. to get one value as CA, I should add up all values in the 4th column which belong the same ID and classified as CA and to get one value as BA, I should add up all values in the 4th column which belong the same ID and classified as BA. the last step is to get the ration of CA/BA per ID. the expected output for the small example would look like this:
1- after filtration:
chr10:103909786-103910082 147 148 24 BA
chr10:103909786-103910082 149 150 11 BA
chr10:103909786-103910082 274 275 5 CA
chr10:103909786-103910082 288 289 15 CA
chr10:103909786-103910082 295 296 15 CA
chr10:104573088-104576021 2925 2926 134 CA
chr10:104573088-104576021 2926 2927 10 CA
chr10:104573088-104576021 689 690 12 BA
chr10:104573088-104576021 819 820 33 BA
2- after summarizing each group (CA and BA):
chr10:103909786-103910082 147 148 35 BA
chr10:103909786-103910082 274 275 35 CA
chr10:104573088-104576021 2925 2926 144 CA
chr10:104573088-104576021 819 820 45 BA
3- the final output(this ratio is made using the values in 4th column):
chr10:103909786-103910082 1
chr10:104573088-104576021 3.2
in the above lines, 1 = 35/35 and 3.2 = 144/45.
I am trying to do that in awk
awk -F "\t" '{ (if($4 < -10 & $5==BA)), (if($4 < -5 & $5==CA)) ; print $2 = BA/CA} file.txt > out.txt
I tried to follow the steps that mentioned in the code but did not succeed. do you know how to solve the problem?
If the records with the same ID are always consecutive, you can do that:
awk 'ID!=$1 {
if (ID) {
print ID, a["CA"]/a["BA"]; a["CA"]=a["BA"]=0;
}
ID=$1
}
$5=="BA" && $4>=10 || $5=="CA" && $4>=5 { a[$5]+=$4 }
END{ print ID, a["CA"]/a["BA"] }' file.txt
The first block tests if the ID has changed, in this case, it displays the previous ID and the ratio.
The second block filter unwanted records.
The END block displays the result for the last ID.

Filter a large text file using ID in another text file

I have a two text file, one file is composed of about 60,000 rows and 14 columns and another has one column containing the subset of one of the columns (first column) in the first file. I would like to filter the File 1 based on ID name in the file 2. I tried some command on net but none of them were not useful. It's a few lines of two text file (I'm on linux system)
File 1:
Contig100 orange1.1g013919m 75.31 81 12 2 244 14 2 78 4e-29 117 1126 435
Contig1000 orange1.1g045442m 65.50 400 130 2 631 1809 2 400 1e-156 466 2299 425
Contig10005 orange1.1g003445m 83.86 824 110 2 3222 808 1 820 0.0 1322 3583 820
Contig10006 orange1.1g047384m 81.82 22 4 0 396 331 250 271 7e-05 41.6 396 412
File 2:
Contig1
Contig1000
Contig10005
Contig10017
Please let me know your great suggestion to solve this issue.
Thanks in advance.
You can do this with python:
with open('filter.txt', 'r') as f:
mask = f.read()
with open('data.txt', 'r') as f:
while True:
l = f.readline()
if not l:
break
if l.split(' ')[0] in mask:
print(l[:-1])
If you're on Linux/Mac, you can do it on the command line (the $ symbolized the command prompt, don't type it).
First, we create a file2-patterns from your file2 by appending .* to each line:
$ while read line; do echo "$line .*"; done < file2 > file2-patterns
And have a look at that file:
$ cat file2-patterns
Contig1 .*
Contig1000 .*
Contig10005 .*
Contig10017 .*
Now we can use these patterns to filter out lines from file1.
$ grep -f file2-patterns file1
Contig1000 orange1.1g045442m 65.50 400 130 2 631 1809 2 400 1e-156 466 2299 425
Contig10005 orange1.1g003445m 83.86 824 110 2 3222 808 1 820 0.0 1322 3583 820

Linux (command) | rename | trim leading chracters after x vs. keep only numbers

I want to delete some parts from file names such that
101 - title [1994].mp4
102 - title [1994].mp4
103 - title [1994].mp4
104 - title [1994].mp4
105 - title [1994].mp4
becomes
101.mp4
102.mp4
103.mp4
104.mp4
There are two or more ways to handle this, either by:
keeping numbers and remove non-numbered characters
trim leading characters after (3)-characters
How would I use the linux command rename to only keep the first (3) characters and trim the rest, while keeping the extension ofcourse.
I would like to avoid the mv command, what are the ways to do this with rename?
This is the expression you want s/(\d{3}).*$/$1.mp4/. Take a look at the output:
rename -n 's/(\d{3}).*$/$1.mp4/' *mp4
101 - title [1994].mp4 renamed as 101.mp4
102 - title [1994].mp4 renamed as 102.mp4
103 - title [1994].mp4 renamed as 103.mp4
104 - title [1994].mp4 renamed as 104.mp4
105 - title [1994].mp4 renamed as 105.mp4

R: How to filter through a string of characters in the header of a table

I have a table, here's the start:
TargetID SM_H1462 SM_H1463 SM_K1566 SM_X1567 SM_V1568 SM_K1534 SM_K1570 SM_K1571
ENSG00000000419.8 290 270 314 364 240 386 430 329
ENSG00000000457.8 252 230 242 220 106 234 343 321
ENSG00000000460.11 154 158 162 136 64 152 206 432
ENSG00000000938.7 20106 18664 19764 15640 19024 18508 45590 32113
I want to write a code that will filter through the names of each column (the SM_... ones) and only look at the fourth character in each name. There are 4 different options that can appear at the 4th character: they can be letters H, K, X or V. This can be seen from the table above, e.g. SM_H1462, SM_K1571 etc. Names that have the letter H and K as the 4th character is the Control and names that have letters X or V as the 4th character is the Case.
I want the code to separate the column names based on the 4th letter and group them into two groups: either Case and Control.
Essentially, we can ignore the data for now, I just want to work with the col names first.
You could try checking for the fourth character and ger case and control aas two separate data frames,if that helps you
my.df <- data.frame(matrix(rep(seq(1,8),3), ncol = 8))
colnames(my.df) <- c('SM_H1462','SM_H1463','SM_K1566','SM_X1567', 'SM_V1568', 'SM_K1534', 'SM_K1570','SM_K1571')
my.df
control = my.df[,(substr(colnames(my.df),4,4) == 'H' | substr(colnames(my.df),4,4) == 'K')]
case = my.df[,(substr(colnames(my.df),4,4) == 'X' | substr(colnames(my.df),4,4) == 'V')]

Resources