Use matlab to search excel data file for time range and copy data into variable - excel

In my excel file I have a time column in 12 hr clock time and a bunch of data columns. I have pasted a snippet of it in this post as a code since i cant attach a file. I am trying to build a gui that will take an input from the user like so:
start time: 7:29:32 AM
End time: 7:29:51 AM
Then do the following:
calculate the time that has passed in seconds (should be just a row count, data is gathered once a second)
copy the data in the time range from the "Data 3" column in to a variable perform other calculations on the data copied as needed
I am having some trouble figuring out what to do to search the time data and find its location since it imports as text with xlsread. any ideas?
The data looks like this:
Time Data 1 Data 2 Data 3 Data 4 Data 5
7:29:25 AM 0.878556385 0.388400561 0.076890401 0.93335277 0.884750618
7:29:26 AM 0.695838393 0.712762566 0.014814069 0.81264949 0.450303694
7:29:27 AM 0.250846937 0.508617941 0.24802015 0.722457624 0.47119616
7:29:28 AM 0.206189924 0.82970364 0.819163787 0.060932817 0.73455323
7:29:29 AM 0.161844331 0.768214077 0.154097877 0.988201094 0.951520263
7:29:30 AM 0.704242494 0.371877481 0.944482485 0.79207359 0.57390951
7:29:31 AM 0.072028024 0.120263127 0.577396985 0.694153791 0.341824004
7:29:32 AM 0.241817775 0.32573323 0.484644494 0.377938298 0.090122672
7:29:33 AM 0.500962945 0.540808907 0.582958676 0.043377373 0.041274613
7:29:34 AM 0.087742217 0.596508236 0.020250297 0.926901109 0.45960323
7:29:35 AM 0.268222071 0.291034947 0.598887588 0.575571111 0.136424853
7:29:36 AM 0.42880255 0.349597405 0.936733938 0.232128788 0.555528823
7:29:37 AM 0.380425154 0.162002488 0.208550466 0.776866494 0.79340504
7:29:38 AM 0.727940393 0.622546124 0.716007768 0.660480612 0.02463804
7:29:39 AM 0.582772435 0.713406643 0.306544291 0.225257421 0.043552277
7:29:40 AM 0.371156954 0.163821476 0.780515577 0.032460418 0.356949005
7:29:42 AM 0.484167263 0.377878242 0.044189636 0.718147456 0.603177625
7:29:43 AM 0.294017186 0.463360581 0.962296024 0.504029061 0.183131098
7:29:44 AM 0.95635086 0.367849494 0.362230918 0.984421096 0.41587606
7:29:45 AM 0.198645523 0.754955312 0.280338922 0.79706146 0.730373691
7:29:46 AM 0.058483961 0.46774544 0.86783339 0.147418954 0.941713252
7:29:47 AM 0.411193343 0.340857813 0.162066261 0.943124515 0.722124394
7:29:48 AM 0.389312994 0.129281042 0.732723258 0.803458815 0.045824426
7:29:49 AM 0.549633038 0.73956852 0.542532728 0.618321989 0.358525184
7:29:50 AM 0.269925317 0.501399748 0.938234302 0.997577871 0.318813506
7:29:51 AM 0.798825842 0.24038537 0.958224157 0.660124357 0.07469288
7:29:52 AM 0.963581196 0.390150081 0.077448543 0.294604314 0.903519943
7:29:53 AM 0.890540963 0.50284339 0.229976565 0.664538451 0.926438543
7:29:54 AM 0.46951573 0.192568637 0.506730373 0.060557482 0.922857391
7:29:55 AM 0.56552394 0.952136998 0.739438663 0.107518765 0.911045415
7:29:56 AM 0.433149875 0.957190309 0.475811126 0.855705733 0.942255155
and this is the code I am using:
[Data,Text] = xlsread('C:\Users\data.xlsx',2);
IndexStart=strmatch('7:29:29 AM',Text,'exact'); %start time
IndexEnd=strmatch('2:30:29 PM',Text,'exact'); %end time
seconds = IndexEnd-IndexStart;
TestData = Data([IndexStart: IndexEnd],:);

You probably need to:
Use strfind to find the relevant string in the data imported
Use datenum to convert the date to serial date numbers, to be able to calculate the elapsed time between the two points.
It would help if you posted your code so far though.
EDIT based on comments:
Here's what I would do for cycling through the list of start and end times:
[Data,Text] = xlsread('C:\Users\data.xlsx',2);
start_times = {'7:29:29 AM','7:29:35 AM','7:29:44 AM','7:29:49 AM'}; % etc...
end_times = {'2:30:29 PM','2:30:59 PM','2:31:22 PM','2:32:49 PM'}; % etc...
elapsed_time = zeros(length(start_times),1);
TestData = cell(length(start_times),1); % need a cell array because data can/will be of unequal lengths
for k=1:length(start_times)
IndexStart=strmatch(start_times{k},Text,'exact'); %start time
IndexEnd=strmatch(end_times{k},Text,'exact'); %end time
elapsed_time(k) = IndexEnd-IndexStart;
TestData{k} = Data([IndexStart: IndexEnd],:);
end

Use the "Import Data" from the Variable Tag in the Home menu. There you can set how you want the data to be imported like. With or without heading and the format.

Related

Looping through column to perform amalysis on each

I have over 338 columns with different drug name in each column. what I want to do is to loop through all the columns using the code. This code is for one specific drug. The problem is I have 338 different drug names. The code is:
NPTESTS
/INDEPENDENT TEST (PLIN3 CYFIP2 IL2RA HSD3B1 IL2RB PYROXD1 ZBED4 MCTP1 LAMA3 CTSC EDEM1 LIF PIM3
PPARA SLC6A11 THNSL2 ZNF697) GROUP (drug_1) MANN_WHITNEY KRUSKAL_WALLIS(COMPARE=PAIRWISE)
/MISSING SCOPE=ANALYSIS USERMISSING=EXCLUDE
/CRITERIA ALPHA=0.05 CILEVEL=95
Is there any way I can loop through the columns and perform the test without running the code block over and over again?
One thing you can try is restructuring the file to long format so all the drugs are in one column, then you can run the test on all the drugs in parallele at once by splitting the file:
varstocases /make drugval from drug_1 to drug_338/index=drugname(drugval).
sort cases by drugname.
split file by drugname.
*now your code .
NPTESTS
/INDEPENDENT TEST (PLIN3 CYFIP2 IL2RA HSD3B1 IL2RB PYROXD1 ZBED4 MCTP1 LAMA3 CTSC EDEM1 LIF PIM3
PPARA SLC6A11 THNSL2 ZNF697) GROUP (drugval) MANN_WHITNEY KRUSKAL_WALLIS(COMPARE=PAIRWISE)
/MISSING SCOPE=ANALYSIS USERMISSING=EXCLUDE
/CRITERIA ALPHA=0.05 CILEVEL=95.
split file off.
Alternatively you can use SPSS macro to loop through all the drugs and test them one by one:
define testdrugs ()
!do !drg=1 !to 338
NPTESTS
/INDEPENDENT TEST (PLIN3 CYFIP2 IL2RA HSD3B1 IL2RB PYROXD1 ZBED4 MCTP1 LAMA3 CTSC EDEM1 LIF PIM3
PPARA SLC6A11 THNSL2 ZNF697) GROUP !concat("(drug_", !drg, ")") MANN_WHITNEY KRUSKAL_WALLIS(COMPARE=PAIRWISE)
/MISSING SCOPE=ANALYSIS USERMISSING=EXCLUDE
/CRITERIA ALPHA=0.05 CILEVEL=95
!doend
!enddefine.
* the macro is defined, now we can call it.
testdrugs .

What is the simplest way to complete a function on every row of a large table?

so I want to do a fisher exact test (one sided) on every row of a 3000+ row table with a format matching the below example
gene
sample_alt
sample_ref
population_alt
population_ref
One
4
556
770
37000
Two
5
555
771
36999
Three
6
554
772
36998
I would ideally like to make another column of the table equivalent to
[(4+556)!(4+770)!(770+37000)!(556+37000)!]/[4!(556!)770!(37000!)(4+556+770+37000)!]
for the first row of data, and so on and so forth for each row of the table.
I know how to do a fisher test in R for simple 2x2 tables, but I wouldn't know how I would apply the fisher.test() function to each row of a large table. I also can't use an excel formula because the numbers get so big with the factorials that they reach excel's digit limit and result in a #NUM error. What's the best way to simply complete this? Thanks in advance!
Beginning with a tab-delimited text file on desktop (table.txt) with the same format as shown in the stem question
if(!require(psych)){install.packages("psych")}
multiFisher = function(file="Desktop/table.txt", saveit=TRUE,
outfile="Desktop/table.csv", progress=T,
verbose=FALSE, digits=3, ... )
{
require(psych)
Data = read.table(file, skip=1, header=F,
col.names=c("Gene", "MD", "WTD", "MC", "WTC"), ...)
if(verbose){print(str(Data))}
Data$Fisher.p = NA
Data$phi = NA
Data$OR1 = format(0.123, nsmall=3)
Data$OR2 = NA
if(progress){cat("\n")}
for(i in 1:length(Data$Gene)){
Matrix = matrix(c(Data$WTC[i],Data$MC[i],Data$WTD[i],Data$MD[i]), nrow=2)
Fisher = fisher.test(Matrix, alternative = 'greater')
Data$Fisher.p[i] = signif(Fisher$p.value, digits=digits)
Data$phi[i] = phi(Matrix, digits=digits)
OR1 = (Data$WTC[i]*Data$MD[i])/(Data$MC[i]*Data$WTD[i])
OR2 = 1 / OR1
Data$OR1[i] = format(signif(OR1, digits=digits), nsmall=3)
Data$OR2[i] = signif(OR2, digits=digits)
if(progress) {cat(".")}
}
if(progress){cat("\n"); cat("\n")}
if(saveit){write.csv(Data, outfile)}
return(Data)
}
multiFisher()

how to choose certain elements of a matrix to create a new one with np.array?

I have a matrix called "times" of form (1,517) where are the times of a whole day 24 hours (in seconds Epoch time) and I want to create a new matrix with the times of each half hour, that is, starting from the first time then the one that corresponds to half hour later and so on until completing all the half hours that there are in a day, that is, 48
I created a delta of time with
dt = timedelta (hours = 0.5)
dts = timedelta.total_seconds (dt)
but I do not know how to do to indicate that my new matrix takes those elements
print(times.shape)
Out[4]: (1, 517)
print(times)
array([[1.55079361e+09, 1.55079377e+09, 1.55079394e+09, 1.55079410e+09,
1.55079430e+09, 1.55079446e+09, 1.55079462e+09, 1.55079479e+09,
1.55079495e+09, 1.55079512e+09, 1.55079528e+09, 1.55079544e+09,
1.55079561e+09, 1.55079577e+09, 1.55079594e+09, 1.55079614e+09,
1.55079630e+09, 1.55079646e+09, 1.55079663e+09, 1.55079679e+09,
1.55079695e+09, 1.55079712e+09, 1.55079728e+09, 1.55079744e+09,
1.55079761e+09, 1.55079781e+09, 1.55079797e+09, 1.55079814e+09,
1.55079830e+09, 1.55079846e+09, 1.55079863e+09, 1.55079879e+09,
1.55079895e+09, 1.55079912e+09, 1.55079928e+09, 1.55079945e+09,
1.55079964e+09, 1.55079981e+09, 1.55079997e+09, 1.55080014e+09,
1.55080030e+09, 1.55080046e+09, 1.55080063e+09, 1.55080079e+09,
1.55080096e+09, 1.55080112e+09, 1.55080128e+09, 1.55080148e+09,
1.55080164e+09, 1.55080181e+09, 1.55080197e+09, 1.55080214e+09,
1.55080230e+09, 1.55080246e+09, 1.55080263e+09, 1.55080279e+09,
1.55080296e+09, 1.55080312e+09, 1.55080332e+09, 1.55080348e+09,
1.55080364e+09, 1.55080381e+09, 1.55080397e+09, 1.55080414e+09,
1.55080430e+09, 1.55080446e+09, 1.55080463e+09, 1.55080479e+09,
1.55080496e+09, 1.55080516e+09, 1.55080532e+09, 1.55080548e+09,
1.55080565e+09, 1.55080581e+09, 1.55080597e+09, 1.55080614e+09,
1.55080630e+09, 1.55080646e+09, 1.55080663e+09, 1.55080683e+09,
1.55080699e+09, 1.55080716e+09, 1.55080732e+09, 1.55080748e+09,
1.55080765e+09, 1.55080781e+09, 1.55080797e+09, 1.55080814e+09,
1.55080830e+09, 1.55080847e+09, 1.55080866e+09, 1.55080883e+09,
1.55080899e+09, 1.55080916e+09, 1.55080932e+09, 1.55080948e+09,
1.55080965e+09, 1.55080981e+09, 1.55080998e+09, 1.55081014e+09,
1.55081030e+09, 1.55081050e+09, 1.55081066e+09, 1.55081083e+09,
1.55081099e+09, 1.55081116e+09, 1.55081132e+09, 1.55081148e+09,
1.55081165e+09, 1.55081181e+09, 1.55081198e+09, 1.55081214e+09,
1.55081234e+09, 1.55081250e+09, 1.55081266e+09, 1.55081283e+09,
1.55081299e+09, 1.55081316e+09, 1.55081332e+09, 1.55081348e+09,
1.55081365e+09, 1.55081381e+09, 1.55081398e+09, 1.55081418e+09,
1.55081434e+09, 1.55081450e+09, 1.55081467e+09, 1.55081483e+09,
1.55081499e+09, 1.55081516e+09, 1.55081532e+09, 1.55081548e+09,
1.55081565e+09, 1.55081585e+09, 1.55081601e+09, 1.55081618e+09,
1.55081634e+09, 1.55081650e+09, 1.55081667e+09, 1.55081683e+09,
1.55081699e+09, 1.55081716e+09, 1.55081732e+09, 1.55081749e+09,
1.55081768e+09, 1.55081785e+09, 1.55081801e+09, 1.55081818e+09,
1.55081834e+09, 1.55081850e+09, 1.55081867e+09, 1.55081883e+09,
1.55081900e+09, 1.55081916e+09, 1.55081932e+09, 1.55081952e+09,
1.55081968e+09, 1.55081985e+09, 1.55082001e+09, 1.55082018e+09,
1.55082034e+09, 1.55082050e+09, 1.55082067e+09, 1.55082083e+09,
1.55082100e+09, 1.55082116e+09, 1.55082136e+09, 1.55082152e+09,
1.55082168e+09, 1.55082185e+09, 1.55082201e+09, 1.55082218e+09,
1.55082234e+09, 1.55082250e+09, 1.55082267e+09, 1.55082283e+09,
1.55082300e+09, 1.55082320e+09, 1.55082336e+09, 1.55082352e+09,
1.55082369e+09, 1.55082385e+09, 1.55082401e+09, 1.55082418e+09,
1.55082434e+09, 1.55082450e+09, 1.55082467e+09, 1.55082487e+09,
1.55082503e+09, 1.55082520e+09, 1.55082536e+09, 1.55082552e+09,
1.55082569e+09, 1.55082585e+09, 1.55082601e+09, 1.55082618e+09,
1.55082634e+09, 1.55082651e+09, 1.55082670e+09, 1.55082687e+09,
1.55082703e+09, 1.55082720e+09, 1.55082736e+09, 1.55082752e+09,
1.55082769e+09, 1.55082785e+09, 1.55082802e+09, 1.55082818e+09,
1.55082834e+09, 1.55082854e+09, 1.55082870e+09, 1.55082887e+09,
1.55082903e+09, 1.55082920e+09, 1.55082936e+09, 1.55082952e+09,
1.55082969e+09, 1.55082985e+09, 1.55083002e+09, 1.55083018e+09,
1.55083038e+09, 1.55083054e+09, 1.55083070e+09, 1.55083087e+09,
1.55083103e+09, 1.55083120e+09, 1.55083136e+09, 1.55083152e+09,
1.55083169e+09, 1.55083185e+09, 1.55083202e+09, 1.55083222e+09,
1.55083238e+09, 1.55083254e+09, 1.55083271e+09, 1.55083287e+09,
1.55083303e+09, 1.55083320e+09, 1.55083336e+09, 1.55083352e+09,
1.55083369e+09, 1.55083389e+09, 1.55083405e+09, 1.55083422e+09,
1.55083438e+09, 1.55083454e+09, 1.55083471e+09, 1.55083487e+09,
1.55083503e+09, 1.55083520e+09, 1.55083536e+09, 1.55083553e+09,
1.55083572e+09, 1.55083589e+09, 1.55083605e+09, 1.55083622e+09,
1.55083638e+09, 1.55083654e+09, 1.55083671e+09, 1.55083687e+09,
1.55083704e+09, 1.55083720e+09, 1.55083736e+09, 1.55083756e+09,
1.55083772e+09, 1.55083789e+09, 1.55083805e+09, 1.55083822e+09,
1.55083838e+09, 1.55083854e+09, 1.55083871e+09, 1.55083887e+09,
1.55083904e+09, 1.55083920e+09, 1.55083940e+09, 1.55083956e+09,
1.55083972e+09, 1.55083989e+09, 1.55084005e+09, 1.55084022e+09,
1.55084038e+09, 1.55084054e+09, 1.55084071e+09, 1.55084087e+09,
1.55084104e+09, 1.55084124e+09, 1.55084140e+09, 1.55084156e+09,
1.55084173e+09, 1.55084189e+09, 1.55084205e+09, 1.55084222e+09,
1.55084238e+09, 1.55084254e+09, 1.55084271e+09, 1.55084291e+09,
1.55084307e+09, 1.55084324e+09, 1.55084340e+09, 1.55084356e+09,
1.55084373e+09, 1.55084389e+09, 1.55084405e+09, 1.55084422e+09,
1.55084438e+09, 1.55084455e+09, 1.55084474e+09, 1.55084491e+09,
1.55084507e+09, 1.55084524e+09, 1.55084540e+09, 1.55084556e+09,
1.55084573e+09, 1.55084589e+09, 1.55084606e+09, 1.55084622e+09,
1.55084638e+09, 1.55084658e+09, 1.55084674e+09, 1.55084691e+09,
1.55084707e+09, 1.55084724e+09, 1.55084740e+09, 1.55084756e+09,
1.55084773e+09, 1.55084789e+09, 1.55084806e+09, 1.55084822e+09,
1.55084842e+09, 1.55084858e+09, 1.55084874e+09, 1.55084891e+09,
1.55084907e+09, 1.55084924e+09, 1.55084940e+09, 1.55084956e+09,
1.55084973e+09, 1.55084989e+09, 1.55085006e+09, 1.55085026e+09,
1.55085042e+09, 1.55085058e+09, 1.55085075e+09, 1.55085091e+09,
1.55085107e+09, 1.55085124e+09, 1.55085140e+09, 1.55085156e+09,
1.55085173e+09, 1.55085193e+09, 1.55085209e+09, 1.55085226e+09,
1.55085242e+09, 1.55085258e+09, 1.55085275e+09, 1.55085291e+09,
1.55085307e+09, 1.55085324e+09, 1.55085340e+09, 1.55085357e+09,
1.55085376e+09, 1.55085393e+09, 1.55085409e+09, 1.55085426e+09,
1.55085442e+09, 1.55085458e+09, 1.55085475e+09, 1.55085491e+09,
1.55085508e+09, 1.55085524e+09, 1.55085540e+09, 1.55085560e+09,
1.55085576e+09, 1.55085593e+09, 1.55085609e+09, 1.55085626e+09,
1.55085642e+09, 1.55085658e+09, 1.55085675e+09, 1.55085691e+09,
1.55085708e+09, 1.55085724e+09, 1.55085744e+09, 1.55085760e+09,
1.55085776e+09, 1.55085793e+09, 1.55085809e+09, 1.55085826e+09,
1.55085842e+09, 1.55085858e+09, 1.55085875e+09, 1.55085891e+09,
1.55085908e+09, 1.55085928e+09, 1.55085944e+09, 1.55085960e+09,
1.55085977e+09, 1.55085993e+09, 1.55086009e+09, 1.55086026e+09,
1.55086042e+09, 1.55086058e+09, 1.55086075e+09, 1.55086095e+09,
1.55086111e+09, 1.55086128e+09, 1.55086144e+09, 1.55086160e+09,
1.55086177e+09, 1.55086193e+09, 1.55086209e+09, 1.55086226e+09,
1.55086242e+09, 1.55086259e+09, 1.55086278e+09, 1.55086295e+09,
1.55086311e+09, 1.55086328e+09, 1.55086344e+09, 1.55086360e+09,
1.55086377e+09, 1.55086393e+09, 1.55086410e+09, 1.55086426e+09,
1.55086442e+09, 1.55086462e+09, 1.55086478e+09, 1.55086495e+09,
1.55086511e+09, 1.55086528e+09, 1.55086544e+09, 1.55086560e+09,
1.55086577e+09, 1.55086593e+09, 1.55086610e+09, 1.55086626e+09,
1.55086646e+09, 1.55086662e+09, 1.55086678e+09, 1.55086695e+09,
1.55086711e+09, 1.55086728e+09, 1.55086744e+09, 1.55086760e+09,
1.55086777e+09, 1.55086793e+09, 1.55086810e+09, 1.55086830e+09,
1.55086846e+09, 1.55086862e+09, 1.55086879e+09, 1.55086895e+09,
1.55086911e+09, 1.55086928e+09, 1.55086944e+09, 1.55086960e+09,
1.55086977e+09, 1.55086997e+09, 1.55087013e+09, 1.55087030e+09,
1.55087046e+09, 1.55087062e+09, 1.55087079e+09, 1.55087095e+09,
1.55087111e+09, 1.55087128e+09, 1.55087144e+09, 1.55087161e+09,
1.55087180e+09, 1.55087197e+09, 1.55087213e+09, 1.55087230e+09,
1.55087246e+09, 1.55087262e+09, 1.55087279e+09, 1.55087295e+09,
1.55087312e+09, 1.55087328e+09, 1.55087344e+09, 1.55087364e+09,
1.55087380e+09, 1.55087397e+09, 1.55087413e+09, 1.55087430e+09,
1.55087446e+09, 1.55087462e+09, 1.55087479e+09, 1.55087495e+09,
1.55087512e+09, 1.55087528e+09, 1.55087548e+09, 1.55087564e+09,
1.55087580e+09, 1.55087597e+09, 1.55087613e+09, 1.55087630e+09,
1.55087646e+09, 1.55087662e+09, 1.55087679e+09, 1.55087695e+09,
1.55087712e+09, 1.55087732e+09, 1.55087748e+09, 1.55087764e+09,
1.55087781e+09, 1.55087797e+09, 1.55087813e+09, 1.55087830e+09,
1.55087846e+09, 1.55087862e+09, 1.55087879e+09, 1.55087899e+09,
1.55087915e+09, 1.55087932e+09, 1.55087948e+09, 1.55087964e+09,
1.55087981e+09]])
First we create an array with a date range between the first and last entry of times
t = np.arange(np.datetime64(datetime.datetime.fromtimestamp(times[0,0])), np.datetime64(datetime.datetime.fromtimestamp(times[0,-1])), np.timedelta64(30, 'm'))
Output for t
array(['2019-02-22T01:00:10.000000', '2019-02-22T01:30:10.000000',
'2019-02-22T02:00:10.000000', '2019-02-22T02:30:10.000000',
'2019-02-22T03:00:10.000000', '2019-02-22T03:30:10.000000',
'2019-02-22T04:00:10.000000', '2019-02-22T04:30:10.000000',
'2019-02-22T05:00:10.000000', '2019-02-22T05:30:10.000000',
'2019-02-22T06:00:10.000000', '2019-02-22T06:30:10.000000',
'2019-02-22T07:00:10.000000', '2019-02-22T07:30:10.000000',
'2019-02-22T08:00:10.000000', '2019-02-22T08:30:10.000000',
'2019-02-22T09:00:10.000000', '2019-02-22T09:30:10.000000',
'2019-02-22T10:00:10.000000', '2019-02-22T10:30:10.000000',
'2019-02-22T11:00:10.000000', '2019-02-22T11:30:10.000000',
'2019-02-22T12:00:10.000000', '2019-02-22T12:30:10.000000',
'2019-02-22T13:00:10.000000', '2019-02-22T13:30:10.000000',
'2019-02-22T14:00:10.000000', '2019-02-22T14:30:10.000000',
'2019-02-22T15:00:10.000000', '2019-02-22T15:30:10.000000',
'2019-02-22T16:00:10.000000', '2019-02-22T16:30:10.000000',
'2019-02-22T17:00:10.000000', '2019-02-22T17:30:10.000000',
'2019-02-22T18:00:10.000000', '2019-02-22T18:30:10.000000',
'2019-02-22T19:00:10.000000', '2019-02-22T19:30:10.000000',
'2019-02-22T20:00:10.000000', '2019-02-22T20:30:10.000000',
'2019-02-22T21:00:10.000000', '2019-02-22T21:30:10.000000',
'2019-02-22T22:00:10.000000', '2019-02-22T22:30:10.000000',
'2019-02-22T23:00:10.000000', '2019-02-22T23:30:10.000000',
'2019-02-23T00:00:10.000000', '2019-02-23T00:30:10.000000'],
dtype='datetime64[us]')
Now, we want to calculate this back to seconds. To do this, we create a lambda function which does this for a single element of the array and use np.apply_along_axis to perform this operation element-wise on the array.
f = lambda x: (x - np.datetime64('1970-01-01T00:00:00Z'))/np.timedelta64(1,'s')
np.apply_along_axis(f, 0, t)
output
array([1.55079721e+09, 1.55079901e+09, 1.55080081e+09, 1.55080261e+09,
1.55080441e+09, 1.55080621e+09, 1.55080801e+09, 1.55080981e+09,
1.55081161e+09, 1.55081341e+09, 1.55081521e+09, 1.55081701e+09,
1.55081881e+09, 1.55082061e+09, 1.55082241e+09, 1.55082421e+09,
1.55082601e+09, 1.55082781e+09, 1.55082961e+09, 1.55083141e+09,
1.55083321e+09, 1.55083501e+09, 1.55083681e+09, 1.55083861e+09,
1.55084041e+09, 1.55084221e+09, 1.55084401e+09, 1.55084581e+09,
1.55084761e+09, 1.55084941e+09, 1.55085121e+09, 1.55085301e+09,
1.55085481e+09, 1.55085661e+09, 1.55085841e+09, 1.55086021e+09,
1.55086201e+09, 1.55086381e+09, 1.55086561e+09, 1.55086741e+09,
1.55086921e+09, 1.55087101e+09, 1.55087281e+09, 1.55087461e+09,
1.55087641e+09, 1.55087821e+09, 1.55088001e+09, 1.55088181e+09])

Resampling Time Series Data (Pandas Python 3)

Trying to convert data at daily frequency to weekly frequency.
In:
weeklyaaapl = pd.DataFrame()
weeklyaapl['Open'] = aapl.Open.resample('W').iloc[0]
#here I am trying to take the first value of the aapl.Open,
#that falls within the week.
Out:
ValueError: .resample() is now a deferred operation
use .resample(...).mean() instead of .resample(...)
I want the true open (the first open that prints for the week) (the open of the first day in that week).
It instead wants me to take the mean of the daily open values for a given week using .mean(), which is not the information I need.
Can't seem to interpret the error, documentation isn't helping either.
I think you need.
aapl.resample('W').first()
Output:
Open High Low Close Volume
Date
2010-01-10 30.49 30.64 30.34 30.57 123432050
2010-01-17 30.40 30.43 29.78 30.02 115557365
2010-01-24 29.76 30.74 29.61 30.72 182501620
2010-01-31 28.93 29.24 28.60 29.01 266424802
2010-02-07 27.48 28.00 27.33 27.82 187468421

Converting time format

There must be a quick solution for this, but after 30min I gave up and need help.
This is the format of source data
0h56m40s 0h57m10s 1h00m40s 1h02m15s 1h02m25s
52m47s 54m25s 54m52s 57m23s 57m43s
49m30s 54m31s 54m34s 56m35s 56m36s
47m45s 48m03s 51m02s 52m23s 53m05s
46m54s 49m29s 50m51s 51m02s 51m03s
46m09s 47m56s 50m16s 51m20s 51m53s
46m55s 47m08s 47m13s 48m16s 50m11s
and I need this in time format like 0h56m40s to 0:56:40
I tried search/replace, from h to :, m to : and removing s, works for when there's hour, but messes up for when only minutes are there.
Any tips?
You can concatenate 0: if the input string is too short:
=(IF(LEN(A1)<7,"0:","") & SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(A1,"h",":"),"m",":"),"s",""))+0
The +0 part converts string to time value (change cell format to h:mm:ss). If you prefer to keep it in text format, remove +0.

Resources