I'm wondering how to actually select the first say 20 values from a column in Julia (DataFrames)
for example if i have a vector
data = DataFrame(X=[1,2,3,4,5,6,7], Y=[2,4,7,9,10,11,14])
how can I get the first 3 values and the last 3 values of X in a subset?
Source: https://testdataframesjl.readthedocs.io/en/readthedocs/subsets/
you can use the first(x,n) and last(x,n) functions to obtain the first or last n values of data x. (a vector or a DataFrame). for example:
julia> data = DataFrame(X=[1,2,3,4,5,6,7], Y=[2,4,7,9,10,11,14])
7×2 DataFrame
Row │ X Y
│ Int64 Int64
─────┼──────────────
1 │ 1 2
2 │ 2 4
3 │ 3 7
4 │ 4 9
5 │ 5 10
6 │ 6 11
7 │ 7 14
julia> first(data,3)
3×2 DataFrame
Row │ X Y
│ Int64 Int64
─────┼──────────────
1 │ 1 2
2 │ 2 4
3 │ 3 7
julia> first(data.X,3)
3-element Vector{Int64}:
1
2
3
julia> last(data,3)
3×2 DataFrame
Row │ X Y
│ Int64 Int64
─────┼──────────────
1 │ 5 10
2 │ 6 11
3 │ 7 14
julia> last(data.X,3)
3-element Vector{Int64}:
5
6
7
a rationale for using first and last is found here https://bkamins.github.io/julialang/2021/06/18/first.html
I have the table below:
╭───╥────────────┬────────────────╮
│ ║ A │ B │
╞═══╬════════════╪═════════════════╡
│ 1 ║ Jack │ 1 year 6 months │
│ 2 ║ Emily │ 6 months │
│ 3 ║ Carl │ 2 years 3 months│
│ 4 ║ │ │
│ 5 ║ Team avg: │ 1 years 5 months│
└───╨────────────┴─────────────────┘
I would like to get the time span averge from column B. Something like 1.42 years or 1 year 5 months.
Is there a way to input time periods in Excel in terms of years and months and days? I could not figure out how to use dates format for this case.
I would prefer a non-macro solution if possible.
Ant ideas? Thanks in advance.
Since you wrote you could enter the numbers differently, and want to do it in a single column, you could for example, enter your data as y.mm, and then use the array formula below to present a human readable output.
Be sure to retain a leading zero for single digit months, as shown in the screen shot below.
=TEXT(INT(DOLLARFR(AVERAGE(INT(myRange)*12+MOD(myRange,1)*100)/12,12)),"[=1]0 ""year "";0"" years """) & TEXT(INT(MOD(DOLLARFR(AVERAGE(INT(myRange)*12+MOD(myRange,1)*100)/12,12),1)*100), "[=1]0"" month"";0"" months""")
To enter/confirm an array formula, hold down ctrl + shift while hitting enter. If you do this correctly, Excel will place braces {...} around the formula seen in the formula bar.
If you want the output in the same format as the entries, you can use the much simpler formula (also entered as an array formula)
=DOLLARFR(AVERAGE(INT(myRange)*12+MOD(myRange,1)*100)/12,12
And, if you don't mind having a dot in the displayed data, and plural months/years even if months/years are one (1), you can custom format it as suggested by #RaulDurand:
0" years". 0 "months ";;
If you are going to add days into the mix, you can use a similar algorithm. I did not provide one because, although years contain a fixed number of months, both contain a variable number of days and you will need to decide how you are going to treat that situation. A VBA UDF would be much simpler to construct.
I wouldn't bother looking for an intensely programmed solution: you only seem to have 4 data rows? I'd just type new data in instead
We don't to time spans in excel, but we can denote a time span by a start date and an end date. Some time spans may be dynamic, dependent on the current time
If you have 2 dates you can do simple math: later_date - earlier_date
Excel will return a result in days and fractions of a day, so for example 2000-02-01 18:00 minus 2000-01-01 00:00 will give an answer of 31.75 - it's a time span of 31.75 days
You can format 31.75 as days and hours and excel will represent it as eg 31 days 18 hours 0 minutes, but you should appreciate that that is a formatting of a numeric value of 31.75 - get used to working with and thinking of spans of time as a decimal number of days
You can add these up, average them etc
You can add a number amount to a date to advance the date by that number of days
Just use three columns. Column B is Years. Column C is Months. Column D is the sum of B and a twelfth of C. So, D2 would be =B2 + C2/12.
╭───╥────────────┬───┬────┬──────────────╮
│ ║ A | B | C | D │
╞═══╬════════════╪═══╪════╪══════════════╡
│ 1 ║ Jack │ 1 | 6 | = B1 + C1/12 │
│ 2 ║ Emily │ 0 | 6 | = B2 + C2/12 │
│ 3 ║ Carl │ 2 | 3 | = B3 + C3/12 │
│ 4 ║ │ | | │
│ 5 ║ Team avg: │ * | ** | = AVG(D1:D3) │
└───╨────────────┴───┴────┴──────────────┘
* B5 can be INT(D5) to give years
** C5 can be (D5 - INT(D5)) * 12 to give months
Based on #RonRosenfeld comments I managed to use numbers. In this case the time spans on column B are written as:
╭───╥────────────┬─────────────────╮
│ ║ A │ B │
╞═══╬════════════╪═════════════════╡
│ 1 ║ Jack │ 1.06 │
│ 2 ║ Emily │ 0.06 │
│ 3 ║ Carl │ 2.03 │
│ 4 ║ │ │
│ 5 ║ Team avg: │ │
└───╨────────────┴─────────────────┘
and using the following custom format
0" years". 00 "months ";;
it becomes:
╭───╥────────────┬───────────────────╮
│ ║ A │ B │
╞═══╬════════════╪═══════════════════╡
│ 1 ║ Jack │ 1 years. 06 months│
│ 2 ║ Emily │ 0 years. 06 months│
│ 3 ║ Carl │ 2 years. 03 months│
│ 4 ║ │ │
│ 5 ║ Team avg: │ 1.42 years │
└───╨────────────┴───────────────────┘
Then, the average time span is calculated as:
=(SUMPRODUCT(INT(B1:B3))+SUMPRODUCT(MOD(B1:B3,1))/0.12)/COUNT(B1:B3)
Also, custom formatting is used in the result.
I want to create a number of random columns in Excel with those characteristics:
Each column has 9 cells
Each cell is either 0, 1, or 2
Each column has SUM = 10
I tried creating 9 random numbers in column A and then use ROUND(B1/SUM(B$1:B$9);1)*10 for the columns but due to ROUND (I think) it is not completely correct as not all columns have sum=10 (some have 8 others 10 e.t.c.)
For example:
Column B: 0,1,1,1,1,1,1,2,2
Column C: 0,0,1,1,1,1,2,2,2
Column D: 0,0,0,1,1,2,2,2,2
Column E: 0,0,0,0,2,2,2,2,2
and so on, numbers in any order like
Column Z: 1,1,2,0,1,1,1,1,2
The closest I can get is with this:
=IF(SUM(A$1:A1)>=10,0,IF(SUM(A$1:A1)=9,1,IF(SUM(A$1:A1)=8,2,RANDBETWEEN(1,2))))
Put it in A2 and copy down and over. It must go in row 2 or it will cause a circular reference.
It fills the column with 1 or 2 till it sums to 10, then the rest are zeros.
Edit
This is about as random as I can get, this will allow 0s randomly:
=IF(SUM(A$1:A1)>=10,0,IF(SUM(A$1:A1)=9,1,IF(SUM(A$1:A1)=8,2,IF(AND(SUM(A$1:A1)<=ROW()-2,ROW()>5),2,RANDBETWEEN(0,2)))))
there are only 5 possible combinations of 9 numbers 0,1 & 2 (disregarding order) where the total = 10.
2,2,2,2,2,0,0,0,0
2,2,2,2,1,1,0,0,0
2,2,2,1,1,1,1,0,0
2,2,1,1,1,1,1,1,0
2,1,1,1,1,1,1,1,1
put those combinations in a spreadsheet:
╔════╦══════════════════════╤═════════╤═════════╤═════════╤═════════╕
║ ║ A │ B │ C │ D │ E │
╠════╬══════════════════════╪═════════╪═════════╪═════════╪═════════╡
║ 1 ║ CORRECT COMBINATIONS │
╟────╫──────────────────────┼─────────┼─────────┼─────────┼─────────┤
║ 2 ║ Group 1 │ Group 2 │ Group 3 │ Group 4 │ Group 5 │
╟────╫──────────────────────┼─────────┼─────────┼─────────┼─────────┤
║ 3 ║ 2 │ 2 │ 2 │ 2 │ 2 │
╟────╫──────────────────────┼─────────┼─────────┼─────────┼─────────┤
║ 4 ║ 2 │ 2 │ 2 │ 2 │ 1 │
╟────╫──────────────────────┼─────────┼─────────┼─────────┼─────────┤
║ 5 ║ 2 │ 2 │ 2 │ 1 │ 1 │
╟────╫──────────────────────┼─────────┼─────────┼─────────┼─────────┤
║ 6 ║ 2 │ 2 │ 1 │ 1 │ 1 │
╟────╫──────────────────────┼─────────┼─────────┼─────────┼─────────┤
║ 7 ║ 2 │ 1 │ 1 │ 1 │ 1 │
╟────╫──────────────────────┼─────────┼─────────┼─────────┼─────────┤
║ 8 ║ 0 │ 1 │ 1 │ 1 │ 1 │
╟────╫──────────────────────┼─────────┼─────────┼─────────┼─────────┤
║ 9 ║ 0 │ 0 │ 1 │ 1 │ 1 │
╟────╫──────────────────────┼─────────┼─────────┼─────────┼─────────┤
║ 10 ║ 0 │ 0 │ 0 │ 1 │ 1 │
╟────╫──────────────────────┼─────────┼─────────┼─────────┼─────────┤
║ 11 ║ 0 │ 0 │ 0 │ 0 │ 1 │
╙────╨──────────────────────┴─────────┴─────────┴─────────┴─────────┘
use RAND() to produce 9 random numbers in a column (say cells G3:G11)
use RANK(G3,$G$3:$G$11) to get randomly-ordered list of numbers 1-9 in the neighbouring column.
use RANDBETWEEN(1,5) to randomly choose one of the 5 allowed number combinations (say in cell I2)
use INDEX to reference the cell in the randomly selected column (1-5) and the randomly ordered row (1-9), from within the 9x5 region of allowed values. eg: in cell I3:=INDEX($A$3:$E$11,H3,$I$2)
you can combine the RANK() into the index function also.
╔════╦═══════════════════════╤══════╤════════╤═══╤═══════════════════════╤════════╕
║ ║ G │ H │ I │ J │ K │ L │
╠════╬═══════════════════════╪══════╪════════╪═══╪═══════════════════════╪════════╡
║ 1 ║ │ │ group: │ │ │ group: │
╟────╫───────────────────────┼──────┼────────┼───┼───────────────────────┼────────┤
║ 2 ║ RANDOM number (order) │ rank │ 3 │ │ RANDOM number (order) │ 4 │
╟────╫───────────────────────┼──────┼────────┼───┼───────────────────────┼────────┤
║ 3 ║ 0.04 │ 8 │ 0 │ │ 0.92 │ 2 │
╟────╫───────────────────────┼──────┼────────┼───┼───────────────────────┼────────┤
║ 4 ║ 0.13 │ 7 │ 1 │ │ 0.79 │ 1 │
╟────╫───────────────────────┼──────┼────────┼───┼───────────────────────┼────────┤
║ 5 ║ 0.9 │ 1 │ 2 │ │ 0.2 │ 0 │
╟────╫───────────────────────┼──────┼────────┼───┼───────────────────────┼────────┤
║ 6 ║ 0.36 │ 6 │ 1 │ │ 0.31 │ 1 │
╟────╫───────────────────────┼──────┼────────┼───┼───────────────────────┼────────┤
║ 7 ║ 0.49 │ 5 │ 1 │ │ 0.98 │ 2 │
╟────╫───────────────────────┼──────┼────────┼───┼───────────────────────┼────────┤
║ 8 ║ 0.89 │ 2 │ 2 │ │ 0.65 │ 1 │
╟────╫───────────────────────┼──────┼────────┼───┼───────────────────────┼────────┤
║ 9 ║ 0 │ 9 │ 0 │ │ 0.68 │ 1 │
╟────╫───────────────────────┼──────┼────────┼───┼───────────────────────┼────────┤
║ 10 ║ 0.84 │ 3 │ 2 │ │ 0.57 │ 1 │
╟────╫───────────────────────┼──────┼────────┼───┼───────────────────────┼────────┤
║ 11 ║ 0.65 │ 4 │ 1 │ │ 0.28 │ 1 │
╟────╫───────────────────────┼──────┼────────┼───┼───────────────────────┼────────┤
║ 12 ║ │ │ │ │ │ │
╟────╫───────────────────────┼──────┼────────┼───┼───────────────────────┼────────┤
║ 13 ║ │ │ 10 │ │ │ 10 │
╙────╨───────────────────────┴──────┴────────┴───┴───────────────────────┴────────┘
Here's a randomized solution for you. First, create a table of possible sets. Given your constraints, there are only 5 possible sets of solutions. I put this table in cells B2:F10, with the headers in row 1. Note that this table can go anywhere, even on a different sheet if preferred. In a final product, I would probably hide these rows. Anyway, it looks like this:
Next, because you want a random number of columns, in cell A12 I put in a header called # of Columns and in cell B12 is this formula (feel free to adjust the upper and lower bounds to what you're looking for, this is just a random number between 3 and 10): =RANDBETWEEN(3,10)
Now we can setup our randomized columns and what sets they use:
In cell B14 and copied right (to the maximum number of columns
defined in the previous formula, so in this example it goes to K
because B:K is 10 columns), use this formula:
=IF(COLUMN(A14)>$B$12,"","Column "&COLUMN(A14))
In cell B15 and copied right is this formula:
=IF(B14="","",INDEX($B$1:$F$1,,RANDBETWEEN(1,5)))
In cell B16 and copied right and down for 9 rows (so in this example it is
copied to K24) is this formula:
=IF(B$14="","",INDEX($B$2:$F$10,MATCH(LARGE(B$26:B$34,ROW(B1)),B$26:B$34,0),MATCH(B$15,$B$1:$F$1,0)))
Finished, it will look like this (note that before completing the next step of this answer, it will show #NUM! errors, explained below):
You'll notice that third formula references a range we haven't built yet, in rows 26:34. In that range, there is another table full of randomized numbers so that the Sets can get scrambled to give us randomized results. Building that table is very easy. In cell B26 and copied over and down to K34 (again, over to the maximum number of columns and down for 9 rows), is this formula:
=IF(B$14="","",RAND())
Now with the randomizers, you'll get results as shown in the second image, with randomized sets of 9 numbers that sum to 10, consisting of 0s, 1s, and 2s. At this point you can cut/paste the Sets and Randomizers tables to a different sheet if preferred, or simply hide those rows.
Because of the constraint, there are only 5 unique combinations of values to get to 10:
5 two's; 0 one's ; 4 zero's
4 two's; 2 one's ; 3 zero's
3 two's; 4 one's ; 2 zero's
2 two's; 6 one's ; 1 zero
1 two's; 8 one's ; 0 zero's
We pick one of the five possibilities at random, scramble the elements and stuff the results into a column.
Store the templates in Sheet1 and the output in columns A through Z in sheet Sheet2.
In Sheet1:
The code:
Sub croupier()
Dim Itms(1 To 9) As Variant
Dim i As Long, J As Long, s1 As Worksheet, s2 As Worksheet
Set s1 = Sheets("Sheet1")
Set s2 = Sheets("Sheet2")
For i = 1 To 26
J = Application.WorksheetFunction.RandBetween(1, 5)
For k = 1 To 9
Itms(k) = s1.Cells(k, J).Value
Next k
Call Shuffle(Itms)
For k = 1 To 9
s2.Cells(k, i).Value = Itms(k)
Next k
Next i
End Sub
Sub Shuffle(InOut() As Variant)
Dim HowMany As Long, i As Long, J As Long
Dim tempF As Double, temp As Variant
Hi = UBound(InOut)
Low = LBound(InOut)
ReDim Helper(Low To Hi) As Double
Randomize
For i = Low To Hi
Helper(i) = Rnd
Next i
J = (Hi - Low + 1) \ 2
Do While J > 0
For i = Low To Hi - J
If Helper(i) > Helper(i + J) Then
tempF = Helper(i)
Helper(i) = Helper(i + J)
Helper(i + J) = tempF
temp = InOut(i)
InOut(i) = InOut(i + J)
InOut(i + J) = temp
End If
Next i
For i = Hi - J To Low Step -1
If Helper(i) > Helper(i + J) Then
tempF = Helper(i)
Helper(i) = Helper(i + J)
Helper(i + J) = tempF
temp = InOut(i)
InOut(i) = InOut(i + J)
InOut(i + J) = temp
End If
Next i
J = J \ 2
Loop
End Sub
Sample Sheet2:
Mathematica has a built-in function called FoldList FoldList function description. Is there a similar primitive verb in J?
(I know that J has a ^: verb, which is like Nest and FixedPoint.)
To clarify my question, J has dyadic verb, so usually u / x1 x2 x3 becomes x1 u (x2 u x3), which works just like FoldList, with reverse order.
Except if the function u takes y, in a different shape from x. In FoldList there is an initial x. In J, if x3 is a different shape, one has to rely on < to pack it together. For example, one has to pack and unpack
[list =. (;/ 3 3 4 3 3 34),(< 1 2)
+-+-+-+-+-+--+---+
|3|3|4|3|3|34|1 2|
+-+-+-+-+-+--+---+
tf =: 4 : '<((> x) , >y)'
tf/ list
+----------------+
|1 2 3 3 4 3 3 34|
+----------------+
tf/\ |. list
+---+------+--------+----------+------------+--------------+----------------+
|1 2|1 2 34|1 2 34 3|1 2 34 3 3|1 2 34 3 3 4|1 2 34 3 3 4 3|1 2 34 3 3 4 3 3|
+---+------+--------+----------+------------+--------------+----------------+
which is kind of inconvenient. Any better solutions?
u/\ comes very close (if you don't mind the right folding):
+/\ 1 2 3 4
1 3 6 10
*/\1+i.10
1 2 6 24 120 720 5040 ...
(+%)/\7#1. NB. continued fraction of phi
1 2 1.5 1.66667 1.6 1.625 1.61538
edit on your edit:
The first two elements of FoldList are x and f(x,a). In J those two have to be of the same "kind" (shape+type) if you want them on the same list. The inconvenience comes from J's data structures not from the lack of a FoldList verb. If you exclude x from the list, things are easier:
FoldListWithout_x =: 1 : 'u/ each }.<\y'
; FoldListWithout_x 1 2 3 4
┌─────┬───────┬─────────┐
│┌─┬─┐│┌─┬─┬─┐│┌─┬─┬─┬─┐│
││1│2│││1│2│3│││1│2│3│4││
│└─┴─┘│└─┴─┴─┘│└─┴─┴─┴─┘│
└─────┴───────┴─────────┘
>+ FoldListWithout_x 1 2 3 4
3 6 10
(+%) FoldListWithout_x 7#1
┌─┬───┬───────┬───┬─────┬───────┐
│2│1.5│1.66667│1.6│1.625│1.61538│
└─┴───┴───────┴───┴─────┴───────┘
The next logical step is to include a boxed x after making the folds, but that will either require more complex code or a case-by-case construction. Eg:
FoldList =: 1 :'({.y) ; u FoldListWithout_x y'
+ FoldList 1 2 3 4
┌─┬─┬─┬──┐
│1│3│6│10│
└─┴─┴─┴──┘
; FoldList 1 2 3 4
┌─┬─────┬───────┬─────────┐
│1│┌─┬─┐│┌─┬─┬─┐│┌─┬─┬─┬─┐│
│ ││1│2│││1│2│3│││1│2│3│4││
│ │└─┴─┘│└─┴─┴─┘│└─┴─┴─┴─┘│
└─┴─────┴───────┴─────────┘
vs
FoldList =: 1 :'(<{.y) ; u FoldListWithout_x y'
+ FoldList 1 2 3 4
┌───┬─┬─┬──┐
│┌─┐│3│6│10│
││1││ │ │ │
│└─┘│ │ │ │
└───┴─┴─┴──┘
; FoldList 1 2 3 4
┌───┬─────┬───────┬─────────┐
│┌─┐│┌─┬─┐│┌─┬─┬─┐│┌─┬─┬─┬─┐│
││1│││1│2│││1│2│3│││1│2│3│4││
│└─┘│└─┴─┘│└─┴─┴─┘│└─┴─┴─┴─┘│
└───┴─────┴───────┴─────────┘
I guess #Dan Bron's comment deserves an answer. It is discussed with some solutions in http://www.jsoftware.com/pipermail/programming/2006-May/002245.html
if we define an adverb (modified from the link above)
upd =: 1 : 0
:
u&.> /\ ( <"_ x),<"0 y
)
then
1 2 , upd |. 3 3 4 3 3 34
┌───┬──────┬────────┬──────────┬────────────┬──────────────┬────────────────┐
│1 2│1 2 34│1 2 34 3│1 2 34 3 3│1 2 34 3 3 4│1 2 34 3 3 4 3│1 2 34 3 3 4 3 3│
└───┴──────┴────────┴──────────┴────────────┴──────────────┴────────────────┘