I am trying to fill a column in Excel with values that are generated randomly but the sum of all the values total to the end.
Example:
Starting value = 320
Final value = 350
Need to generate 2 more values which have random difference between them but total to 350 at the end, as in: 2nd val = 12, 3rd val = 18.
The script/formula should generate different values when run next (for another table etc.). It may generate, for the same starting and final values, 15 and 15 or 8 and 22 for the 2nd and 3rd values respectively etc.
Basically what the formula should do is: Find the difference between the starting and final values then randomly add a number to create the 2nd entry. Now the third entry should follow the same pattern but the value generated should end up totaling to the final.
The example is only for 2 values but I'm working on tables ranging from 15-30+ values.
I don't know if Excel can do it, or if there's a mathematical formula that will work here.
Thanks in advance for all the help!
You need something like this:
That gives:
The trick is to generate a list of uniform random numbers between 0 and 1 and then scale those numbers up the total that you're looking for.
Related
I have an array that looks like this
11100100110
essentially, an array of fixed size with each item being a 1 or 0 with the last item always equal to 0.
Consider each set of consecutive 1's to be a "bucket". I'd like a formula to determine the size of each bucket. So the output of this formula for the above sequence should be
312
as an array. Ideally this works in both excel and google sheets.
If you are interested this is the result of a list of stars and bars configurations where the 0's in my sequence represent bars and the 1's represent stars (the final value is a dummy 0 to make things easier to work with). I want the size of each non-empty bucket in a given configuration of stars and bars.
Thanks, in advance.
You could also use the standard method with Frequency which will work with Excel 365 and GS:
=FILTER(FREQUENCY(IF(A1:A11=1,ROW(A1:A11)),IF(A1:A11=0,ROW(A1:A11))),FREQUENCY(IF(A1:A11=1,ROW(A1:A11)),IF(A1:A11=0,ROW(A1:A11))))
try:
=INDEX((JOIN(, LEN(SPLIT(A1, 0)))))
update:
=INDEX(IFERROR(1/(1/SUBSTITUTE(FLATTEN(QUERY(TRANSPOSE(IFERROR(1/(1/
LEN(SPLIT(SUBSTITUTE(FLATTEN(QUERY(
TRANSPOSE(A1:K),, 9^9)), " ", ), 0))))),, 9^9)), " ", ))))
Assuming A2:A9 contains the data,
=ARRAYFORMULA(QUERY(FREQUENCY(IF(A2:A9,ROW(A2:A9)),IF(NOT(A2:A9),ROW(A2:A9))),"where Col1>0",))
FREQUENCY(data,classes) to get the frequency of data in classes
Make sequence of row numbers as data, if 1
Make sequence of row numbers as classes, if not 1
QUERY to get rid of zeros
Following from the example here I'm trying to add additional conditions to a sum formula. I've represented an example below:
The output that I'm looking for for example for Jan 2017 is
2017
1
UP A 1
UP B 6
UP C 6
DOWN A 1
DOWN B 8
DOWN C 7
I tried with the following formula:
=MMULT(--($B$17:$C$17="X"),MATCH(1,($A23=$C$2:$C$14)*(C$21=$A$2:$A$14)*(C$22=$B$2:$B$14)*($E$2:$E$14=$D$2:$D$14),0))
but I get a N/A value.
Does anyone know it if is possible to do it?
In your first example the number of rows in array1 and number of columns in array2 were equal, five. Here you have two columns and 13 rows. That they are unequal here is part (all) of the reason why you are having an issue.
Also your match function is returning a Boolean not an array
I have a way to do this using matrix condition and multiple criteria but had to change problem up a bit, see photo for example:
{=MMULT(--(D18:P18="x"),E$2:E$14*(--(A$2:A$14=$C$21)*--(B$2:B$14=$C$22)*--(C$2:C$14=A24)))"
https://i.stack.imgur.com/FEvgR.png
You can create a formula to fill the second matrix with X's see below
=IF(OR(INDIRECT("D"&VALUE(D20))=$A$18,INDIRECT("D"&VALUE(D20))=$B$18),"X","")
https://i.stack.imgur.com/4rS4L.png
That being said I don't think this is particularly efficient as you are treating the one of the matrixes as a all 1's so you basically just adding an extra criteria / Boolean with added complexity....that being said u asked for this specifically and I believe that I have delivered that LOL
Just add two SUMIFS together.
=SUMIFS($E$2:$E$14, $A$2:$A$14, C$21, $B$2:$B$14, C$22, $C$2:$C$14, $A23, $D$2:$D$14, IF(INDEX($B$17:$C$19, MATCH($B23, $A$17:$A$19, 0), 1)="x", $B$16))+
SUMIFS($E$2:$E$14, $A$2:$A$14, C$21, $B$2:$B$14, C$22, $C$2:$C$14, $A23, $D$2:$D$14, IF(INDEX($B$17:$C$19, MATCH($B23, $A$17:$A$19, 0), 2)="x", $C$16))
I have sets of gene probes that are upregulated when put under different chemical stresses. Each column contains all of the upregulated gene probes. I have 12 columns, how do I get a list of gene probes that appear in all 12 columns?
I've been able to find similarities between two columns using the formula
=IF(ISERROR(MATCH(A2,$C$2:$C$21473,0)),"",A2)
but cant work out how to adapt it to include 12 columns
G.Ac G.As G.At G.Ac.At G.As.Ac G.As.At G.Cd G.Cu G.Ni
G.Cd.Cu G.Cd.Ni G.Ni.Cu
GENE:JGI_V11_3346220103 GENE:JGI_V11_2653050203 GENE:JGI_V11_3299790103
GENE:JGI_V11_359040103 GENE:JGI_V11_2228010103 GENE:JGI_V11_2662750203
GENE:JGI_V11_1926920303 GENE:JGI_V11_3134270303 GENE:JGI_V11_3119540303
GENE:JGI_V11_3134270203 GENE:JGI_V11_1926920303 GENE:JGI_V11_3134270303
GENE:JGI_V11_3164760203 GENE:JGI_V11_565470303 GENE:JGI_V11_2296170203
GENE:JGI_V11_2045300203 GENE:JGI_V11_2421620203 GENE:JGI_V11_2228010303
GENE:JGI_V11_2196580303 GENE:JGI_V11_3134270203 GENE:JGI_V11_3119540203
GENE:JGI_V11_1926920103 GENE:JGI_V11_1926920103 GENE:JGI_V11_1014720202
GENE:JGI_V11_478830203 GENE:JGI_V11_3168730303 GENE:JGI_V11_3311070202
GENE:JGI_V11_3216620102 GENE:JGI_V11_2653050303 GENE:JGI_V11_3300140202
GENE:JGI_V11_2653050303 GENE:JGI_V11_1159220202 GENE:JGI_V11_2024180303
GENE:JGI_V11_1926920303 GENE:JGI_V11_2196580303 GENE:JGI_V11_1159220202
GENE:JGI_V11_3164760303 GENE:JGI_V11_2228010203 GENE:JGI_V11_2341670203
GENE:JGI_V11_1938910303 GENE:JGI_V11_3026230203 GENE:JGI_V11_2449230203
GENE:JGI_V11_3134270303 GENE:JGI_V11_2235750203 GENE:JGI_V11_1981410203
GENE:JGI_V11_3251310202 GENE:JGI_V11_977750103 GENE:JGI_V11_954070203
GENE:JGI_V11_2267320203 GENE:JGI_V11_2268000303 GENE:JGI_V11_2226270101
GENE:JGI_V11_3003640303 GENE:JGI_V11_223520203 GENE:JGI_V11_2662750103
GENE:JGI_V11_2228010103 GENE:JGI_V11_3251310202 GENE:JGI_V11_3198630203
GENE:JGI_V11_3134270303 GENE:JGI_V11_1926920203 GENE:JGI_V11_287750103
GENE:JGI_V11_465160203 GENE:JGI_V11_2268000203 GENE:JGI_V11_2473230303
GENE:JGI_V11_3192220102 GENE:JGI_V11_3026230303 GENE:JGI_V11_3039310303
GENE:JGI_V11_1926920103 GENE:JGI_V11_1159220102 GENE:JGI_V11_3052790202
GENE:JGI_V11_3075830303 GENE:JGI_V11_2196580203 GENE:JGI_V11_3134280203
GENE:JGI_V11_3142970303 GENE:JGI_V11_503720303 GENE:JGI_V11_2236410103
GENE:JGI_V11_3042230103 GENE:JGI_V11_2228010203 GENE:JGI_V11_3028210101
GENE:JGI_V11_2105710303 GENE:JGI_V11_1926920303 GENE:JGI_V11_2131620103
GENE:JGI_V11_1002840203 GENE:JGI_V11_2088480203 GENE:JGI_V11_3196120102
Heres the first 8 rows of the 12 columns. There are 21473 rows in total.
Thanks
You could use an array formula like this to count how many columns a particular gene probe occurs in
=SUM(--(MMULT(TRANSPOSE(ROW(A$2:L$10000)^0),N(A$2:L$10000=A2))>0))
This is a standard way of getting column totals for a 2D array - in this case an array of true/false values corresponding to instances of an array element being equal/unequal to A2.
It is rather a brute force approach - it needs ~120K multiplications for each row. If you copy the formula down for ~10K rows, there is a delay of ~100 seconds on my computer while Excel works out the results.
Must be entered as an array formula using CtrlShiftEnter
In this dummy data C is the only value that occurs in all 12 columns.
This question relates to the Schematiq add-in for Microsoft Excel.
Using =tbl.Lookup(table, columnsToSearch, valuesToFind, resultColumn, [defaultValue]) the values in the valuesToFind column have a consistent 3 characters to the left and then varying characters after (e.g. 908-123456 or 908-321654 - i.e. 908 is always consistent)
How can I tell the function to lookup the value based on the first 3 characters only? The expected answer should be the sum of the results of the above, i.e. 500 + 300 = 800
tbl.Lookup() works by looking for an exact match - this helps ensure it's fast but in this case it means you need an extra step to calculate a column of lookup values, something like this:
A2: =tbl.CalculateColumn(A1, "code", "x => LEFT(x, 3)", "startOfCode")
This will give you a new column that you can use for the columnsToSearch argument, however tbl.Lookup() also looks for just one match - it doesn't know how to combine values together if there is more than one matching row in the table, so I think you also need one more step to group your table by the first 3 chars of the code, like this:
A3: =tbl.Group(A2, "startOfCode", "amount")
Because tbl.Group() adds values together by default, this will give you a table with a row for each distinct value of startOfCode and the subtotal of amount for each of those values. Finally, you can do the lookup exactly as you requested, which for your input table will return 800:
A4: =tbl.Lookup(A3, "startOfCode", "908", "amount")
I am having Excel with Date time stamp at random row in B Column,
Ex: Where in between cell are Empty ( B2, B3..etc).
B1 : 15:13:48:335 2014/08/06
B27: 15:13:55:955 2014/08/06
B31: 15:14:16:005 2014/08/06 ...
I need to find the time difference between 2 consecutive entries Ex: B21-B1 and B31-B27 and so on.
If the values you've shown are actual datetimes, then they are numbers that seem to grow progressively larger as the rows increase.
To get the difference from B1 to B27,
=LARGE(B:B, 2)-LARGE(B:B, 3)
Format the result as time in any way you prefer.
For the difference from B27 to B31,
=LARGE(B:B, 1)-LARGE(B:B, 2)
When datetimes are actual datetimes and not text, the LARGE function can be used just like any other number.
If your values in column B are text, start by reverting them to proper datetimes. Use something like the following,
=DATEVALUE(RIGHT(B1, 5)&"/"&MID(B1, 14, 4))+TIMEVALUE(LEFT(B1, 8)&"."&MID(B1, 10,3))
Correct your data first; then worry about manipulating the numbers.
If the cells propreties are correct, Excel should be able do compute a difference between them without any problem.
Both of the cells containing the dates must be set with Date/Hour format, the cell containing the result of the difference can be (for instance) set to Standard. Then the difference will be a number (integer or float). For instance :
If the result is 3, it means 3 days, multiply it by 24 to have the
number of hours.
If the result is 3,6667, the integer part gives you
the number of full days, the float part gives you the number of
hours. 0.6667*24 = 16 hours.
Hope it helped