Spotfire- calculated column with row ratios based on condition - calculated-columns

I’m having trouble understanding if Spotfire allows for conditional computations between arbitrary rows containing numerical data repeated over data groups. I could not find anything to cue me onto a right solution.
Context (simplified): I have data from a sensor reporting state of a process and this data is grouped into bursts/groups representing a measurement taking several minutes each.
Within each burst the sensor is measuring a signal and if a predefined feature (signal shape) was detected the sensor outputs some calculated value, V quantifying this feature and also reports a RunTime at which this happened.
So in essence I have three columns: Burst number, a set of RTs within this burst and Values associated with these RTs.
I need to add a calculated column to do a ratio of Values for rows where RT is equal to a specific number, let’s say 1.89 and 2.76.
The high level logic would be:
If a Value exists at 1.89 Run Time and a Value exists at 2.76 Run Time then compute the ratio of these values. Repeat for every Burst.
I understand I can repeat the computation over groups using OVER operator but I’m struggling with logic within each group...
Any tips would be appreciated.
Many thanks!

The first thing you need to do here is apply an order to your dataset. I assume the sample data is complete and encompasses the cases in your real data, thus, we create a calculated column:
RowID() as [ROWID]
Once this is done, we can create a calculated column which will compute your ratio over it's respective groups. Just a note, your B4 example is incorrect compared to the other groups. That is, you have your numerator and denominator reversed.
If(([RT]=1.89) or ([RT]=2.76),[Value] / Max([Value]) OVER (Intersect([Burst],Previous([ROWID]))))
Breaking this down...
If(([RT]=1.89) or ([RT]=2.76), limits the rows to those where the RT = 1.89 or 2.76.
Next comes the evaluation if the above condition is TRUE
[Value] / Max([Value]) OVER (Intersect([Burst],Previous([ROWID])))) This takes the value for the row and divides it by the Max([Value]) over the grouping of [Burst] and AllPrevious([ROWID]). This is noted by the Intersect() function. So, the denominator will always be the previous value for the grouping. Note that Max() was a simple aggregate used, but any should do for this case since we are only expecting a single value. All Over() functions require and aggregate to limit the result set to a single row.
RESULTS

Related

How can i calculate the difference of two cells based on the values of a check-column?

I am struggling with finding a way to calculate the difference between two cells, if a criteria is matched on a check-column that I have. to further explain this, i have the dataset as below:
Example of the table
In this table, column A is where I have times in which a bus has performed a cycle, stopping at each bus stop. In column B I have distinguished between start of the cycle, half of the cycle and end of the cycle. In column C i want to calculate the full time in which a cycle is performed, in other words i want to calculate the difference between time (in column A) of the end of the cycle (when "End" in column B is met) and time of the start of the cycle (when "Start" in column B is met)
Please note that the number of stops is different from cycle to cycle as the bus does not always perform the same stops each cycle. Therefor this task becomes a bit complicated for me. And also the number of cycles is very large, that why I am trying to find an automated way to calculate the difference explained above based on the criteria that first parameter ofthe difference should be taken if "End" is met and second parameter should be taken if "Start" is met.
I have no clue on where to start with this, so until now i have gotten nowhere in trying a solution for myself.
Thank you for your time
Well maybe a Index and match calculation.
=(INDEX(Table1[Column1],MATCH(E10,Table1[Column2],0),1))-(INDEX(Table1[Column1],MATCH(E9,Table1[Column2],0),1))
Maybe SumIF?
=SUMIF(Table1[Column2],E10,Table1[Column1])-SUMIF(Table1[Column2],E9,Table1[Column1])
Regardless the result looks like this:
If the time is sequential, I like FILTER.
Something like: =IF(B2="stop", A2-MAX(FILTER(A$2:A2, B$2:B2="Start")), "")
Here you filter the previous rows in the data frame (make sure you freeze the right cells!) to only show those that also have "Start" in the row, then select the largest one (the MAX function) from the filter to subtract from the current time. If the current row is not a "Stop", we elect to leave the cell blank.

Get result of difference of two columns in a matrix in third column

Above is a matrix in PowerBI that displays the amount and target amount for various divisions. I want to calculate the different between actual amount and target amount and display it in a third column in the matrix. I tried creating a column that calculates the difference between these two columns in the data table. But the data displayed in the matrix is not what I wish for it to be. I want the matrix to display the difference between the target and actual amount based on the values in the matrix only (which are derived based on certain filters applied). I do not want to get the difference of all the values in the dataset.
I am completely new to PowerBI and any help for the same would be highly appreciated.
Simply, use calculate. The value is the result of the expression evaluated in a modified filter context. That means in your current filter/row context.
diff = calculate([Target] - [Amount])

Excel- Generating a set of numbers with normal distribution with MIN and MAX

I want to generate a single column of 6000 numbers with a normal distribution, with a mean of 30.15, standard deviation of 49.8, minium of -11.5, maximum 133.5.
I am a total newb at this so i tried to use the following formula in a cell and than just drag it down to cell 6000:
=NORMINV(RANDBETWEEN(-11.5,133.5)/100,30.15,49.8)
It returns a value but sometimes it returns #NUM! error. Thank you!
Unfortunately NORMINV expects a probability for the argument, which must be a value in the interval (0, 1). Any parameter outside that range will yield #NUM!.
What you're asking cannot be done directly with a normal distribution since that has no constraints on the minimum and maximum values.
One approach is to use a primary column to generate the normally distributed numbers, then filter out the ones you want in the adjacent column. But this will cause even the mean (let alone higher moments) to go off quite considerably due to your minimum and maximum values not being equidistant from the mean. You could get round this by recentering the distribution and adjusting afterwards.

How do I distribute a value over multiple cells evenly but under a maximum limit?

Example data with desired outcome that I need to calculate
I have 12 items of a certain current value. I have a 'soft' cap of $1,000,000 for these values. Some of the items fall above, and some below this cap level.
I have an amount of money (for this example $900,000) that I want to distribute amongst only the items that fall below the cap (in this example 6 items), with the aim of bringing the value of these items up to but not over the cap value.
If I distribute the $900,000 evenly over these 6 items (each receiving $150,000), you can see that items 2 and 9 would then be over the $1,000,000 cap. So items 2 and 9 should only receive $100,000 to raise their value to the cap, then the remaining 4 items would receive and equal share on the remaining pool of money ($700,000 / 4 = $175,000).
So I need a formula to check every item to see if it needs a distribution (i.e below the cap) and then portion/divide out the money pool as illustrated above in the desired distribution column.
Note: The pool of money to be distributed can change. Also the number of items below the cap can change. The cap value itself can change.
I am hoping to avoid VBA or Solver because the spreadsheet could be used on other people's computers.
Hopefully this makes sense. Thanks.
EDIT:
So far I have been able to get close by adding a helper column and using the following formula:
=IF(SUM($F$6:F14)=$D$23,0,E15*MIN(D15,($D$23-SUM($F$6:F14))/SUM(E15:$E$18)))
Working example when values are sorted.
This seems to work when the values are sorted in descending order, as shown in the example image above. But seems to break when the values are a bit more randomly assorted which is likely to happen (as in the original post).
Just to give you an idea of how the solver can be set up to do a capital budget model here is one, also shows the solver and its settings:

Spotfire: calculating difference between two rows in same column based on attributes in different columns

I am new to Spotfire and need help in getting the right expression for a calculated column.
My Data contains different subjects grouped in column ID. For every ID, Bodyweight was measured on different days. Days are given in column Day and stated as 1,2,3...
The last day is denoted by Last and Bodyweight measurements given in another column. Another column is present which is called Baseline. The Body weight measured is considered as baseline if the column contains a Y for that row.
I need to insert a calculated column, which will contain the difference between Body measurement measured on Day denoted Last and Body measurement marked by Y in column Baseline.
This should be done for every new ID. I am not able to figure this out. Could someone advise me on how to go about it?
Here is an example attached
So, the calculated column for Rita will give -4 (body weight at Last=56 and BodyWeight at baseline=56, so 52-56 =-4)
the sample data you provided is a little weird, particularly the [Day] column. if it's within your control, I suggest to use actual dates rather than a number/string here.
barring that, I was able to get your desired results, but it required two calculated columns: the first one will consolidate the [Day] and [Baseline] columns into a single column, and the second one contains your desired info.
column 1, which I called Day (int):
CASE
WHEN [Day]="Last" THEN 1000000
WHEN [Baseline]="Y" THEN -1000000
WHEN [Day]!="Last" THEN Integer([Day])
END
I picked a random high and low max to establish a chronological order. this will put 1000000 in place of "Last" (if you have any programs that are longer than one million days, you'll need to increase this number). the same for the [Baseline] column, but that value will be -1000000, which is presumably the lowest value you will ever see in this column. both of these are assumptions and may not work for your implementation. finally, in all other cases, the day number will be used.
column 2, which I called Diff:
Last([Weight]) OVER (Intersect([Name],LastNode([Day (int)]))) -
First([Weight]) OVER (Intersect([Name],FirstNode([Day (int)])))
the first line uses what's called an OVER expression to retrieve the first value for [Weight], ordered by [Day (int)], per [Name]. the second line gives the reverse of that, and so the difference is calculated as -4 (or whatever is appropriate).

Resources