Spotfire: calculating difference between two rows in same column based on attributes in different columns - calculated-columns

I am new to Spotfire and need help in getting the right expression for a calculated column.
My Data contains different subjects grouped in column ID. For every ID, Bodyweight was measured on different days. Days are given in column Day and stated as 1,2,3...
The last day is denoted by Last and Bodyweight measurements given in another column. Another column is present which is called Baseline. The Body weight measured is considered as baseline if the column contains a Y for that row.
I need to insert a calculated column, which will contain the difference between Body measurement measured on Day denoted Last and Body measurement marked by Y in column Baseline.
This should be done for every new ID. I am not able to figure this out. Could someone advise me on how to go about it?
Here is an example attached
So, the calculated column for Rita will give -4 (body weight at Last=56 and BodyWeight at baseline=56, so 52-56 =-4)

the sample data you provided is a little weird, particularly the [Day] column. if it's within your control, I suggest to use actual dates rather than a number/string here.
barring that, I was able to get your desired results, but it required two calculated columns: the first one will consolidate the [Day] and [Baseline] columns into a single column, and the second one contains your desired info.
column 1, which I called Day (int):
CASE
WHEN [Day]="Last" THEN 1000000
WHEN [Baseline]="Y" THEN -1000000
WHEN [Day]!="Last" THEN Integer([Day])
END
I picked a random high and low max to establish a chronological order. this will put 1000000 in place of "Last" (if you have any programs that are longer than one million days, you'll need to increase this number). the same for the [Baseline] column, but that value will be -1000000, which is presumably the lowest value you will ever see in this column. both of these are assumptions and may not work for your implementation. finally, in all other cases, the day number will be used.
column 2, which I called Diff:
Last([Weight]) OVER (Intersect([Name],LastNode([Day (int)]))) -
First([Weight]) OVER (Intersect([Name],FirstNode([Day (int)])))
the first line uses what's called an OVER expression to retrieve the first value for [Weight], ordered by [Day (int)], per [Name]. the second line gives the reverse of that, and so the difference is calculated as -4 (or whatever is appropriate).

Related

In Excel: How to built a graph with time as X, names as Y with multiples series?

I am looking for a way to make a specific graph in Excel and I can't find a solution in Excel or on the web.
I have data about an online training with people completing parts of a course at a certain time:
FullName
Course
TIME
Name-A
Part 1
23/03/2022 10:38
Name-A
Part 2
23/03/2022 12:07
Name-A
Part 3
23/03/2022 16:55
Name-B
Part 1
11/03/2022 15:14
Name-B
Part 2
22/03/2022 12:08
Name-B
Part 3
28/03/2022 16:06
Name-B
Part 4
30/03/2022 14:55
Name-B
Part 5
18/04/2022 08:13
Name-C
Part 1
11/04/2022 15:25
Name-C
Part 2
20/04/2022 13:50
I would like to have a specific graph of this data:
On the vertical axis: one row for each user' name: Name-A, Name-B and Name-C.
On the horizontal axis: continuous time (say, in days) From the minimum time in the table (or less) to the maximum (or more)
Series of plots for the data: Each part of the course (from Part 1 to Part 5 here) would be a series of dots of a specific color, placed on the right row (for a learner's name) above the corresponding time on the horizontal axis.
Do you have any idea on how it could be achieved?
All the best, R.S.
Edit: The table does not appear as in the preview so i try to add a screenshot:
Screenshot of the table
So one way to visualise this as mentioned in the comments is to create a separate series for each person and show passing each part of the course as a vertical step:
It's based very loosely on this but I've set each day in the date range as the x-coordinates and used a lookup to transform the data in H2
=RIGHT(XLOOKUP($G2+TIME(23,59,59),FILTER($C$2:$C$11,$A$2:$A$11=H$1),FILTER($B$2:$B$11,$A$2:$A$11=H$1),0,-1))+(COLUMN()-COLUMN($G$1))*10
pulled down and across to give
Explanation
The data for the graph has dates spanning the times in the raw data for its x-coordinates (column G). I generated it manually but could have used Sequence in Excel 365.
There are three columns of y-values, H to J, generating a separate series for each person. The three lines are initially spaced out by 10 units based on the column number. In the formula above, the raw data is filtered by the person's name so the headers in columns H, I or J match the names in column A in the raw data. Xlookup is used with 'next smallest' match so where the date in column G is greater or equal to the date/time in column C it will return the corresponding course from column B. Because column C actually contains date/times, I have added almost 24 hours when matching the date in column G to make sure that a match is found if the day is the same, regardless of time. In a case like Name-A, where three courses are completed in the same day, this will automatically select the last one (Part 3). Then I take the right-hand character of the course name (which is a digit in the sample data) and add it to the relative column number multiplied by 10. If there is no match, Xlookup returns zero so you just get the initial value for each series (10, 20 or 30), otherwise the result will be an increase by one unit each time a course is passed. If you couldn't assume the last character of the course name was a digit, you would need a lookup to assign a number to each course name.
The data is then plotted on a scatter graph with points joined by straight lines. I had to adjust the x-axis manually to make the range correct and the labelling clearer.
This could be done without Excel 365, probably using Aggregate to get the highest row number with a condition on the name and date.
EDIT
I could have achieved the same result much more easily using Countifs to find how many courses had been passed by a certain person by a certain date:
=COUNTIFS($A$2:$A$11,H$1,$C$2:$C$11,"<="&$G2+TIME(23,59,59))+(COLUMN()-COLUMN($G$1))*10
This wouldn't have needed Excel 365. If you needed to give different courses different weightings, you could do this with a sumproduct and a lookup, also fairly straightforward.

Excel Pivottable Subtract Two Columns

I am trying to figure out how to find net growth in a workforce in a pivotable. As of right now, I have a column that says status and consist of "Hire" and "Separation". I have these both in the pivotable as count of. For example, it could say 6 hire and 2 separation. However, when I do subtotal to see the net growth, it shows these two values added not subtracted. I need it to do subtotal of "hire" count - "separation" count.
Does anyone know how to do this? I know that inserting a calculated field will not work as you cannot calculate a field based of the value of another column
You can accomplish this a little differently, by not using subtotals. Instead, drag Status to your value field twice. The first will show you the number of hires and fires. Then, click the second Count > Value Field Settings > Show Values As. Choose Difference From in the dropdown. Base field is status and Base item is previous.
This will add a second column showing the Net Growth (or loss) of each grouping.

Excel Group rows based on time interval

I have the following excel result:
I want to group the above result in groups based on sessions i.e. if the time gap between two successive timestamps is greater than 5 minutes, it must be a new row.
For example :
I need some formula to achieve this. As I'm fairly new to Excel this is causing to be a major headache for me. Please help me, if anyone knows how to do it or at least point me in a direction.
Thanks a ton !!!
Judging by your screenshot, it appears your timestamps are actually text values. Text by default is usually left aligned where as numbers are right aligned. You seem to have a space at the end of your time stamp suggesting that it is probably left aligned and therefore text. You can test it with the following formula which will return TRUE if its text.
=ISTEXT(P2)
where P2 is one of your time stamps.
CONVERT TIMESTAMPS TO TIME
There are a variety of ways to do this. Some will depend on system settings. Take a look at the following functions as each might be useable depending on your system. The first two are a guarantee, the last two are more dependent on system settings.
DATE
TIME
DATEVALUE
TIMEVALUE
Something important to remember here is that in excel dates are integers counting the days since 1900/01/01 with that date being 1. Time is stored as a decimal and represents fraction/percentage of a day. 24:00:00 is not a valid time in excel though some functions may work with it.
So in order to convert your time stamp in P2 I used the following formula to pull out the date:
=DATE(LEFT(P2,4),MID(P2,FIND("-",P2)+1,2),MID(P2,FIND(" ",P2)-2,2))
Basically it goes into the text and strips out the individual numbers for Year, Month and Day.
To pull out the time, I could have done the same procedure but elected to demonstrate the TIMEVALUE method which is a little more robust than DATEVALUE and not a subjective to system settings as much. With the following formula I stripped out the whole time code (MINUS"UTC"):
=TIMEVALUE(TRIM(MID(P2,FIND(" ",P2)+1,FIND("UTC",P2)-FIND(" ",P2)-1)))
I also made an assumption that you are not mixing and matching UTC with other time zones which means it can be ignored. Now to get DATE and TIME all in one cell, you just need to add the two formulas together to get:
=DATE(LEFT(P2,4),MID(P2,FIND("-",P2)+1,2),MID(P2,FIND(" ",P2)-2,2))+TIMEVALUE(TRIM(MID(P2,FIND(" ",P2)+1,FIND("UTC",P2)-FIND(" ",P2)-1)))
In the example at the end, I placed that formula in Q2 and copied down
DELTA TIME
Since you want to break your groups out based on a time difference between individual entries, I used a helper column to store the time difference. In my example at the end I stored this difference in Column S. The first entry is blank as there is no time before it. I used the following formula in S3 and copied it downward.
=Q3-Q2
I applied the custom formatting of [h]:mm:ss to the cell to get it to display as shown.
FIND GROUP BREAK POINTS
In my example I am using helper column T to hold breakpoint flags. At a minimum, you will have two break points. Your first time entry and your last time entry. To make like simple I simply hard coded my first breakpoint flag in T2 as 1. Stating in T3, Three checks need to be made. If any of them are TRUE then the next flag needs to be added with a value increase by one. the three checks are:
Is this the last entry
Is the next time delta greater than 5 minutes (means end of a group)
Is this time delta greater than 5 minutes (means start of a group)
Based on those three checks I placed the following formula in T3 and copied down:
=IF(OR(S4="",S4>TIME(0,5,0),S3>TIME(0,5,0)),MAX($T$2:T2)+1,"")
Note the $ on the first part of the range for the MAX function. This will lock the start of the range while the formula gets copied down while the end of the range increases accordingly.
Also the row after the last time entry must be blank. IF it is not blank and has a set value in it, change the S4="" to S4="set value".
GENERATE TABLE
There are multiple ways to reference the flags and pull the corresponding times. a couple of formulas you can look into are:
INDEX / MATCH
LOOKUP
In this example I elected to use LOOKUP though I believe INDEX and MATCH are more appropriate and robust. For starters we want to generate a list of ODD number and EVEN numbers. These represent the start and end of the groups and correspond to the flags set in column T. One way to generate ODD and EVEN numbers as you copy down is:
=ROW(A1)*2-1 (ODD)
=ROW(A1)*2 (EVEN)
The next step is to find the generated number in Column T and then pull its corresponding timestamp in Column Q. I did this with the following formula in V2 and copied down.
=LOOKUP(ROW(A1)*2-1,T:T,Q:Q)
And in W2
=LOOKUP(ROW(A1)*2,T:T,Q:Q)

Plotting an upper envelope function in Excel

I have got data points of a sound sample from Audacity which I exported to a .txt file and imported in Excel. Is it possible to plot an upper envelope function in Excel?
(In the end I have to determine the reverberation time, so the time in which the loudness decreases with 60dB.)
For a decaying oscillation such as this or a damped pendulum, the decay envelope can be found by looking at the difference between each sampled reading. You then need to see where the is a change in gradient from +ve to -ve. (or zero on one side). To do this some logical operators are used.
Method:
Consider data starting at column A row 5 for time and Column B row 5 for first data value.
Create a column which has the value minus the previous value. [ C6=B6-B5 etc.]
Next do a column for the gradient change giving a "flag" of 1 for a positive to negative or zero inflection [D7= IF (AND(C7>0,C6<=0],1,0)
This should produce a column of data that corresponds to the peaks.
In the next columns use the flag to get the original co-ordinates to display
[E7 =IF (D7=1,A7,"")] For time and [F7 IF(D7=1,B7,"")]
Copy this data using "Value" only in the paste option to yet another column.
Filter this to exclude the null data.
Beware of aliasing in data set.
Not the most elegant of solutions (and some steps can be linked -for clarity shown separately) but it works.
Tech99m

Picking top 5 scores from a range

I run a small golf eclectic with excel. One of the things we have is a points system. I would like to get the 5 highest points scored over the season and have them ranked from 1 (being the highest points scored) to 5.
My knowledge of excel "sums" goes only a wee bit further than add and subtract.
Thanks!
If you don't want to change the order that they are presently in you can use the LARGE function. It returns the kth largest value.
Below is a great formula, if you drag it down it will automatically get the second, third and nth largest value from a table of data (in this example the data is between A1 to A10).
=LARGE(A1:A10,ROW(A1)-ROW($A$1)+1)
You can then match the values with names or corresponding data from the tables using the MATCH and INDEX functions. The example below would fetch the name for each value from the second column.
=INDEX($A$1:$B$10,MATCH(cell reference with score or value,$A$1:$B$10,2))
Play around with these formulas, they are very convenient for data m
If you have a column containing the scores, you could add a filter (Data->Filter I think) and sort descending.
Though, if you just have rows that are something like [Date][Person][Score] you'll need to go to another sheet and SUM the scores for each person then sort that... Unfortunately my Excel skills aren't up to par to pull a score for each person like that.
Given a list of numbers in A1 to A10, you can work out their 'Rank' relative to each other by using 'RANK'.
e.g.
RANK(A1,A1:A6,0)
RANK(cell, list of cells to check against, order)
For order, 0 = descending.
From there you can work out which one is first pragmatically.
If you have Excel 2007,
Check that your data is continuous, with no blank rows or columns. Click on your scores and then select 'Data - Filter'
Using the dropdown that the filter creates at the top of your scores column and select 'Number filters - Top ten'
A 'Top ten Autofilter' dialog will be displayed, reduce the show 10 to 5 and then click on OK.
For earlier versions of Excel add a RANK formula in a new column. Be careful as the scores need to be sorted, usually into descending order. If there are any ties, they will be given the same ranking number and the subsequent rank number will be incremented by the number of ties. (E.g. If there are two scores of 2, ranked as 5. The next score will be ranked as 7, not 6)
If you want to use the LARGE Function as described above, make sure you put the same range in the list for each of the LARGE functions. That is, change =LARGE(A1:A10,ROW(A1)-ROW($A$1)+1) to =LARGE(A$1:A$10,ROW(A1)-ROW($A$1)+1) or you will get some strange incorrect results

Resources