Subtract one cell from the cell above in Spotfire using multiple columns - calculated-columns

How would I create a calculated column that would find the thickness of a cell that is the difference from one cell with the cell above it, but want to do this for many individual wells that have unique identifiers in a column. Additional, the last cell of that well will not have a value because there is not cell below it. On top of that I would like to play with the option of calculating this from deepest to shallowest, or shallowest to deepest so c in a column, and organized by ranking scheme (ascending or descending).
See image attached

Use a calculated column with an OVER statement (provided you're not using an in-database data connection). It will look like this:
Sum([Depth]) OVER Intersect([Well],Next([Depth])) - [Depth]
It will return the same result regardless of the row order because the Next function has built-in alphanumeric sorting which it performs on [Depth].

Related

VBA code required to loop through different sized rows of data and return MAX value

I am currently automating a dashboard creation and I've hit a bit of a roadblock. I need some code that will go through about 7000 rows of data and return the highest value in a certain column for each specific item. The data is copied from a pivot table and so is broken down into row sections, I have attached a mock of what it looks like.
I need the highest value in Column G for each portfolio, and will need to use the portfolio code (e.g. XY12345 - They are always 7 characters) to map that value to the dashboard.
My issue is, each portfolio has a different number of rows for the values, and some have blank cells between them, and therefore I am stumped. I was hoping to use Column J to count the number of rows for each portfolio (as there are no breaks for the portfolios in this column) and then use a loop to loop through each portfolios rows of values, based off the Column J count, and then return the highest row value for each portfolio. Problem is I'm new to VBA and have been teaching myself as I go, and I've yet to use a loop.
Many thanks,
Harry
If I understand correctly, you're looking for the largest value in Column G.
I'm not sure why you think you would need VBA for this.
Get the maximum value of a column
You mentioned that you're concerned about each column not having the same number of cells but that's irrelevant. as SUM ignores blank cells, so just "go long", or - find the maximum of the entire column.
To return the largest number in Column G you could use worksheet formula :
=MAX(G:G)
The only catch is that you can't place that formula anywhere column G or else it would create a circular cell reference (trying to infinitely add a number to itself). let's pit that formula in cell F1 for now (but anywhere besides column G would do fine).
Find the location of a value
Now that you know the largest value, you can determine where it is using a lookup function such as MATCH or VLOOKUP. Like with so many things in Excel, there are several ways to accomplish the same thing. I'll go with MATCH.
Replace the formula from above (in F1) with:
=MATCH(MAX(G:G),G:G,0)
This will return the row number of the first exact match of the maximum value of Column G.
As for the third part of question: returning the code like X12345 where the value exists, will be a little tricky since your data is not organized in a logical tabular style (tabular meaning, "like a table").
Your data is organized for humans to look at, not for machines to easily read and manipulate it. (See: Office Support: Guidelines for organizing and formatting data on a worksheet)
Basically, when organizing data in rows, all relevant information should be on the same row (not a subjective number of rows behind). Also, you have the number combined with other information.
My suggestion for a quick fix:
Right-click the heading of Column C and choose Insert to insert a blank column.
In C2 enter formula: =IF(B2="",C1,LEFT(B2,7))
Copy cell C2
Select cells in column C all the way to the "end" of your data, where ever that is (not the end of the worksheet). For example maybe you would select cells B2:B1000)
Paste the copied cell into all those cells.
Now, you can again modify the formula in F1:
=INDEX(C:C,MATCH(MAX(G:G),G:G,0))
This will return the value from Column C in the same row that the maximum value of Column G is located.
This is known as an INDEX/MATCH formula.
Hopefully this works for you in the interim until you can organize your data more logically. There's lots of related information and tutorials online.

how to calculate the means of 100s of subgroups in excel

I have a spreadsheet with ~8000 records, there are ~400 unique identifiers (i.e. element 101, 102, 103....500) that I need to calculated means for. Is there a simple way to calculate means on large datasets like this?? Or will I have to do =average('select column block') for each subgroup/unique identifier?
Many Thanks
Use the following formula
=AVERAGEIF($A$1:$A$8000,"=IDNUMBER",$B$1:$B$8000)
Where
Column A is your column of ID numbers
Column B is your list that you need the mean from.
If your ID numbers are sequential, you can set up something like:
=AVERAGEIF($A$1:$A$8000,"="&100+row(A1),$B$1:B8000)
And copy that down from say C1 to C500
Alternatively you could make a list of the unique identifiers with another formula and place that unique list in C1 to C500 and then in column D use the following:
=AVERAGEIF($A$1:$A$8000,C1,$B$1:$B$8000)
If you have a header row you will need to adjust your ranges accordingly
The formula to generate a unique list of IDs is:
=INDEX($A$2:$A$8001,MATCH(0,INDEX(COUNTIF($C$1:C1,$A$2:$A$8001),0,0),0))
Use that in column C but in row 2 and copy down. So if your data starts in row 1 you will want to bump it down 1 row.
Create a pivot table with the unique identifiers in the rows and calculate the average of the values.
For data that is clustered up nicely and immediately ready to be handed off for a visual review of the averages try a creating a Subtotal:
Select your data
Go to Data > subtotal (far right on the tab)
On the menu popup in the At each change in field, select the column header name that corresponds to your unique identifier.
Select Average for Use function. Select the checkbox of the column for which you want to find the group's mean.
Select other formatting features if desired (defaults typically work best)
Click okay.
Take a sip of coffee and let the magic happen.

For dates with missing data, take average of values from closest dates

I have sporadically collected data for stream nitrogen levels. For dates where nitrogen levels are not available, I'm hoping I can use an Excel formula that will estimate the value by calculating the average of nitrogen levels from the two closest dates.Note that nitrogen is not correlated with flow (or other water quality parameters) and there is no trend so I can't use a regression equation to estimate values.
Below is an example of what the table looks like for columns A, B, and C, and rows 1 (header) to 7. Column B contains the real data (note that I inserted 'no value' for illustrative purposes). Column C is what I would like to have calculated. Because I have thousands of rows of data, manually entering each of these formulas is not an option.
Date_______Actual Value______Desired Calculation
1/1/2012_______0.15__________=B2
1/2/2012_____[no value]_______=AVERAGE(B2,B5)
1/3/2012_____[no value]_______=AVERAGE(B2,B5)
1/4/2012_______0.17__________=B5
1/5/2012_____[no value]_______=AVERAGE(B5,B7)
1/6/2012_______0.23__________=B7
=IF(B1<>"",B1,VLOOKUP(MAX($A$1:$A$4),$A$1:$B$4,2,0))
Misread the question the first time.
Test to see if B1 is equal to "" or not. If it is empty then perform the VLOOKUP searching for the max value of column A (most recent date), and then use the value for the adjacent column (2).
This will not cover the case where the most recent Date has no value in column B. I will return "" in that case.
TREND Option
Based on your comments and the use of some helper cells you could do the following. Make a small table off to the right somewhere of just your known dates and your known values from B. We can populate this table using the following formulas. Let arbitrarily put this table in columns F and G.
in F2 use:
=IFERROR(INDEX($A$2:$A$1564,AGGREGATE(15,6,(ROW($B$2:$B$1564)-ROW($A$2)+1)/($B$2:$B$1564<>""),ROWS(B$2:B2))),"")
This will pull the dates in order that they appear with values in B that are not "". Then in G2 put the following formula in which will pull the associated value from B
=IFERROR(INDEX($B$2:$B$1564,AGGREGATE(15,6,(ROW($B$2:$B$1564)-ROW($B$2)+1)/($B$2:$B$1564<>""),ROWS(B$2:B2))),"")
Both those formulas are currently set up to work with a header row and data starting in row 2.
once we have the helper table set up, you can set up a TREND function, which is a line of best fit through all your known points. You can pick the point on this line for dates with no value in B. In C2 you would add:
=IF(B2="",TREND($G$2:$G$3,$F$2:$F$3,A2),B2)
you would copy this formula down. It checks to see if B is blank. If it is then it pulls a value for B from the trend line. If it is not blank, then it will use the value in B.
AVERAGE Option
Again using the helper table in the same spot. This one gets a little uglier. In C2 place the following and copy down
=IF(B2<>"",B2,IF($A2<$F$2,$G$2,IF($A2>$F$3,$G$3,(INDEX($G$2:$G$3,IFERROR(MATCH($A2,$F$2:$F$3,1),1))+INDEX($G$2:$G$3,IFERROR(MATCH($A2,$F$2:$F$3,1)+1,2)))/2)))
The first thing it does is check for a blank value in B. If there is no blank then it uses the value in B2. If there is no value in B2 it then proceeds to check the date of the blank against the first date in your helper table. If it is less than that first value, it uses the first value of B since there is nothing to average it with. I then proceeds to do the same to see if the date exceeds the last value in the table. (this is currently hard coded). If it exceeds the last date in the table it uses the last value for B. the last option for the date is for it to be in the table. It take the average between the date prior to the date and question and the next date with a value.
Optional version where you dont have to hard code the last values of helper table, but you still need to hard code the range of the helper table to meet your needs.
=IF(B3<>"",B3,IF($A3<$F$2,$G$2,IF($A3>MAX($F$2:$F$3),VLOOKUP(MAX($F$2:$F$3),$F$2:$G$3,2,0),(INDEX($G$2:$G$3,IFERROR(MATCH($A3,$F$2:$F$3,1),1))+INDEX($G$2:$G$3,IFERROR(MATCH($A3,$F$2:$F$3,1)+1,2)))/2)))
TREND Between only two points
So I adjusted the formula a bit on this one. Its kind of a mix of the trend and previous options listed above. I also made it automatically selecting the end values of your table, so now you just have to copy your table down as far as you need. Its also works on the premise that only your data value in B are present, no numbers below your last row.
=IF($A2<$F$2,$G$2,IF($A2>=MAX(OFFSET($F$2,0,0,COUNT(B:B),1)),VLOOKUP(MAX(OFFSET($F$2,0,0,COUNT(B:B),1)),OFFSET($F$2,0,0,COUNT(B:B),2),2,0),TREND(OFFSET($F$2,IFERROR(MATCH($A2,OFFSET($F$2,0,0,COUNT(B:B),1),1),0)-1,1,2,1),OFFSET($F$2,IFERROR(MATCH($A2,OFFSET($F$2,0,0,COUNT(B:B),1),1)-1,0),0,2,1),$A2)))
So basically TREND function is a line of best fit through known points. if your know points do not all sit in a straight line it figures out what straight line passes near them minimizing the distance they are from the line. If we limit the points it using to make its line to only 2 points, then the line has to pass through those two points. Remember, the helper table is a list of all known points.
So basically what the formula doing is checking to see if the date is less than or equal to the first known point on the helper table. If it is, use the value from the first known point in the table. If its not, check to see if the date is greater than the last date in the table, if it is lookup the last known date in the table and return the corresponding table. If neither of these cases is true then the date must be in the table. That means we can get two points from the table and their corresponding values to use in the trend function. We feed the TREND function a an array of 2 known X and 2 Know Y values, and then supply it with an X value we are interested in. In this case X would be the dates, and Y would be the values in the second column.
Revised Helper table
In F2 use this formula:
=IFERROR(INDEX(OFFSET($A$2,0,0,COUNT(A:A),1),AGGREGATE(15,6,(ROW(OFFSET($A$2,0,1,COUNT(A:A),1))-ROW($A$2)+1)/(OFFSET($A$2,0,1,COUNT(A:A),1)<>""),ROWS(B$2:B2))),"")
and G2 use this formula:
=IF(F2="","",INDEX(OFFSET($A$2,0,1,COUNT(A:A),1),MATCH(F2,OFFSET($A$2,0,0,COUNT(A:A),1),0)))
and if you want to verify how far down you want to copy your helper table, you can put this in lets say I3:
=COUNT(B:B)+1
Now these update that are using the count(A:A) or B:B require that no other numbers be above or below your data in the A or B column.
UPDATED Average
I also tweaked the average formula to give this option as well.
=IF($B2<>"",$B2,IF($A2<=$F$2,$G$2,IF($A2>=MAX(OFFSET($F$2,0,0,COUNT($B:$B),1)),VLOOKUP(MAX(OFFSET($F$2,0,0,COUNT($B:$B),1)),OFFSET($F$2,0,0,COUNT($B:$B),2),2,0),AVERAGE(OFFSET($F$2,IFERROR(MATCH($A2,OFFSET($F$2,0,0,COUNT($B:$B),1),1),0)-1,1,2,1)))))
Example results (note rows 14-69 have been hidden so you can see what is happening around row 75.

List top 5 scores AND names associated with those scores

My Numbers spreadsheet has a column of summed scores (AA) from a gymnastics meet, the All-Around scores. Each score in this column is associated with a specific gymnast name in another column (Name).
What I'm trying to do is populate a smaller spreadsheet that will parse the top 5 (or more) scores AND names associated with those scores into two columns - one for names, the other for scores.
This should be something very simple, but I haven't ever played with Excel or Numbers and am really struggling to make sense of all the syntax and formulas.
I've created a link to an image of the columns in question for reference.
https://www.dropbox.com/s/5w819hz1iyhpdde/Scores.jpg?dl=0
Create a pivot table and the set the Top 10 (you can set later to any number)
More info:
http://www.contextures.com/excel-pivot-table-filters-top10.html
This can be broken down into three steps.
Let's say you need the 5th largest element.
Use Large to get the 5th largest element. LARGE(b:b, 5)
The above returns the 5th largest value in the b:b range. But you need to know which row number this is in the B column. For that we use MATCH. Use, MATCH(lookup value, range, 0) where lookup value is the one returned in the first step, range is again B:B and 0 means exact match.
We need to find the appropriate name associated with the largest value. We use INDEX, which finds the value at a particular index value. Use, INDEX(name range, row_number) where name range is say A:A and row_number is the one returned in the previous step.
To sum it up, you need:
INDEX(A:A, MATCH(LARGE(B:B,5),B:B,0)) to get the 5th largest name. Change 5 to whatever rank you want!

Lookup values in an unknown range - Excel

Is there any way to look up and display the latest value in a group that does not have a set range?
Here's a breakdown of a question. I am working with some data that I like to break into data groups that can be collapsed. On a weekly basis I un-collapse the group and insert a line at the bottom with the new information. I would like that latest bit of information to be automatically displayed on the top cells that is displayed when collapsed.
I have used look-up tables and I have a function that I used for testing purposes: =LOOKUP(2,1/(C5:C11<>""),C5:C11) that obtains the last cell within that designated ranged: C5-C11.
Now can I do something similar within grouped data values that have no defined range?
One way to do this is with an index function, assuming that there are no other rows of data on this worksheet other than the group that you're trying to find. Here's an example of a formula that could work for you. It assumes that your data starts in K2 and you want to have your result in K1.
=INDEX(OFFSET(K2,,,ROWS(K:K)-ROW(K2),1),MATCH(9.99999999999999E+307,
OFFSET(K2,,,ROWS(K:K)-ROW(K2),1)),1)
The offset creates an array for all cells below k1 without counting k1. The match function searches for the largest number possible, this formula assumes that you're looking for numbers. If you're looking for text, or a combination, you'll have to use a different formula by replacing the match portion with:
MATCH(REPT("z",255) 'for text
MAX(MATCH(9.99999999999999E+307,range),MATCH(REPT("z",255) 'for numbers and text
Source: http://www.techonthenet.com/excel/questions/last_value.php

Resources