Divide two "Calculated Values" within Spofire Graphical Table - spotfire

I have a spotfire question. Is it possible to divide two "calculated value" columns in a "graphical table".
I have a Count([Type]) calculated value. I then limit the data within the second calculated value to arrive at a different number of Count[Type].
I would like to divide the two in a third calculated value column.
ie.
Calculated value column 1:
Count([Type]) = 100 (NOT LIMITED)
Calculated value column 2:
Count([Type]) = 50 (Limited to [Type]="Good")
Now I would like to say 50/100 = 0.5 in the third calculated value column.
If it is possible to do this all within one calculated column value that is even better. Graphical Tables do not let you have if statements in the custom expression, the only way is to limit data. So I am struggling, any help is appreciated.

Graphical tables do allow IF() in custom expressions. In order to accomplish this you are going to have to move your logic away from the Limit Data Using Expressions and into your expression directly. Here should be your three Axes expressions:
Count([Type])
Count(If([Type]="Good",[Type]))
Count(If([Type]="Good",[Type])) / Count([Type])
Data Set
+----+------+
| ID | Type |
+----+------+
| 1 | Good |
| 1 | Good |
| 1 | Good |
| 1 | Good |
| 1 | Good |
| 1 | Bad |
| 1 | Bad |
| 1 | Bad |
| 1 | Bad |
| 2 | Good |
| 2 | Good |
| 2 | Good |
| 2 | Good |
| 2 | Bad |
| 2 | Bad |
| 2 | Bad |
| 2 | Bad |
+----+------+
Results

Related

Spark replicating rows with values of a column from different dataset

I am trying to replicate rows inside a dataset multiple times with different values for a column in Apache Spark. Lets say I have a dataset as follows
Dataset A
| num | group |
| 1 | 2 |
| 3 | 5 |
Another dataset have different columns
Dataset B
| id |
| 1 |
| 4 |
I would like to replicate the rows from Dataset A with column values of Dataset B. You can say a join without any conditional criteria that needs to be done. So resulting dataset should look like.
| id | num | group |
| 1 | 1 | 2 |
| 1 | 3 | 5 |
| 4 | 1 | 2 |
| 4 | 3 | 5 |
Can anyone suggest how the above can be achieved? As per my understanding, join requires a condition and columns to be matched between 2 datasets.
What you want to do is called CartesianProduct and df1.crossJoin(df2) will achieve it. But be careful with it because it is a very heavy operation.

Athena/Presto - UNNEST MAP to columns

Assume i have a table like this,
table: qa_list
id | question_id | question | answer |
---------+--------------+------------+-------------
1 | 100 | question1 | answer |
2 | 101 | question2 | answer |
3 | 102 | question3 | answer |
4 | ...
... | ...
and a query that gives below result (since I couldn't find a direct way to transpose the table),
table: qa_map
id | qa_map
--------+---------
1 | {question1=answer,question2=answer,question3=answer, ....}
Where qa_map is the result of a map_agg of arbitrary number of questions and answers.
Is there a way to UNNEST qa_map to an arbitrary number of columns as shown below?
id | Question_1 | Answer_1 | Question_2 | Answer_2 | Question_3 | ....
---------+-------------+-----------+-------------+-----------+-------------+
1 | question | answer | question | answer | question | ....
AWS Athena/Presto-0.172
No, there is no way to write a query that results in different number of columns depending on the data. The columns must be known before query execution starts. The map you have is as close as you are going to get.
If you include your motivation for wanting to do this there may be other ways we can help you achieve your end goal.

How to divide two cells based on match?

In a table 1, I have,
+---+---+----+
| | A | B |
+---+---+----+
| 1 | A | 30 |
| 2 | B | 20 |
| 3 | C | 15 |
+---+---+----+
On table 2, I have
+---+---+---+----+
| | A | B | C |
+---+---+---+----+
| 1 | A | 2 | 15 |
| 2 | A | 5 | 6 |
| 3 | B | 4 | 5 |
+---+---+---+----+
I want the number in second column to divide the number in table 1, based on match, and the result in third column.
The number present in the bracket is the result needed. What is the formula that I must apply in third column in table 2?
Please help me on this.
Thanks in advance
You can use a vlookup() formula to go get the dividend. (assuming table 1 is on Sheet1 and table 2 in Sheet2 where we are doing this formula):
=VLOOKUP(A1,Sheet1!A:B, 2, FALSE)/Sheet2!B1
Since you mention table, with structured references, though it seems you are not applying those here:
=VLOOKUP([#Column1],Table1[#All],2,0)/[#Column2]

PySpark getting distinct values over a wide range of columns

I have data with a large number of custom columns, the content of which I poorly understand. The columns are named evar1 to evar250. What I'd like to get is a single table with all distinct values, and a count how often these occur and the name of the column.
------------------------------------------------
| columnname | value | count |
|------------|-----------------------|---------|
| evar1 | en-GB | 7654321 |
| evar1 | en-US | 1234567 |
| evar2 | www.myclient.com | 123 |
| evar2 | app.myclient.com | 456 |
| ...
The best way I can think of doing this feels terrible, as I believe I have to read this data once per column (there are actually about 400 such columns.
i = 1
df_evars = None
while i <= 30:
colname = "evar" + str(i)
df_temp = df.groupBy(colname).agg(fn.count("*").alias("rows"))\
.withColumn("colName", fn.lit(colname))
if df_evars:
df_evars = df_evars.union(df_temp)
else:
df_evars = df_temp
display(df_evars)
Am I missing a better solution?
Update
This has been marked as a duplicate but the two responses IMO only solve part of my question.
I am looking at potentially very wide tables with potentially a large number of values. I need a simple way (ie. 3 columns that show the source column, the value and the count of the value in the source column.
The first of the responses only gives me an approximation of the number of distinct values. Which is pretty useless to me.
The second response seems less relevant than the first. To clarify, source data like this:
-----------------------
| evar1 | evar2 | ... |
|---------------|-----|
| A | A | ... |
| B | A | ... |
| B | B | ... |
| B | B | ... |
| ...
Should result in the output
--------------------------------
| columnname | value | count |
|------------|-------|---------|
| evar1 | A | 1 |
| evar1 | B | 3 |
| evar2 | A | 2 |
| evar2 | B | 2 |
| ...
Using melt borrowed from here:
from pyspark.sql.functions import col
melt(
df.select([col(c).cast("string") for c in df.columns]),
id_vars=[], value_vars=df.columns
).groupBy("variable", "value").count()
Adapted from the answer by user6910411.

Excel 2010 Calculating Production line quantities without long calculations

Program: Excel 2010
Requirements: Prefer no VBA (Macro free book)
I am creating a spreadsheet to calculate items required for components (parts). I have a list of the product, and under the number of specific parts. I have a calculation which tells me what the total parts are needed, but, is there a better way?
=($C$32*C34)+($D$32*D34)+($E$32*E34)+($F$32*F34)+($G$32*G34)+($H$32*H34)+($I$32*I34)+($J$32*J34)+($K$32*K34)
| A | B | C | D | E | F |
| Making: | | 2 | 2 | 2 | |
|---------------|-------|------------|-------------|-----------------|---------|
| Item -> | Total | Small raft | Rowing boat | Sm sailing boat | Corbita |
| | | | | | |
| Planks | 20 | 4 | 6 | | |
| Logs | 8 | 4 | | | |
| Nails - Large | 16 | 8 | | | |
| Oars | | | | | |
In the above, you can see that ($C$32*C34) = 8 & ($D$32*D34) = 12 => 12+8 = 20 (B34) (Planks Total)
Is there an easier way of doing this, or will my equation just keep getting bigger?
Thanks in advance.
As chris neilsen mentioned in his comment, you can use the SUMPRODUCT function in Excel. The formula in your cell B34 (total planks) should look like this:
=SUMPRODUCT(C32:K32,C34:K34)
This has the effect of multiplying the corresponding components in the given ranges (C32 * C34, D32 * D34, etc.) and then returning the sum of those products/multiplications.
As you add more columns, you can expand K to the last column in the range that you want to add up in both ranges.

Resources