Excel - consolidate with OR operation - excel

I have in a sheet a table like so:
+------+---------+
| Name | Boolean |
+------+---------+
| A | true |
| B | true |
| B | false |
| C | false |
| C | false |
| A | false |
+------+---------+
But in another sheet I need to consolidate it by having a column with unique name and another with the OR of all column Boolean matching the unique name.
Like so:
+------+---------+
| Name | Boolean |
+------+---------+
| A | true |
| B | true |
| C | false |
+------+---------+
I tried Consolidate (no OR operation) and also Index/Match combo, and I couldn't make it work with array-formulas.
I more or less succeed with an Pivot table, but the result wasn't much satisfactory.
There's a simple way to do this that I can't see?

Maybe try:
=COUNTIFS(Sheet1!A:A,A2,Sheet1!B:B,TRUE)>0

Related

Find cell address of value found in range

tl;dr In Google Sheets/Excel, how do I find the address of a cell with a specified value within a specified range where value may be in any row or column?
My best guess is
=CELL("address",LOOKUP("My search value", $search:$range))
but it doesn't work. When it finds a value at all, it returns the rightmost column every time, rather than the column of the cell it found.
I have a sheet of pretty, formatted tables that represent various concepts. Each table consists of
| Title |
+------+------+-------+------+------+-------+------+------+-------+
| Sub | Prop | Name | Sub | Prop | Name | Sub | Prop | Name |
+------+------+-------+------+------+-------+------+------+-------+
| Sub prop | value | Sub prop | value | Sub prop | value |
+------+------+-------+------+------+-------+------+------+-------+
| data | data | data | data | data | data | data | data | data |
| data | data | data | data | data | data | data | data | data |
⋮
I have 8 such tables of variable height arranged in a grid within the sheet 3 tables wide and 3 tables tall except the last column which has only 2 tables--see image. These fill the range C2:AI78.
Now I have a table off to the right consisting in AK2:AO11 of
| Table title | Table title address | ... |
+---------------+-----------------------+-----+
| Table 1 Title | | ... |
| Table 2 Title | | ... |
⋮
| Table 8 Title | | ... |
I want to fill out the Table title address column. (Would it be easier to do this manually for all of 8 values? Absolutely. Did I need to in order to write this question? Yes. But using static values is not the StackOverflow way, now, is it?)
Based on very limited Excel/Google Sheets experience, I believe I need to use CELL() and LOOKUP() for this.
=CELL("address",LOOKUP($AK4, $C$2:$AI$78))
This retrieves the wrong value. For AL4 (looking for value Death Wave), LOOKUP($AK4, $C$2:$AI$78) should retrieve cell C2 but it finds AI2 instead.
| Max Levels |
+------------------+---------------+----+--+----+
| UW | Table Address | | | |
+------------------+---------------+----+--+----+
| Death Wave | $AI$3 | 3 | | 15 |
| Poison Swamp | $AI$30 | | | |
| Smart Missiles | $AI$56 | | | |
| Black Hole | #N/A | 1 | | |
| Inner Land Mines | $AI$3 | | | |
| Chain Lightning | #N/A | | | |
| Golden Tower | $AI$3 | | | |
| Chrono Field | #N/A | 25 | | |
The error messages for the #N/A columns is
Did not find value '<Table Title>' in LOOKUP evaluation.
My expected table is
| Max Levels |
+------------------+---------------+----+--+----+
| UW | Table Address | | | |
+------------------+---------------+----+--+----+
| Death Wave | $C$2 | 3 | | 15 |
| Poison Swamp | $C$28 | | | |
| Smart Missiles | $C$54 | | | |
| Black Hole | $O$2 | 1 | | |
| Inner Land Mines | $O$28 | | | |
| Chain Lightning | $O$54 | | | |
| Golden Tower | $AA$2 | | | |
| Chrono Field | $AA$39 | 25 | | |
try:
=INDEX(ADDRESS(
VLOOKUP(A2:A3, SPLIT(FLATTEN(D2:F4&"​"&ROW(D2:F4)), "​"), 2, ),
VLOOKUP(A2:A3, SPLIT(FLATTEN(D2:F4&"​"&COLUMN(D2:F4)), "​"), 2, ), 4))
or if you want to create jump links:
=INDEX(LAMBDA(x, HYPERLINK("#gid=1273961649&range="&x, x))(ADDRESS(
VLOOKUP(A2:A3, SPLIT(FLATTEN(D2:F4&"​"&ROW(D2:F4)), "​"), 2, ),
VLOOKUP(A2:A3, SPLIT(FLATTEN(D2:F4&"​"&COLUMN(D2:F4)), "​"), 2, ), 4)))
Try this:
=QUERY(
FLATTEN(
ARRAYFORMULA(
IF(
C:AI=$AK4,
ADDRESS(ROW(C:AI), COLUMN(C:AI)),
""
)
)
), "
SELECT
Col1
WHERE
Col1<>''
"
, 0)
Basically, cast all cells in the search range to addresses if they equal the search term. Then flatten that 2D range and filter out non-nulls.

Excel count and sort rows satisfying multiple criteria across columns

I have a table that resembles something like the following:
+----+------+------+--+------+------+
| A | B | C | D| E | F |
+----+------+------+--+------+------+
| T1 | | TRUE | | | |
| T2 | TRUE | | | | |
| T3 | | TRUE | | TRUE | |
| T4 | TRUE | TRUE | | | |
| T5 | | | | | TRUE |
+----+------+------+--+------+------+
I want to rearrange rows according to the decreasing number of multiple activities each T supports i.e. more TRUE values a T has across columns, it comes at the top along with the count of TRUEs for each row.
How can I achieve that?

Excel - avoiding indirect

I have the following three columns A,B,C
+-----------------+---------------+--------------------------------------------------------+
| ExcelRangedName | Default Value | Formula Check |
+-----------------+---------------+--------------------------------------------------------+
| Name1 | | =IF(AND(ISBLANK(B2),ISBLANK(INDIRECT(A2))),"Yes","No") |
| Name2 | FALSE | … |
| Name3 | TRUE | … |
| Name4 | 0 | … |
| Name5 | Alpha | … |
| Name6 | | … |
| Name7 | | … |
| Name8 | | … |
| … | … | … |
+-----------------+---------------+--------------------------------------------------------+
The column "ExcelRangedName" has text which refers to a range name in other worksheets. I want to replace the formula to remove the indirect completely as it is slowing down my worksheet. I remember being able to do this using paste values trick but I completely forgot how. I want my formula to look (for the first row):
=IF(AND(ISBLANK(B2),ISBLANK(Name1)),"Yes","No")
Is this possible?

Filter filter criteria and then apply in countif statement in Excel

I have a table of filter criteria like this:
+----------+----------+------+------+------+
| Category | SpecName | Spec | Pass | Fail |
+----------+----------+------+------+------+
| A | S1 | 3 | | |
| A | S2 | 4 | | |
| B | S1 | 5 | | |
| C | S1 | 2 | | |
+----------+----------+------+------+------+
I have a table I want to apply the filter criteria to like this:
+----------+----+----+
| Category | S1 | S2 |
+----------+----+----+
| A | 5 | 3 |
| B | 4 | |
| A | 5 | 5 |
| C | 2 | |
| A | 2 | 6 |
+----------+----+----+
I want to fill the Pass and Fail columns in the filter criteria table with a count of items in second table with values >= the corresponding spec, like so.
+----------+----------+------+------+------+
| Category | SpecName | Spec | Pass | Fail |
+----------+----------+------+------+------+
| A | S1 | 3 | 1 | 2 |
| A | S2 | 4 | 1 | 2 |
| B | S1 | 5 | 0 | 1 |
| C | S1 | 2 | 1 | 0 |
+----------+----------+------+------+------+
Here are steps for how I might do it in a scripting language:
Filter first table to get all spec filter criteria for the Category on that row, as follows for the first row.
+----------+----------+------+
| Category | SpecName | Spec |
+----------+----------+------+
| A | S1 | 3 |
| A | S2 | 4 |
+----------+----------+------+
Copy table 2 to a variable iTable
+----------+----+----+
| Category | S1 | S2 |
+----------+----+----+
| A | 5 | 3 |
| B | 4 | |
| A | 5 | 5 |
| C | 2 | |
| A | 2 | 6 |
+----------+----+----+
For each spec name:
Find column in iTable with spec name
Filter spec name column in iTable by spec
After all filters applied, we would have:
+----------+----+----+
| Category | S1 | S2 |
+----------+----+----+
| A | 5 | 5 |
+----------+----+----+
Then just count the rows in iTable and assign to the cell in Pass column of the criteria table
Is this possible with Excel formulas?
If not, does anyone know how to do it with VBA?
Looking at an alternative layout for you spec criteria. Expand you columns to suit your need.
With each spec criteria being its own column life gets really easy. You just need to adjust your formula to match the number of criteria you have.
Based on the table at the end for layout, place the following formula in D3 and copy down as required.
=SUMPRODUCT(($G$2:$G$6=A3)*($H$2:$H$6>=B3)*($I$2:$I$6>=C3))
That will give you a count of passing all criteria. Its also a function that performs array like calcs. It could be repeated in the next column but in order to reduce dependency on array calculation and potentially speed things up depending on the amount of data to check, place the following in the top of the fail column and copy down as required:
=COUNTIF($G$2:$G$6,A3)-D3
Basically it subtracts the passes from the total count. This assumes you can only have PASS and FAIL as options.

PySpark getting distinct values over a wide range of columns

I have data with a large number of custom columns, the content of which I poorly understand. The columns are named evar1 to evar250. What I'd like to get is a single table with all distinct values, and a count how often these occur and the name of the column.
------------------------------------------------
| columnname | value | count |
|------------|-----------------------|---------|
| evar1 | en-GB | 7654321 |
| evar1 | en-US | 1234567 |
| evar2 | www.myclient.com | 123 |
| evar2 | app.myclient.com | 456 |
| ...
The best way I can think of doing this feels terrible, as I believe I have to read this data once per column (there are actually about 400 such columns.
i = 1
df_evars = None
while i <= 30:
colname = "evar" + str(i)
df_temp = df.groupBy(colname).agg(fn.count("*").alias("rows"))\
.withColumn("colName", fn.lit(colname))
if df_evars:
df_evars = df_evars.union(df_temp)
else:
df_evars = df_temp
display(df_evars)
Am I missing a better solution?
Update
This has been marked as a duplicate but the two responses IMO only solve part of my question.
I am looking at potentially very wide tables with potentially a large number of values. I need a simple way (ie. 3 columns that show the source column, the value and the count of the value in the source column.
The first of the responses only gives me an approximation of the number of distinct values. Which is pretty useless to me.
The second response seems less relevant than the first. To clarify, source data like this:
-----------------------
| evar1 | evar2 | ... |
|---------------|-----|
| A | A | ... |
| B | A | ... |
| B | B | ... |
| B | B | ... |
| ...
Should result in the output
--------------------------------
| columnname | value | count |
|------------|-------|---------|
| evar1 | A | 1 |
| evar1 | B | 3 |
| evar2 | A | 2 |
| evar2 | B | 2 |
| ...
Using melt borrowed from here:
from pyspark.sql.functions import col
melt(
df.select([col(c).cast("string") for c in df.columns]),
id_vars=[], value_vars=df.columns
).groupBy("variable", "value").count()
Adapted from the answer by user6910411.

Resources