Insert data if previous column contains data - excel

I'm trying to think of a way where I can have a Powershell script check an Excel or CSV file, read the last column and then insert a static data for each row that has data.
For example this is the original file:
| Id | Price | Description |
|-----|---------|---------------------------------|
| 33 | 878.35 | C2 Flush Casement |
| 52 | 1111.69 | DS2-L101-T102_L |
| 71 | 875.75 | C2 Flush Casement |
I would like to add the Quote number (Which I can get from the file name):
| Id | Price | Description | Quote |
|----|---------|-------------------|-------|
| 33 | 878.35 | C2 Flush Casement | Q1234 |
| 52 | 1111.69 | DS2-L101-T102_L | Q1234 |
| 71 | 875.75 | C2 Flush Casement | Q1234 |
Any pointers would be appreciated.

You can use Import-Csv to read and parse the data, then create the new column with Select-Object and write to a new CSV file with Export-Csv:
Import-Csv path\to\file.csv |Select-Object *,#{Name='Quote';Expression={if($_.Description -ne ''){'Q1234'}}} |Export-Csv path\to\output.csv -NoTypeInformation

Related

Find cell address of value found in range

tl;dr In Google Sheets/Excel, how do I find the address of a cell with a specified value within a specified range where value may be in any row or column?
My best guess is
=CELL("address",LOOKUP("My search value", $search:$range))
but it doesn't work. When it finds a value at all, it returns the rightmost column every time, rather than the column of the cell it found.
I have a sheet of pretty, formatted tables that represent various concepts. Each table consists of
| Title |
+------+------+-------+------+------+-------+------+------+-------+
| Sub | Prop | Name | Sub | Prop | Name | Sub | Prop | Name |
+------+------+-------+------+------+-------+------+------+-------+
| Sub prop | value | Sub prop | value | Sub prop | value |
+------+------+-------+------+------+-------+------+------+-------+
| data | data | data | data | data | data | data | data | data |
| data | data | data | data | data | data | data | data | data |
⋮
I have 8 such tables of variable height arranged in a grid within the sheet 3 tables wide and 3 tables tall except the last column which has only 2 tables--see image. These fill the range C2:AI78.
Now I have a table off to the right consisting in AK2:AO11 of
| Table title | Table title address | ... |
+---------------+-----------------------+-----+
| Table 1 Title | | ... |
| Table 2 Title | | ... |
⋮
| Table 8 Title | | ... |
I want to fill out the Table title address column. (Would it be easier to do this manually for all of 8 values? Absolutely. Did I need to in order to write this question? Yes. But using static values is not the StackOverflow way, now, is it?)
Based on very limited Excel/Google Sheets experience, I believe I need to use CELL() and LOOKUP() for this.
=CELL("address",LOOKUP($AK4, $C$2:$AI$78))
This retrieves the wrong value. For AL4 (looking for value Death Wave), LOOKUP($AK4, $C$2:$AI$78) should retrieve cell C2 but it finds AI2 instead.
| Max Levels |
+------------------+---------------+----+--+----+
| UW | Table Address | | | |
+------------------+---------------+----+--+----+
| Death Wave | $AI$3 | 3 | | 15 |
| Poison Swamp | $AI$30 | | | |
| Smart Missiles | $AI$56 | | | |
| Black Hole | #N/A | 1 | | |
| Inner Land Mines | $AI$3 | | | |
| Chain Lightning | #N/A | | | |
| Golden Tower | $AI$3 | | | |
| Chrono Field | #N/A | 25 | | |
The error messages for the #N/A columns is
Did not find value '<Table Title>' in LOOKUP evaluation.
My expected table is
| Max Levels |
+------------------+---------------+----+--+----+
| UW | Table Address | | | |
+------------------+---------------+----+--+----+
| Death Wave | $C$2 | 3 | | 15 |
| Poison Swamp | $C$28 | | | |
| Smart Missiles | $C$54 | | | |
| Black Hole | $O$2 | 1 | | |
| Inner Land Mines | $O$28 | | | |
| Chain Lightning | $O$54 | | | |
| Golden Tower | $AA$2 | | | |
| Chrono Field | $AA$39 | 25 | | |
try:
=INDEX(ADDRESS(
VLOOKUP(A2:A3, SPLIT(FLATTEN(D2:F4&"​"&ROW(D2:F4)), "​"), 2, ),
VLOOKUP(A2:A3, SPLIT(FLATTEN(D2:F4&"​"&COLUMN(D2:F4)), "​"), 2, ), 4))
or if you want to create jump links:
=INDEX(LAMBDA(x, HYPERLINK("#gid=1273961649&range="&x, x))(ADDRESS(
VLOOKUP(A2:A3, SPLIT(FLATTEN(D2:F4&"​"&ROW(D2:F4)), "​"), 2, ),
VLOOKUP(A2:A3, SPLIT(FLATTEN(D2:F4&"​"&COLUMN(D2:F4)), "​"), 2, ), 4)))
Try this:
=QUERY(
FLATTEN(
ARRAYFORMULA(
IF(
C:AI=$AK4,
ADDRESS(ROW(C:AI), COLUMN(C:AI)),
""
)
)
), "
SELECT
Col1
WHERE
Col1<>''
"
, 0)
Basically, cast all cells in the search range to addresses if they equal the search term. Then flatten that 2D range and filter out non-nulls.

Explode a dataframe column of csv text into columns

+-------------+-------------------+--------------------+----------------------+
|serial_number| test_date| s3_path| table_csv_data |
+-------------+-------------------+--------------------+----------------------+
| 1050D1B0|2019-05-07 15:41:11|s3://test-bucket-...|col1,col2,col3,col4...|
| 1050D1B0|2019-05-07 15:41:11|s3://test-bucket-...|col1,col2,col3,col4...|
| 1050D1BE|2019-05-08 09:26:55|s3://test-bucket-...|col1,col2,col3,col4...|
| A0123456|2019-07-25 06:54:28|s3://test-bucket-...|col1,col2,col3,col4...|
| A0123456|2019-07-22 21:07:21|s3://test-bucket-...|col1,col2,col3,col4...|
| A0123456|2019-07-22 21:07:21|s3://test-bucket-...|col1,col2,col3,col4...|
| A0123456|2019-07-25 00:19:52|s3://test-bucket-...|col1,col2,col3,col4...|
| A0123456|2019-07-24 22:24:40|s3://test-bucket-...|col1,col2,col3,col4...|
| A0123456|2019-09-12 22:15:19|s3://test-bucket-...|col1,col2,col3,col4...|
| A0123456|2019-07-22 21:27:56|s3://test-bucket-...|col1,col2,col3,col4...|
+-------------+-------------------+--------------------+----------------------+
sample table_csv_data column contains:
timestamp,partition,offset,key,value
1625218801350,97,33009,2CKXTKAT_20210701193302_6400_UCMP,458969040
1625218801349,41,33018,3FGW9S6T_20210701193210_6400_UCMP,17569160
Trying to achieve the final dataframe as below, please help
+-------------+-------------------+--------------------+-----------------+-----------+-----------------------------------+--------------+
|serial_number| test_date| timestamp| partition | offset | key | value |
+-------------+-------------------+--------------------+-----------------+-----------+-----------------------------------+--------------+
| 1050D1B0|2019-05-07 15:41:11| 1625218801350 | 97 | 33009 | 2CKXTKAT_20210701193302_6400_UCMP | 458969040 |
| 1050D1B0|2019-05-07 15:41:11| 1625218801349 | 41 | 33018 | 3FGW9S6T_20210701193210_6400_UCMP | 17569160 |
..
..
..
I cannot think of an approach, kindly help with some suggestions.
As I alternative, I converted the sting csv data into list using csv_reader as well as below, but post that I have been blocked
[[timestamp,partition,offset,key,value],
[1625218801350, 97, 33009, 2CKXTKAT_20210701193302_6400_UCMP, 458969040]
[1625218801349, 41,33018, 3FGW9S6T_20210701193210_6400_UCMP, 17569160]]
You just need to use split :
from pyspark.sql import functions as F
df = df.withColumn("table_csv_data", F.split("table_csv_data", ",")).select(
"serial_number",
"test_date",
F.col("table_csv_data").getItem(0).alias("timestamp"),
F.col("table_csv_data").getItem(1).alias("partition"),
... # Do the same for all the columns you need
)

SUMIFS on Filtered Data

I am trying to do a SUMIFS in an Excel sheet where I need to take into consideration cells that are filtered. I can accomplish this with a pivot table, but the same source data is already being using in a handful of Bubble Charts which update on filtering anyways. The only reason for this other table to the data needs to be aggregated differently to get the desired chart out. I am trying to avoid Pivots so that once the user selects their filters in the Source Table (using Slicers) the destination table and accompanying chart just update. If I use a pivot table to handle this part the user will have to duplicate work by putting their parameters into the Slicers and the PivotTable.
Row# in the real data is a unique identifier for each row. For filtering there is roughly 20 columns that could be filtered.
I have seen that SUMPRODUCT can do this, but I cannot wrap my head around how to do it with my data set.
Source Table Example
+------+---------------+-------+-------+-------+---------+----------+
| Row# | Amount | Class |Channel| Other | Columns | ToFilter |
+------+---------------+-------+-------+-------+---------+----------+
| 1 | $122,616.16 | 10 | A | Stuff | Stuff | Stuff |
| 2 | $128,587.43 | 1 | B | Stuff | Stuff | Stuff |
| 3 | $273,055.04 | 10 | C | Stuff | Stuff | Stuff |
| 25 | $144,087.59 | 4 | A | Stuff | Stuff | Stuff |
| 26 | $273,537.45 | 2 | A | Stuff | Stuff | Stuff |
| 27 | $110,177.94 | 2 | B | Stuff | Stuff | Stuff |
| 3674 | $455,133.20 | 2 | C | Stuff | Stuff | Stuff |
+------+---------------+-------+-------+-------+---------+----------+
Destination Table Example
+---------+---------------+---------------+--------+---------------+-----+---------------+
| Channel | 1 | 2 | 3 | 4 | ~ | 10 |
+---------+---------------+---------------+--------+---------------+-----+---------------+
| A | $- | $273,537.45 | $- | $144,087.59 | ~ | $122,616.16 |
| B | $128,587.43 | $110,177.94 | $- | $- | ~ | $- |
| C | $- | $455,133.20 | $- | $- | ~ | $273,055.04 |
+---------+---------------+---------------+--------+---------------+-----+---------------+
Assuming the source table is located at A1:G8, and the destination table is located at A10:K13 .
Put '=IFERROR(INDEX($C$2:$C$8,MATCH(1,INDEX((B$10=$D$2:$D$8)*($A11=$E$2:$E$8),0,1),0)),"-")` in B11 and drag till the end.
Idea : use multiple criteria index match & wrap it in iferror(). $ is used to lock the reference column/row.
Ref : https://exceljet.net/formula/index-and-match-with-multiple-criteria
Please share if it works/not. ( :

How to transpose all subfields in front of the parent field of a pivot table in Excel

I have a data of automotive spare parts with their multiple store locations in a warehouse.
all I want to do is get the locations in front of the part number, so that it is easy to know all the locations of a specific part number.
The current pivot data looks like this
I've manually transposed a few rows in the below image, but the data contains around 70K rows, Hence I'm looking for a better solution
Kindly refer to the below table
+--------------+-----+-------+-------------+
| Item name | Qty | UoM | Stock |
+--------------+-----+-------+-------------+
| '0450000115 | 324 | piece | G12B04 |
| '0450000A61 | 312 | piece | G12B05 |
| '0450000115 | 336 | piece | G12B06 |
| '0450000A61 | 228 | piece | G12B07 |
| '0450000115 | 336 | piece | G12B08 |
| '0450000115 | 192 | piece | G12B09 |
| '087902E200A | 470 | piece | G12B10 |
| '087902E200A | 760 | piece | G12B13 |
| '087902E200A | 759 | piece | G12B14 |
| '0450000115 | 336 | piece | G12B15 |
| '087902E200A | 400 | piece | G12B16 |
| '087902E200A | 10 | piece | G3B32 |
| '084B410426 | 100 | piece | G3B32 |
| '087902E200A | 300 | piece | G4B08 |
| '0450000A61 | 2 | piece | GDB01 |
| '084B410426 | 60 | piece | GR.04.C.04. |
| '087902E200A | 327 | piece | HD.03.K.05. |
+--------------+-----+-------+-------------+
You need to create a measure, using the CONCATENATEX function. For this you need to add your data to the datamodel. You can do this by checking the box add this data to the datamodel on the bottom of the create pivottable dialogbox.
Rightclick the table on the Pivottable Fields Pane and select add measure. Then create the following measure: = CONCATENATEX('table','table'[Stock],", ")
Now put [Item name] on Rows and the measure [StockText] on Values. This should be the result:

Count text occurrences in a column in Excel

I have the following list in Excel:
+-------+----------+
| am | ipiresia |
+-------+----------+
| 50470 | 29 |
| 50470 | 43 |
| 50433 | 29 |
| 6417 | 51 |
| 6417 | 52 |
| 6417 | 53 |
| 4960 | 25 |
| 4960 | 26 |
| 5567 | 89 |
| 6716 | 88 |
+-------+----------+
I want to add a column, let's say 'num' and count the occurrences of column 'am' in a row adding one when a new occurrence happens as follows:
+-------+----------+-----+
| am | ipiresia | num |
+-------+----------+-----+
| 50470 | 29 | 1 |
| 50470 | 43 | 2 |
| 50433 | 29 | 1 |
| 6417 | 51 | 1 |
| 6417 | 52 | 2 |
| 6417 | 53 | 3 |
| 4960 | 25 | 1 |
| 4960 | 26 | 2 |
| 5567 | 89 | 1 |
| 6716 | 88 | 1 |
+-------+----------+-----+
Is it possible to get this automatically with a formula in Excel?
yes,
my example:
(assume you start your table containing 3 columns at Excels origin at A1 without header lines)
Then fill C1 with value "1"
and then start in C2 with entering a formula
simple like this:
=if($A2=$A1;$C1+1;1)
then you drag C2 down at the cells downright located autofill position as far as you want. Most times also double click works to let Excel autofill the columns down to the end of you prefilled table.
If you need assistance for AutoFill press F1 in Excel an the help with tell you in detail.
Assuming the sample table starts at A1 (with headers) the following formula will provide the expected results even if the list is not sorted.
=COUNTIF($A$1:$A2,A2)
Enter the formula at cell C2 then paste it down to the last cell of the data (or use AutoFill)

Resources