How to create a pivot view in MariaDB from json keys? - pivot

I have a table like this:
+--------+--------------+-----------------------+
| hostid | hostName | attr |
+--------+----------+---------------------------+
| 1 | host1 | {"cpu": 1, "ram": 2} |
+--------+--------------+-----------------------+
| 2 | host2 | {"cpu": 2, "ram": 1} |
+--------+--------------+-----------------------+
| 3 | host3 | {"cpu": 1, "ram": 5} |
+--------+--------------+-----------------------+
| 4 | host4 | {"cpu": 3, "ram": 7} |
+--------+--------------+-----------------------+
I would like to create a view that show a table like this, where the columns was taken from "attr" column (json keys):
+--------+--------------+-----------+
| hostid | hostName | cpu | ram |
+--------+----------+---------------+
| 1 | host1 | 1 | 2 |
+--------+--------------+-----------+
| 2 | host2 | 2 | 1 |
+--------+--------------+-----------+
| 3 | host3 | 1 | 5 |
+--------+--------------+-----------+
| 4 | host4 | 3 | 7 |
+--------+--------------+-----------+
How I can create a view like this? the complexity is in auto discover the json keys.

Related

Unique count of values in column per month

Excel-Table:
| A | B | C | D | E | F | G |
-----|----------------|-----------------|------------------|--------|---------|---------|---------|-----
1 | month&year | date | customer | | 2020-01 | 2020-03 | 2020-04 |
-----|----------------|-----------------|------------------|--------|---------|---------|---------|-----
2 | 2020-01 | 2020-01-10 | Customer A | | 3 | 2 | 4 |
3 | 2020-01 | 2020-01-14 | Customer A | | | | |
4 | 2020-01 | 2020-01-17 | Customer B | | | | |
5 | 2020-01 | 2020-01-19 | Customer B | | | | |
6 | 2020-01 | 2020-01-23 | Customer C | | | | |
7 | 2020-01 | 2020-01-23 | Customer B | | | | |
-----|----------------|-----------------|---------------- -|--------|---------|---------|---------|-----
8 | 2020-03 | 2020-03-18 | Customer E | | | | |
9 | 2020-03 | 2020-03-19 | Customer A | | | | |
-----|----------------|-----------------|------------------|--------|---------|---------|---------|-----
10 | 2020-04 | 2020-04-04 | Customer B | | | | |
11 | 2020-04 | 2020-04-07 | Customer C | | | | |
12 | 2020-04 | 2020-04-07 | Customer A | | | | |
13 | 2020-04 | 2020-04-07 | Customer E | | | | |
14 | 2020-04 | 2020-04-08 | Customer A | | | | |
15 | 2020-04 | 2020-04-12 | Customer A | | | | |
16 | 2020-04 | 2020-04-15 | Customer B | | | | |
17 | |
In my Excel file I want to calculate the unique count of cutomers per month as you can see in Cell E2:G2.
I already inserted Column A as a helper column which extracts only the month and the year from the date in Column B.
Therefore, the date-formatting is the same as in the timline in Cell E1:G2.
I guess the formula to get the unique count per month is somehow related to =COUNTIFS($A:$A,E$1) but I have no clue how to modify this formula to get the expected values.
Do you have any idea?
Here's one approach which would work for Office 365 and if you have access to UNIQUE:
=COUNTA(UNIQUE(IF($A$2:$A$16=G$1,$C$2:$C$16,""),,FALSE))-1
For older versions, following will work with CTRL+SHIFT+ENTER (array entry)
=SUM(--(FREQUENCY(IFERROR(MATCH($A$2:$A$16&$C$2:$C$16,E$1&$C$2:$C$16,0),"a"),MATCH($A$2:$A$16&$C$2:$C$16,E$1&$C$2:$C$16,0))>0))
You can do it without any helping column.
=SUM(--(UNIQUE(FILTER($C$2:$C$16,TEXT($B$2:$B$16,"yyyy-mm")=E$1))<>""))
For older version of excel use below formula with your helper column.
=SUMPRODUCT(--($A$2:$A$16=D$1)*(1/COUNTIFS($A$2:$A$16,$A$2:$A$16,$C$2:$C$16,$C$2:$C$16)))

Pandas: How to merge cells in the dataframe from a specific column using pandas?

I want to remove the duplicated names from the cells and merge them. This dataframe is generated after concatenating multiple dataframes.
My dataframe as under:
| | Customer ID | Category | VALUE |
| -:|:----------- |:------------- | -------:|
| 0 | GETO90 | Baby Sets | 1090.0 |
| 1 | GETO90 | Girls Dresses | 5357.0 |
| 2 | GETO90 | Girls Jumpers | 2823.0 |
| 3 | SETO90 | Girls Top | 3398.0 |
| 4 | SETO90 | Shorts | 7590.0 |
| 5 | SETO90 | Shorts | 7590.0 |
| 6 | RETO90 | Pants | 6590.0 |
| 7 | RETO90 | Pants | 6590.0 |
| 8 | RETO90 | Jeans | 8590.0 |
| 9 | YETO90 | Jeans | 9590.0 |
| 10| YETO90 | Jeans | 2590.0 |
I want to merge the first column and the expected dataframe is mentioned below:
| | Customer ID | Category | VALUE |
| -:|:----------- |:------------- | -------:|
| 0 | GETO90 | Baby Sets | 1090.0 |
| 1 | | Girls Dresses | 5357.0 |
| 2 | | Girls Jumpers | 2823.0 |
| 3 | SETO90 | Girls Top | 3398.0 |
| 4 | | Shorts | 7590.0 |
| 5 | | Shorts | 7590.0 |
| 6 | RETO90 | Pants | 6590.0 |
| 7 | | Pants | 6590.0 |
| 8 | | Jeans | 8590.0 |
| 9 | YETO90 | Jeans | 9590.0 |
| 10| | Jeans | 2590.0 |
Use duplicated with loc:
df.loc[df.duplicated('Customer ID'), 'Customer ID'] = ''

SIGN() formula returns unexpected results

In continuation of my previous question: Sumproduct with multiple criteria on one range
Jeeped provided me with an very helpful formula to achieve a sumproduct() which takes multiple criteria. My current case is however a bit broader:
Take these example tables:
First column is the ID number, second column a respondent group(A,B). Column headers are question types (X,Y,Z).
Table Q1
| | | X | Y | Y | Z | Y |
|----|---|---|---|---|---|---|
| 1 | A | 2 | 2 | 1 | | 1 |
| 2 | A | 1 | 1 | | | 2 |
| 3 | A | 1 | 1 | | | 1 |
| 4 | A | 2 | 1 | | | 1 |
| 5 | A | 1 | 2 | 1 | | 1 |
| 6 | A | 1 | 1 | | | 1 |
| 7 | A | | | | | |
| 8 | A | | | | | |
| 9 | A | 1 | 1 | | | 1 |
| 10 | A | 2 | 2 | 2 | | 2 |
| 11 | A | | | | | |
| 12 | A | 1 | 2 | 1 | | 2 |
| 13 | B | | | | | |
| 14 | B | 1 | 1 | | | 1 |
| 15 | B | 2 | 2 | 1 | | 1 |
Table Q2
| | | X | Y | Y | Z | Y |
|----|---|---|---|---|---|---|
| 1 | A | 1 | 2 | 1 | | 1 |
| 2 | A | 1 | 1 | | | 1 |
| 3 | A | 1 | 1 | | | 1 |
| 4 | A | 1 | 1 | | | 1 |
| 5 | A | 1 | 1 | | | 1 |
| 6 | A | 1 | 1 | | | 1 |
| 7 | A | | | | | |
| 8 | A | | | | | |
| 9 | A | 1 | 1 | | | 1 |
| 10 | A | 1 | 1 | | | 1 |
| 11 | A | | | | | |
| 12 | A | 1 | 2 | 1 | | 1 |
| 13 | B | | | | | |
| 14 | B | 1 | 1 | | | 1 |
| 15 | B | 1 | 2 | 1 | | 1 |
Now I want to know the amount of times a respondent answered 1 (yes) on Q2 for each question type (X,Y,Z). The catch is that if someone answered 1 (yes) on Q1 it should "override" the answer on Q2, as we assume that when someone answers yes on Q1 (implementation of a measure), their answer on Q2 (knowledge of said measure) has to be yes as well.
The second catch is that for the first two occurrences of Y there can only be yes in one of both columns, so in fact there can only be two yes answers for question type Y for each respondent.
I used the following formula (on sheet 3): =SUMPRODUCT(SIGN(('Q1'!$C$2:$G$16=1)+('Q2'!$C$2:$G$16=1))*('Q2'!$B$2:$B$16=Blad3!$D5)*('Q2'!$C$1:$G$1=Blad3!E$4)) to obtain the following results.
| | X | Y | Z |
|---|---|----|---|
| A | 9 | 19 | 0 |
| B | 2 | 4 | 0 |
For X these results are correct, as there are 9 1's in table Q2.
For Y the results for B are correct, for A however they are not, as there are only 9 respondents, answering max 2 questions would result in a max of 18, we have 19 however.
It turns out there is nothing wrong with the formula, just that it isn't suited for the way this data is organised. If you look at row 5:
Q1
| | | X | Y | Y | Z | Y |
|----|---|---|---|---|---|---|
| 5 | A | 1 | 2 | 1 | | 1 |
Q2
| | | X | Y | Y | Z | Y |
|----|---|---|---|---|---|---|
| 5 | A | 1 | 1 | | | 1 |
If we condense that to everywhere there is a 1 in any of the Y column we get this table:
| | | X | Y | Y | Z | Y |
|----|---|---|---|---|---|---|
| 5 | A | | 1 | 1 | | 1 |
When I ask for the sumproduct() for this combined table the result will be 3.
To prevent this I added a helper column (between the two Y and the Z column) to my tables, with the following formula: IF(OR(D1=1,E1=1),1,""). Removed the headers from the double Y columns, and re-running the query produced the correct results.
New table Q1 looks like this then:
| | | X | | | Y | Z | Y |
|----|---|---|---|---|---|---|---|
| 1 | A | 2 | 2 | 1 | 1 | | 1 |
| 2 | A | 1 | 1 | | 1 | | 2 |
| 3 | A | 1 | 1 | | 1 | | 1 |
| 4 | A | 2 | 1 | | 1 | | 1 |
| 5 | A | 1 | 2 | 1 | 1 | | 1 |
| 6 | A | 1 | 1 | | 1 | | 1 |
| 7 | A | | | | | | |
| 8 | A | | | | | | |
| 9 | A | 1 | 1 | | 1 | | 1 |
| 10 | A | 2 | 2 | 2 | | | 2 |
| 11 | A | | | | | | |
| 12 | A | 1 | 2 | 1 | 1 | | 2 |
| 13 | B | | | | | | |
| 14 | B | 1 | 1 | | 1 | | 1 |
| 15 | B | 2 | 2 | 1 | 1 | | 1 |

Creating a Product Tree from Excel BOM

I have a standard Bill of Materials in Excel. The hierarchy is defined by the product level in column 1, column 2 and 3 are part number and product name respectively.
Example:
+---+--------+--------------------+
| 1 | 2 | 3 |
+---+--------+--------------------+
| 0 | 111111 | TOP LEVEL ASSEMBLY |
| | | |
| 1 | 123456 | ABC |
| | | |
| 2 | 454444 | DEF |
| | | |
| 2 | 533433 | GFG |
| | | |
| 3 | 342333 | DFD |
| | | |
| 3 | 234232 | FFD |
| | | |
| 4 | 234343 | DSD |
| | | |
| 3 | 322222 | DDS |
| | | |
| 1 | 343433 | DFD |
+---+--------+--------------------+
If this was structured, it would look like this:
0 111111 TOP LEVEL ASSEMBLY
1 123456 ABC
2 454444 DEF
2 533433 GFG
3 342333 DFD
3 234232 FFD
4 234343 DSD
3 322222 DDS
1 343433 DFD
I am looking to have a macro create an actual family tree structure that would show the dependency of these items in Visio (with boxes and logical connections). So in this case, it would look something like this (except in block/arrow format).
**111111 TOP LEVEL ASSEMBLY**
123456 ABC 1 343433 DFD
454444 DEF 533433 GFG
342333 DFD 234232 FFD 322222 DDS
234343 DSD
Any help will be appreciated!

Expand a data set using two columns

In Excel, I have two columns of data that I wish to combine.
Current set of data:
+---------+---------+
| column1 | column2 |
+---------+---------+
| a | 1 |
| b | 2 |
| c | 3 |
| d | 4 |
| | 5 |
| | 6 |
| | 7 |
+---------+---------+
For each value in column1, I need to assign all of the values in column2 so it looks like this:
+---------+---------+
| column1 | column2 |
+---------+---------+
| a | 1 |
| a | 2 |
| a | 3 |
| a | 4 |
| a | 5 |
| a | 6 |
| a | 7 |
+---------+---------+
| b | 1 |
| b | 2 |
| b | 3 |
| b | 4 |
| b | 5 |
| b | 6 |
| b | 7 |
+---------+---------+
| c | 1 |
| c | 2 |
| c | 3 |
| c | 4 |
| c | 5 |
| c | 6 |
| c | 7 |
+---------+---------+
| d | 1 |
| d | 2 |
| d | 3 |
| d | 4 |
| d | 5 |
| d | 6 |
| d | 7 |
+---------+---------+
How can I do this?
Do I need to find a macro/VB solution?
Since seems unlikely to receive any other answer:
in A1: a
in B1: =MOD(ROW()-1,7)+1
in A2: =IF(MOD(ROW()-1,7)>0,CHAR(CODE(A1)),CHAR(CODE(A1)+1))
Copy both formulae down to suit.

Resources