How use grep for that complicated expressions? - linux

+----+-------+-----+
| ID | STORE | QTY |
+----+-------+-----+
| | | |
| 9 | 101 | 18 |
| | | |
| 8 | 154 | 19 |
| | | |
| 7 | 111 | 13 |
| | | |
| 9 | 154 | 18 |
| | | |
| 8 | 101 | 19 |
| | | |
| 7 | 101 | 13 |
| | | |
| 9 | 111 | 18 |
| | | |
| 8 | 111 | 19 |
| | | |
| 7 | 154 | 14 |
+----+-------+-----+
Suppose that I have 3 stores, and I'd like to take STORE for every id which qty is the same for every store.
e.g id 9 is in 3 stores, in every store has 18 qty,
but id 7 is in stores but in only two store has equal qty (in store 111 and 101 - in 154 - id has 14 qty); how can I get that result using grep?
Do you think that is impossible to get that one in one expressions? I thought about regex but I don't know in which way I get Qty and compare to another row. In my file it looks like:

Extract the first and last columns by cut, count the number of uniq combinations, and output only those whose count is 3 (i.e. the value is the same for all three stores):
$ cut -d\| -f2,4 | sort | uniq -c | grep '^ *3 '
3 8 | 19
3 9 | 18

Related

How does efficient_apriori package code itemsets?

Doing association rules mining using efficient_apriori package in python.Found a very useful answer on how to convert the output to a dataframe
However, struggling with the Itemset output and hoping someone can help me parse that correctly. Guessing LHS is an index value, but struggling with the decimal values in RHS. Does anyone know how the encoding is done? I have tried the same with SKU descriptions, and get the same output.
Input dataframe looks like this:
| SKU | Count | Percent |
|----------------------------------------------------------------------|-------|-------------|
| "('000000009100000749',)" | 110 | 0.029633621 |
| "('000000009100000749', '000000009100000776')" | 1 | 0.000269397 |
| "('000000009100000749', '000000009100000776', '000000009100002260')" | 1 | 0.000269397 |
| "('000000009100000749', '000000009100000777', '000000009100002260')" | 1 | 0.000269397 |
| "('000000009100000749', '000000009100000777', '000000009100002530')" | 1 | 0.000269397 |
Output looks like this:
| | lhs | rhs | count_full | count_lhs | count_rhs | num_transactions | confidence | support |
|---|-----------------------------|-----------------------------|------------|-----------|-----------|------------------|------------|-------------|
| 0 | "(1,)" | "(0.00026939655172413793,)" | 168 | 168 | 168 | 297 | 1 | 0.565656566 |
| 1 | "(0.00026939655172413793,)" | "(1,)" | 168 | 168 | 168 | 297 | 1 | 0.565656566 |
| 2 | "(2,)" | "(0.0005387931034482759,)" | 36 | 36 | 36 | 297 | 1 | 0.121212121 |
| 3 | "(0.0005387931034482759,)" | "(2,)" | 36 | 36 | 36 | 297 | 1 | 0.121212121 |
| 4 | "(3,)" | "(0.0008081896551724138,)" | 21 | 21 | 21 | 297 | 1 | 0.070707071 |
Could someone help me understand what's outputting in the LHS and RHS columns, and how to join that back to the 'SKU'? Ideally would like the output to have the 'SKU' instead of whatever is showing up.
I have looked at the documentation and it is quite sparse.

Create descending list inlcuding duplicates based on filter criteria

Excel-File
| A | B | C | D | E | F |
---|--------------|-------------------|--------|-----------------|------------|------------|-
1 | Sales | Product | | Product | Sales | |
2 | 20 | Product_A | | Product_D | 100 | Product_D |
3 | 10 | Product_A | | Product_D | 90 | |
4 | 50 | Product_A | | Product_D | 50 | |
5 | 80 | Product_B | | Product_D | 50 | |
6 | 40 | Product_C | | | | |
7 | 30 | Product_C | | | | |
8 | 100 | Product_D | | | | |
9 | 90 | Product_D | | | | |
10 | 50 | Product_D | | | | |
11 | 50 | Product_D | | | | |
12 | | | | | | |
In Column B I have list of different products with their corresponding sales in Column A.
Products can appear mutliple times in the list.
Sales numbers can be equal for multiple products.
I want to use the value in Cell F2 as Filter-Criteria to create a descending list of the products in Column D and Column E sorted by the sales in Column A.
Therefore, I tried to add the FILTER function to the formula from this question:
=INDEX(SORT(FILTER(A2:B11,A2:A11=F2,""),2,-1),SEQUENCE(COUNT(A2:A11)),{2,1})
However, with this formula I get error #VALUE.
How do I need to modify the formula to make it work?
Simply add COUNTIF() inside the SEQUENCE():
=INDEX(SORT(FILTER(A2:B11,B2:B11=F2,""),2,-1),SEQUENCE(COUNTIF(B2:B11,F2)),{2,1})
Current view on OP side:
Due to unknown reason only Column D gets filled.

Show text as value Power Pivot using DAX formula

Is there a way by using a DAX measure to create the column which contain text values instead of the numeric sum/count that it will automatically give?
In the example below the first name will appear as a value (in the first table) instead of their name as in the second.
Data table:
+----+------------+------------+---------------+-------+-------+
| id | first_name | last_name | currency | Sales | Stock |
+----+------------+------------+---------------+-------+-------+
| 1 | Giovanna | Christon | Peso | 10 | 12 |
| 2 | Roderich | MacMorland | Peso | 8 | 10 |
| 3 | Bond | Arkcoll | Yuan Renminbi | 4 | 6 |
| 1 | Giovanna | Christon | Peso | 11 | 13 |
| 2 | Roderich | MacMorland | Peso | 9 | 11 |
| 3 | Bond | Arkcoll | Yuan Renminbi | 5 | 7 |
| 1 | Giovanna | Christon | Peso | 15 | 17 |
| 2 | Roderich | MacMorland | Peso | 10 | 12 |
| 3 | Bond | Arkcoll | Yuan Renminbi | 6 | 8 |
| 1 | Giovanna | Christon | Peso | 17 | 19 |
| 2 | Roderich | MacMorland | Peso | 11 | 13 |
| 3 | Bond | Arkcoll | Yuan Renminbi | 7 | 9 |
+----+------------+------------+---------------+-------+-------+
No DAX needed. You should put the first_name field on Rows and not on Values. Select Tabular View for the Report Layout. Like this:
After some search I found 4 ways.
measure 1 (will return blank if values differ):
=IF(COUNTROWS(VALUES(Table1[first_name])) > 1, BLANK(), VALUES(Table1[first_name]))
measure 2 (will return blank if values differ):
=CALCULATE(
VALUES(Table1[first_name]),
FILTER(Table1,
COUNTROWS(VALUES(Table1[first_name]))=1))
measure 3 (will show every single text value), thanks # Rory:
=CONCATENATEX(Table1,[first_name]," ")
For very large dataset this concatenate seems to work better:
=CALCULATE(CONCATENATEX(VALUES(Table1[first_name]),Table1[first_name]," "))
Results:

How to add space between rows and sum up automatically in Excel

let's say that I have a table like the below:
| | Value 1 | Value 2 | Value 3 | |
|---|---------|---------|---------|---|
| A | 22 | 12 | 3 | |
| A | 5 | 6 | 12 | |
| A | 19 | 9 | 13 | |
| A | 22 | 43 | 31 | |
| B | 7 | 12 | 23 | |
| B | 5 | 5 | 8 | |
| B | 35 | 78 | 9 | |
| B | 45 | 1 | 8 | |
| C | 34 | 56 | 0 | |
| C | 22 | 1 | 14 | |
| C | 13 | 46 | 45 | |
and that I'd need to transform it into the below:
| | Value 1 | Value 2 | Value 3 | |
|---|---------|---------|---------|---|
| A | 22 | 12 | 3 | |
| A | 5 | 6 | 12 | |
| A | 19 | 9 | 13 | |
| A | 22 | 43 | 31 | |
| | 68 | 70 | 59 | |
| | | | | |
| B | 7 | 12 | 23 | |
| B | 5 | 5 | 8 | |
| B | 35 | 78 | 9 | |
| B | 45 | 1 | 8 | |
| | 92 | 96 | 48 | |
| | | | | |
| C | 34 | 56 | 0 | |
| C | 22 | 1 | 14 | |
| C | 13 | 46 | 45 | |
| | 69 | 103 | 59 | |
How could I obtain the desired effect automatically?
There would be n empty rows after each group and the sums of each column within the group.
You can use the Subtotal feature of Excel. Subtotal is in the "Data" tab of the ribbon. To automatically add the totals between groupings. I don't think it adds the blank row. If you absolutely need the blank row, then I can generate some VBA that will work.

excel I need formula in column name "FEBRUARY"

I have a set of data as below.
SHEET 1
+------+-------+
| JANUARY |
+------+-------+
+----+----------+------+-------+
| ID | NAME |COUNT | PRICE |
+----+----------+------+-------+
| 1 | ALFRED | 11 | 150 |
| 2 | ARIS | 22 | 120 |
| 3 | JOHN | 33 | 170 |
| 4 | CHRIS | 22 | 190 |
| 5 | JOE | 55 | 120 |
| 6 | ACE | 11 | 200 |
+----+----------+------+-------+
SHEET2
+----+----------+------+-------+
| ID | NAME |COUNT | PRICE |
+----+----------+------+-------+
| 1 | CHRIS | 13 | 123 |
| 2 | ACE | 26 | 165 |
| 3 | JOE | 39 | 178 |
| 4 | ALFRED | 21 | 198 |
| 5 | JOHN | 58 | 112 |
| 6 | ARIS | 11 | 200 |
+----+----------+------+-------+
The RESULT should look like this in sheet1 :
+------+-------++------+-------+
| JANUARY | FEBRUARY |
+------+-------++------+-------+
+----+----------+------+-------++-------+-------+
| ID | NAME |COUNT | PRICE || COUNT | PRICE |
+----+----------+------+-------++-------+-------+
| 1 | ALFRED | 11 | 150 || 21 | 198 |
| 2 | ARIS | 22 | 120 || 11 | 200 |
| 3 | JOHN | 33 | 170 || 58 | 112 |
| 4 | CHRIS | 22 | 190 || 13 | 123 |
| 5 | JOE | 55 | 120 || 39 | 178 |
| 6 | ACE | 11 | 200 || 26 | 165 |
+----+----------+------+-------++-------+-------+
I need formula in column name "FEBRUARY". this formula will find its match in sheet 2
Assuming the first Count value should go in cell E3 of Sheet1, the following formula would be the usual way of doing it:-
=INDEX(Sheet2!C:C,MATCH($B3,Sheet2!$B:$B,0))
Then the Price (in F3) would be given by
=INDEX(Sheet2!D:D,MATCH($B3,Sheet2!$B:$B,0))
I think this query will work fine for your requirement
SELECT `Sheet1$`.ID,`Sheet1$`.NAME, `Sheet1$`.COUNT AS 'Jan-COUNT',`Sheet1$`.PRICE AS 'Jan-PRICE', `Sheet2$`.COUNT AS 'Feb-COUNT',`Sheet2$`.PRICE AS 'Feb-PRICE'
FROM `C:\Users\Nagendra\Desktop\aaaaa.xlsx`.`Sheet1$` `Sheet1$`, `C:\Users\Nagendra\Desktop\aaaaa.xlsx`.`Sheet2$` `Sheet2$`
WHERE (`Sheet1$`.NAME=`Sheet2$`.NAME)
Provide Actual path insted of
C:\Users\Nagendra\Desktop\aaaaa.xlsx
First you need to know about how to make connection. So refer http://smallbusiness.chron.com/use-sql-statements-ms-excel-41193.html

Resources