Product attribute combinations generation in Excel - excel

I have a table which contains 13.931 rows and 2 columns, the first column is SKU's, second is Options (Size, Colour, etc..);
SKU Option
0001 Size:S
0001 Size:M
0001 Size:L
0001 Size:XL
0001 Colour:Red
0001 Colour:Blue
0002 Size:S
0002 Size:M
0002 Size:L
0002 Colour:Navy
0002 Leg:G
and goes on like this. What I need to do is generate every combination of these options (there are 7 option types in total; some SKU's has 2 types some has 4, some non) within SKU's, so it'll be like this;
SKU Option
0001 Size:S;Colour:Red
0001 Size:M;Colour:Red
0001 Size:L;Colour:Red
0001 Size:XL;Colour:Red
0001 Size:S;Colour:Blue
0001 Size:M;Colour:Blue
0001 Size:L;Colour:Blue
0001 Size:XL;Colour:Blue
0002 Size:S;Colour:Navy;Leg:G
0002 Size:M;Colour:Navy;Leg:G
0002 Size:L;Colour:Navy;Leg:G
I can seperate Option column into 2 for option types and option values if it makes it easier.
My question is: Is it doable using macros? Because it is going to be a huge pain if I manually do these.

Related

Pandas groupby timestamp and increase count

I am a beginner in pandas and I would like some help about a problem I have.
I have a csv file structured as follow:
#timestamp. message. name. ID
2021-07-10 14:01:00 user 0001 has logged out. User Log Off. 0001
2021-07-10 14:01:10 user 0002 has logged out. User Log Off. 0002
2021-07-10 14:01:15 user 0003 has logged out. User Log Off. 0003
2021-07-10 14:08:20 user 0001 has logged out. User Log Off. 0001
I would like to do, is to go through all the columns, and check if they are doubles, and if they are double in a time span of 10 min(based on the timestamp) to add a column with the number of the counted event.
for example this is what I would like to have as an output
#timestamp. message. name. ID. count
2021-07-10 14:01:00 user 0001 has logged out. User Log Off. 0001. 2
2021-07-10 14:01:10 user 0002 has logged out. User Log Off. 0002. 1
2021-07-10 14:01:15 user 0003 has logged out. User Log Off. 0003 1
Basically group the double event into only one row with the number of event counted in that time span.
Is this something achievable with pandas?
Thank you so much for any help
Here's an outline that you can follow:
# 0. sort data by timestamp if not already sorted
df = df.sort_values('#timestamp')
# lazy groupby
groups = df.groupby(['message.','name.', 'ID'])
# 1. compute the time differences `timediff` and compare to threshold
df['timediff'] = groups['#timstamp.'].diff() > pd.Timedelta('10T')
# 2. find the blocks with cumsum
df['block'] = groups['timediff'].cumsum()
# 3. groupby the blocks
out = (df.groupby(['blocks','message.','name.', 'ID'])
.agg({'#timestamp.':'first', 'timediff':'count'})
)
Note this will group 00:00:00, 00:09:00, and 00:18:00 together.

Insert a 0"zero" infront of number for specify number sequences

I have this data
Name | Code | Price
XXX 102 1000
YYY 4321 1150
ZZZ 202 1150
AAA 123 1000
I can now Add concatenate and Add 0 in front of Code which makes
0102
04321
0202
0123
Now here the problem lies. I dont want that 0 in front of 4321 . I want 0 only infront of 3 digit numbers not more than 3 digit.
Right click on Column, go to Format cell-->Custom and write 0000 in the type and click on Ok
Simplest and easy solution
Assuming the '102' data is located at B2, just type :
=IF(len(B2)<=3,"0"&B2,B2)
will do. Alternatively, using concatenate() function you may do it like this :
=IF(len(B2)<=3,CONCATENATE("0"&B2),B2)
Assuming you have codes in B column
if(len(b2)=3,concatenate("0",b2),b2)
If you want to write formula then this would be better,
=REPT(0,4-LEN(A1))&A1

Automatically restart formula given that parameter is met?

I am in need of some assistance. I have a list of multiple id's in column A, column b contains data of the number of items linked to the ID. I want to generate a list of every page pertaining to each ID as follows, so i want in column C (for example) "a - Page 0001" all the way until "a - 1000" given that a had 1000 pages but then when it reaches 1000, i want it to restart from b as follows:
Column A Column B
a 1000
b 2000
c 1500
d 1200
e 700
a - Page 0001
a - Page 0002
a - Page 0003
a - Page 0004
…
a - Page 1000
b - Page 0001
b - Page 0002
b - Page 0003
b - Page 0004
…
b - Page 0001
…
b - Page 2000
c - Page 0001
I have tried using the following formula:
=IF(ROW(C1)< B1+1,CONCATENATE($A$1," - Page ",TEXT(ROW(C1),"0000"),""))
The problem is that once it reaches 1000 I get errors (#VALUE!), firstly, I believe I have to $ the &A$1 otherwise when I drag the formula down it will just refer to the column to the left an i'll get a - Page 0001, b - page 0002, etc. Secondly, I am using the ROW function in order to generate the page numbers but I don't understand how I can force it to restart from 1 once it reaches the maximum (i.e. 1000 for a).
This formula will generate you list of individual pages:
=IFERROR(INDEX($A$1:$A$5,IFERROR(MATCH(ROW(C1)-1,$C$1:$C$5,1)+1,1))&" - Page "&RIGHT("0000"&ROW(C1)-IFERROR(INDEX($C$1:$C$5,MATCH(ROW(C1)-1,$C$1:$C$5,1)),0),4),"")
The key to making it work is column C which is a helper formula. In C we are going to place a running total of the number of pages. In C1 use:
=SUM($B$1:$B1)
note the missing $ in the last address, its important that it not be there. copy that down for the length of your table.
Note the hidden rows

Hamming Distance in Block Parity

I´m sitting here and I´m not able to solve a problem related to the hamming distance.
I have a block like this:
100
000
111
When I use the vertical redundancy Check now I get:
100-1
000-0
111-1
||| |
011-0
because I add an 1 to each row and column, if I have an odd number of 1´s or I add an 0 to each row and column, if i have an odd number of 0´s.
Now I should have 6 words right?
w1: 1001
w2: 0000
w3: 1110
w4: 1010
w5: 0011
w6: 0011
Are the created (through VRC) bits words too? So are 1000 and 0110 words too?
If I compare the words and check the Hamming Distance, I get an minimal Hamming Distance of 2 (0 should not count, because if the Hamming Distance is 0, the words are the same). E.g. compare w1 and w2.
In our lecture, the professor said, that the Hamming Distance for this example is 3. How can it be 3?!
Were is my mistake? :-(
I hope someone can help me.
Have a great Sunday!

Why is gray code called reflected code?

I understand that each gray code differs from its preceding code by one bit, but i don't exactly understand why its called reflected. I came across this website https://www.pc-control.co.uk/gray_code.htm, where it says " The gray code is sometimes referred to as reflected binary, because the first eight values compare with those of the last 8 values, but in reverse order", but the first 8 gray codes are not comparable to the last 8 gray codes in reverse order as can be seen from the gray code table on their website. To add to my confusion the gray code table differs from the gray code table on my textbook, for eg gray code for 9 = 1000 on my textbook while on the website its 9 = 1101.
Consider the sequence on the linked page:
0000
0001
0011
0010
0110
0111
0101
0100
1100
1101
1111
1110
1010
1011
1001
1000
Remove the most significant bit and you obtain a nice reflected sequence:
x000
x001
x011
x010
x110
x111
x101
x100
-------- mirror
x100
x101
x111
x110
x010
x011
x001
x000
Please note that the same kind of reflection can be found for Gray sequences of any width.

Resources