How to set values for all rows from group for certain criteria in Tableau - calculated-columns

I need to create one Calculated Field "IsProductPresent" (see last column). I have shown here as part of data so that we can compare if our results are right or wrong.
This is basically retail store data - Group of transactions is called Basket (Represented by BasketId) and each item in the basket is one row here (BasketItemNbr). Goal is to find if certain product is present in the basket. If it is present then we should mark "IsProductPresent" to 1 else 0.
What is the criteria for deciding the product?
Criteria is BSKReqID = 308 & ProductBarocde = '0049000000443' , '0049000000450'.
So if there is even one transaction in Basket which satisfies above criteria then IsProductPresent should give me 1 for all the transactions of that basket else 0.
please share emailid so that I can share Sample Data.

You can use LOD calculations for this. One example:
{ FIXED [BasketId] : MAX(IIF([BSKReqID] = 308 AND ([ProductBarcode] = '0049000000443' OR [ProductBarcode] = '0049000000450'), 1, 0)) }
I'll break that down.
IIF([BSKReqID] = 308 AND ([ProductBarcode] = '0049000000443' OR [ProductBarcode] = '0049000000450'), 1, 0)
That's an inline if statement. It says "If the BSKReqID is 308 and the barcode is one of these barcodes, then return 1. Otherwise, return 0."
Then we aggregate that with MAX(). Since the inline if statement can only return a 1 or a 0, MAX() will necessarily return one of those values - 1 if the item is present, 0 if it's not.
{ FIXED [BasketId] : ... }
That says that we're only going to use BasketId in our aggregation. That way, BasketItemNumber will not be included in the MAX() aggregation, and we will calculate that 0 or 1 for the entire Basket rather than for the BasketLineNumber.

Related

How do I group rows based on a fixed sum of values in Excel?

I am trying to find another solution to below Excel formula that was already provided here:
How do I create groups based on the sum of values?
It is the same requirement, but the grouping criteria needs to be an exact value.
Here's the sample data:
Column A | Column B
Item A | 1
Item B | 2
Item C | 3
Item D | 4
Item E | 5
Item F | 1
Item G | 2
Item H | 3
Item I | 4
Item J | 5
I need to group the rows if their Column B sum = 5.
Expected result:
Group 1 = Item A, Item D (1 + 4) = 5
Group 2 = Item B, Item C (2 + 3) = 5
Group 3 = Item E = 5
Group 4 = Item F, Item I (1 + 4) = 5
Group 5 = Item G, Item H (2 + 3) = 5
Group 6 = Item J = 5
If a row's Column B exceeds 5 or does not have another matching row to equal 5 when added then it will have no Group value.
Groupings can be interchangeable, ie. Group 1 = Item A, Item I can be made since 1 + 4 = 5.
I assume this can be achieved using Excel formulas but I am struggling to find which formula(s) can be used. Any help is appreciated!
I believe I was able to understand your question after some comments exchanged. Anyway I would recommend to update your question, it is an interesting problem, but the question was difficult to follow.
Before looking for an Excel solution, I took the approach of understanding the problem as a state machine with the transition from one state to another. I considered the following states that represent the position the item in the group. A group is defined as consecutive items that the sum of all items is equal to 5.
EMPTY: Just the initial situation
START: Start of the group
MIDDLE: A middle element of the group
END: The end of the group
START-END: A group with a single element
NA: Not applicable group
I follow the same idea of: How do I create groups based on the sum of values?, but slightly different helper columns:
Total (Column D), but for this case it is used the following formula: IF(SUM(C3,D2)>5,C3,SUM(C3,D2))
Status or item position within Group (Column G). Here is where it is calculated the corresponding status for each element
Checks for Valid Groups (Column H): Evaluates if a group is valid. When there is no match to 5, the group is not valid. It is indicated at the row that represents the beginning of the group (START or START-END states). If TRUE it means a valid group, if FALSE it is not a valid group, and NA for an NA value from Status column. If empty represents any element of the group that is not the first one.
Group # (Column I): To identify the group the row (Item) belongs to. Notice that we start counting the group from 1 and I also consider the case a group can not be formed (NA).
Here is a screenshot with the solution and the formula on G3:
=LET(total, D3, prevS, G2, QTY, C3,
IF(C3="", "",
IF(OR(AND(total=5, QTY<5, prevS="START"), AND(total=5, prevS="MIDDLE")), "END",
IF(OR(AND(total>5, total=QTY, OR(prevS="START", prevS="MIDDLE")),AND(total>5, OR(prevS="", prevS="END", prevS="NA", prevS="START-END"))), "NA",
IF(OR(AND(total<5, total=QTY, OR(prevS="START", prevS="MIDDLE")),AND(total<5, OR(prevS="", prevS="END", prevS="NA", prevS="START-END"))), "START",
IF(AND(total<5, OR(prevS="START", prevS="MIDDLE")), "MIDDLE",
IF(OR(AND(total=5, total= QTY, OR(prevS="START", prevS="MIDDLE")),AND(total=5, OR(prevS="", prevS="END", prevS="NA", prevS="START-END"))), "START-END", "UNDEFINED")
)
)
)
)
)
)
Notes::
LET Excel function is used to have something more readable
The IF blocks should to be ordered from the most specific case of total and QTY values to the most generic ones. For the case with same total condition, make sure the second condition for prevS are not repeated.
Added as a last resort UNDEFINED case, to check if any transition was not covered, if that is the case it has to be reviewed, so far in the sample data all cases are covered
Column K-Q is just for documenting purpose to identify all possible transitions. Column K-M provides all possible transitions organized them by previous status. The columns O-Q represent all possible transitions ordered by current status, so it is easier to formulate each portion of the IF blocks.
Maybe the formula can be simplified, compared to the solution provided by the similar question is more complex, but this question has more specific conditions. Some transitions maybe not relevant for the final result, but it is preferred to consider all positions in the group to make sure all transitions are covered.
The following state machine diagram shows all possible transitions:
Notes:
As you can see the solution also considers when a group cannot be created or non valid groups (NA values). The solution considers that Item column has only positive values, it is not stated in the question any restriction, but looking at the example they are all positives. To consider zero values, this solution needs to be adjusted.
Checks for Valid Groups column is calculated as follow:
= IF(G3="", "",
IF(G3="START-END", TRUE,
IF(G3="NA", "NA",
IF(G3="START",
LET(endRow, IFNA(MATCH("START", LEFT(G4:$G$1000,5),0), MATCH("", LEFT(G4:$G$1000,5),0))+ ROW()-1,
value, VLOOKUP("END", G4:INDIRECT( "G" & endRow),1,0),
IF(ISNA(value), FALSE, TRUE)
), ""
)
)
)
)
It identifies the start and end of the group, and then finds any NA values, if there are, then it is not a valid group. If the end of the candidate group is not found (the first MATCH returns N/A), then is searches until a blank row
Group # column is calculated has follow:
=IF(C3="","", LET(value, MAX($I$2:I2), IF(G3="NA", "NA",
IF(H3=TRUE, value + 1, IF(H3=FALSE, "NA",
IF(I2="NA", "NA", value))))))
This way only valid transaction are considered, i.e. the following status transitions starting from START but not ending in END : START->NA, START->MIDDLE[one or more]...->NA and NA are not considered valid groups (NA).
I added more examples from the original sample file provided, more can be added to further test all possible scenarios, but I guess you get the idea about this approach. As you sated "I assume this can be achieved using Excel formulas" yes it is possible, but I would say for more complex conditions I would suggest to implement a state machine algorithm in VBA. Even it is possible to do it with Excel functions, you have to deal with several nested IF blocks and helper columns, something that can be achieved with a simple for-loop in VBA.
Here is a link to online Excel file I used.

NetSuite Saved Search To Find Subsidiary NOT Set For Customer

We use multiple Subsidiaries; obviously every customer has at least one. Most have multiple, and I'm trying to get a list of all of the customers that don't have a particular subsidiary (call it 'XYZ').
The most obvious approach is to use:
Subsidiary : Name does not contain 'XYZ'
or, as a formula(numeric):
case when {msesubsidiary.namenohierarchy} != 'XYZ' then 1 end
That doesn't work because every customer has at least one subsidiary that isn't XYZ, so all customers satisfy the condition and get returned.
I've got a feeling the solution will involve counting the number of {msesubsidiary.namenohierarchy}s for each customer which = 'XYZ' and returning only the ones where that number is 0, but that's not an area I'm very knowledgeable on.
I don't have access to a OneWorld system, but I've done the same thing looking for items that don't have a preferred bin in a given location, and works where you want to show any record where a sublist doesn't contain any desired value. And you're right in your thinking :
Make the Customer your first "Results" column, and set the summary type to "Group".
Set your "Standard" filters as required e.g excluding inactive, only certain sales reps, etc.
Create a "Summary" filter :
Type = Sum
Field = Formula (Numeric)
Formula = case when {msesubsidiary.namenohierarchy} = 'XYZ' then 1 else 0 end
Condition = EQUALS 0
This creates a search, where for each customer, the subsidiary sublist is checked, and if ANY row matches XYZ then it sets the flag to one, and the condition (EQUALS 0) then only presents customers where NONE of the subsidiaries are XYZ.

Iterate in column for specific value and insert 1 if found or 0 if not found in new column python

I have a DataFrame as shown in the attached image. My columns of interest are fgr and fgr1. As you can see, they both contain values corresponding to years.
I want to iterate in the the two columns and for any value present, I want 1 if the value is present or else 0.
For example, in fgr the first value is 2028. So, the first row in column 2028 will have a value 1 and all other columns have value 0.
I tried using lookup but I did not succeed. So, any pointers will be really helpful.
Example dataframe
Data:
Data file in Excel
This fill do you job. You can use for loops aswell but I think this approach will be faster.
df["Matched"] = df["fgr"].isin(df["fgr1"])*1
Basically you check if values from one are in anoter column and if they are, you get True or False. You then multiply by 1 to get 1 and 0 instead of True or False.
From this answer
Not the most efficient, but should work for your case(time consuming if large dataset)
s = df.reset_index().melt(['index','fgr','fgr1'])
s['value'] = s.variable.eq(s.fgr.str[:4]).astype(int)
s['value2'] = s.variable.eq(s.fgr1.str[:4]).astype(int)
s['final'] = np.where(s['value']+s['value2'] > 0,1,0)
yourdf = s.pivot_table(index=['index','fgr','fgr1'],columns = 'variable',values='final',aggfunc='first').reset_index(level=[1,2])
yourdf

Get top(N) or single value using Bql

How is it possible to use a PXSelect statement so that it retrieves the Top(N) or the first value for a particular DAC.
Let's say that I have a table with a sequence number and I want to obtain the record with the largest sequence number. How can I do that?
Of course, I would like that for performance reasons, SQL just sends 1 record.
You can use SelectWindowed in place of Select on your PXSelect to get the top N records. In the example below it will get the Top 1. If you change the totalRows value of 1 to 5 it would get the top 5 (except you would have to loop or get the PXResultSet to use all 5 records retrieved.)
Top 1 Example:
DiscountSequence firstRow = PXSelect<DiscountSequence,
Where<DiscountSequence.discountID, Equal<Required<DiscountSequence.discountID>>>
>.SelectWindowed(this, 0, 1, someDiscountID);
Top 5 Example:
foreach (DiscountSequence row in PXSelect<DiscountSequence,
Where<DiscountSequence.discountID, Equal<Required<DiscountSequence.discountID>>>
>.SelectWindowed(this, 0, 5, someDiscountID))
{
//5 rows returned
}

Count number of occurences of a string and relabel

I have a n x 1 cell that contains something like this:
chair
chair
chair
chair
table
table
table
table
bike
bike
bike
bike
pen
pen
pen
pen
chair
chair
chair
chair
table
table
etc.
I would like to rename these elements so they will reflect the number of occurrences up to that point. The output should look like this:
chair_1
chair_2
chair_3
chair_4
table_1
table_2
table_3
table_4
bike_1
bike_2
bike_3
bike_4
pen_1
pen_2
pen_3
pen_4
chair_5
chair_6
chair_7
chair_8
table_5
table_6
etc.
Please note that the dash (_) is necessary Could anyone help? Thank you.
Interesting problem! This is the procedure that I would try:
Use unique - the third output parameter in particular to assign each string in your cell array to a unique ID.
Initialize an empty array, then create a for loop that goes through each unique string - given by the first output of unique - and creates a numerical sequence from 1 up to as many times as we have encountered this string. Place this numerical sequence in the corresponding positions where we have found each string.
Use strcat to attach each element in the array created in Step #2 to each cell array element in your problem.
Step #1
Assuming that your cell array is defined as a bunch of strings stored in A, we would call unique this way:
[names, ~, ids] = unique(A, 'stable');
The 'stable' is important as the IDs that get assigned to each unique string are done without re-ordering the elements in alphabetical order, which is important to get the job done. names will store the unique names found in your array A while ids would contain unique IDs for each string that is encountered. For your example, this is what names and ids would be:
names =
'chair'
'table'
'bike'
'pen'
ids =
1
1
1
1
2
2
2
2
3
3
3
3
4
4
4
4
1
1
1
1
2
2
names is actually not needed in this algorithm. However, I have shown it here so you can see how unique works. Also, ids is very useful because it assigns a unique ID for each string that is encountered. As such, chair gets assigned the ID 1, followed by table getting assigned the ID of 2, etc. These IDs will be important because we will use these IDs to find the exact locations of where each unique string is located so that we can assign those linear numerical ranges that you desire. These locations will get stored in an array computed in the next step.
Step #2
Let's pre-allocate this array for efficiency. Let's call it loc. Then, your code would look something like this:
loc = zeros(numel(A), 1);
for idx = 1 : numel(names)
id = find(ids == idx);
loc(id) = 1 : numel(id);
end
As such, for each unique name we find, we look for every location in the ids array that matches this particular name found. find will help us find those locations in ids that match a particular name. Once we find these locations, we simply assign an increasing linear sequence from 1 up to as many names as we have found to these locations in loc. The output of loc in your example would be:
loc =
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
5
6
7
8
5
6
Notice that this corresponds with the numerical sequence (the right most part of each string) of your desired output.
Step #3
Now all we have to do is piece loc together with each string in our cell array. We would thus do it like so:
out = strcat(A, '_', num2str(loc));
What this does is that it takes each element in A, concatenates a _ character and then attaches the corresponding numbers to the end of each element in A. Because we want to output strings, you need to convert the numbers stored in loc into strings. To do this, you must use num2str to convert each number in loc into their corresponding string equivalents. Once you find these, you would concatenate each number in loc with each element in A (with the _ character of course). The output is stored in out, and we thus get:
out =
'chair_1'
'chair_2'
'chair_3'
'chair_4'
'table_1'
'table_2'
'table_3'
'table_4'
'bike_1'
'bike_2'
'bike_3'
'bike_4'
'pen_1'
'pen_2'
'pen_3'
'pen_4'
'chair_5'
'chair_6'
'chair_7'
'chair_8'
'table_5'
'table_6'
For your copying and pasting pleasure, this is the full code. Be advised that I've nulled out the first output of unique as we don't need it for your desired output:
[~, ~, ids] = unique(A, 'stable');
loc = zeros(numel(A), 1);
for idx = 1 : numel(names)
id = find(ids == idx);
loc(id) = 1 : numel(id);
end
out = strcat(A, '_', num2str(loc));
If you want an alternative to unique, you can work with a hash table, which in Matlab would entail to using the containers.Map object. You can then store the occurrences of each individual label and create the new labels on the go, like in the code below.
data={'table','table','chair','bike','bike','bike'};
map=containers.Map(data,zeros(numel(data),1)); % labels=keys, counts=values (zeroed)
new_data=data; % initialize matrix that will have outputs
for ii=1:numel(data)
map(data{ii}) = map(data{ii})+1; % increment counts of current labels
new_data{ii} = sprintf('%s_%d',data{ii},map(data{ii})); % format outputs
end
This is similar to rayryeng's answer but replaces the for loop by bsxfun. After the strings have been reduced to unique labels (line 1 of code below), bsxfun is applied to create a matrix of pairwise comparisons between all (possibly repeated) labels. Keeping only the lower "half" of that matrix and summing along rows gives how many times each label has previously appeared (line 2). Finally, this is appended to each original string (line 3).
Let your cell array of strings be denoted as c.
[~, ~, labels] = unique(c); %// transform each string into a unique label
s = sum(tril(bsxfun(#eq, labels, labels.')), 2); %'// accumulated occurrence number
result = strcat(c, '_', num2str(x)); %// build result
Alternatively, the second line could be replaced by the more memory-efficient
n = numel(labels);
M = cumsum(full(sparse(1:n, labels, 1)));
s = M((1:n).' + (labels-1)*n);
I'll give you a psuedocode, try it yourself, post the code if it doesn't work
Initiate a counter to 1
Iterate over the cell
If counter > 1 check with previous value if the string is same
then increment counter
else
No- reset counter to 1
end
sprintf the string value + counter into a new array
Hope this helps!

Resources