IBM ODM : How to ensure there are no duplicate rules in decision table - ibm-odm

We have a need to manage large number of rules in ODM decision table. Is there a way I can enforce validation within ODM that prevents rule authors from duplicating rules. ODM does have overlap/GAP validation, but it is limited to one condition column. It does not go across multiple columns to look for overlap.
example:
I can have 4 conditional columns resulting in specific charge. I want to make sure authors cannot use same data in these 4 columns and assign different charge.
A B C D ==> $1
A B C D ==> $2

Here is an example, with more data to show the issues:
It is standard practices to use merged rows and overlap validation -- when possible. Not only does this eliminate the possibility of duplicate rules, but usually it makes it much easier to grasp what is going on in the table. Here is the exact same decision table and rules, with the cells merged:
Of course, structuring a decision table so that cells can be merged is not always possible. One common issue is wanting to have the same condition yield multiple actions; often the row will just be duplicated with the same conditions and different actions. If you can't merge, programming standards can dictate that rows are sorted by column from left to right, which makes it more obvious when a duplicate row is added by mistake.

Related

How to identify all columns that have different values in a Spark self-join

I have a Databricks delta table of financial transactions that is essentially a running log of all changes that ever took place on each record. Each record is uniquely identified by 3 keys. So given that uniqueness, each record can have multiple instances in this table. Each representing a historical entry of a change(across one or more columns of that record) Now if I wanted to find out cases where a specific column value changed I can easily achieve that by doing something like this -->
SELECT t1.Key1, t1.Key2, t1.Key3, t1.Col12 as "Before", t2.Col12 as "After"
from table1 t1 inner join table t2 on t1.Key1= t2.Key1 and t1.Key2 = t2.Key2
and t1.Key3 = t2.Key3 where t1.Col12 != t2.Col12
However, these tables have a large amount of columns. What I'm trying to achieve is a way to identify any columns that changed in a self-join like this. Essentially a list of all columns that changed. I don't care about the actual value that changed. Just a list of column names that changed across all records. Doesn't even have to be per row. But the 3 keys will always be excluded, since they uniquely define a record.
Essentially I'm trying to find any columns that are susceptible to change. So that I can focus on them dedicatedly for some other purpose.
Any suggestions would be really appreciated.
Databricks has change data feed (CDF / CDC) functionality that can simplify these type of use cases. https://docs.databricks.com/delta/delta-change-data-feed.html

Classification and Grouping of Sorted Data

I have a Dataset H3:J12 where components are classified based on Type. I have summed the count for similar components and sorted based on Count with Unique and Sort formulae and the result is L3:M7.
In my actual case, there are several thousands of such components which are sorted as in L:L and now I would like to add the Type column next to the Component with sorted Count as shown in P3:R12. Is it possible to extract them directly from L3:M7 or directly from H3:J12, as I will not be able to do them manually.
Screenshots/here refer:
Mechanical approaches include pivots, VB, etc. However, you could consider a more dynamic approach that doesn't require constant updating / refreshing (of code, pivots, etc.) whenever underlying data is appended/amended.
YES: you can retrieve directly from source data as follows:
=SORT(UNIQUE(C3:D20))
=SUMIFS(E3:E30,C3:C30,H3:H17,D3:D30,I3:I17)
Notes:
Could make this averageif(s), countif(s), percentile etc. etc. - another
significant advantage over alternative methods (VB, pivots, etc.), besides dynamic nature, is the flexibility re: measures; restrictions present in pivots are substantive in relation to Excel direct calculation
Disadvantage is inability to automatically chart data using pivot chart functionality that accompanies pivot tables
See linked sheet opening line for conditional formatting sample used to recreate 'look / feel' you might otherwise be losing out on with this approach
It seemed as though you were almost there! I'm not sure what relevance other tables have (ignored this in light of question).

Excel: Create lookup based on list of criteria

I have a little bit of an Excel problem and would be happy about any suggestions.
Long version: I have a dataset with raw data representing journal entries. The structure of this dataset can be seen here:
Now, what I want to achieve is to assign each row/each journal entry to a cost category (marketing, personnel, IT, depreciation, …) based on the values in the account number, type, and cost center rows, and, in a second step, break down the categories once more, eg. for labour costs, distinguish between direct and indirect labour costs.
The way my company does this right now is using an Excel sheet with several macros where the criteria are hardcoded in the VBA code to loop through the whole list, check if a row matches the criteria for a certain cost category, and if it does, copy the row to a new sheet (having one new sheet for each category), then using a second macro to break down the categories, assigning values to the “description”-column which is empty initially based on another set of criteria. Then, pivot tables are used on each of the new sub-datasets to calculate sums for each sub-category. These sums are finally used as input data for a management report (as seen in the image above) which is the ultimate goal of this whole ordeal.
Now, not only does this seem overly complicated to me and running the macros and manually adjusting the input ranges for the pivot tables takes forever, but also the criteria for allocating the costs can change quite often, and opening the VBA editor and changing the code is not really user-friendly.The initial idea was to maybe include some helper columns (one for each cost category) and somehow create an indicator variable being one of the entry falls in the respective category, and zero otherwise, and then use these columns for further calculations (e.g. for Sumifs and such).
The problem is that a) combinations of account number and type are not unique, so that one account number can go along with various types, and one type can go along with various account numbers, so the criteria can be something like C6 = 544300 OR 544700 AND D6<>110246, etc. And b) criteria can change, meaning sometimes a new account number or type is added that also has to be assigned to an already existing category such as labor costs, which would make it necessary to include that criterion in all the formulas for that particular cost category. So, is it possible to somehow create a criteria table for each category that serves as input for some sort of IF/SUMIF or lookup function?
Short version: I have a data set (can range from 5000 to up to 100000 rows, 8 columns) where I want to perform a lookup, but based on various criteria. And, in addition to that, it would be nice if the criteria could somehow be drawn from a separate list so that they can be modified fairly easily without having to change the formula itself. Is there a way to do so? Or do you think using the advanced filter might be the most suitable option?

compare two tables then sort them in excel

I have two tables with the same data but in different rows, I want to sort them in front of each other. each duplicate row in front of its duplicate.
attached photo
In a new worksheet, copy the code data from one table and append to that a copy of the code data from the other. Apply Remove Duplicates to that column and sort ascending.
Now use that sheet to look up (VLOOKUP Description, Uom and Unit Price from one of your tables into three separate columns (say 2,3,4) and lookup up same fields from the other of your tables into a further three columns (say 5,6,7).
Wrap both formulae in IFERROR(....,"") to reduce noise.
I take it any numbering will be applied independently in a new sheet (ie No. is not required to be copied to there).
Incidentally you have a lot of unconventional hyphens (eg L-80 is never normally written other than as L80), m for OCTG as a unit of measure leads to many problems and with competent staff a structured catalogue could be advisable for a high value of stock and long-term storage.

PivotTable Nested Columns

I am interested in grouping columns together by a common factor. Below is a basic example:
I'm interested in nesting the vehicle brands under the categories Luxury and Frugal. Thus making them collapsible like the rows can be. This is a basic example whereas the data I'm planning to work with this has thousands of rows of data similar to a layout as these rows and possibly 50-100 columns that could be collapsible into 5-10. Any guidance would be appreciated.
As has been suggested, adding Luxury/Frugal as a column in your source data would achieve what you require. However that would require substantial rearrangement, make your source data occupy a much larger range and not give an ideal result because adding just the one column, while allowing filtering by column in the PivotTable, each filter view would still show all columns (though mostly blank).
So completely "flattening" the source data may be a better choice, if to rearrange the source data at all:
The fields arranged like so:

Resources