Cognos BI question here - I have two data tables – one contains the Last Name and SS # of customers, and another table has “Extended Info” about those customers. Element ID is the data element being stored, Ext Cis Value has the data value, and SS Number ties it back to a customer.
I want to build a single list which lists all customers, as well as the corresponding values for each of the three data elements in the ExtendedInfo table. In this case it’s #13 (Email Address), #15 (Prospect Type) and #16 (Prospect Source)
Here is the data I have today:
ProspectData table:
| Last Name | SS # |
|-----------------------|-----------|
| ABC Construction, LLC | S10000104 |
| XYZ Construction, LLC | S10000106 |
ExtendedInfo table:
| Element Id | Ext Cis Value | SS Number |
|------------|---------------|-----------|
| 13 | HAS#EMAIL.COM | S10000104 |
| 13 | NO#EMAIL.COM | S10000106 |
| 15 | HOT PROSPECT | S10000104 |
| 15 | WARM PROSPECT | S10000106 |
| 16 | External | S10000106 |
| 16 | Internal | S10000104 |
I've been able to JOIN these two tables together to create a result like this, but only by applying a filter to ExtendedInfo to return a single field. Example as shown:
| SS # | Last Name | Email Address |
|-----------|-----------------------|---------------|
| S10000104 | ABC Construction, LLC | HAS#EMAIL.COM |
| S10000106 | XYZ Construction, LLC | NO#EMAIL.COM |
I am trying to set up a single query which will contain five columns: SS Number, Last Name, Email Address (#13 on Element ID), Prospect Type (#15) and Prospect Source (#16). I envision it looking like this:
| SS # | Last Name | Email Address | Prospect Type | Prospect Source |
|-----------|-----------------------|---------------|---------------|-----------------|
| S10000104 | ABC Construction, LLC | HAS#EMAIL.COM | HOT PROSPECT | Internal |
| S10000106 | XYZ Construction, LLC | NO#EMAIL.COM | WARM PROSPECT | External |
So far, the closest I’ve come to this is adding a new query on the ExtendedInfo table which has a filter applied for Element ID, then using JOIN to join the result of that query and the ProspectData table. However, I don’t know how (or if it’s practical) to create 3 individual queries on ExtendedInfo (Email, Prospect Type, Prospect Source) and join them all to ProspectData.
This seems like a simple task, but I’m not sure how to do it. Any suggestions? Thanks in advance for your help.
You don't have to join the tables three times. In fact, you only have to join once. You can construct your custom columns at the model/report layer.
Join ProspectData and ExtendedInfo on SS Number with a standard inner join
The result will look like this:
| Element Id | Ext Cis Value | SS Number | SS Number | Last Name |
|------------|---------------|-----------|-----------|-----------------------|
| 13 | HAS#EMAIL.COM | S10000104 | S10000104 | ABC Construction, LLC |
| 13 | NO#EMAIL.COM | S10000106 | S10000106 | XYZ Construction, LLC |
| 15 | HOT PROSPECT | S10000104 | S10000104 | ABC Construction, LLC |
| 15 | WARM PROSPECT | S10000106 | S10000106 | XYZ Construction, LLC |
| 16 | External | S10000106 | S10000106 | XYZ Construction, LLC |
| 16 | Internal | S10000104 | S10000104 | ABC Construction, LLC |
Now, at the model layer (if doing this in Framework manager) or in the resultant result query (if doing this in a report) add three new data items, Email Address, Prospect Type, Prospect Source with the following expressions:
Email Address
CASE
WHEN position('#',[Ext Cis Value]) > 0 THEN [Ext Cis Value]
ELSE null
END
Prospect Type
CASE
WHEN position('PROSPECT',[Ext Cis Value]) > 0 THEN [Ext Cis Value]
ELSE null
END
Prospect Source
CASE
WHEN position('ternal',[Ext Cis Value]) > 0 THEN [Ext Cis Value]
ELSE null
END
Set the Aggregate Function attribute for the three new data items to 'Maximum'. This should cause your result to roll up to a single row, with values in each of the three new data items.
Related
I'm working on a project where I receive a list in excel of employee names, dates and ID's. I need to compare this list to a Power BI report that I've made to bring back any ID's that are locked.
For example:
I receive
| Employee Name | Date | ID |
| ------------- | --------- | -- |
| John Doe | 4/22/21 | 1 |
| Jane Doe | 4/23/21 | 2 |
The Power BI Report looks like this:
| Employee Name | Date | ID | LOCK? |
| ------------- | -------------- | -- | -------- |
| John Doe | 4/22/21 | 1 | LOCK |
| Jane Doe | 4/23/21 | 2 | UNLOCKED |
Is there a way to compare a my list in excel with my a Power BI on a large scale? I've tried Power Query in Excel, but the data is too large.
Ended up using a pbiviz file (Filter By List)
I've got a Microsoft Access database with several tables. I've thrown 2 of those into an Excel file to simplify my work, but either an Access or Excel solution can be used for this. Below are examples of the data that needs to be manipulated, but in those records there's a lot of other columns and information.
I've got Table 1 (Input Table):
| Bank | Reference |
|-----------------|-----------|
| Chase Bank LLC | |
| JPMorgan Chase | |
| Chase | |
| Bank of America | |
| Bank of America | |
| Wells Fargo | |
The Reference column is empty. I want to fill it based on the reference table, which contains the IDs that would go into the Reference column.
Table 2 (Reference Table):
| Bank | ID |
|-----------------|-----------|
| Chase Bank | 1 |
| Bank of America | 2 |
| Wells Fargo | 3 |
So the solution would fill the "Reference" column like this:
| Bank | Reference |
|-----------------|-----------|
| Chase Bank LLC | 1 |
| JPMorgan Chase | 1 |
| Chase | 1 |
| Bank of America | 2 |
| Bank of America | 2 |
| Wells Fargo | 3 |
Since this is taken from a database's table, these aren't really ordered records. The purpose of this is to create a relationship in an already-existing database that didn't have those relationships set up.
a join between the 2 text fields, in an Update query, will provide a write of the ID for those records that exactly match.
there is no technology/option for the non matching; you can only apply some creative designs... for instance the chase bank does match for the first 10 characters... so for the non matched you could set up a temp table with a new field that is Left(fieldname,10)...join on this new field to get the ID into the temp table - - and then do a 2nd Update query to move the ID again finally using the full name
I have multiple monthly datasets with 50 variables each. I need to append these datasets to create one single dataset. However, I also want to add the month's name to the corresponding records while appending such that I can see a new column in the final dataset which can be used to identify records belonging to a month.
Example:
Data 1: Monthly_file_201807
ID | customerCategory | Amount |
1 | home | 654.00 |
2 | corporate | 9684.65 |
Data 2: Monthly_file_201808
ID | customerCategory | Amount |
84 | SME | 985.29 |
25 | Govt | 844.88 |
On Appending, I want something like this:
ID | customerCategory | Amount | Month |
1 | home | 654.00 | 201807 |
2 | corporate | 9684.65 | 201807 |
84 | SME | 985.29 | 201808 |
25 | Govt | 844.88 | 201808 |
currently, I'm appending using following code:
List dsList = [
Data1Path,
Data2Path
].collect() {app.data.open(source:it)}
//concatenate all records into a single larger dataset
Dataset ds=app.data.create()
dsList.each(){
ds.prepareToAdd(it)
ds.addAll(it)
}
ds.save()
app.data.copy(in: ds, out: FinalAppendedDataPath)
I have used the standard append code, but unable to add that additional column with a fixed value of month in there. I don't want to loop through the data to create an additional column of "month", as my data is very large and I have multiple files.
I've created Pivot Tables before using VBA but my professor recently gave us a bonus that although is not necessary, is driving me nuts.
Use a VBA Macro to write Region, District, and Store Name to your first report to create a new report
1) My first report looks like this:
Location | Sum of ActNetSales | Sum of PlanNetSales
----------|--------------------|---------------------
1 | $76,170 | $65,172
100 | $163,691 | $140,057
101 | $34,724 | $29,710
104 | $70,501 | $60,322
106 | $113,826 | $97,391
2) Below is the data source for the above report.
Division | Year | Week | Location | SchedDept | PlanNetSales | ActNetSales | AreaCategory
----------|------|------|----------|-----------|--------------|-------------|--------------
5 | 2018 | 10 | 520 | 541 | 1943.2 | 2271.115 | Non-Comm
5 | 2018 | 10 | 520 | 608 | 4378.4 | 5117.255 | Non-Comm
5 | 2018 | 10 | 520 | 1059 | 1044.8 | 1221.11 | Comm
5 | 2018 | 10 | 520 | 1126 | 6308 | 7372.475 | Non-Comm
3) My professor wants me to add the following information to the above table: Region, District and Store Name. However, these 3 fields are from a different data source then the above report. Below is the data source for the 3 fields I've listed.
Division | Location | LocationName | Region | RegionName | District | DistrictName
----------|----------|--------------|--------|------------|----------|--------------
5 | 1 | Location 1 | 3 | Region 3 | 18 | District 18
5 | 4 | Location 4 | 5 | Region 5 | 32 | District 32
5 | 5 | Location 5 | 3 | Region 3 | 19 | District 19
5 | 6 | Location 6 | 5 | Region 5 | 28 | District 28
I've created what he's asking above by joining the 2 tables (created a key by concatenating the foreign keys - location and division: to make a unique key and using a basic index/match ) and just creating a Pivot Table from that but I want to try my best to solve the bonus! Unfortunately, I don't have Power Query so I had to do it this way. I've tried searching up the above and I can't find any good resources. Is there anything you can suggest or just point me in the right direction? Thank you!
Is it cheating to modify your table under (2) to add the columns region, district, and storename using VLOOKUP on the third table? The second table would then have raw data, and extra columns of constructed data, effectively joining it to the third table using the Excel VLOOKUP trick rather than an actual SQL table join.
Then you can just use the expanded, joined table as your one Pivot Table source.
Cheating is legal in love, war, and IT solutions.
Based on my earlier questions, how can I pivot data using Informatica PowerCenter Designer when I have variable amount of Addresses in my data. I would like to Pivot e.g four addresses from my data. This is the structure of the source data file:
+---------+--------------+-----------------+
| ADDR_ID | NAME | ADDRESS |
+---------+--------------+-----------------+
| 1 | John Smith | JohnsAddress1 |
| 1 | John Smith | JohnsAddress2 |
| 1 | John Smith | JohnsAddress3 |
| 2 | Adrian Smith | AdriansAddress1 |
| 2 | Adrian Smith | AdriansAddress2 |
| 3 | Ivar Smith | IvarAddress1 |
+---------+--------------+-----------------+
And this should be the resulting table:
+---------+--------------+-----------------+-----------------+---------------+----------+
| ADDR_ID | NAME | ADDRESS1 | ADDRESS2 | ADDRESS3 | ADDRESS4 |
+---------+--------------+-----------------+-----------------+---------------+----------+
| 1 | John Smith | JohnsAddress1 | JohnsAddress2 | JohnsAddress3 | NULL |
| 2 | Adrian Smith | AdriansAddress1 | AdriansAddress2 | NULL | NULL |
| 3 | Ivar Smith | IvarAddress1 | NULL | NULL | NULL |
+---------+--------------+-----------------+-----------------+---------------+----------+
I guess I can use
SOURCE --> SOURCE_QUALIFIER --> SORTER --> AGGREGATOR --> EXPRESSION --> TARGET TABLE
But what kind of port should I use in AGGREGATOR and EXPRESSION transforms?
You should use something along the lines of this:
Source->Expression->Aggregator->Target
In the expression, add a variable port:
v_count expr: IIF(ISNULL(v_COUNT) OR v_COUNT=3, 1, v_COUNT + 1)
OR
v_count expr: IIF(ADDR_ID=v_PREVIOUS_ADDR_ID, v_COUNT + 1, 1)
And 3 output ports:
o_addr1 expr: DECODE(TRUE, v_COUNT=1, ADDR_IN, NULL)
o_addr2 expr: DECODE(TRUE, v_COUNT=2, ADDR_IN, NULL)
o_addr3 expr: DECODE(TRUE, v_COUNT=3, ADDR_IN, NULL)
Then use the aggregator, group by ID and select always the Max,
e.g.
agg_addr1: expr: MAX(O_ADDR1)
agg_addr2: expr: MAX(O_ADDR2)
agg_addr3: expr: MAX(O_ADDR3)
If you need more denormalized ports, add additional ports and set the initial state
of the v_count variable accordingly.
Try this:
SOURCE --> SOURCE_QUALIFIER --> RANK --> AGGREGATOR -->TARGET
In RANK transformation, group by on ADDR_ID and select ADDRESS as rank port. In properties tab, select Number of ranks as 4.
In AGGREGATOR transformation group by on ADDR_ID and use the following output port expressions (RANKINDEX will be generated by RANK transformation):
ADDRESS1 = MAX(ADDRESS,RANKINDEX=1)
ADDRESS2 = MAX(ADDRESS,RANKINDEX=2)
ADDRESS3 = MAX(ADDRESS,RANKINDEX=3)
ADDRESS4 = MAX(ADDRESS,RANKINDEX=4)