How to pivot row data using Informatica? - pivot

How can I pivot row data using Informatica PowerCenter Designer? Say, I have a source file called address.txt:
+---------+--------------+-----------------+
| ADDR_ID | NAME | ADDRESS |
+---------+--------------+-----------------+
| 1 | John Smith | JohnsAddress1 |
| 1 | John Smith | JohnsAddress2 |
| 2 | Adrian Smith | AdriansAddress1 |
| 2 | Adrian Smith | AdriansAddress2 |
+---------+--------------+-----------------+
I would like to Pivot this data like this:
+---------+--------------+-----------------+-----------------+
| ADDR_ID | NAME | ADDRESS1 | ADDRESS2 |
+---------+--------------+-----------------+-----------------+
| 1 | John Smith | JohnsAddress1 | JohnsAddress2 |
| 2 | Adrian Smith | AdriansAddress1 | AdriansAddress2 |
+---------+--------------+-----------------+-----------------+
How can I do this in Informatica?

If every person has two addresses, you can use the FIRST and LAST functions in an Aggregator transformation:
!

Related

Comparing Power BI/Excel Reports

I'm working on a project where I receive a list in excel of employee names, dates and ID's. I need to compare this list to a Power BI report that I've made to bring back any ID's that are locked.
For example:
I receive
| Employee Name | Date | ID |
| ------------- | --------- | -- |
| John Doe | 4/22/21 | 1 |
| Jane Doe | 4/23/21 | 2 |
The Power BI Report looks like this:
| Employee Name | Date | ID | LOCK? |
| ------------- | -------------- | -- | -------- |
| John Doe | 4/22/21 | 1 | LOCK |
| Jane Doe | 4/23/21 | 2 | UNLOCKED |
Is there a way to compare a my list in excel with my a Power BI on a large scale? I've tried Power Query in Excel, but the data is too large.
Ended up using a pbiviz file (Filter By List)

Excel, autofill column of an Excel table based on same column

I'm trying to autocomplete a column:
+-------+-----------------------+
| Name | Currently Employed? |
+-------+-----------------------+
| John | |
| Tom | |
| John | |
| John. | Yes |
+-------+-----------------------+
If I update line 4, currently employed column, to yes. I want to find all other instances and update with the same value:
+-------+-----------------------+
| Name | Currently Employed? |
+-------+-----------------------+
| John | -> Yes |
| Tom | |
| John | -> Yes |
| John. | Yes |
+-------+-----------------------+
Any changes should be reflected across all instances of a name, regardless of order. For example, if:
+-------+-----------------------+
| Name | Currently Employed? |
+-------+-----------------------+
| John | -> No |
| Tom | |
| John | Yes |
| John. | Yes |
+-------+-----------------------+
Then:
+-------+-----------------------+
| Name | Currently Employed? |
+-------+-----------------------+
| John | Yes |
| Tom | |
| John | -> No |
| John. | -> No |
+-------+-----------------------+
Is this possible?

Combine multiple rows into single row in Pandas Dataframe

I have got a child table here. Here is the sample data.
+----+------+----------+----------------+--------+---------+
| ID | Name | City | Email | Phone | Country |
+----+------+----------+----------------+--------+---------+
| 1 | Ted | Chicago | abc#gmail.com | 132321 | USA |
| 1 | Josh | Richmond | abc#gmail.com | 435324 | USA |
| 2 | John | Seattle | 123#gmail.com | 322421 | USA |
| 2 | John | Berkley | 4723#gmail.com | 322421 | USA |
| 2 | Mike | Seattle | 4723#gmail.com | 322421 | USA |
+----+------+----------+----------------+--------+---------+
The rows above need to be appended together. Only unique values are required.
+----+---------------+----------------------+----------------------------------+-------------------+---------+
| ID | Name | City | Email | Phone | Country |
+----+---------------+----------------------+----------------------------------+-------------------+---------+
| 1 | 'Ted','Josh' | 'Chicago','Richmond' | 'abc#gmail.com' | '132321','435324' | 'USA' |
| 2 | 'John','Mike' | 'Seattle','Berkley' | '123#gmail.com','4723#gmail.com' | '322421' | 'USA' |
+----+---------------+----------------------+----------------------------------+-------------------+---------+
Use if ordering is important GroupBy.agg with lambda function and remove duplicates by dictionary:
df1=df.groupby('ID').agg(lambda x: ','.join(dict.fromkeys(x.astype(str)).keys())).reset_index()
#another alternative, but slow if large data
#df = df.groupby('ID').agg(lambda x: ','.join(x.astype(str).unique())).reset_index()
print (df1)
ID Name City Email \
0 1 Ted,Josh Chicago,Richmond abc#gmail.com
1 2 John,Mike Seattle,Berkley 123#gmail.com,4723#gmail.com
Phone Country
0 132321,435324 USA
1 322421 USA
If ordering is not important use similar solution with removed duplicates by sets:
df2 = df.groupby('ID').agg(lambda x: ','.join(set(x.astype(str)))).reset_index()
print (df2)
ID Name City Email \
0 1 Josh,Ted Richmond,Chicago abc#gmail.com
1 2 John,Mike Berkley,Seattle 4723#gmail.com,123#gmail.com
Phone Country
0 435324,132321 USA
1 322421 USA

PowerPivot Grouped Average DAX

I'm trying to model some outbound calling data in PowerPivot. We have reps across multiple locations, and in general we breakdown our outbound calling into two periods of the day (before and after 12pm).
We can export data from our phone system a list of every call made for a day -- let's say an example is as follows:
+------------+-------------+-------+-----------+-------------+
| Date | Call Length | Agent | Workgroup | Call Period |
+------------+-------------+-------+-----------+-------------+
| 01.01.2016 | 00:05:26 | Sam | Sydney | 1 |
| 01.01.2016 | 00:15:05 | Sam | Sydney | 1 |
| 01.01.2016 | 00:55:22 | John | Sydney | 2 |
| 01.01.2016 | 00:45:11 | Sam | Sydney | 2 |
| 01.01.2016 | 00:04:52 | John | Sydney | 1 |
| 01.01.2016 | 00:01:52 | Timmy | London | 1 |
| 01.01.2016 | 00:02:21 | Timmy | London | 2 |
| 01.01.2016 | 00:05:21 | Karen | London | 1 |
| 02.01.2016 | 00:15:21 | Sam | Sydney | 1 |
| 02.01.2016 | 00:42:44 | Sam | Sydney | 2 |
| 02.01.2016 | 01:52:22 | John | Sydney | 1 |
| 02.01.2016 | 00:53:24 | John | Sydney | 1 |
| 02.01.2016 | 00:05:53 | Kerry | Sydney | 2 |
| 02.01.2016 | 00:43:43 | Sam | Sydney | 2 |
| 02.01.2016 | 01:08:00 | John | Sydney | 2 |
| 02.01.2016 | 00:13:52 | Timmy | London | 2 |
| 02.01.2016 | 00:25:44 | Timmy | London | 1 |
| 02.01.2016 | 02:58:31 | Karen | London | 1 |
| 02.01.2016 | 00:08:37 | Timmy | London | 2 |
| 02.01.2016 | 00:12:28 | Karen | London | 2 |
+------------+-------------+-------+-----------+-------------+
What I'm trying to calculate is the average daily time spent on phone per Workgroup, eg. on average how long is each agent on the phone at each location.
I'm guessing the arithmetic is as follows:
Measure 1: Total talk time for each Agent (eg. sum of all talk time for the day)
Measure 2: Average agent total talk time per workgroup (eg. sum of the above grouped by workgroup, divided by number of agents in that workgroup)
The output might look something like this (but doesn't have to be):
+------------+-----------+-----------------------+-----------------+-----------------------------+
| Date | Workgroup | Total Number of Calls | Total Talk Time | Average Talk Time per Agent |
+------------+-----------+-----------------------+-----------------+-----------------------------+
| 01.01.2016 | Sydney | 11 | 03:02:42 | 1:34:53 |
| | London | 4 | 02:24:51 | 01:13:41 |
| 02.01.2016 | Sydney | 5 | 01:52:05 | 00:56:51 |
| | London | 52 | 10:11:23 | 03:51:11 |
+------------+-----------+-----------------------+-----------------+-----------------------------+
Apologies if I'm unclear it what I'm asking.
Slicing your data on a pivot table will do the calculations.
you only need the following calculations:
DurationOfCall :=sum(MyTable[CallLength])
NrOfCalls :=countrows(MyTable)
AvgDuration :=DIVIDE([DurationOfCall],[NrOfCalls])
this will give the following result (on your sample dataset):
Workbook with testcase: attachment

Create columns from column values in Excel

I have a data in Excel:
+-----------------------------+--------------------+----------+
| Name | Category | Number |
+-----------------------------+--------------------+----------+
| Alex | Portret | 3 |
| Alex | Other | 2 |
| Serge | Animals | 1 |
| Serge | Portret | 4 |
+-----------------------------+--------------------+----------+
And I want to transform it to:
+-----------+-----------+-------+---------+
| Name | Portret | Other | Animals |
+-----------+-----------+-------+---------+
| Alex | 3 | 2 | 0 |
| Serge | 4 | 0 | 1 |
+-----------+-----------+-------+---------+
How can I do it in MS Excel ?
You can use a pivot table for that
Take a look at http://office.microsoft.com/en-gb/excel-help/pivottable-reports-101-HA001034632.aspx

Resources