How to migrate comma seperate values in different rows of DB using TALEND - data-migration

In my old DB, i have 2 columns in one of the table which are having comma seperated values like
oid columnA columnB
1 21,22 hi,hello
Now i want to migrate this value in my new table as
NEW_OID old_oid ID COMMENTS
1 1 21 hi
2 1 22 hello
I tried using tNormalize component but it allowed to migrate only one comma delimmted column like either i can migrate ColumnA or COlumnB but not both. Using this i am getting output as
NEW_OID old_oid ID COMMENTS
1 1 21 hi,hello
2 1 22 hi,hello
Someone please guide me how to do this.

One possible way is that you have to take two pipelines of your flow - as below
tInput-->tNormalize(Normalize on ID column) -->tMap-->(add rownumber column here)-->tHash1/tFile1
tInput-->tNormalize(Normalize on comments) -->tMap-->(add rownumber column here)-->tHash2/tFile2
tHash/tFile1-----tMap (tHash2/tFile2) Join on new oid, oldoid and rownumber -->output

Related

Counting multiple values in a range to group by category and making a pivot table with slicer

I've been Googling at this for days and I can't seem to wrap my head around it and am not familiar enough to figure it out. I have a table of data. I have a list of categories with multiple codes for each category. Each row in my table has 100 columns than can have category codes in them. They can be blank or have different codes from the same category but no duplicate codes. Here is a small example
val1
val2
val3
val4
val5
val6
user1
3
5
3
6
4
7
user2
6
5
8
2
4
5
user3
7
7
5
3
7
0
user4
1
4
7
3
9
2
I am trying to make a pivot table to count the number of times codes are present for each category. Initially, I created additional columns in the data table, one for each category, that used COUNTIFs to look in all the columns per row and add up the categories. The additional columns look like this:
cat1
cat
cat3
user1
3
5
3
user2
6
5
8
user3
7
7
5
user4
1
4
7
So for example, if you count up all the codes belonging to cat1 for user1 (columns val1 - val100) it would be 3. The problem with this is when making my pivot table the columns are labeled "sum of" followed by category name, but more importantly, I can't make a slicer by category. I can make a slicer for one category and it lets me filter by the number of times the values appears in a row (0,1,2,3, etc).
I made another table with the codes in one column (unique) and the categories in another (not unique), but I just can't figure out how to get my pivot table working. I've been reading about adding a measure and using a DAX formula but I don't know if that's the right approach and I'm not familiar with them either. I need a pivot table because I eventually will turn it into a graph with slicers. Can anyone point me in the right direction?
Seems like your only Option is to Do the CountIF at the end of the table
or Reconstruct your source Table
Like
and then you pivot that like this
Lastly you can pivot that base on your second table
Your second Option is to

How to find latest category change for product using excel function

I have below scenarios with million rows where I have to fill data based on another table. Can someone please help on this
Table 1:
Date Category Product Change?
1/1/2010​ Cat 1 Prod 1
12/2/2019​ Cat 2 Prod 1
1/1/2020​ Cat 3 Prod 3
3/2/1989​ Cat 4 Prod 4
5/2/1990​ Cat 5 Prod 4
2/2/2020​ Cat 6 Prod 2
Table 2:
Product Category Expected Category
Prod 1 Cat 2
Prod 2 Cat 6
Prod 3 Cat 3
Prod 4 Cat 5
Problem 1:
I have to fill the latest category in Table 2 based on the latest date available in Table 1. Expected Category (answer) is given in Table 2 last column
Problem 2:
True/False to be filled in Table1 "Change?" column based on if there is any category change for the product
There's no need to sort the table.
You mentioned the data is in Tables, so I use that with structured references. You could certainly switch to regular addressing if that is an issue.
If you have O365 with the FILTER and SORT functions, you can use:
=INDEX(SORT(FILTER(Table1,Table1[Product]=[#Product]),1,-1),1,2)
FILTER returns a table which contains only the designated Product
SORT then sorts that table descending so that the newest is in the first row.
INDEX returns the contents of the first row, second column which will be the relevant Category.
If you do not have those functions, you can use:
=LOOKUP(2,1/((AGGREGATE(14,6,1/(Table1[Product]=[#Product])*Table1[Date],1)=Table1[Date])*([#Product]=Table1[Product])),Table1[Category])
This will work if table is sorted on dates in ascending order as shown in your table. Then you will get the latest category.
=LOOKUP(2,1/($C$2:$C$7=F5),$B$2:$B$7)
True/False to be filled in Table1 "Change?" column based on if there is any category change for the product ... This can be done by countifs. If there is change the product will appear more than once.
So D7 = IF(COUNTIFS($C$2:$C$7,C7)>1,"True","False")

Aggregating records with two main IDs in [VBA macro]

I want to make a macro in Excel that summarizes data from rows that match a composite ID generated from 2 ID columns. In my excel sheet, each row has 2 main ID columns: ID_1 is the main key, and ID_2 is a secondary key from which I only care about the first 2 letters (Which I have gotten using LEFT). I want to group rows with the same ID_1 and first 2 letters of ID_2 and report the SUM of the value, count, and sum columns.
In the example picture below, I want to turn the data in columns A:J into the data in columns M:V
So, with this example -> We have 6 records 1015 (ID_1) with 3 different ID_2 (AB, AZ, AE). I want to sum them up to a one cell each (1015 AB ; 1015 AZ ;1015 AE) with values which each record had (there is 3 records: 1015 AB with VALUE of 2,3,4 so in result I want to get just one row 1015 AB 9(sum of value) 4(sum of count) 17 (sum of(value * count)). It's important to see that this 17 dosn't come from 9 * 4. It's =sum(I4:I6) (but it may be spread out like in 1200 FF example below! I am still trying to sort them both at one time, but I cant get past it..)
Add a helper column in D to combine the ID_1 and the first 2 characters of ID_2. =A4 & LEFT(C4,2). Copy that down then go to L4 and type in:
=+INDEX($D$4:$D$25,MATCH(0,COUNTIF(L$3:L3,$D$4:$D$25),0)
and hold down Ctrl + Shift + Enter to make it an array function. Copy down to get a list of unique combinations, and then split these values into the separate columns.
Finally to pull in the numbers, put this in Q4:
=SUMIFS(E$4:E$25,$A$4:$A$25,M4,$C$4:$C$25,O4 & "*")
and then copy down and across.

How to create a count table of two variables in excel pivot

I am struggling with creating a count table of two variables at the same time. Ultimately, I would like to create a bar graph of the table.
Assuming I have two items for a sample of firms and I just want a summary table of the answer count.
Firm Item1 Item2
1 1 1
2 2 1
3 1 2
4 1 2
Based on this answer, I can easily create the summary table for Item 1 telling me that "1" appears three times for item 1 and two times in Item 2. But I cant easily create a Pivot table showing this jointly.
I'm not sure I understood correct, but is it COUNTIF() you need?
In D1 type:
=COUNTIF(B:B, "1")
that should give you result 3

Excel: filter table rows by specified column value

I have a table with first column as primary key. Ex:
id value1 value2
1 10 5
2 2 3
3 12 5
..
I also have a second list of id's I want to select, which can have repeated ids. Ex:
selectId
1
2
2
2
5
10
..
How can I "merge" the two tables (something like INNER JOIN) to obtain:
id value1 value2
1 10 5
2 2 3
2 2 3
2 2 3
5 99 99
10 22 22
..
I tried using 'Microsoft Query' from Data > Extern Data to join the two tables. The problem is that it seems it cannot handle tables with more than 256 columns.
Thanks
UPDATE:
Thanks, VLOOKUP works as intended.
However one problem is that if the row was found but that corresponding column was blank, this function returns 0 (where I expected it to return an empty cell), and since zero is a valid value, I have no way to differentiate between the two (blank and zero)?
Any help is appreciated..
If this is Excel -like the title says- just use vlookups.
Not very relational, but that's the Excel way.
Using the VLOOKUP function would get you the data in the layout you require.
If you are using Tables in Excel 2007, the formula would look like this based on the example below.
in cell B8
=VLOOKUP([selectId],Table1,2,FALSE)
in cell C8
=VLOOKUP([selectId],Table1,3,FALSE)
Lookup screenshot http://img208.imageshack.us/img208/1/lookupz.png
It is not clear where you store your data, but it looks like you have this problem, described on Microsoft site:
http://support.microsoft.com/kb/272729

Resources