Identify unique record from duplicates in Talend

Identify unique record from duplicates in Talend - excel

http://postimg.org/image/89yglfakx/
Refer the above link for the image as a reference.
I have an excel file which gets updated on a daily basis i.e the data is always different every time.
I am pulling the data from the excel sheet into the table using Talend. I have a primary key Company_ID defined in the table.
The error I am facing is that the Excel sheet has few duplicate Company_ID values.
It will also pick up more duplicate values in the future as the excel Excel file will be updated on a daily basis ,so it will have different duplicate values in Company_ID field.
I want to choose the unique data record for the Company ID 1,the record that doesn't have null in the rest of the columns.
For Company_ID 3 ,there is a null value for the columns which is ok since it is a unique record for that company_id.
How do I choose a unique row which has maximum no. of column values present ie for eg in the case of Company ID 1 in Talend ?
I tried using Tuniqrow but it uniquely picks up the first record from the duplicates,so if my first record has null values from the duplicate Company ID then it won't work.

Related

How to create a number field with prefix (ex: 00001) that increments whenever a document is added in SharePoint Online?

I created a calculated column and concatenated the id with a prefix, but it didn't work as expected.
The reason was that whenever i upload a file, the formula in calculated field that contains the ID with the prefix is executed, then SharePoint creates the ID.
So, the ID is being calculated as empty and only the prefix is showing.
What I used:
=CONCATENATE(REPT("0",MAX(0,5-LEN(ID))),ID)
What i would like to create is a number field with a prefix (ex: 00001) that increments whenever a document is added in SharePoint Online.
In other words, it's like the ID column in a list but with a prefix of five digits.
Therefore, is there any solution for this problem?
Thank you 😁.

Using the calculate column formula, when creating an new item, the ID column and the calculate column are created at the same time, so the value of the ID column cannot be obtained, so the calculate column will be displayed as 0.
But you can solve this problem by creating a flow.
Please refer to the screenshot below:

Updating a table "Y" in file "B" with new added rows from table "X" in file "A"

I am trying to create an "instant cloud flow" on Power Automate.
Context:
Excel file "A" contains a table "X" which gets updated regularly with new rows at the bottom containing new information.
Excel file "B" contains a table "Y" with the same characteristics, number of columns, headers etc. Except for the number of rows since the table "X" is being updated regularly with new rows of data.
Both files are stored on OneDrive cloud and will possibly move into Sharepoint file storage, so they will be in the cloud, not stored locally on any device.
Challenge/need:
I need table "Y", within file "B", to mirror the changes happening on table "X" from file "A". Specifically the new rows of data being added to table "X":
Internet/world > New rows of data at the bottom of Table "X" of file "A" > These same new rows get copied into also the bottom of Table "Y" of file "B". Basically both tables, "X" and "Y" need to stay exactly the same with a maximum interval of 3 minutes.
Solution tried:
I tried a flow which gets triggered every minute. In this flow, I tried creating an array containing the new rows of data added to table "X". Then using the Apply to each control with the values from this new array, I tried the actions Add a row into a table, followed by Update a row for each item inside this array. Keeping in this way table "Y" updated as per table "X". This part works, rows are added and updated on table "Y".
My problem:
The Condition that compares the data from the 2 tables, decides that all rows from table "X" are new data, even though some are already present in table "Y". This is a problem because too many rows are added to table "Y" and the tables become out of sync due to the difference in the number of rows/body length. In my understanding, this happens because an item/object is generated by List rows present in a table called ItemInternalId.
This ItemInternalId generates different id numbers for the same rows already updated previously, and because of this, the condition identifies all rows on table "X" as new data to be updated on table "Y".
Questions:
Could someone confirm that this ItemInternalId is the problem here? I am in doubt because I tried removing this by creating another array using the Select action and then proceeded using just the columns/headers I need, excluding this way ItemInternalId. Problem is that the "header" is excluded (which I need), containing only the value, and also the condition proceeds to identify all rows on "X" as new data again anyway...
Maybe the problem is that I am doing it wrong and there is another simple, or better way to get an array with the new items from table "X"? Here is the condition that I use to try to feed feed a new Array with the new rows from table "X":
Thank you

I found a workaround. I will not accept this as the right answer because it is just a workaround not the definitive solution to the problem.
Basically, The file "A" needs to have a "X" table with just 1 blank row. The Power Automate flow will "add new rows" with the information to this table.
Then on file "B" the table "Y" will need to be created with a certain amount of rows depending on how much data comes in per day, but can be like 100. Then create a Power automate flow that "updates the table" this will add the information from "X" table to "Y" table.
Please be aware that you will need a Key column on both tables so that Power automate knows what rows to update. You can just use basic numerical order for each row on the Key column.

Excel Pivot Table Count of Sub-Rows

I have a Pivot Table structure as follows:
ROWS:
+-State
+---Customer
+-----Brand
Columns:
+-Cost
I would like to have another column that contains the number of Customers in each state. The issue being that my data contains every order that the customers had placed, so when I try to get the count of Customers it is returning every instance of said customer in the column. Another issue is that my data is 40,000 rows, so I want to try and avoid having to edit the raw data.
I can easily do this with brute force, but I was wondering if there is anyway to do this with standard pivot tables and no add-ons. The pivot table already does a nice job of consolidating the unique values for customers, now I just need a count of those unique values.

MVC 5 related to views settings for grid acc. to view indexing

I have a list where I am getting all data approx. 50 columns with foreign key tables data through a model.
my second list where I am getting only columnname in rows approx. 10 columname,it can be more acc. to condition ,
now I want to bind a table using 2nd table column name and data will come from first list..because that columns are available in 1st list..so I want only those column name using mvc.. is it possible ?
note : I am using this things for viewsettings acc. to diff. diff. view which we are creating ...so please tell me how can I map rows of 2nd list with column of 1st list using mvc query...

Pivot table using 2 data ranges

is it possible to have a single pivot table that combines 2 worksheets as the data?
For example, first data table will be made up of the following columns:
ID/Details/Category
The second data table will be made up of the following columns:
ID/Customer name
The reason why the 2 tables are not combined is because there may be many customer names to the same customer name.
I want a pivot table that will show me the following things:
1) Be able to sort by ID and see for each ID the details linked to that ID sorted by category
2) Be able to sort by customer name and see the details linked to that customer sort by category.
Thank you for your help.

Press Alt+D+P. Select multiple consolidation ranges, choose your two ranges of data, and you've got it.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string