Apache POI add new column to pivot table - apache-poi

I have a pivot table which has column labels "Total sales" and "Individual Sales". I want to add a new column to the pivot table to calculate the percentage of sales (Individual sales / Total sales *100). I am having problem adding this new calculated column to pivot table.
I think my problem is with the following line of code.
ctCacheField.setFormula("'Individual Sales'/'Total Sales' * 100") in
addFormulaToCache method.
Note: Individual Sales and Total Sales in the above line is the column names i have in pivot table.
Here is my complete code. When i run this code, it generates the Excel. But when i open the Excel, i get an error and i don't see the Pivot table.
AreaReference a=new AreaReference("Data!A1:C6");
/* Define the starting Cell Reference for the Pivot Table */
CellReference b=new CellReference("A1");
/* Create the Pivot Table */
XSSFPivotTable pivotTable = sheet1.createPivotTable(a,b);
pivotTable.addRowLabel(0);
// 1. Add Formula to cache
addFormulaToCache(pivotTable);
// 2. Add PivotField for Formula column
addPivotFieldForNewColumn(pivotTable);
// 3. Add all column labels before our function..
pivotTable.addColumnLabel(DataConsolidateFunction.COUNT, 2, "Total Sales");
pivotTable.addColumnLabel(DataConsolidateFunction.SUM, 1, "Individual Sales");
addFormulaColumn(pivotTable);
/* Write output to file */
FileOutputStream output_file = new FileOutputStream(new File("C:\\POI_XLS_Pivot_Example.xlsx")); //create XLSX file
new_workbook.write(output_file);//write excel document to output stream
output_file.close(); //close the file
}
private static void addFormulaToCache(XSSFPivotTable pivotTable) {
CTCacheFields ctCacheFields = pivotTable.getPivotCacheDefinition().getCTPivotCacheDefinition().getCacheFields();
CTCacheField ctCacheField = ctCacheFields.addNewCacheField();
ctCacheField.setName("Field1"); // Any field name
**ctCacheField.setFormula("'Individual Sales'/'Total Sales' * 100");** //This is where the problem could be
ctCacheField.setDatabaseField(false);
ctCacheField.setNumFmtId(0);
ctCacheFields.setCount(ctCacheFields.sizeOfCacheFieldArray()); //!!! update count of fields directly
}
private static void addPivotFieldForNewColumn(XSSFPivotTable pivotTable) {
CTPivotField pivotField = pivotTable.getCTPivotTableDefinition().getPivotFields().addNewPivotField();
pivotField.setDataField(true);
pivotField.setDragToCol(false);
pivotField.setDragToPage(false);
pivotField.setDragToRow(false);
pivotField.setShowAll(false);
pivotField.setDefaultSubtotal(false);
}
private static void addFormulaColumn(XSSFPivotTable pivotTable) {
CTDataFields dataFields;
if(pivotTable.getCTPivotTableDefinition().getDataFields() != null) {
dataFields = pivotTable.getCTPivotTableDefinition().getDataFields();
} else {
// can be null if we have not added any column labels yet
dataFields = pivotTable.getCTPivotTableDefinition().addNewDataFields();
}
CTDataField dataField = dataFields.addNewDataField();
dataField.setName("%age Group Difference");
// set index of cached field with formula - it is the last one!!!
dataField.setFld(pivotTable.getPivotCacheDefinition().getCTPivotCacheDefinition().getCacheFields().getCount()-1);
dataField.setBaseItem(0);
dataField.setBaseField(0);
}

This question is a follow-on from another question: XSSF (POI) - Adding “formula” column to pivot table
The missing piece to your question is that "Individual Sales" and "Total Sales" are made-up column names in your pivot table. Assuming your source is:
Fruit | Price
Apple | 1.5
Orange | 1
Your formula to get Price * 2 would naturally be "'Price' * 2" irregardless of the name in the pivot table. If you truly wish to reference the made-up name, your formula becomes more complicated. For your pivot table in cell A1, a reference to "Individual Sales" would be:
GETPIVOTDATA("Individual Sales",$A$1,<Filter Name>,<Filter Value>)
But this defeats the purpose of a pivot table.
As an additional note, should you desire, all three pieces of valerii ryzhuk's answer can be put into a single function. They don't need to be split-up except to explain the answer.

Related

Excel Pivot Table with "Measure"

I want to use the pivottable feature of excel to solve my below issue.
I have two tables as follows:
Table= A_Master
Table= A_Child
Where table A_Master Joins with table A_Child on Student Name in pivot table relationship.
The final table has to be like below:
Here I dont know how to create "Measure" = "FeesRemaining" so that it calculates ActualFees-FeesPaid.
If you want the difference between what the actually fee is minus the sum of everything they paid in the other table. Not sure if there is a better way to do it but this is one way to do it.
= CALCULATE(
SUM(Master[Actual Fees]),
FILTER(Master, 'Master'[Student Name] = VALUES('Master'[Student Name]))
) - CALCULATE(
SUM(Child[Fees Paid]),
FILTER(Child,'Child'[StudentName] = VALUES('Master'[Student Name]))
)
The measure above gets a sum of all the fees that are owed by the student in that row of the master table, then subtracts the sum of everything that was paid by that student in the child table.

Combine Specific columns from several tables using excel office script to one table

A looking for a way to get specific columns by name from several tables. My data comes in several sheets with different number of columns upto 38 columns so i cannot use getColumnById. I only need 7 columns from this.
First am converting all sheet ranges to tables, then am getting all tables.
What I want is to get specific columns by names and merge all into one table on a new sheet.
I followed example from Docs but am stuck at getting column name for each Table.
I know my header Values, shown in example below.
function main(workbook: ExcelScript.Workbook) {
let sheets = workbook.getWorksheets();
for (let sheet of sheets) {
sheet.getTables()[0].convertToRange();
sheet.addTable(sheet.getRange('A1').getUsedRange().getAddress(),true)
}
workbook.getWorksheet('Combined')?.delete();
const newSheet = workbook.addWorksheet('Combined');
const tables = workbook.getTables();
const headerValues = [['Column1', 'Column6', 'Column8', 'Column9','Column11', 'Column16', 'Column18', 'Column19']];
const targetRange = newSheet.getRange('A1').getResizedRange(headerValues.length - 1, headerValues[0].length - 1);
targetRange.setValues(headerValues);
const combinedTable = newSheet.addTable(targetRange.getAddress(), true);
for (let table of tables) {
let dataValues = table.getColumnByName( // this where am stuck //).getRangeBetweenHeaderAndTotal().getTexts();
let rowCount = table.getRowCount();
// If the table is not empty, add its rows to the combined table.
if (rowCount > 0) {
combinedTable.addRows(-1, dataValues);
}
}
}
Thanks for your help.
George
A few things:
In most circumstances for this scenario, I'd recommend iterating
through a specific set of table objects. Unfortunately, that's
difficult to do here. Every time you unlink and recreate a new table,
Excel may give your table a new name. That makes it difficult to
work with the table. You can get around this in your code by
capturing the table name before you unlink it, unlinking the table,
recreating the table, and setting the table name to the original one
captured. If you go that route then you could reliably work with the
table names
Because table names in this scenario can be a bit tricky, I'm going
to use the sheet names so that I can work with the sheets that contain
the underlying tables. This will allow us to use and get data from the
tables regardless of what they're named in the sheets.
Please see my code below:
function main(workbook: ExcelScript.Workbook) {
//JSON object called SheetAndColumnNames. On the left hand side is the sheet name.
//On the right hand side is an array with the column names for the table in the sheet.
//NOTE: replace the sheet and column names with your own values
let columnNames : string[] = ["ColA","ColB","ColC"]
const sheetAndColumnNames = {
"Sheet1": columnNames,
"Sheet2": columnNames
}
//JSON object called columnNamesAndCellValues. On the left hand side is the column name.
//On the right hand side is an array that will hold the values for the column in the table.
//NOTE: replace these column names with your own values
const columnNamesAndCellValues = {
"ColA": [],
"ColB": [],
"ColC": []
}
//Iterate through the sheetAndColumnNames object
for (let sheetName in sheetAndColumnNames) {
//Use sheet name from JSON object to get sheet
let sheet: ExcelScript.Worksheet = workbook.getWorksheet(sheetName)
//get table from the previously assigned sheet
let table: ExcelScript.Table = sheet.getTables()[0]
//get array of column names to be iterated on the sheet
let tableColumnNames: string[] = sheetAndColumnNames[sheetName]
//Iterate the array of table column names
tableColumnNames.forEach(columnName=> {
//get the dataBodyRange of the tableColumn
let tableColumn : ExcelScript.Range = table.getColumn(columnName).getRangeBetweenHeaderAndTotal()
//iterate through all of the values in the table column and add them to the columnNamesAndCellValues array for that column name
tableColumn.getValues().forEach(value=>{
columnNamesAndCellValues[columnName].push(value)
})
})
}
//Delete previous worksheet named Combined
workbook.getWorksheet("Combined")?.delete()
//Add new worksheet named Combined and assign to combinedSheet variable
let combinedSheet : ExcelScript.Worksheet = workbook.addWorksheet("Combined")
//Activate the combined sheet
combinedSheet.activate()
//get the header range for the table
let headerRange : ExcelScript.Range = combinedSheet.getRangeByIndexes(0,0,1,columnNames.length)
//set the header range to the column headers
headerRange.setValues([columnNames])
//iterate through the arrays returned by the columnNamesAndCellValues object to write to the Combined sheet
columnNames.forEach((column,index)=>{
combinedSheet.getRangeByIndexes(1, index, columnNamesAndCellValues[column].length, 1).setValues(columnNamesAndCellValues[column])
})
//Get the address for the current region of the data written from the tableColumnData array to the sheet
let combinedTableAddress : string = combinedSheet.getRange("A1").getSurroundingRegion().getAddress()
//Add the table to the sheet using the address and setting the hasHeaders boolean value to true
combinedSheet.addTable(combinedTableAddress,true)
}

Excel Power Pivot aggrating data through a many to 1 then 1 to many relationsips

I have 2 large tables in power pivot and I am trying to reconcile stockpile build grades to crushed stockpile grades. Please see example. I can create pivot table that contains the crushed grades but I am unable to find the right way to bring the stockpile grades though for the reconciliation high lighted in green in the attached example.
Thanks for any help or direction on where to look
In Power Query, create your lookup tables.
1) unique crushers, ID
2) Dates, ID
Here is a function to create a dates table, if you need one. After you invoke the function to get the column of dates, add another column for the ID.
/*--------------------------------------------------------------------------------------------------------------------
PQ Create a Dates Table, returning a single column of dates.
Inputs:
Start Date | Enter the year as yyyy, month as mm, day as dd
End Date | Enter the year as yyyy, month as mm, day as dd
Increments | One row will be returned per increment.
Author: Jenn Ratten
Edits:
07/16/18 | Modified query copied from the internet.
10/01/19 | Converted to a function.
--------------------------------------------------------------------------------------------------------------------*/
let
fDatesTable = (StartYear as number, StartMonth as number, StartDay as number, EndYear as number, EndMonth as number, EndDay as number, IncrementDays as number, IncrementHours as number, IncrementMin as number, IncrementSec as number) as table =>
let
StartDate = #date(StartYear,StartMonth,StartDay),
EndDate = #date(EndYear,EndMonth,EndDay),
Increments = #duration(IncrementDays,IncrementHours,IncrementMin,IncrementSec),
DatesTable = Table.FromColumns({List.Dates(StartDate, Number.From(EndDate) - Number.From(StartDate), Increments)}, type table[Date]),
ChangeType = Table.TransformColumnTypes(DatesTable,{{"Date", type date}})
in
ChangeType
in
fDatesTable
Load all of the tables to the data model.
Go to Power Pivot, diagram view, and create your relationships.
Lookup Crusher to data tables 1 and 2
Lookup Date to data tables 1 and 2
Go to Data View on data tables 1 and 2, add 2 new columns for the lookup IDs. You can specify the column header and the formula at one time by clicking in first cell and using this syntax, then either press enter or click the check mark in the formula bar.
Dates Lookup ID:=RELATED(lookup_dates[ID])
Crusher Lookup ID:=RELATED(lookup_crusher[ID])
Optional, but a good practice....
Right-click the new fields you just created and select "hide from client tools". Also hide the date and crusher fields on both data tables, and the ID field on both lookup tables. When you are creating pivots to summarize data from more than one table, the text fields that you place on your pivot table should be the fields that are shared (aka the lookup tables). This helps to minimize pivots in which the grand totals don't match the sum that you actually see on the table. If you hide the fields, it reminds you of that. There are exceptions of course, but this is a good rule of thumb.
Now create measures to sum the tons and any other math calculations you'd like. With the measures, start simple and let the pivot do the slicing. Put the measures in the values section of the pivot table.
Sum of Source Tons:=sum(Table1[Tons])
Sum of Destination Tons:=sum(Table2[Tons])

How to calculate the daily warehouse stock in DAX?

I have a table in SSAS tabular mode that shows how individual pieces of products moved through different sections of a production line:
Product_ID, section_ID, Category_id (product category), time_in (when a product entered the section), time_out (when the product exited the section)
This is how the input table looks like:
I would like to write a measure in DAX that can show me the stock of each section and product category day-by-day as shown below by counting the number of distinct product ids which were in a particular section on that day.
I'm using SQL Server 2017 Analysis Services in Tabular Mode and Excel Pivot Table for representation.
Create a new table that has all of the dates that you want to use for your columns. Here's one possibility:
Dates = CALENDAR(MIN(ProductInOut[time_in]), MAX(ProductInOut[time_out]))
Now create a measure that counts rows in your input table satisfying a condition.
ProductCount =
VAR DateColumn = MAX(Dates[Date])
RETURN COUNTROWS(FILTER(ProductInOut,
ProductInOut[time_in] <= DateColumn &&
ProductInOut[time_out] >= DateColumn)) + 0
Now you should be able to set up a pivot table with Category_id on the rows and Dates[Date] on the columns and ProductCount as the values.

Eliminate duplicates and Insert Unique records having max no. of column values present through Talend

I have an excel file which gets updated on a daily basis i.e the data is always different every time.
I am pulling the data from the Excel sheet into the table using Talend. I have a primary key Company_ID defined in the table.
The error I am facing is that the Excel sheet has few duplicate Company_ID values. It will also pick up more duplicate values in the future as the Excel file will be updated daily.
I want to choose the first record where the Company ID field is 1 and the record doesn't have null in the rest of the columns. Also, for a Company_ID of 3 there is a null value for one column which is ok since it is a unique record for that company_id.
How do I choose a unique row which has maximum no. of column values present ie for eg in the case of Company ID of 1 in Talend ?
tUniqRow is usually the easiest way to handle duplicates.
If you are worried that the first row coming to tUniqRow may not be the first row that you want there, you can sort your rows, so they enter tUniqRow in your preferred order:
(used components: tFileInputExcel, tJavaRow, tSortRow, tUniqRow, tFilterColumns)
In your particular case, the tJava could look like this:
// Code generated according to input schema and output schema
output_row.company_id = input_row.company_id;
output_row.name = input_row.name;
output_row.et_cetera = input_row.et_cetera;
// End of pre-generated code
int i = 0;
if (input_row.company_id == null) { i++; }
if (input_row.name == null) { i++; }
if (input_row.et_cetera == null) { i++; }
output_row.priority = i;

Resources