Apache POI - is there a way to create pivot table on SXSSFSheet? - apache-poi

Tried writing large dataset to excel, and able to do that well with SXSSFWorkbook instead of xssfworkbook.
Now I am trying to create a pivot table with the already-written large dataset as base data. Unfortunately, SXSSFSheet does not have createPivotTable: only XSSFSheet has that facility.
Is there anyway I can use SXSSFSheet to create pivot tables?

In my case, I use SXSSFWorkbook to create a large .xlsx with pivot in few sheet.
So I create these sheet with below code. And make sure your source table is also XSSFSheet or it will cause some error. (Sheet not related to the pivot could be Sheet/SXSSFSheet is fine)
Hope this solution could help you to resolve your question.
XSSFSheet sheet = workbook.getXSSFWorkbook().createSheet("Pivot sheet");
AreaReference ar = new AreaReference("A1:" + "AI" + (source.getLastRowNum() + 1));
CellReference cr = new CellReference("A1");
XSSFPivotTable pivotTable = sheet.createPivotTable(ar, cr, source);

Related

Reference a sheet in my workbook as my power query data source

After a long search, I found the below M code to reference data in a sheet and use it as a source for my query, Data is found in sheet 1 in the same workbook that contains my queries and data is a simple XLS report exported from SAP. the reason i don't use table, that when people use the sheet they may paste the SAP exported data in the sheet without using table.
please let me know if it is reliable and won't cause errors.
also how to change the below first line to let it use my first sheet in the current workbook as a source instead of a workbook from a folder path as I don't need that.
let
Source = Excel.Workbook(File.Contents("C:\...\Downloads\Test.xlsx"), null, true),
Sheet1_Sheet = Source{[Item="SAP",Kind="Sheet"]}[Data],
fTrimTable = (tbl as table, header as text) =>
let
t = Table.Buffer(tbl),
columns = List.Buffer(Table.ColumnNames(t)),
rowsToCheck = 100,
Column = List.Select(columns, each List.PositionOf(List.FirstN(Table.Column(t, _),rowsToCheck), header)>0){0},
Row = List.PositionOf(Table.Column(t, Column), header),
ScrollRows = Table.RemoveFirstN (t, Row),
ScrollColumns = Table.SelectColumns(ScrollRows, List.RemoveFirstN(columns, List.PositionOf(columns, Column))),
#"Promoted Headers" = Table.PromoteHeaders(ScrollColumns, [PromoteAllScalars=true])
in
#"Promoted Headers",
Trimmed = fTrimTable(Sheet1_Sheet, "Header100")

Linking a slicer to multiple pivot tables using office scripts

I'm trying to add a slicer to connect to multiple pivot tables in Excel using office scripts. It seems like the office scripts can only connect 1 slicer to 1 pivot table. The recording action does not seem to be able to record the connectivity action in the pivot table slicer settings.
let newSlicer = workbook.addSlicer(newPivotTable, newPivotTable.getHierarchy("Overdue").getFields()[0], selectedSheet);
The above script does not seem to be able to pass in more than one pivot table. Anyone got a solution to this? Much appreciation.
I don't think linking multiple PivotTables to a slicer is currently supported. But there may be a workaround. You can run the Office Scripts code below. You will have to update the variables with the names for your own PivotTable, field for the slicer, etc.:
function main(workbook:ExcelScript.Workbook){
let sh: ExcelScript.Worksheet = workbook.getActiveWorksheet();
let slicer1: ExcelScript.Slicer = getOrAddSlicer("PivotTable1","Col1",workbook);
let slicer2: ExcelScript.Slicer = getOrAddSlicer("PivotTable1", "Col2", workbook);
}
function getOrAddSlicer(ptName:string,ptRowHierarchyName: string, workbook:ExcelScript.Workbook): ExcelScript.Slicer {
let sh: ExcelScript.Worksheet = workbook.getActiveWorksheet();
let pt: ExcelScript.PivotTable = sh.getPivotTable(ptName);
let pf: ExcelScript.PivotField = pt.getRowHierarchy(ptRowHierarchyName).getPivotField(ptRowHierarchyName);
let slicer: ExcelScript.Slicer = workbook.getSlicer(ptRowHierarchyName);
if (slicer === undefined) {
slicer = workbook.addSlicer(pt, pf, sh);
}
return slicer;
}
The getOrAddSlicer function will add a slicer to the active worksheet. Or select a slicer on the active sheet that's linked to a specific field if it's previously been added. After you've added all the slicers, you can copy and paste the PivotTable the slicers are linked to. After you've copied and pasted the PT, both PivotTables should also be linked to all of the slicers.

How to write to an existing excel file with openpyxl, while preserving pivot tables

I have this excel file with multiple sheet. One sheet contains two pivot tables, normal table based on the data from pivot, some graphs based on pivot as well.
I am updating the sheets without pivots using below code. The content for these sheets are generated as dataframes and straight away right the data frame.
Method 1
book = xl.load_workbook(fn)
writer = pd.ExcelWriter(fn,engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
DF.to_excel(writer, 'ABC', header=None, startrow=book.active.max_row)
writer.save()
But, when the file is written, the pivot table is converted to plain text. The solution I found to preserve the pivot table is to read and write the workbook using below methods.
Method 2
workbook = load_workbook(filename=updating_file)
sheet = workbook["Pivot"]
pivot = sheet._pivots[0]
# any will do as they share the same cache
pivot.cache.refreshOnLoad = True
workbook.save(filename=updating_file)
This adds an additional row to the pivot table as 'Value' which ruins the values of the tables based on the pivot.
According to here using pd.ExcelWriter would not preserve pivot tables. The only example I found to update an existing excel file with data frame requires pandas ExcelWriter.
Some help would be highly appreciated, as I am unable to find a method to fulfill both requirements.
Only option I can see so far is to write the data parts with Pandas. Then, drop the existing Pivot sheet and copy a sheet from original fie. But, again I have to find a way to clear the table based on the pivot and rewrite with openpyxl using 2nd method. (We can't copy sheets between workbooks)
Stick with your Method 1: if you convert the df to a pivot table in pandas, and then export to excel, it will work.
An example:
import pandas as pd
import numpy as np
# create dataframe
df = pd.DataFrame({'A': ['John', 'Boby', 'Mina', 'Peter', 'Nicky'],
'B': ['Masters', 'Graduate', 'Graduate', 'Masters', 'Graduate'],
'C': [27, 23, 21, 23, 24]})
table = pd.pivot_table(df, values ='A', index =['B', 'C'],
columns =['B'], aggfunc = np.sum)
table.to_excel("filename.xlsx")
Outputs
I found a way to iterate the data frame as rows. If it was adding rows to the end of exisitng table, this would have been much easier. Since, I have to insert rows to middle, I followed below approach to insert blank rows and write the cell values.
current_sheet.insert_rows(idx=11, amount=len(backend_report_df))
sheet_row_idx = 11
is_valid_row = False
for row in dataframe_to_rows(backend_report_df, index=True, header=True):
is_valid_row = False
for col_idx in range (0, len(row)):
if col_idx == 0 and row[col_idx] is None:
logger.info("Header row/blank row")
break
else:
is_valid_row = True
if col_idx != 0:
current_sheet.cell(row=sheet_row_idx, column=col_idx).value = row[col_idx]
if is_valid_row:
sheet_row_idx = sheet_row_idx + 1

Creating Conditional Column in excel or power bi

I have a problem which can be solved in either excel or power bi. From the image provided i want to create a new column so that the Program/Course will only contain the values Program and values containing Kurs will be in their own column. Is there a way to do that using either excel or Power BI.
I would use the Query Editor to rename the current Program/Course column, then create 2 Conditional Columns to split the data the way you want e.g.
Program/Course = if [Program/Course - Original] <> "Kurs" then [Program/Course - Original] else null
Course = if [Program/Course - Original] = "Kurs" then [Program/Course - Original] else null
Finally I would remove the renamed column.

Empty box created when inserting a table using XMLCursor to XWPFDocument

When I inset a table into a XWPFDocument using an XMLCursor it is inserting the table into the correct position but it is adding an extra box under the first column. The box is joined onto the table so it looks like it is an additional table cell but when I insert the table without using and XMLCursor the table is correct and the box is at the position of the XMLCursor. Is there any way to delete the box as it looks like its an additional table cell.
XWPFDocument part1Document = new XWPFDocument(part1Package);
XmlCursor xmlCursor = part1Document.getDocument().getBody().getPArray(26).newCursor();
//create first row
XWPFTable tableOne = part1Document.createTable();
XWPFTableRow tableOneRowOne = tableOne.getRow(0);
tableOneRowOne.getCell(0).setText("Hello");
tableOneRowOne.addNewTableCell().setText("World");
XWPFTableRow tableOneRowTwo = tableOne.createRow();
tableOneRowTwo.getCell(0).setText("This is");
tableOneRowTwo.getCell(1).setText("a table");
tableOne.getRow(1).getCell(0).setText("only text");
XmlCursor c2 = tableOne.getCTTbl().newCursor();
c2.moveXml(xmlCursor);
c2.dispose();
XWPFTable tables = part1Document.insertNewTbl(xmlCursor);
xmlCursor.dispose();
The empty box is appearing at the position of the 26th paragraph. Any help would be great. Thanks.
What you do is you actually create two different tables. The first one with method createTable(), then you add rows and cells needed. After that at the cursor on your table tableOneyou create a new table with method insertNewTbl(xmlCursor). The method creates a new table with 1 row and 1 cell, hence your empty box.
I can't say why your first table is actually placed in the correct position, probably because it is merged with the second table, that goes to your cursor.
I suggest you create the second table from the start:
XWPFTable tableOne = part1Document.insertNewTbl(xmlCursor);
XWPFTableRow tableOneRowOne = tableOne.getRow(0);
tableOneRowOne.getCell(0).setText("Hello");
tableOneRowOne.addNewTableCell().setText("World");

Resources