Efficient way to implement excel import in grails - excel

This code should probably go in code review but I won't get quick response there (Only 2 groovy questions there).
I have the following code for importing data from excel into my grails application. The problem is that I didn't test for >1000 rows in the excel file so my app froze when my client tried to upload 13k rows. I have checked the stacktrace.log (app is in production) but no exception. The system admin thinks the jvm ran out of memory. We have increased the size of the heap memory. However, I want to ask if there's a better way to implement this. I am using apache poi and creating domain objects as I read each row from excel. After that, I pass the list of objects to the controller that validates and saves them in the database. Should I tell my client to limit number of items imported at a time? Is there a better way to write this code?
def importData(file, user){
def rows = []
def keywords = Keyword.list()
int inventoryCount = Inventory.findAllByUser(user).size()
def inventory = new Inventory(name:"Inventory ${inventoryCount +1}", user:user)
Workbook workbook = WorkbookFactory.create(file)
Sheet sheet = workbook.getSheetAt(0)
int rowStart = 1;
int rowEnd = sheet.getLastRowNum() + 1 ;
for (int rowNum = rowStart; rowNum < rowEnd; rowNum++) {
Row r = sheet.getRow(rowNum);
if(r != null && r?.getCell(0, Row.RETURN_BLANK_AS_NULL)!=null ){
def rowData =[:]
int lastColumn = 8;
for (int cn = 0; cn < lastColumn; cn++) {
Cell c = r.getCell(cn, Row.RETURN_BLANK_AS_NULL);
if (c == null) {
return new ExcelFormatException("Empty cell not allowed",rowNum+1, cn+1)
} else {
def field = properties[cn+1]
if(field.type==c.getCellType()){
if(c.getCellType()==text){
rowData<<[(field.name):c.getStringCellValue().toString()]
}else if(c.getCellType()==numeric){
if(field.name.equalsIgnoreCase("price") ){
rowData<<[(field.name):c.getNumericCellValue().toDouble()]
}else{
rowData<<[(field.name):c.getNumericCellValue().toInteger()]
}
}
}else{
return new ExcelFormatException("Invalid value found", rowNum+1, cn+1)
}
}
}
def item = new InventoryItem(rowData)
String keyword = retrieveKeyword(item.description, keywords)
String criticality = keyword?"Critical":"Not known"
int proposedMin = getProposedMin(item.usagePerYear)
int proposedMax = getProposedMax(proposedMin, item.price, item.usagePerYear, item?.currentMin)
String inventoryLevel = getInventoryLevel(item.usagePerYear, item.quantity, proposedMin, item.currentMin)
item.proposedMin = proposedMin
item.proposedMax = proposedMax
item.inventoryLevel = inventoryLevel
item.keyword = keyword
item.criticality = criticality
inventory.addToItems(item)
}
}
return inventory
}
Functions used in above code:
def retrieveKeyword(desc, keywords){
def keyword
for (key in keywords){
if(desc.toLowerCase().contains(key.name.toLowerCase())){
keyword = key.name
break
}
}
return keyword
}
int getProposedMin(int usage){
(int) ((((usage/12)/30) *7) + 1)
}
int getProposedMax(int pmin, double price, int usage, int cmin){
int c = price == 0? 1: ((Math.sqrt((24 * (usage/12)*5)/(0.15*price))) + (pmin - 1))
if(cmin >= c){
return pmin
}
return c
}
String getInventoryLevel(int usage, int qty, int proposedMin, int currentMin){
if(qty != 0){
double c = usage/qty
if(usage==0)
return "Excess"
if(c<0.75){
return "Inactive"
}else if(proposedMin<currentMin){
return "Excess"
}else if(c>=0.75){
return "Active"
}
}else if(usage==0 && qty == 0){
return "Not used"
}else if(usage>3 && qty ==0){
return "Insufficient"
}else if(proposedMin > currentMin){
return "Insufficient"
}
}
Controller action:
def importData(){
if(request.post){
def file = request.getFile("excelFile")
//validate file
def file_types = ["application/vnd.ms-excel","application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"]
if(!file_types.contains(file.getContentType())){
render view:"importData", model:[error:"Invalid File type"]
return
}
def inv = excelService.importData(file.getInputStream(),User.get(principal.id))
if(inv){
if(inv instanceof ExcelFormatException){
def err = (ExcelFormatException) inv
render view:"importData", model:[error:err.message +". Error occurred at: Row: "+err.row+" Col: "+err.col]
return
}else{
render view:"viewData", model:[inventory:inv]
return
}
}
}
}

Hibernate and GORM require some tuning when dealing with bulk imports. Two suggestions:
Follow the techniques found here: http://naleid.com/blog/2009/10/01/batch-import-performance-with-grails-and-mysql (written with MySQL in mind, but the concepts are pertinent to any RDBMS)
Don't use a collection to map the relationship between Inventory and InventoryItem. Remove the items collection from Inventory and instead add an Inventory field to your InventoryItem class. Burt Beckwith covers this in great detail here: http://burtbeckwith.com/blog/?p=1029

Using a plugin would be a better option.
I use this plugin - http://grails.org/plugin/excel-import

Related

Set a JMter variable with an groovy collection (JSR223 PostProcessor)

I'm trying to set a variable in JMter with the value in a List that I have in JSR223 Processor (Groovy). For that, I'm using the method vars.putObject, but when I try to use this variable in a ForEach Controller the loop doesn't execute.
My PostProcessor has the following flow:
Get a list of strings that were generated by a Regular Expression Extractor
Create a List with the valid values for the test (filter some values)
Add the result in a JMter variable vars.putObject
import org.apache.jmeter.services.FileServer
int requestAssetsCount = vars.get("CatalogAssetIds_matchNr").toInteger()
int maxAssetsNumbers = vars.get("NumberAssets").toInteger()
List<String> validAssets = new ArrayList<String>()
def assetsBlackListCsv = FileServer.getFileServer().getBaseDir() + "\\\assets-blacklist.csv"
File assetsBlackListFile = new File(assetsBlackListCsv)
List<String> assetsBlackList = new ArrayList<String>()
log.info("Loading assets black list. File: ${assetsBlackListCsv}")
if (assetsBlackListFile.exists()) {
assetsBlackListFile.eachLine { line ->
assetsBlackList.add(line)
}
}
else {
log.info("Black list file doesn't exist. File: ${assetsBlackListCsv}")
}
log.info("Verifying valid assets")
for (def i = 1; i < requestAssetsCount; i++) {
def assetId = vars.get("CatalogAssetIds_${i}_g1")
if (!assetsBlackList.contains(assetId)) {
validAssets.add(assetId)
}
else {
log.info("Found a blacklisted asset. Skipping it. Asset ID: ${assetId}")
}
if (validAssets.size() >= maxAssetsNumbers) {
break
}
}
I've tried (like regular extractor):
log.info("Storing valid assets list")
vars.putObject("ValidCatalogAssetIds_matchNr",validAssets.size())
for(def i = 0; i < validAssets.size(); i++) {
vars.putObject("ValidAssetIds_${i+1}_g",1)
vars.putObject("ValidAssetIds_${i+1}_g0","\"id\":\"${validAssets[i]}\"")
vars.putObject("ValidAssetIds_${i+1}_g1",validAssets[i])
}
I've tried (set list value):
log.info("Storing valid assets list")
vars.putObject("ValidAssetIds",validAssets)
Concat strings as "+ (i+1) + "
vars.putObject("ValidCatalogAssetIds_"+ (i+1) + "_g",1)
vars.putObject("ValidAssetIds_"+ (i+1) + "_g0","\"id\":\"${validAssets[i]}\"")
vars.putObject("ValiAssetIds_"+ (i+1) + "_g1",validAssets[i])
Don't use ${} syntax in JSR223 scripts because it will initialize values before script executed and not as expected

Create advanced filter

I am trying to create an advanced filter in Excel from C# to copy unique data from one sheet to another, at least I get it in Excel and if I use Interop like this :
Excel.Range rang = sheet2.get_Range("A2");
Excel.Range oRng = sheet.get_Range("I2", "I" + (lst.Count + 1));
oRng.AdvancedFilter(Excel.XlFilterAction.xlFilterCopy, CriteriaRange: Type.Missing,
CopyToRange: rang, Unique: true);
Works fine but I'm doing all my application with EPPlus and it will be better if I can do the same without Interop.
So, it is possible?
Since Advanced Filter is an excel function you need the full Excel DOM to access it. Epplus doesnt have that - it just generated the XML to feed to excel which will then apply its "interpretation", so to speak.
But since you have the power of .NET at your disposal, you can use linq fairly easily to do the same thing by querying the cell store and using .distinct() to get the unique list. The only wrinkle is you have to create your own IEquitableComparer. This will do it:
[TestMethod]
public void Distinct_Filter_Test()
{
//Throw in some data
var datatable = new DataTable("tblData");
datatable.Columns.AddRange(new[]
{
new DataColumn("Col1", typeof (int)), new DataColumn("Col2", typeof (string))
});
var rnd = new Random();
for (var i = 0; i < 10; i++)
{
var row = datatable.NewRow();
row[0] = rnd.Next(1, 3) ;row[1] = i%2 == 0 ? "even": "odd";
datatable.Rows.Add(row);
}
//Create a test file
var fi = new FileInfo(#"c:\temp\Distinct_Filter.xlsx");
if (fi.Exists)
fi.Delete();
using (var pck = new ExcelPackage(fi))
{
//Load the random data
var workbook = pck.Workbook;
var worksheet = workbook.Worksheets.Add("data");
worksheet.Cells.LoadFromDataTable(datatable, true);
//Cells only contains references to cells with actual data
var rows = worksheet.Cells
.GroupBy(cell => cell.Start.Row)
.Skip(1) //Exclude header
.Select(cg => cg.Select(c => c.Value).ToArray())
.Distinct(new ArrayComparer())
.ToArray();
//Copy the data to the new sheet
var worksheet2 = workbook.Worksheets.Add("Distinct");
worksheet2.Cells.LoadFromArrays(rows);
pck.Save();
}
}
public class ArrayComparer: IEqualityComparer<object[]>
{
public bool Equals(object[] x, object[] y)
{
return !x.Where((o, i) => !o.Equals(y[i])).Any();
}
public int GetHashCode(object[] obj)
{
//Based on Jon Skeet Stack Overflow Post
unchecked
{
return obj.Aggregate((int) 2166136261, (acc, next) => acc*16777619 ^ next.GetHashCode());
}
}
}

Return a set of objects from a class

I have a method that adds a new item to an EF table, then queries back the table to return a subset of the table. It needs to return to the caller a set of "rows", each of which is a set of columns. I'm not sure how to do this. I have some code, but I think it's wrong. I don't want to return ONE row, I want to return zero or more rows. I'm not sure what DataType to use... [qryCurrentTSApproval is an EF object, referring to a small view in SS. tblTimesheetEventlog is also an EF object, referring to the underlying table]
Ideas?
private qryCurrentTSApproval LogApprovalEvents(int TSID, int EventType)
{
using (CPASEntities ctx = new CPASEntities())
{
tblTimesheetEventLog el = new tblTimesheetEventLog();
el.TSID = TSID;
el.TSEventType = EventType;
el.TSEUserName = (string)Session["strShortUserName"];
el.TSEventDateTime = DateTime.Now;
ctx.tblTimesheetEventLogs.AddObject(el);
ctx.AcceptAllChanges();
var e = (from x in ctx.qryCurrentTSApprovals
where x.TSID == TSID
select x);
return (qryCurrentTSApproval)e;
}
}
Change your method return type to a collection of qryCurrentTSApproval
private List<qryCurrentTSApproval> LogApprovalEvents(int TSID, int EventType)
{
using (CPASEntities ctx = new CPASEntities())
{
// some other existing code here
var itemList = (from x in ctx.qryCurrentTSApprovals
where x.TSID == TSID
select x).ToList();
return itemList;
}
}

Watin: Iterating through text boxes in a telerik gridview

I am currently developing a testing framework for a web data entry application that is using the Telerik ASP.Net framework and have run into a blocker. If I step through my code in debug mode the test will find the text box that I am looking for and enter some test data and then save that data to the database. The problem that I am running into is that when I let the test run on it's own the test fails saying that it couldn't fine the column that was declared. Here is my code:
/*Method to enter test data into cell*/
private TableCell EditFieldCell(string columnHeader)
{
var columnIndex = ColumnIndex(columnHeader);
if (columnIndex == -1)
throw new InvalidOperationException(String.Format("Column {0} not found.", columnHeader));
return NewRecordRow.TableCells[columnIndex];
}
/*Method to return column index of column searching for*/
public int ColumnIndex(string columnHeader)
{
var rgTable = GridTable;
var rgCount = 0;
var rgIndex = -1;
foreach (var rgRow in rgTable.TableRows)
{
foreach (var rgElement in rgRow.Elements)
{
if (rgElement.Text != null)
{
if (rgElement.Text.Equals(columnHeader))
{
rgIndex = rgCount;
break;
}
}
rgCount++;
}
return rgIndex;
}
My thinking is that something with my nested for loops is presenting the problem because the rgIndex value that is returned when I let the program run is -1 which tells me that the code in the for loops isn't being run.
TIA,
Bill Youngman
Code that gets the table Column index. You need to pass the Table(verify that the table exists while debug):
public int GetColumnIndex(Table table, string headerName)
{
ElementCollection headerElements = table.TableRows[0].Elements; //First row contains the header
int counter = 0;
foreach (var header in headerElements)
{
if (header.ClassName != null && header.ClassName.Contains(headerName)) //In this case i use class name of the header you can use the text
{
return counter;
}
counter++;
}
// If not found
return -1;
}

ReportViewer Nested SubReport

I have 4 reports Report A, Report B, Report C and Report D with datasources dsA, dsB, dsC and dsD respectively.
Report A is a Main Report which has the subreport B has a subreport C ...
The Report A fills the datasource dsB in the SubreportProcessingEvent with the parameter from ReportA.
i would need an event which is fired for every row in Report B so that I pass parameter from Report B and fill the Report C and C parameter to Report D....
code in SubreportProcessingEventArg
SearchValue = new SqlParameter[2];
SqlConnection thisConnection = new SqlConnection(thisConnectionString);
DataSet thisDataSet = new DataSet();
SearchValue[0] = new SqlParameter("#TPlanId", e.Parameters[1].Values[0]);
SearchValue[1] = new SqlParameter("#ProblemId", e.Parameters[0].Values[0]);
thisDataSet = SqlHelper.ExecuteDataset(thisConnection, "Proc_TP_Goal", SearchValue);
/* Associate thisDataSet (now loaded with the stored procedure result) with the ReportViewer datasource */
ReportDataSource datasource = new ReportDataSource("Goal_Proc_TP_Goal", thisDataSet.Tables[0]);
e.DataSources.Add(datasource);
i was not able to figure out the 3rd and 4th level of event handler any suggestion or examples would be greatly appreciated.
Thanks
I do this, I have a parameter in the sub-subreports that pass the row of the subreport. I hope you understand, if not let me know and I will post a sourcecode.
if ("RendicionDetalleCodigosReporte".Equals(e.ReportPath))
{
if (data != null)
{
RendicionDetalleData detalle = new RendicionDetalleData();
detalle.row = 0;
int row = Convert.ToInt32(e.Parameters[0].Values[0]);
foreach (var det in data.Detalles)
{
if (det.row.Equals(row))
{
detalle = det;
break;
}
}
if (detalle.row == 0)
{
e.DataSources.Add(new ReportDataSource("RendicionDetalleCodigo", new List<RendicionDetalleCodigosData>()));
}
else
{
e.DataSources.Add(new ReportDataSource("RendicionDetalleCodigo", detalle.Codigos));
}
}
else
{
e.DataSources.Add(new ReportDataSource("RendicionDetalleCodigo", new List<RendicionDetalleCodigosData>()));
}
}
else
{
if (data != null)
{
e.DataSources.Add(new ReportDataSource("RendicionDetalle", data.Detalles));
}
else
{
e.DataSources.Add(new ReportDataSource("RendicionDetalle", new List<RendicionDetalleData>()));
}
}

Resources