Node.js xlsx get number of non-empty rows - node.js

I use xlsx in Node.js application to get the number of total rows. Here is the code
const range = xlsx.utils.decode_range(sheet['!ref']);
const totalRows = (range.e.r - range.s.r) + 1;
The problem is it also counts formatted cells with empty text. I only want to get number of rows with non-empty text. How can I do it using xlsx or is there any other library that can count number of rows containing non-empty text?

I know the thread is old but if anyone still looking for an answer, you can use the below code to ignore formatted blank cells:
var arr = xlsx.utils.sheet_to_row_object_array(sheet,{blankrows : false, defval: ''});
const totalRows = arr.length+1;

Related

Total Rows and Columns Count - Excel Office JS

I've been using getRangeByIndexes as it seems to be the best way to use numbers to get Ranges in Office-JS. Problem I'm having is that when I need to use entire row/columns, I've resorted to the below. I read up and looks like number of rows/cols hasn't changed in a long time, but I know in vba I used rows.count to make sure the code was dynamic, whatever version of Excel it would use the number of rows on a spreadsheet.
Is there anything similar in Office-JS?
const Excel_Worksheet_Lengths_Obj = {
"total_rows": 1048576,
"total_cols": 16384,
}
var ws = context.workbook.worksheets.getActiveWorksheet()
var Method_Headers_Rng = ws.getRangeByIndexes(0,0,1,Excel_Worksheet_Lengths_Obj.total_cols)
This will do it for you ...
const sheet = context.workbook.worksheets.getActiveWorksheet();
let rangeRows = sheet.getRange("A:A").load(["rowCount"]);
let rangeColumns = sheet.getRange("1:1").load(["columnCount"]);
await context.sync();
console.log("Row Count = " + rangeRows.rowCount);
console.log("Column Count = " + rangeColumns.columnCount);
Everything relating to the number of rows or columns looks to be done at the range level, not the worksheet level.
Passing in the first column to achieve the number of rows and the first row to achieve the number of columns does the trick. Not ideal but such is life.

Vba to break up text within a cell? Text to columns not working

How can I break up text within a cell with vba? I exported emails to an excel file using a vba and the information exported in one of the cells is formatted as seen below:
Name * xxxxxx
Country of residence * xxxxxx Email * xxxxx#gmail.com mailto:xxxxxxx#gmail.com
Mobile phone number * 0xxxxxx
Do you want to become a member of Assoc? Yes Check all that apply *
Members
Education
Ethical Conduct
Events
Regulation
I tried the solution below and it’s not working.
From article: If you need to build a formula to remove these line breaks all you need to know is that this ‘character’ is character 10 in Excel. You can create this character in an Excel cell with the formula =CHAR(10).
So to remove it we can use the SUBSTITUTE formula and replace CHAR(10) with nothing ( shown as “”).
https://www.auditexcel.co.za/blog/removing-line-breaks-from-cells-the-alt-enters/#:~:text=Building%20a%20formula%20to%20remove%20the%20ALT%20ENTER%20line%20breaks,-If%20you%20need&text=You%20can%20create%20this%20character,cell%20with%20no%20line%20breaks.
My understanding is that you dump an email into 1 excel cell and are hoping to separate a series of strings [Country, Email, Etc.] that are separated by a line break?
I suggest using the split function to separate the strings into an array, then loop through that array to put the information in the desired cells. Mind you this will only work if the items are in the same order everytime, if the order can change then you will need to add a data verification step. i.e. if inStr("#",[Range]) then its an email...
Split([string to split], [delimiter])
https://learn.microsoft.com/en-us/office/vba/language/reference/user-interface-help/split-function
Dim strEmail as String 'Email dump
Dim arrEmail() as String 'Array for looping
Dim ItemsInArray as Integer 'Used to hold array count
Dim i as Integer 'Counter
strEmail = ActiveSheet.Cells("[Column,Row]") 'Cell your email dumps to
arrEmail = Split(strEmail, char(10)) 'Populate array
ItemsInArray = UBound(arrEmail) 'Get upper bound of array (total item count)
For i = 0 to ItemsInArray
ActiveSheet.Cells("[Column,Row]") = arrEmail(i)
Column + 1
Next i
when i = 0 its a country code
when i = 1 its an email
when i = 2 its a phone #
etc....

Efficient way to read data from Excel using OpenXML SDK

I am trying to import data from an excel file which has 8 sheets with one of the sheet containing 56000 lines. This sheet has 44 columns with data. In order to access data from first 2 rows using OpenXML, I am using following code.
using (SpreadsheetDocument spreadSheetDocument = SpreadsheetDocument.Open(importFileCopyPath, false))
{
string relationshipId = spreadSheetDocument.WorkbookPart.Workbook.Descendants<Sheet>().Where(p => p.Name.Value == "Sheet1").Select(q => q.Id.Value).FirstOrDefault();
var worksheetPart = (WorksheetPart)spreadSheetDocument.WorkbookPart.GetPartById(relationshipId);
Worksheet workSheet = worksheetPart.Worksheet;
var sheetData = workSheet.GetFirstChild<SheetData>();
var rows = sheetData.Descendants<Row>();
}
Then I traverse through the rows collection I get in line number 5. Unfortunately for a file with 56000 lines this takes 2 minutes to extract data from first 2 rows. This is because line no. 3 loads the entire sheet with 56000 lines into the memory.
Is there a better way to bypass loading entire sheet data and directly access contents from the first 2 rows?

Read from a specific row onwards from Excel File

I have got a Excel file having around 7000 rows approx to read. And Excel file contains Table of Contents and the actual contents data in details below.
I would like to avoid all rows for Table of Content and start from actual content data to read. This is because if I need to read data for "CPU_INFO" the loop and search string occurrence twice 1] from Table of Content and 2] from actual Content.
So I would like to know if there is any way I can point to Start Row Index to start reading data content for Excel File , thus skipping whole of Table Of Content Section?
As taken from the Apache POI documentation on iterating over rows and cells:
In some cases, when iterating, you need full control over how missing or blank rows or cells are treated, and you need to ensure you visit every cell and not just those defined in the file. (The CellIterator will only return the cells defined in the file, which is largely those with values or stylings, but it depends on Excel).
In cases such as these, you should fetch the first and last column information for a row, then call getCell(int, MissingCellPolicy) to fetch the cell. Use a MissingCellPolicy to control how blank or null cells are handled.
If we take the example code from that documentation, and tweak it for your requirement to start on row 7000, and assuming you want to not go past 15k rows, we get:
// Decide which rows to process
int rowStart = Math.min(7000, sheet.getFirstRowNum());
int rowEnd = Math.max(1500, sheet.getLastRowNum());
for (int rowNum = rowStart; rowNum < rowEnd; rowNum++) {
Row r = sheet.getRow(rowNum);
int lastColumn = Math.max(r.getLastCellNum(), MY_MINIMUM_COLUMN_COUNT);
for (int cn = 0; cn < lastColumn; cn++) {
Cell c = r.getCell(cn, Row.RETURN_BLANK_AS_NULL);
if (c == null) {
// The spreadsheet is empty in this cell
} else {
// Do something useful with the cell's contents
}
}
}

Matlab number of rows in excel file

is there a command of Matlab to get the number of the written rows in excel file?
firstly, I fill the first row. and then I want to add another rows in the excel file.
so this is my excel file:
I tried:
e = actxserver ('Excel.Application');
filename = fullfile(pwd,'example2.xlsx');
ewb = e.Workbooks.Open(filename);
esh = ewb.ActiveSheet;
sheetObj = e.Worksheets.get('Item', 'Sheet1');
num_rows = sheetObj.Range('A1').End('xlDown').Row
But num_rows = 1048576, instead of 1.
please help, thank you!
If the file is empty, or contains data in only one row, then .End('xlDown').Row; will move to the very bottom of the sheet (1048576 is the number of rows in a Excel 2007+ sheet).
Test if cell A2 is empty first, and return 0 if it is.
Or use Up from the bottom of the sheet
num_rows = sheetObj.Cells(sheetObj.Rows.Count, 1).End('xlUp').Row
Note: I'm not sure of the Matlab syntax, so this may need some adjusting
You can use MATLAB's xlsread function to read in the spreadsheet. This obtains the following fields:
[numbers strings misc] = xlsread('myfile.xlsx');
if you do a size check on strings or misc, this should give you the following:
[rows columns] = size(strings);
testing this, I got rows = 1, columns = 10 (assuming nothing else was beyond 'A' in the spreadsheet).

Resources