Reading Excel File with Gembox - Columns.Count = 0 - excel

I'm using GemBox to read Excel files. I'm copying the fields to a DataTable, so I have to add the columns to the DataTable first.
Therefore I'm using this code:
For i As Integer = 0 To objWorksheet.Columns.Count - 1
objDataTable.Columns.Add(i, GetType(ExcelCell))
Next
But objWorksheet.Columns.Count is 0 even if there is data in 4 columns.
Any ideas?

Cells are internally allocated in rows and not in columns. ExcelColumn objects are created only if they have non-standard width or style, or they are accessed directly. So, while ExcelRowCollection.Count shows number of rows occupied with data, ExcelColumnCollection.Count does not say which Column is the last one occupied with data!
If you want to read all data in a sheet, use ExcelRow.AllocatedCells property.
If you want to find last column occupied with data, use CalculateMaxUsedColumns method.
In version 3.5 method ExcelWorksheet.CreateDataTable(ColumnTypeResolution) is added. This method will automatically generate DataTable columns with appropriate type from excel file columns and will import cells with data to DataTable rows.

Related

Replace values in columns based on matching IDs between sheets

I usually do my data cleaning in python but the issue with python (pandas) is that when you read and print a table to excel it doesn't retain any of the excel formatting.
In this case I was given a large table where a lot of the cells are color coded and or commented. I need to retain all the coloring, comments, font styles and etc. I don't know how else to do that but to work in excel
The issue:
In one sheet I have a large table (400 rows x 45 columns). It is structured like below
Sheet 1:
|ID|C|D|E|F|
:--|:--|:--|:--|:--|
|EDMU025|1|2|3|4|
|EDMU026|5|6|7|8|
|EDMU027|9|2|3|4|
|EDMU028|5|6|7|8|
In another sheet I have a series of small tables which look like this
Sheet 2:
|ID|Date|C|D|E|F|
:--|:--|:--|:--|:--|:--|
|EDMU025|9/14/22|100|210|300|450|
|EDMU025|9/14/22|100|200|340|400|
|||||||
|Value to be replaced||100|200|300|400|
|||||||
|EDMU028|9/14/22|700|810|900|550|
|EDMU028|9/14/22|700|800|940|500|
|||||||
|Value to be replaced||700|800|900|500|
For each ID in Sheet 2 I need to find the ID in Sheet 1 and replace the values in sheet 1 columns C-F with the Values to be replaced.
The output would be:
|ID|C|D|E|F|
:--|:--|:--|:--|:--|
|EDMU025|100|200|300|400|
|EDMU026|5|6|7|8|
|EDMU027|9|2|3|4|
|EDMU028|700|800|900|500|
What is the most efficient way to do that for the entire table (while still keeping the original values that don't need to be replaced intact?)??
Try nested XLOOKUP() like-
=XLOOKUP(A2,Sheet2!$B$1:$B$15,Sheet2!$D$1:$G$15,XLOOKUP(A2,$A$2:$A$15,$B$2:$E$15,"",0),0,-1)

How do I read an excel using pandas to_excel method, till a specific column?

I have an excel which has 100 columns, but has data till a specific column only. (it is a monthly file and data gets appended to the file every month at the end in the column name which is month name.
How to I select that specific column which has recently populated data.
This is the following piece of code that i used to get the last column, but it returns the 100th column, which for obvious reasons, is empty
workbook = load_workbook(filename=Input_directory + '\\' +File_name)
worksheet = workbook.active
worksheet.max_column
Is there any way where i can get the column value of the excel till which it has data.
Thanks

Excel manipulates invisible cells in Filtered Range

I have a large table that contains a large data. Most of the time when I apply a filter to it i can manipulate and edit the filtered data with no problems. However sometimes(every 200th time perhaps...) when i select filtered range and try to paste in the selection some text - it seems like it has done the job but when I unfilter the table, the range that was edited is the range as it wasn't filtered at all.
Example:
my data is A1:A10
the filtered range is the cells A1 and A10,
when I select the filtered range and paste a text, occasionally the whole A1:A10 range is changed.
Anyone faced this issue?
the consequences are disastrous.
How will i avoid it in the future.
Thanks!
Ok I figured it out.
When the data is filtered, I select cells and want to apply some changes to it - what happens is that excel defines the rows range for manipulation as "upper row in selection to bottom row in selection":
the problem is that sometimes the row indexes are not consecutive(common issue when the data is not logically ordered in the first place) and excel treats the whole range in between the visible selected cells as the range for manipulation TOO.
It hapened to me occasionally only because my data is more or less ordered.
Example: a small table of numbers
**nums**
1
2
3
4
5
3
6
if i filter the nums table to show only 3s
it will show me this:
**nums**
3
3
when i select these two cells by dragging from one 3 to the other, paste the number 0 and unfilter the table back, the result will be
**nums**
1
2
0
0
0
0
6
because the cells inbetween the visible cells were in selected range too.
To prevent it, the solution is as Lior suggested:
Find & Select --> Go to... --> Visible cells only.
After you select the column you want to edit use select visible cells only.
in the link there is an example of the copy you can use the same for the paste.
http://office.microsoft.com/en-001/excel-help/copy-visible-cells-only-HA010244897.aspx

Can I adjust the widths of Excel columns without setting them each individually?

I'm using cfspreadsheet to generate an Excel spreadsheet using ColdFusion. I insert a header row, and then use spreadsheetAddRows to dump a query into the sheet. The problem is that the columns are often not wide enough. I know I can use SpreadsheetSetColumnWidth to adjust each column individually, but is there any way that I can just apply an auto-width to the entire sheet? I don't know the max width of each column, and I don't want to apply it to each column individually. Excel has an auto-width feature for columns — is there any way to trigger it from the ColdFusion code? (Or even better: Can I add on to the auto-width — set each column to the max width + 2 or something?)
Last I checked there was not a documented CF function. However you can use POI's autoSizeColumn(columnIndex) method to auto size each column. Just note POI uses base zero for sheet and column indexes.
<cfscript>
// create a workbook and add a long value
wb = SpreadSheetNew();
spreadSheetSetCellValue(wb, repeatString("x", 200), 1, 1);
// get the first sheet
sheet = wb.getWorkBook().getSheetAt( javacast("int", 0) );
// resize first column ie "A"
sheet.autoSizeColumn( javacast("int", 0) );
spreadSheetWrite( wb, "c:/test.xls", true );
</cfscript>

Skip a few rows when parsing Excel using OleDb

I chose OleDb as a method of reading data from Excel, one of my problems on parsing the Excel is this:
I want to skip a few rows from the file (let's call them a header..) - there are merged cells there and other stuff I need to ignore, I found this syntax:
'SELECT * FROM [Sheet1$a4:c]',
specifying "a4" - the left corner of the header row, and "c" - the right most column where the data is ..how ever this is not OK to me as I do not know the exact number of columns with data I need to parse ...Is there another way of accomplishing this ?
you can take all rows in a dataTable and then
IEnumerable<DataRow> newRows = dt.AsEnumerable().Skip(1);
DataTable dt2 = newRows.CopyToDataTable();
dt2 now contains all rows but the first.

Resources