SAP BODS cannot fetch 255+ columns from Excel - excel

We are facing an issue when uploading long texts (longer than 255 symbols) from Excel file using Data Services in SAP BODS.
Data Services ODBC driver truncates all further texts in this column to 255 symbols, even if the field length is defined as varchar(2500) in Excel file format in Data Services and if the column contains longer texts in next rows.
- I tried to set parameter TypeGuessRows = 0 -- but it's not working.
- Also tried using keeping record on first row in source Excel but it's not working.
can anyone knows how to load max length data using sap bods.

This is a known issue which is described in note 1675110. This is default (faulty) behavior of SAP DS, which sets file width according first 100 rows of Excel book. The subsequent rows even longer ones will not be treated longer than 255 characters.
SOLUTION: move longer rows to the top 100 or make the fake first row of the necessary length which consider longest column in your book.

Related

Generate a multicolumn table using docxtpl

I have a series of data (in 2-dimensional list 'CombinedTable') I need to use to populate a table in an MS Word template. The table has 7 columns so I attempted the following using docxtpl module:
context = {
'tpl_modules1': CombinedTable[0]
'tpl_modules2': CombinedTable[2]
'tpl_modules3': CombinedTable[4]
'tpl_modules4': CombinedTable[6]
'tpl_modules5': CombinedTable[8]
'tpl_modules6': CombinedTable[10]
'tpl_modules7': CombinedTable[12]
}
tpl.render(context)
tpl.save(FilePath + FileName)
Not the most elegant solution I know but am just trying to get this working- unfortunately using this code with the following template results in tpl_modules7 data being written in to all columns, rather than just the 7th.
Does anyone have advice for how to resolve this? I attempted to create a for loop through the columns as well as rows but was unsuccessful in writing anything to the doc (was saved as a blank & empty doc).
The CombinedTable variable is a list of 12 lists (one for each column in template, although only 7 contain data). Each of these 12 lists contains another list with cell data whose length is equal to the number of rows to be written to the table in that column. This means that the number of rows that are written to varies for each column.
EDIT: Looking more closely at the docs, it states that I cannot use %tr multiple times in the same row. I assume I will then have to use a loop through %tc and %tr (which I tried & couldn't get working). Any advice on how to implement this? Especially on the side of the word document. Thanks!
I was able to resolve this satisfactorily for my requirements, however my solution may not suit all. I simply set up 7 different tables in a document with 7 columns and adjusted margins/borders to suit the dimensions I required for the tables. Each of the 7 tables had identical docxtpl syntax as image in my question with the small buffer columns between them being replaced by columns in the word document.

How to convert csv text to numbers as well as manipulate almost non-machine readable data in Power Bi?

I have a sales datasheet that is in csv, that I input into Power BI. The monetary values on the sheet come up as decimal placed numbers (e.g 123.0000) but Power BI reads it as text. When I try and convert this to a fixed decimal number ($) it kicks back an error. How do I convert this safely to ($)? There are also multiple columns with these values in them. How would I convert all of them in the easiest way, as there are other columns with just normal numbers between these monetary columns? (1 x SOH column and then 1 x Net column - this repeats)
On top of this, the datasheet is spread in such a way that is is difficult to manipulate the data into a form that is easy for Power BI to read. The header rows begin with the SKU code and description, but then move over to each individual store (retail store) by location as well as being broken up into SOH and Net, per store per column. I've been racking my brain on this for ages and can't seem to find a simple way around it. Any tips would be great.
For the conversion to ($), I went into the csv sheet, altered the format of the numbers and saved it as a .xml, but the issue with this is that I would have to repeat this tedious step every time I would need to pull data, which is a lot.
For the layout of the original spreadsheet, I tried unpivoting the data in Power BI, which does work. However, it is still sectioned off by Net and SOH, which means I have to add a slicer in just to see Net or SOH on its own, instead of having them as separate entries.
I expect the output to firstly give me fixed decimal numbers, but all I get is an error when trying to convert the numbers to $.
With the unpivoting, I can manipulate the data by store, which is great and helps, but I have to create a separate sheet which has the store ID's on it so that I can"filter" them when I want to switch between them (again, a slicer is necessary). I expect to be able to look at the store individually as well as overall and then also look at the Net individually and SOH individually, by store and as a whole. From there I can input my cost sheet and calculate the GP.
I have attached a picture of the data. I can drop a sample sheet somewhere as well if necessary. I just need to know where.
I figured it out. All you have to do is change the regional settings, not on your laptop specifically but rather within Power BI itself.
Go to File > options and settings > options > regional settings > change to English (United Kingdom) {thats the region that worked for me and fixed everything automatically}

Reading an Excel sheet using ADO/ODBC in Delphi 7

I'm trying to read an Excel sheet from an XLS or XLSX file in memory using Delphi 7. When possible I use automation to read the cells one by one, but when Excel is not installed, I revert to using the ADO/ODBC Jet driver.
I connect using either
Provider=Microsoft.Jet.OLEDB.4.0; Data Source=file.xls;Extended Properties="Excel 8.0;Persist Security Info=False;IMEX=1;HDR=No";
Provider=Microsoft.ACE.OLEDB.12.0; Data Source=file.xlsx;Extended Properties="Excel 12.0;Persist Security Info=False;IMEX=1;HDR=No";
My problem then is that when I use the following query:
SELECT * FROM [SheetName$]
the returned results do not contain the empty rows or empty columns, so if the sheet contains such rows or columns, the following cells are shifted and do not end up in their correct position. I need the sheet to be loaded "as is", ie know exactly from what cell position each value comes from.
I tried to read the cells one by one by issuing one query of the form
SELECT F1 FROM `SheetName$A1:A1`
but now the driver returns an error saying "There is data outside the selected region". btw I had to use backticks to enclose the name because using brackets like this [SheetName$A1:A1] gave out a syntax error message.
Is there a way to tell the driver to select the sheet as-is, whithout skipping blanks? Or maybe a way to know from which cell position each value is returned?
For internal policy reasons (I know they are bad but I do not decide these), it is not possible to use a third party library, I really need this to work from standard Delphi 7 components.
I assume that if your data is say in the range B2:D10 for example, you want to include the column A as an empty column? Maybe? Is that correct? If that's the case, then your data set, when you read the sheet (SELECT * FROM [SheetName$]) would also return 1 million rows by 16K columns!
Can you not execute a query like: SELECT * FROM [SheetName$B2:D10] and use the ADO GetRows function to get an array - which will give you the size of the data. Then you can index into the array to get what data you want?
OK, the correct answer is
Use a third party library no matter what your boss says. Do not even
try ODBC/ADO to load arbitrary Excel files, you will hit a wall sooner or later.
It may work for excel files that contain a single data table, but not when you want to cherry pick data in a sheet primarily made for human consumption (ie where a single column contains some cells with introductory text, some with numerical data, some with comments, etc...)
Using IMEX=1 ignores empty lines and empty columns
Using IMEX=0 sometimes no longer ignores empty lines, but now some of the first non empty cells are considered field names instead of data, although HDR=No. Would not work anyway since valules in a column are of mixed types.
Explicitly looping across cells and making a SELECT * FROM [SheetName$A1:A1] works until you reach an empty cell, then you get access violations (see below)
Access violation at address 1B30B3E3 in module 'msexcl40.dll'. Read of address 00000000
I'm too old to want to try and guess the appropriate value to use so it works until someone comes with yet another mix of data in a column. Sorry for having wasted everybody's time.

SQL - Linked Server with Excel imports values as NULL

I have been successfully using a linked server with SQL Server Management Studio importing a file from Excel which has four columns.
The Excel document looks like (no TOOL means blank cell, rows 6-199)
TDS HOLDER TOOL
1 3 1187
2 4 09812
3 5 9082
4 2 ----
5 76 ----
6 9
7 1
. .
. .
. .
200 18 CT-2989
201 98 CT-9871
When I import it as is, it will grab the cells with the numbers at the top, cells that contain ------ and then when it gets to the cells which are blank it will then print NULL for the rest of the data, which is incorrect.
When I alter my Excel document so that the 'CT' values are at the top, it will grab all of the proper CT and TL values in column 3.
The problem is with the SQl Server Import and Export wizard. It uses the data in the top few rows of the spreadsheet to decide on the data types in each column. When your Tools column has numbers at the top the wizard decides the data type of the column is float. When the column had "CT-2989" at the top it chooses a char type. Once it has chosen the float type it will ignore CT-2989 because it isn't convertable to a floating point number. The simplest solution to the problem it to arrange your Excel spreadsheet with a dummy row at the very top which gives the wizard the proper types for each column. For example, make the first data cell in the Tools column "abcdefg", assuming the rest of the data in that column consists of up to 7 alphanumeric characters. Once your data has imported into SSMS, delete the dummy row.
If you go to the "Review DataType Mapping" page of the wizard it will show that the Tools column has been detected as containing float data when the numeric data is at the top of the spreadsheet. Note that the even if the destination type for the Tools column is nvarchar, the wizard makes it's own decisions regarding source type.
There are other solutions using openrowset() and SSIS, but this one is quick and simple.
Here the problem is with OLEDB which is unable to handel mixed data(numbers + text) so there is no solution only a few hacks some of them already mentioned above I just want to add a few more:-
1) In excel sheet Keep the data consistent and maintain distinct coloumns for each depending upon its data type e.g. text,numeric or whole numbers, fractional numbers etc.
2) Ok before importing it break down the sheet in multiple sheets based on its datatype so that OLEDB won't get confiused.
3) Import the excel sheet in MS Access so that it all the data would get a data type then import it into SQL this would handle NULL too very wisely.
Save the worksheet as .CSV and import as a flat file from task, when reviewing the data, unchecked the Datatypes indicator.

How to Differentiate a Data from a Column/Header in an Excel File

I hope someone can help me come up with an algorithm.
Im still very new with Apache POI and I was assigned to come up with an algorithm on how to read a template (Excel) and extract the headers/column names from the data itself.
The following must be taken into account:
There can be multiple headers/column names in just one sheet of an Excel file.
Headers can be horizontal AND/OR vertical in nature. This means that there could be a mixture of vertical and horizontal headers in one sheet.
Headers dont necessarily have to be at the very first row of the file. There could be introductions or banner images there.
The system must allow ANY kind of Excel format, so there is no control over the formatting of the cells, the naming convention, etc.
Some headers are alphanumeric in nature, which means it also contains numbers.
Some cells are merged to make room for a specific header.
Any ideas and suggestions are very much welcome. Just let me know if you have further clarifications.
(I know nothing about Apache, but some about Excel Interop working)
If the sheets to be detected are yours, I'd recomend NAMING those header cells. (To name a cell in Excel, there's a field at the top left of the screen, where normally the cell coordinates appear (like "A1" or "B2" and so...). Type a name in that place, and you will be able to identify that cell via code by it's name. ( 'Worksheet.Range("Name")' is where you get those cells via code)
To manage names, go to "Insert - Names" or "Formulas - Name manager", depending on what version of excel.
(Personally, I never work with sheets via code without naming headers, then I use "Offset" to get the data cells corresponding to those headers - This allows me to freely edit the sheet later without breaking the code)
If the sheets aren't yours, then, you'll need to find out the extents of the data. (Last row and last column)
Then check for the first line that contains all columns filled, none of them blank. That's a probable horizontal header.
As well as check for the first columns that contains all lines filled. That's a probable vertical header.
You could, as well, search for completely blank lines and/or columns to find headers that are AFTER some data, in case of sheets containing multiple horizontal headers, or vertical.
You could use some formatting properties (Range.Interior or Range.Font for examples) of those cells to identify if they are headers (usually headers have different format, color, borders and so on).
If you're sure there's no numeric header, I mean, all headers contains text, check for the type of data in the cells. If all are strings, header probability increases.
Even so, that's a tricky thing to do, if sheets don't follow some pattern, once in a while one of them can deceive your code and bring false results. I'd recommend, if alowed, to add a human verification to confirm the results after the proccess is done.
The solution to this problem involves taking away two of these freedoms. Such constraints applied will make this a tractable problem. Most of such freedoms come from overcautious thinking.
The freedoms are given as quotes below:-
Headers can be horizontal AND/OR vertical in nature. This means that there could be a mixture of vertical and horizontal headers in one sheet.
Typically, vertical headers are not used in Excel Files where there is a need to programmatically detect headers. As the primary, most common and sometimes the only reason for such detection is to upload/transform the tabular data.
Funny things happen when vertical headers are introduced:
They become Labels of Forms. This implies that such forms are used for data entry rather than storage. The data from such forms is stored in horizontal/columnar headers and rowwise/vertical records of data . Thus obviating the need for Upload/Transformation of the data entry sheet.
Excel is designed to have only horizontal headers. Vertical Headers cease to have autofilter support.
Even when Vertical Headers are present, a top horizontal header row can still be introduced to mark the headers themselves as descriptions / categories.
Staying true, to the core need for autodetection of headers, we can state that once our requirement states that Headers can be placed only in a horizontal alignment, the solution becomes slightly more tractable but not fully so.
Some cells are merged to make room for a specific header.
Merging cells is poison and anathema to the entire reason for transformation/upload of data. This is a pill I steadfastly have refused to take in my entire career with Excel & SQL jugglery. You may kindly merge all that you want to for all I care, however thee shall not pass into my beloved SQL Server.
For aforementioned reasons of prejudice and ill-will towards all mergers and mergees alike. I'd respectfully suggest that you too take this course.
Solution
Staying true to the above requirements after taking away the 2 freedoms. The pseudo algorithm (solution) is to
Take a sample of say c x r Excel Rows. For eg: 200 x 201 rows and columns
Find the counts of non-empty cells using an inbuilt formula like COUNTA whose contents have a non-zero length. The Count of such non-empty cells in each row is maintained as a data structure.
The type of data ie:- Number, Date, String should also be maintained in the above data structure capable of expressing the following:
Row# 22 contains
30 non-empty cells of which
28 are alphanumeric,
1 is a Date and
1 is a Number.
The First specific row that contains the maximum number of such non empty cells with the maximum number of strings should very likely be the header row.
Converting all of the above to a specific algorithm in any given language should be a deliciously occupying task for any young developer in their prime.

Resources