Stopping NULL Rows and Columns to get imported from excel to access - excel

I am importing some table from excel to access.Sometime some blank columns also are imported as field13 or field-x .
Whats the reason for that.
Also sometimes some blank rows are imported also.Is there a way to stop it?
Below is the set of data I am trying for import. But sometimes I see an extra column after ArtSRv and and some extra empty rows after 2 row getting imported. So I want to know the reason for that.

During the import process you are given the options to not import columns (there is a checkbox). Also you can choose a primary key which can help with the blank records.
The blank records are mostly the way the data is structured in excel (or not structured as the case may be.
Excel keeps a property of the active area, and I believe Access uses this property to determine the rows/columns to import.

Related

Check For Matches And Import Data Into Specific Cells From An External Source

We are trying to track some online marketing metrics and I'm having some trouble. I have 2 tables in different tabs (one imports data from several external data sources, ultimately we want this to be a series of google sheets) and one is the working table.
I have rows on the imported data with month and other attributes defining the data and in the working data these are columns. The working data has a lot of other cells too that are not there with calculations, etc.
What I need to do is to check on the working sheet which month (for example) we are in, then go to the working data and scan all the data for matches with that month. Then I want to consolidate each of the data parameters into the working sheet. Ideally I wouldn't even have to import all the external data into a tab on the working spreadsheet, if I could find a way to work where it would check the external documents for the matches that would be great. The structure of the data in the external documents is the same as displayed here fore the imported data.
Note that in this case it is month but it could be anything random so DATE functions wouldn't work.
So I want to pickup the data from the external source above, and insert in the relevant places. But while the months will not change, other data can change the order in which is imported so we need to check that the headers from picture 2 match the row labels from the imported data.
I hope that makes sense. I would really appreciate any help. Was up until 4 AM trying to figure this out and I would hate to go back to my boss saying he's gonna need to get someone else to do this as I can't. :/
Thank you.
So I resolved this with a INDEX(array, MATCH(),MATCH()) function. First I selected the answer array from the cells with the info I wanted then used the MATCH Function to match the row and the columns I wanted in the matrix.
This created another problem where no answer existed as it threw an error so I had to envelop the whole expression in an IFERROR function.
The final solution was like this:
=IFERROR(INDEX(Table_Query_from_Excel_Files,MATCH(!H1:I1&A1,INDEX(Table_Query_from_Excel_Files[Month]&Table_Query_from_Excel_Files[User location],,),0),MATCH(!A1,Table_Query_from_Excel_Files[#Headers],0)), 0)

Excel 2010 check cells in a row contain data before the Selection Change fires

I have an Excel 2010 data table which is driven by a query from MSSQL. The underlying query changes depending upon what options the user selects in the Excel workbook. I'm okay with changing the query and pulling the data.
After the data has been selected multiple users will be able to edit and append data to the Excel table and these changes will post back to the SQL database table. Due to the database table structure some of these cells within a given row are mandatory before any data can be inserted into MSSQL and/or potentially updated.
So what I'm trying to achieve is checks on whether certain columns in a row are blank after a cell is edited (I can do this via Worksheet Change) and also before they move off that row so I can bring up a message if all mandatory columns haven't been entered. I can't see any events that fire before Selection Change. My only thoughts on a workaround is to have a global variable row marker that is updated on Selection Change, i.e. it will store the previous row number. I can't use Excel's standard data validation functionality looking at blank cells because although this is fine for a currently correctly populated row that is being edited, inserting new rows or appending directly to the bottom of the table will constantly error as all those mandatory columns will, of course, be blank. Currently I am using conditional formatting to at least highlight columns/cells that require input although this doesn't force users to actually do it. Data cannot be stored within MSSQL until these columns contain data so if they don't fill them in and refresh the table for whatever reason, whatever they have entered will be lost. Obviously this is bad, m'kay. I am concerned about both the Worksheet Change and Selection Change events constantly firing and how that will affect workbook performance.
Any suggestions would be appreciated. Maybe I'm going about this all wrong so any ideas to make this more efficient would also be well received. The user base do not want to see UserForms or MS Access even though it would make this activity very easy. They are too used to the look and feel of Excel sheets.
your best way is to copy the table into 2d array or some other data structure in memory such as dictionary or collection. and than manage each change in memory. this one is very efficient but requires a lot of code. with excel the only problem you have is the key the rest is vlookup and true false questions. vlookup will find the original value and then you have current data + previous data + the logic... is the new data ok?

Update excel from access shuffles rows

We have an access 2003 database and an excel 2003. We connected the excel to import the data from the access, so far so good.
At the end of the imported data right to the next 3 fields of the excel we want to add some comments, text.
Up to here all work fine, but when we update the data to excel the rows shuffle and the comments at the right columns remain at the same row. So the data rows shuffle and the comment rows remain at the same position.
So if row 3 has at the right the comment 3, when we update the worksheet to retrieve the new data from access it replaces row 3 with data from another row but the comment remains at row 3.
Is there a way for the excel to keep the imported data to the same position and not shuffling it?
We tried all the options at excel "What to do when new lines of data are available" but all do the same, it shuffles the data when we update, even if access has no new data.
Thank you in advance.
I think you will find that when you imported data from Access, Excel created/named a range called 'database' or 'data' or something like that. You need to expand that definition to include the added columns.

SSIS Data Flow Task Excel Source

I have a data flow task set up in SSIS.
The source is from an Excel source not an SQL DB.
The problem i seem to get is that, the package is importing empty rows.
My data has data in 555200 rows, but however when importing the SSIS package imports over 900,000 rows. The extra rows are imported even though the other empty.
When i then download this table into excel there are empty rows in between the data.
Is there anyway i can avoid this?
Thanks
Gerard
The best thing to do. If you can, is export the data to a flat file, csv or tab, and then read it in. The problem is even though those rows are blank they are not really empty. So when you hop across that ODBC-Excel bridge you are getting those rows as blanks.
You could possibly adjust the way the spreadsheet is generated to eliminate this problem or manually the delete the rows. The problem with these solutions is that they are not scalable or maintainable over the long term. You will also be stuck with that rickety ODBC bridge. The best long term solution is to avoid using the ODBC-Excel bridge entirely. By dumping the data to a flat file you have total control over how to read, validate, and interpret the data. You will not be at the mercy of a translation layer that is to this day riddled with bugs and is at the best of times "quirky"
You can also add in a Conditional Split component in your Data flow task, between the source task and the destination task. In here, check if somecolumn is null or empty - something that is consistent - meaning for every valid row, it has some data, and for every invalid row it's empty or null.
Then discard the output for that condition, sending the rest of the rows to the destination. You should then only get the amount of rows with valid data from Excel.

Skipping rows when importing Excel into SQL using SSIS 2008

I need to import sheets which look like the following:
March Orders
***Empty Row
Week Order # Date Cust #
3.1 271356 3/3/10 010572
3.1 280353 3/5/10 022114
3.1 290822 3/5/10 010275
3.1 291436 3/2/10 010155
3.1 291627 3/5/10 011840
The column headers are actually row 3. I can use an Excel Sourch to import them, but I don't know how to specify that the information starts at row 3.
I Googled the problem, but came up empty.
have a look:
the links have more details, but I've included some text from the pages (just in case the links go dead)
http://social.msdn.microsoft.com/Forums/en-US/sqlintegrationservices/thread/97144bb2-9bb9-4cb8-b069-45c29690dfeb
Q:
While we are loading the text file to SQL Server via SSIS, we have the
provision to skip any number of leading rows from the source and load
the data to SQL server. Is there any provision to do the same for
Excel file.
The source Excel file for me has some description in the leading 5
rows, I want to skip it and start the data load from the row 6. Please
provide your thoughts on this.
A:
Easiest would be to give each row a number (a bit like an identity in
SQL Server) and then use a conditional split to filter out everything
where the number <=5
http://social.msdn.microsoft.com/Forums/en/sqlintegrationservices/thread/947fa27e-e31f-4108-a889-18acebce9217
Q:
Is it possible during import data from Excel to DB table skip first 6 rows for example?
Also Excel data divided by sections with headers. Is it possible for example to skip every 12th row?
A:
YES YOU CAN. Actually, you can do this very easily if you know the number columns that will be imported from your Excel file. In
your Data Flow task, you will need to set the "OpenRowset" Custom
Property of your Excel Connection (right-click your Excel connection >
Properties; in the Properties window, look for OpenRowset under Custom
Properties). To ignore the first 5 rows in Sheet1, and import columns
A-M, you would enter the following value for OpenRowset: Sheet1$A6:M
(notice, I did not specify a row number for column M. You can enter a
row number if you like, but in my case the number of rows can vary
from one iteration to the next)
AGAIN, YES YOU CAN. You can import the data using a conditional split. You'd configure the conditional split to look for something in
each row that uniquely identifies it as a header row; skip the rows
that match this 'header logic'. Another option would be to import all
the rows and then remove the header rows using a SQL script in the
database...like a cursor that deletes every 12th row. Or you could
add an identity field with seed/increment of 1/1 and then delete all
rows with row numbers that divide perfectly by 12. Something like
that...
http://social.msdn.microsoft.com/Forums/en-US/sqlintegrationservices/thread/847c4b9e-b2d7-4cdf-a193-e4ce14986ee2
Q:
I have an SSIS package that imports from an Excel file with data
beginning in the 7th row.
Unlike the same operation with a csv file ('Header Rows to Skip' in
Connection Manager Editor), I can't seem to find a way to ignore the
first 6 rows of an Excel file connection.
I'm guessing the answer might be in one of the Data Flow
Transformation objects, but I'm not very familiar with them.
A:
Question Sign in to vote 1 Sign in to vote rbhro, actually there were
2 fields in the upper 5 rows that had some data that I think prevented
the importer from ignoring those rows completely.
Anyway, I did find a solution to my problem.
In my Excel source object, I used 'SQL Command' as the 'Data Access
Mode' (it's drop down when you double-click the Excel Source object).
From there I was able to build a query ('Build Query' button) that
only grabbed records I needed. Something like this: SELECT F4,
F5, F6 FROM [Spreadsheet$] WHERE (F4 IS NOT NULL) AND (F4
<> 'TheHeaderFieldName')
Note: I initially tried an ISNUMERIC instead of 'IS NOT NULL', but
that wasn't supported for some reason.
In my particular case, I was only interested in rows where F4 wasn't
NULL (and fortunately F4 didn't containing any junk in the first 5
rows). I could skip the whole header row (row 6) with the 2nd WHERE
clause.
So that cleaned up my data source perfectly. All I needed to do now
was add a Data Conversion object in between the source and destination
(everything needed to be converted from unicode in the spreadsheet),
and it worked.
My first suggestion is not to accept a file in that format. Excel files to be imported should always start with column header rows. Send it back to whoever provides it to you and tell them to fix their format. This works most of the time.
We provide guidance to our customers and vendors about how files must be formatted before we can process them and it is up to them to meet the guidlines as much as possible. People often aren't aware that files like that create a problem in processing (next month it might have six lines before the data starts) and they need to be educated that Excel files must start with the column headers, have no blank lines in the middle of the data and no repeating the headers multiple times and most important of all, they must have the same columns with the same column titles in the same order every time. If they can't provide that then you probably don't have something that will work for automated import as you will get the file in a differnt format everytime depending on the mood of the person who maintains the Excel spreadsheet. Incidentally, we push really hard to never receive any data from Excel (only works some of the time, but if they have the data in a database, they can usually accomodate). They also must know that any changes they make to the spreadsheet format will result in a change to the import package and that they willl be charged for those development changes (assuming that these are outside clients and not internal ones). These changes must be communicated in advance and developer time scheduled, a file with the wrong format will fail and be returned to them to fix if not.
If that doesn't work, may I suggest that you open the file, delete the first two rows and save a text file in a data flow. Then write a data flow that will process the text file. SSIS did a lousy job of supporting Excel and anything you can do to get the file in a different format will make life easier in the long run.
My first suggestion is not to accept a file in that format. Excel files to be imported should always start with column header rows. Send it back to whoever provides it to you and tell them to fix their format. This works most of the time.
Not entirely correct.
SSIS forces you to use the format and quite often it does not work correctly with excel
If you can't change he format consider using our Advanced ETL Processor.
You can skip rows or fields and you can validate the data the way you want.
http://www.dbsoftlab.com/etl-tools/advanced-etl-processor/overview.html
Sky is the limit
You can just use the OpenRowset property you can find in the Excel Source properties.
Take a look here for details:
SSIS: Read and Export Excel data from nth Row
Regards.

Resources