Excel Multiple data connections to same table - excel

I have a worksheet DATA with the table populated from json file through the Microsoft Query.
There're different json files so I need to create several connections to any of those files.
I also have a cell on another worksheet where I would like to indicate a parameter (for example Yesterday,Today,Tomorrow).
According to selected parameter the table in the DATA worksheet should be populated from the related data connection (yesterday.json, today.json, tomorrow.json).
Is it possible to do it? If yes, what would be the procedure?
I have an idea that it might be possible to do by changing the filename inside the query.
For example, this is my query:
let
FilePath = Excel.CurrentWorkbook(){[Name="FilePath"]}[Content]{0}[Column1],
FullPathToFile1 = FilePath & "\json\today.json",
Source = Json.Document(File.Contents(FullPathToFile1)),
So am thinking if there's some way to "inject" filename in the above query based on value of some cell.
Will appreciate any help, links etc.
Thanks!
UPDATE:
I have created a named cell jsonPath and put the file name in it.
Then I have modified above query as follows, but it gives me an error.
FilePath = Excel.CurrentWorkbook(){[Name="FilePath"]}[Content]{0}[Column1],
FullPathToFile1 = FilePath & "\json\" & [jsonPath],
Source = Json.Document(File.Contents(FullPathToFile1)),

I got it working by modifying my query as follows:
FilePath = Excel.CurrentWorkbook(){[Name="FilePath"]}[Content]{0}[Column1],
FileName = Excel.CurrentWorkbook(){[Name="jsonPath"]}[Content]{0}[Column1],
FullPathToFile1 = FilePath & "\json\" & FileName,
Source = Json.Document(File.Contents(FullPathToFile1)),

Related

Is there an equivalent to SCHEMA.INI for reading Excel Workbooks

I am currently working on a project that will import data from multiple different sources in a variety of formats and structures - e.g., CSV, fixed-length, other-delimited (tab, pipe, etc.) plain-text, and Excel worksheets/workbooks. For this, I'm attempting to build "generic" readers for these files which will throw the files' contents into a DataTable/DataSet I can use in other methods. The plain-text files are pretty simple as I've created a large SCHEMA.INI file which contains field definitions for each of the files the system will handle. That SCHEMA.INI resides in a "processing folder" where the files are temporarily stored until their data has been integrated with other systems. A defined text files' data can be easily extracted using this method:
Private Function TextFileToDataTable(ByVal TextFile As IO.FileInfo) As DataTable
Dim TextFileData As New DataTable("TextFileData")
Using TapeFileConnect As New OleDb.OleDbConnection("Provider=Microsoft.Jet.OleDb.4.0;Data Source='" + TextFile.DirectoryName + "';Extended Properties='Text';")
Using TapeAdapter As New OleDb.OleDbDataAdapter(String.Format("SELECT * FROM {0};", TextFile.Name), TapeFileConnect)
Try
TapeAdapter.Fill(TextFileData)
Catch ex As Exception
TextFileData = Nothing
End Try
End Using
End Using
Return TextFileData
End Function
This works well because a plain-text file isn't terribly complex in its data structure. A single file generally (at least for my requirements) contains, at most, one single table's worth of data - unless, of course, it's some sort of complex XML or JSON structure file, which can/should be handled completely differently anyway - so there's no need to go iterating through different elements beyond this.
NOTE: The code above is dependent on the SCHEMA.INI file being present in the same directory as the plain-text file being read and there being a section within that SCHEMA.INI defined with the same name as that plain-text file.
EXAMPLE:
[EXAMPLE_TEXT_FILE.TXT]
CharacterSet=ANSI
Format=FixedLength
ColNameHeader=FALSE
DateTimeFormat="YYYYMMDD"
COL1=CUSTOMER_NUMBER TEXT WIDTH 20
COL2=CUSTOMER_FIRSTNAME TEXT WIDTH 30
COL3=CUSTOMER_LASTNAME TEXT WIDTH 40
COL4=CUSTOMER_ADDR1 TEXT WIDTH 40
COL5=CUSTOMER_ADDR2 TEXT WIDTH 40
COL6=CUSTOMER_ADDR3 TEXT WIDTH 40
...
Excel workbooks, however, can be a bit trickier. Several of the workbooks I have to process contain multiple worksheets worth of data that I want to consolidate into a single DataSet with a DataTable for each worksheet. The basic functionality is, again, fairly straightforward and I've come up with the following method to read any and all sheets into a DataSet:
Private Function ExcelFileToDataSet(ByVal ExcelFile As IO.FileInfo, ByVal HasHeaderRow As Boolean) As DataSet
Dim ExcelFileData As New DataSet("ExcelFileData")
Dim ExcelConnectionString As String = String.Empty
Dim UseHeaders As String = "NO"
Select Case ExcelFile.Extension.ToUpper.Trim
Case ".XLS"
ExcelConnectionString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source={0};Extended Properties='Excel 8.0;HDR={1}'"
Case ".XLSX"
ExcelConnectionString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties='Excel 8.0;HDR={1}'"
End Select
If HasHeaderRow Then
UseHeaders = "YES"
End If
ExcelConnectionString = String.Format(ExcelConnectionString, ExcelFile.FullName, UseHeaders)
Try
Using ExcelConnection As New OleDb.OleDbConnection(ExcelConnectionString)
Dim ExcelSchema As New DataTable
ExcelConnection.Open()
ExcelSchema = ExcelConnection.GetOleDbSchemaTable(OleDb.OleDbSchemaGuid.Tables, Nothing)
For Each ExcelSheet As DataRow In ExcelSchema.Rows
Dim SheetTable As New DataTable
Using ExcelAdapter As New OleDb.OleDbDataAdapter
Dim SheetName As String = ExcelSheet("TABLE_NAME").ToString
Dim ExcelCommand As New OleDb.OleDbCommand
SheetTable.TableName = SheetName.Substring(0, SheetName.Length - 1)
ExcelCommand.Connection = ExcelConnection
ExcelCommand.CommandText = String.Format("SELECT * FROM [{0}]", SheetName)
ExcelAdapter.SelectCommand = ExcelCommand
ExcelAdapter.Fill(SheetTable)
End Using
ExcelFileData.Tables.Add(SheetTable)
Next ExcelSheet
End Using
Catch ex As Exception
ExcelFileData = Nothing
End Try
Return ExcelFileData
End Function
The above code will work in a majority of the cases I deal with, but my "difficulty" is that there may be some worksheets that have header rows and some that don't within the same workbook. Also, for those worksheets that do not have a header row, I'd like to be able to define the field names and data types similar to how I can with the plain-text SCHEMA.INI. The only thing I have going for me in these cases is that the "client" provides me with a data map to help me identify what data elements are in each field.
What I'd like to know is if there is a way similar to the text file's SCHEMA.INI to define the structure of an Excel workbook and the worksheet(s) it contains - including column data types to avoid the OleDb driver from "misinterpreting" a column's data - ahead of time. I imagine this could be any sort of structured file such as INI, XML, or whatever, but it would need to be capable of identifying whether or not a particular sheet contains a header row or, in lieu of such a row, the (expected) column definitions. Does any such "standard definition" file exist for Excel workbooks?
One thing to note: As you may have noticed in the code for the ExcelFileToDataSet() method, I may be dealing with the older .XLS (97-03) format or the .XLSX (07+) format, so I can't necessarily rely on the workbook being Open XML compliant. I suppose I could try breaking the methods out to one for each extension, but I'd rather find something that I can use regardless of which file format the Excel file is using.

how to import excel to datagrid, then filter by db values

my question about import excel to datagridview but there is an extra case.
I have also a oledb database with store code and store names.
I want it to show only store codes from db that are in the database after imported.
my codes here;
Dim conn As OleDbConnection
Dim dtr As OleDbDataReader
Dim dta As OleDbDataAdapter
Dim cmd As OleDbCommand
Dim dts As DataSet
Dim excel As String
Dim OpenFileDialog As New OpenFileDialog
OpenFileDialog1.FileName = ""
OpenFileDialog1.InitialDirectory = My.Computer.FileSystem.SpecialDirectories.Desktop
OpenFileDialog1.Filter = "All Files (*.*)|*.*|Excel files (*.xlsx)|*.xlsx|CSV Files (*.csv)|*.csv|XLS Files (*.xls)|*xls"
If (OpenFileDialog1.ShowDialog(Me) = System.Windows.Forms.DialogResult.OK) Then
DataGridView1.Columns.Clear()
Dim fi As New FileInfo(OpenFileDialog1.FileName)
Dim FileName As String = OpenFileDialog1.FileName
excel = fi.FullName
conn = New OleDbConnection("Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + excel + ";Extended Properties=Excel 12.0;")
dta = New OleDbDataAdapter("Select * From [Sheet1$]", conn)
dts = New DataSet
dta.Fill(dts, "[Sheet1$]")
DataGridView1.DataSource = dts
DataGridView1.DataMember = "[Sheet1$]"
conn.Close()
End If
firstly sorry for my terrible english :)
images as follows;
Main Form
Store List Form
I want only the ones in the store list to be displayed in datagrid.. :\
It's not exactly clear what your current presentation/display looks like, what the problem is, and what your desired presentation/display should look like. But you have asked about selecting only one part of the data you are importing, which is presumably found in only one column of the imported Excel data.
When the datatable is created, it has the columns and rows from the Excel worksheet. The columns will be data from the first row, and the rows will be the records from the succeeding rows in the worksheet. You can access both the header data and the row data easily. The code below is VERY rough but for you to see how to gain access to the data in the datatable which you have already very successfully imported in the limited code shown above.
Dim columns = datatable.Columns
Dim rows = datatable.Rows
Dim columns1 = columns(0)
Dim rows1 = rows(0)
Dim element1 = rows1(0)
Columns will have all the headers, so you can locate the column with the store codes or store names. Then the rows will have the data for each store. So above, rows1 is the first row of data and element1 is the data in that row from columns1, and so on. The (0) is the index into the respective collections.
You will, of course, have to write code to extract the data you want and if necessary eliminate duplicates, but the data is all there already.
Hopefully getting the data into a list and then sorting, filtering and selecting the data should be relatively straightforward, but if not, add a comment. That's kind of a different problem. You asked about getting only the store codes.
Added: Based on your additional images and explanation, you are looking to perform an SQL INNER JOIN operation. From the w3schools.com page on SQL INNER JOIN, "The INNER JOIN keyword selects all rows from both tables as long as there is a match between the columns." This is something you will have to study and learn, but it should provide what you need in this case. You will need to define and construct both tables and then perform the JOIN.
And, by the way, you could also follow the link provided in the first comment by T.S., and if that solves your problem, it's a far simpler solution.

Get Current Workbook Path - Excel Power Query

Trying to add a custom column and populating the value with the current workbook path name.
I have tried Excel.Workbook.name and Excel.CurrentWorkbook() and other objects, but it seems those are limited to pulling data.
in VBA this is simply WorkbookObject Path property. but with power query its another story. The references and libraries on Microsoft site are limited for power query.
https://msdn.microsoft.com/en-us/library/mt779182.aspx
Instead of using VBA, you can use the following method which merely involves using an Excel formula:
Define the following formula in Excel and name this cell "FilePath":
=LEFT(CELL("filename",$A$1),FIND("[",CELL("filename",$A$1),1)-1)
Add the following function in PowerQuery. This will return the current directory:
() =>
let
CurrentDir = Excel.CurrentWorkbook(){[Name="FilePath"]}[Content]{0}[Column1]
in
CurrentDir
Now you can import your CSV (or other) file from the current directory:
let
Source = Csv.Document(File.Contents(currentdir() & "filename.csv"),[Delimiter=";", Columns=15, Encoding=65001, QuoteStyle=QuoteStyle.None])
in
Source
Credits: https://techcommunity.microsoft.com/t5/excel/power-query-source-from-relative-paths/m-p/206150
There is no direct way to do this in Power Query. If you can fill the value into a cell you can get that value through Excel.CurrentWorkbook.
You can use VBA and have a cell filled in when the file is opened during the Workbook_Open event:
Private Sub Workbook_Open()
Dim root As String
root = ActiveWorkbook.path
Range("root").Value = root
'root is the named range used in power query.
End Sub
You can then get this variable from the named range ("root") into Power Query by doing something along these lines:
let
Source = Excel.CurrentWorkbook(){[Name="root"]}[Content][Column1]{0}
in
Source

PowerQuery - Folder Query import new files

If I have created a PowerQuery function that imports XML from a folder, how in the same excel file do I reuse the query when there are new files in the folder to only include data from those new files and append the data to the current table?
If you start a Power Query using From File / From Folder and browse to your folder, you will see each file represented as a row in a table, with columns such as Date modified. You can filter that list using Date/Time filters on Date modified or by something more complicated if necessary (post your specific requirements and I'll try to steer you in the right direction).
Once you have filtered the query to just the "new files", you can pass the [Content] column into your Function.
Finally Append a new query based on the saved Excel Table output from your pre-existing query together with the "new files" query above to get your combined output. The new query would be set to Load To / Only Create Connection.
you can watch a folder for file changes with a simple vba script that uses WMI to poll the directory contents every n seconds.
Something similar to this ...
Sub WatchDirectory(dir as string, every as integer)
Set wmisvc = GetObject("winmgmts:\\.\root\cimv2")
let query = "SELECT * FROM __InstanceOperationEvent " _
& "WITHIN " & every _
& " WHERE Targetinstance ISA 'CIM_DirectoryContainsFile' and " _
& "TargetInstance.GroupComponent='Win32_Directory.Name=" _
& Chr(34) & dir & Chr(34) & "'"
Set events = wmisvc.ExecNotificationQuery(query)
Do While True
Set event = events.NextEvent()
if event.Class = "__InstanceCreationEvent" then
....
end if
Loop
For more info on wmi see https://sites.google.com/site/beyondexcel/project-updates/exposingsystemsecretswithvbaandwmiapi
For more details on file watching with WMI, see https://blogs.technet.microsoft.com/heyscriptingguy/2005/04/04/how-can-i-monitor-for-different-types-of-events-with-just-one-script/

Unexpected null members when reading XLSX with Excel Data Reader (.Net)

I'm reading an XLSX (Microsoft Excel XML file) using the Excel Data Reader from http://exceldatareader.codeplex.com/ and am having a problem with missing data. Data which is in the source Excel spreadsheet is missing from the data set returned by the library.
Here's a bit more detail of what I'm doing:
Created a simple test spreadsheet in Excel with one sheet, a header row and two data rows. Save and close Excel.
Open the file and pass the stream into the CreateOpenXmlReader() method and get back an IExcelDataReader.
Call the AsDataSet() method on the IExcelDataReader and get back a DataSet.
Get the ItemArray from row 1 of table 0.
Loop through the ItemArray. Discovered there is data missing (i.e. there are System.DBNull members where I expected System.string members).
Here's a bit more analysis...
I debugged the code and looked inside the ExcelDataReader object model. Found a non-public string array called "SST" which appears to contain the data from the spreadsheet as a single linear (one-dimensional) array.
On closer inspection, I found that the data I was looking for was also missing from this array. In this raw data, the member does not exist at all.
My guess is that for some reason the parser is not picking up the data from the OOXML and concluding that the cell is empty. Looking at the OOXML itself, the data seems to be split across the sharedStrings.xml and sheet1.xml files, so perhaps the parser is having a tough time putting all this together.
Saving the file in binary format (Excel 97 to 2003) and reading that in solves the problem so on the surface that seems to confirm my suspicion is with reading the OOXML format.
Suggestions?
As a stop gap I'm converting all files to binary format, but that seems like a kludge. Is there some way to get my OOXML formatted Excel files to read in properly with Excel Data Reader?
To retrieve an Excel spreadsheet (.xlsx) and load it into a DataSet, you don't need to mess with XML readers or a separate library like Excel Data Reader. The code for reading an entire spreadsheet into a DataSet is pretty simple when using the normal OleDb functions in .NET:
Sub readInMyExcelFile
Dim xlsFile as string = "myexcelfile"
Dim conStr As String = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" & xlsFile & ";Extended Properties=""Excel 12.0 Xml;HDR=YES"""
Dim dtSheets As New DataTable
Dim tmp As String
Dim sqlText as Sting
Using cn As New OleDbConnection(conStr)
cn.Open()
dtSheets = cn.GetSchema("Tables")
End Using
//Dataset for the spreadsheet
Dim ds as New DataSet
/Loop through the names of all the worksheets in the file.
For Each rw as DataRow in dtSheets.Rows
tmp = rw("TABLE_NAME")
tmp = "[" & tmp & "]"
Dim dt as New DataTable
Using cn as New OleDbConnection(conStr)
cn.Open
/Retrieve all the records from the worksheet.
sqlText = "SELECT * FROM " & tblName
Dim adp As New OleDbDataAdapter(sqlText, cn)
/Fill the data table with the all the data.
adp.Fill(dt)
End Using
ds.Tables.Add(dt)
Next
End Sub
It seems there is a bug in Excel Data Reader (it is first time I have heard about it). Do you have to use it? If not, EPPlus would be a better choice.
excel datareader from codeplex is used for reading data from the excel file directly on web application without any sort of caching on the server.the above code only stands when we can store the excel file somewhere.I have faced similar problems with exceldatareader where some of the data are missing.Most importanly i coludnt find any specific trend.All i cud see that if all the rows have values then there is no problem. Best chance is to convert xlsx to xls.

Resources