We have an existing table on bigquery that gets updated via a scheduler that checks ftp server and upload the new added data into it.
The issue is that few days were dropped from the FTP and now I need to upload the data manually into the table.
Hopefully, I didn't want to create another table and upload the data into it and then make union between the two tables, I was looking for a solution that would insert the sheets to the main table right away
Related
I want to run a presql script in the Data flow in SINK. I want to delete exiting records for particular year.
This particular year will be coming from source excel file.
Say I have a file for 2021 data and loaded that data into DB. when I rerun the pipeline for the same excel file I want to delete 2021 related records in DB and insert fresh. This table may contain multiple years data. So everytime a new file arrives for a particular year, I want to delete that respective records and reload the new data.
I can read the year value from source file column. And I can keep it as derived column. How Can I write a presql script to delete?
delete from where year = <sourcefile.year>
how can i do this in data flow?
Pls help!
You can use a temporary table in sink1 and delete the records in the sink2.
Please follow the demonstration below:
This is my SQL table with a sample data.
Create two SQL sinks for it from same source using a new branch.
In the first sink, provide a dataset with edit table name option for the temporary table.
In sink check on Recreate table.
In sink 2, give your SQL table with the below SQL script.
DELETE FROM exceltable WHERE year in (select year from dbo.temp1)
drop table dbo.temp1;
In the settings of Data flow give the Write order of sinks.
Temporary table sink should execute first.
This is my result in the SQL table after deleting records.
I have an Excel file with a drop down list that I can choose data to display. I want to use SSIS to load the data into my SQL Server Database.
There is around 55 items in the list to choose from so I don't think I want to manually copy paste data from each selection into a new sheet or file.
I did some research but I didn't find any answer on this topic. I'm hoping a script task can do it?
Any idea will help.
I have 4 Sheets sheets with similar fields. I intend to merge these sheets together to create a master file that has all information in one sheet. However, i need Tableau to connect to the final merged file so i can create dashboards off. This works locally as i have an access program that appends the tables together and creates a new table which Tableau connects to.
The main issue is i am trying to take this process offline (to run online to locally), meaning i need a database that can;
1- Drop content of the tables, pick up the sheets from a specified folder, import them into specified tables.
2- Append new tables into master tables.
All of this should be done automatically at a scheduled time.
I tried using SQL server (SQL Agent for scheduling job import/append etc) for this requirement but i need to know if something else is out there that can serve this purpose efficiently.
Thank you
As long as the sheets have the exact fields, you should be able to use Tableau's Union feature. This feature will allow you to do a wildcard search for sheets within a folder structure. Anytime the data is refreshed in Tableau it will reach back out to the folder and update/union what is currently there.
I built a data entry UserForm to populate a worksheet that will serve as the raw database. The raw data requires further manipulation and analysis in order to be reported, so I set up a database connection using Get External Data>From Microsoft Query>Excel Files, pointed it to the file I was already working in, selected the fields I wanted and performed basic functions on those I wanted aggregated. This creates an Excel table where I then use formulas that to complete the analysis. It works great for me; I can add entries to the database, Refresh the summary table, the new entries are added and the formulas populate automatically.
The problem is that no one else can refresh the table because it's looking locally for the file. The connection string is:
DSN=Excel Files;DBQ=C:\Users\MyName\Desktop\Folder 1\Results.xlsm;DefaultDir=C:\Users\MyName\Desktop\Folder 1;DriverId=1046;MaxBufferSize=2048;PageTimeout=5;
I have a very basic understanding of the database connections, but I need this file to be as automated as possible by request of my colleague. Can I fix the connection string so that the file is "flexible" and can be refreshed on any computer? Is this the best solution? If not, what else can I do that does not involve downloading additional plugins or 3rd party add-ins?
If what you need is a file containing the raw data (a Database) AND one or more excel files connected to it that pick up the data from the database and work with this data, you need to split the two things. You can do the database with an access file located on a shared directory with an appropriate table and you can reproduce the user form in this file so the insertion of the data will be made in this file. Then you connect one or more excel files (using connection Mode = Share Deny None, so you can update the data and at the same time work with them from the excel files), the data will be imported in the files in tables and here you do all the proessing you need.
If one file is enough for you (you don't need to have a database with the row data separated and you don't need to use the file from different location simultaneously) and all the problem is that if the file is opened from a different location from the one specifyed in the connection string it does not work...well in this case (that seems the case) i don't know why to use a connection to the same file.
If what you need is a table for work with, just create it selecting the range with the data you already have inserted (Create a table - quick start guide) and then when you add data through the form instead of adding them in a "normal" row, add them to a new row of the table with something like WorkSheets("name").ListObjects("table_name").ListRows.Add and add the data in the new table row.
If I have a table in Excel, populated via an external data connection, how can I refresh the data in such a way as to insert new rows for new data, but keep the old rows as well?
For example, this is my table:
Unfortunately the database that I'm working with only holds onto the current month's data, so if I refresh, I'll only get February 2011's data back. The end result I want is:
Are there any built-in Excel options that I'm missing (similar to "External Data Properties"->"Insert entire rows for new data, clear unused cells") or should I go the programmatic route and save the old data in a temp table, etc?
Since Excel external data is based on a query of the external source, Refresh will update to whatever is in that source. I think you will need to code a routine to append the external link data to another sheet