Query database columns using Excel/csv data - excel

I have a case where I need to read an Excel/csv/text file containing two columns (say colA and colB) of values (around 1000 rows). I need to query the database using values in colA. The query will return an XMLType into which the respective colB value needs to be inserted. I have the XML query and the insert working but I am stuck on what approach I should take to read the data, query and update it on the fly.
I have tried using external tables but realized that I don't have access to the server root to host the data file. I have also considered creating a temporary table to load the data to using SQL Loader or something similar and run the query/update within the tables. But that would need some formal overhead to go through. I would appreciate suggestions on the approach. Examples would be greatly helpful.
e.g.
text or Excel file:
ColA,ColB
abc,123
def,456
ghi,789
XMLTypeVal e.g.
<node1><node2><node3><colA></colA><colB></colB></node3></node2></node1>
UPDATE TableA SET XMLTypeVal
INSERTCHILDXML(XMLTypeVal,
'/node1/node2/node3', 'colBval',
XMLType('<colBval>123</colBval>'))
WHERE EXTRACTVALUE(TableA.XMLTypeVal, node1/node2/node3/ColA') = ('colAval');

Related

how can I manually update some table values in an Excel data model table imported from csv with power query

I am using Excel power query to import csv files containing transactions from a directory. That way adding a new file to the directory automatically makes it available when refreshing the query/data model. I load the table from the csv files into the data model. I do some cleaning and data transformation in the query.
However, there are some things that I can't do in the query that loads the raw data.
There may be missing data that I need to enter manually (a column missing some values)
I may need to split a transaction/row into multiple transactions/rows to categorize the parts correctly
It seems like there should be a way to do this that allows me to make my changes and not have them overwritten when I refresh the query to import new transactions.
Currently I am experimenting with creating a column with a unique id for the transaction table as part of the query. Then creating an aux table in excel relating to the raw transactions by unique id. I then make my changes in the aux table. And finally, I create a new table that merges the raw transactions with the aux table to create the working transaction table. This does work for missing data, or incorrect values, but it still doesn't allow me to split a row into multiple rows.
I would welcome any suggestions or references.

Excel Data Queries - Ignore missing table / assign specific table number for every query

I am having a bit of trouble to create an automated report based on an HTML file. The file contains tables with data structured from the web page, and I just create tables from the tables recognized by Excel. So far it does what I need, but sometimes one or more tables from the HTML file is missing, and causing the tables to shuffle between them, like table 0 is missing then table 1 will take it's place and break the entire sheet because the wrong table is in the place of table 0.
What I wanted to know if it's a way to assign every query to a specific table number for each query. Like Table 0 will get the value from the specified query, not the first one that comes in the list of queries. The code so far is this for Power Query Editor:
let
Source = Web.Page(File.Contents("D:\AUTO.html")),
Data0 = Source{0}[Data]
in Data0
I use this code because the columns or rows will not always be the same, sometimes one can be missing and if I use the original code that is generated when getting the data from the page it will give errors and not load the table if there is a missing column/row.
Any help is appreciated.
MissingField.Ignore
When you use functions like Table.SelectColumns or RenameColumns or ReorderColumns you can use the MissingField.Ignore options to avoid the missing field error to stop your query
eg:
= Table.SelectColumns(#"blah",{"column1", "column2", "column3"}, MissingField.Ignore)
documentation:
https://learn.microsoft.com/en-us/powerquery-m/missingfield-error

How to quickly migrate from one table into another one with different table structure in the same/different cassandra?

I had one table with more than 10,000,000 records in Cassandra, but for some reason, I want to build another Cassandra table with the same fields and several additional fields, and I will migrate the previous data into it. And now the two tables are in the same Cassandra cluster.
I want to ask how to finish this task in a shortest time?
And If my new table in the different Cassandra, How to do it?
Any advice will be appreciated!
If you just need to add blank fields to a table, then the best thing to do is use the alter table command to add the fields to the existing table. Then no copying of the data would be needed and the new fields would show up as null in the existing rows until you set them to something.
If you want to change the structure of the data in the new table, or write it to a different cluster, then you'd probably need to write an application to read each row of the old table, transform the data as needed, and then write each row to the new location.
You could also do this by exporting the data to a csv file, write a program to restructure the csv file as needed, then import the csv file into the new location.
Another possible method would be to use Apache Spark. You'd read the existing table into an RDD, transform and filter the data into a new RDD, then save the transformed RDD to the new table. That would only work within the same cluster and would be fairly complex to set up.

How can I insert data from Excel file to Oracle using INSERT INTO SQL statement?

I have created types using Oracle objects and created a table
CREATE OR REPLACE TYPE OttawaAddress_Ty AS OBJECT
(StrtNum NUMBER(9),
Street VARCHAR2(20),
City VARCHAR2(15),
Province CHAR(2),
PostalCode CHAR(7));
/
CREATE OR REPLACE TYPE OttawaOfficesInfo_Ty AS OBJECT
(Name VARCHAR(35),
OfficeID VARCHAR2(2),
Phone VARCHAR2(15),
Fax CHAR(15),
Email CHAR(30));
/
CREATE TABLE OttawaOffices
(OfficeAddress OttawaAddress_Ty,
OfficeInfo OttawaOfficesInfo_Ty,
Longitude_DMS NUMBER (10,7),
Latitude_DMS NUMBER (10,7),
SDO_GEOMETRY MDSYS.SDO_GEOMETRY);
I have an Excel file which holds the data and I need to import to this Oracle table using INSERT INTO SQL statements. How can I do this? As you can notice, I have a column called SDO_GEOMETRY which will hold the Decimal Degrees of the records. These decimal degrees are saved in two separate columns in my Excel file.
I am not sure if I can problematically insert the values from Excel or whether I need to go through every record and create
INSERT INTO ... VALUES.... And if so, how to add values when I have created types?
Oracle has a really neat feature called External Tables. These look like regular tables from inside the database, so we can execute SELECT statements against them. The trick is that the table's data domes from OS files (hence "external"). We just define the table to say the same structure as the spreadsheet's columns. It doesn't work with Excel binary format but it does work for CSV files (so Save as ...).
The advantage of external tables is that manipulating data is easy in SQL - it's what it just best - and we don't need to load anything into a staging table. Something like
insert into OttawaOffices
select ottawaaddress_ty(ext.strtnum,ext.street,ext.city,ext.province,ext.postalcode)
, ottawaofficesinfo_ty (ext.name,ext.officeid,ext.phone,ext.fax,ext.email)
, ext.longitude
, ext.latitude
, SDO_GEOMETRY MDSYS.SDO_GEOMETRY(ext.col1, ext.col2)
from your_external_table ext
/
The limitation of external tables is the need to get the source file onto the database server and create a database directory object. Some places are funny about this.
Anyway, find out more.
I'm not going to pass judgement on the declaring table columns as user-defined types. It's usually considered bad practice but maybe it works in your use case.

Create a Volatile table in teradata

I have a sharepoint list which i have linked to in MS Access.
The information in this table needs to be compared to information in our datawarehouse based on keys both sets of data have.
I want to be able to create a query which will upload the ishare data into our datawarehouse under my login run the comparison and then export the details to Excel somewhere. MS Access seems to be the way to go here.
I have managed to link the ishare list (with difficulties due to the attachment fields)and then create a local table based on this.
I have managed to create the temp table in my Volatile space.
How do i append the newly created table that i created from the list into my temporary space.
I am using Access 2010 and sharepoint 2007
Thank you for your time
If you can avoid using Access I'd recommend it since it is an extra step for what you are trying to do. You can easily manipulate or mesh data within the Teradata session and export results.
You can run the following types of queries using the standard Teradata SQL Assistant:
CREATE VOLATILE TABLE NewTable (
column1 DEC(18,0),
column2 DEC(18,0)
)
PRIMARY INDEX (column1)
ON COMMIT PRESERVE ROWS;
Change your assistant to Import Mode (File-> Import Data)
INSERT INTO NewTable (?,?)
Browse for your file, this example would be a comma delineated file with two numeric columns and column one being the index.
You can now query or join this table to any information in the uploaded database.
When you are finished you can drop with:
DROP TABLE NewTable
You can export results using File->Export Data as well.
If this is something you plan on running frequently there are many ways to easily do these type of imports and exports. The Python module Pandas has simple functionality for reading a query directly into DataFrame objects and dropping those objects into Excel through the pandas.io.sql.read_frame() and .to_excel functions.

Resources