I am tring to load a .xlsx file into an Oracle database table. I am getting an error for my code. I normally use this code for .csv files but need to use it for .xlsx I have edited my field names, table names etc
Is this possible?
Drop TABLE Temp_Info;
CREATE TABLE Temp_Info
(
Unique_Id varchar2(255) ,
Name varchar2(255),
Alt_Name varchar2(255)
)
ORGANIZATION EXTERNAL
(
TYPE ORACLE_LOADER
DEFAULT DIRECTORY SEPA_FILES
ACCESS PARAMETERS
(
records delimited by newline
skip 1
fields terminated by ','
missing field values are null
(
Unique ID -(filled automatically),Name,Alt Name
)
)
LOCATION ('Data_File.xlsx')
)
REJECT LIMIT UNLIMITED;
Select * From Temp_Info a;
Error Message: 9:16:55 ORA-29913: error in executing ODCIEXTTABLEOPEN callout
9:16:55 ORA-29400: data cartridge error
9:16:55 KUP-00554: error encountered while parsing access parameters
9:16:55 KUP-01005: syntax error: found "identifier": expecting one of: "comma, char, date, defaultif, decimal, double, float, integer, (, nullif, oracle_date, oracle_number, position, raw, recnum, ), unsigned, varrawc, varchar, varraw, varcharc, zoned"
9:16:55 KUP-01008: the bad identifier was: ID
9:16:55 KUP-01007: at line 6 column 30
9:16:55 ORA-06512: at "SYS.ORACLE_LOADER", line 14
9:16:55 ORA-06
1) Option [ Pure PL/SQL]
Xlsx document is zipped set of xml documents. You can change extension xlsx to zip, unzip and find out what is inside.
Here is description how to deal with xlsx document in oracle environment.
This solution works but implementation is very painful.
2) Option (PL/SQL + apache POI)
Create implementation in java. And use it in db.
3) Convert xlxs to csv.
You might want to take a look at ExcelTable SQL interface (Disclaimer : I'm the author).
It provides access to .xlsx (or .xlsm) files as an external table.
Here's an example based on your ext table definition :
SELECT t.*
FROM TABLE(
ExcelTable.getRows(
ExcelTable.getFile('SEPA_FILES','Data_File.xlsx')
, 'Sheet_name_goes_here'
, ' "UNIQUE_ID" for ordinality
, "NAME" varchar2(255)
, "ALT_NAME" varchar2(255) '
, 'A2'
)
) t ;
(I assumed UNIQUE_ID is some kind of autogenerated sequence)
Related
There is a large xls file that needs to be imported into the database. I work in PL/SQL Developer 12. I first created an empty table in the database, for which I used two query options.
First like this:
CREATE TABLE vz_117390_in_garant (
COLUMN1 VARCHAR2(38), /*--1 Number of the contract*/
... ... ...
COLUMN5 DATE, /*--5 Date of conclusion of the contract*/
... ... ...
COLUMN12 TIMESTAMP, /*-12 Installation date*/
... ... ...
COLUMN28 NUMBER(38, 4), /*-28 Maximum power*/
... ... ...
COLUMN38 VARCHAR2(128)) /*-38 Note*/
;
When it doesn't work, then this:
CREATE TABLE vz_117390_in_garant (
COLUMN1 VARCHAR2(38), /*--1 Number of the contract*/
... ... ...
COLUMN5 VARCHAR2(10), /*--5 Date of conclusion of the contract*/
... ... ...
COLUMN12 VARCHAR2(19), /*-12 Installation date*/
... ... ...
COLUMN28 VARCHAR2(38), /*-28 Maximum power*/
... ... ...
COLUMN38 VARCHAR2(128)) /*-38 Note*/
;
Import tried to do in two ways:
By direct copying from an excel file directly to the table in edit mode (taking into account the fact that for correct copying from an xls table, it is necessary to add a left column with serial numbers of rows to it);
Using a tool in PL/SQL Developer - ODBC Importer.
As a result, with any method, I get a message of this type:
Tried to fight like this:
Set the NLS_DATE_FORMAT parameter:
ALTER SESSION SET NLS_DATE_FORMAT = 'DD.MM.YYYY';
Replaced the cell format in the source file from "General" to "Text";
Used in the database instead of the first query option (see above) the second one in order to exclude the use of the "DATE" data type.
However, nothing has changed. I ask to prompt the answer to a question of a topic.
I use the UDF.Javascript function to process the message,when after converting to json object ,I see the UDF.Javascript alias name getting added to the json.
{"Device":{"deviceId":"DJT3COE4","productFilter":"pcmSensor","SignalDetails":[{"Devicevalue":"72.04","DisplayName":"Valve Open Status","Description":"Machine Valve Open State Information","DataType":"BOOLEAN","Precision":"undefined","DefaultUoM":"undefined"},{"Devicevalue":"2.7","DisplayName":"Temperature","Description":"Temperature Sensor Reading","DataType":"TEMPERATURE","Precision":"2","DefaultUoM":"DEG_CELSIUS"},{"Devicevalue":"2.99","DisplayName":"Location","Description":"Location","DataType":"LOCATION","Precision":"undefined","DefaultUoM":"LAT_LONG"},{"Devicevalue":"15","DisplayName":"Valve Control","Description":"On / Off control","DataType":"BOOLEAN","Precision":"undefined","DefaultUoM":"undefined"}]}}
Remove the aliasname : {"Device": from the json.
Maybe you could use WITH...AS... in your sql,please see below example:
WITH
c AS
(
SELECT
udf.processArray(input)
from input
)
SELECT
c.processarray.item,c.processarray.name
INTO
output
FROM
c
Output:
My columns are very few,you need to define all of your columns which is a little bit tedious.But it does works,please have a try.
I have a BUNCH of fixed width text files that contain multiple transaction types with only 3 that I care about (121,122,124).
Sample File:
D103421612100188300000300000000012N000002000001000032021420170012260214201700122600000000059500000300001025798
D103421612200188300000300000000011000000000010000012053700028200004017000000010240000010000011NNYNY000001000003N0000000000 00
D1034216124001883000003000000000110000000000300000100000000000CS00000100000001200000033NN0 00000001200
So What I need to do is read line by line from these files and look for the ones that have a 121, 122, or 124 at startIndex = 9 and length = 3.
Each line needs to be parsed based on a data dictionary I have and the output needs to be grouped by transaction type into three different files.
I have a process that works but it's very inefficient, basically reading each line 3 times. The code I have is something like this:
#121 = EXTRACT
col1 string,
col2 string,
col3 string //ect...
FROM inputFile
USING new MyCustomExtractor(
new SQL.MAP<string, string> {
{"col1","2"},
{"col2","6"},
{"col3","3"} //ect...
};
);
OUTPUT #121
TO 121.csv
USING Outputters.Csv();
And I have the same code for 122 and 124. My custom extractor takes the SQL MAP and returns the parsed line and skips all lines that don't contain the transaction type I'm looking for.
This approach also means I'm running through all the lines in a file 3 times. Obviously this isn't as efficient as it could be.
What I'm looking for is a high level concept of the most efficient way to read a line, determine if it is a transaction I care about, then output to the correct file.
Thanks in advance.
How about pulling out the transaction type early using the Substring method of the String datatype? Then you can do some work with it, filtering etc. A simple example:
// Test data
#input = SELECT *
FROM (
VALUES
( "D103421612100188300000300000000012N000002000001000032021420170012260214201700122600000000059500000300001025798" ),
( "D103421612200188300000300000000011000000000010000012053700028200004017000000010240000010000011NNYNY000001000003N0000000000 00" ),
( "D1034216124001883000003000000000110000000000300000100000000000CS00000100000001200000033NN0 00000001200" ),
( "D1034216999 0000000000000000000000000000000000000000000000000000000000000000000000000000000 00000000000" )
) AS x ( rawData );
// Pull out the transaction type
#working =
SELECT rawData.Substring(8,3) AS transactionType,
rawData
FROM #input;
// !!TODO do some other work here
#output =
SELECT *
FROM #working
WHERE transactionType IN ("121", "122", "124"); //NB Note the case-sensitive IN clause
OUTPUT #output TO "/output/output.csv"
USING Outputters.Csv();
As of today, there is no specific U-SQL function that can define the output location of a tuple on the fly.
wBob presented an approach to a potential workaround. I'd extend the solution the following way to address your need:
Read the entire file, adding a new column that helps you identify the transaction type.
Create 3 rowsets (one for each file) using a WHERE statement with the specific transaction type (121, 122, 124) on the column created in the previous step.
Output each rowset created in the previous step to their individual file.
If you have more feedback or needs, feel free to create an item (and voting for others) on our UserVoice site: https://feedback.azure.com/forums/327234-data-lake. Thanks!
I have a CSV file with some integer column, now it 's saved as "" (empty string).
I want to COPY them to a table as NULL value.
With JAVA code, I have try these:
String sql = "COPY " + tableName + " FROM STDIN (FORMAT csv,DELIMITER ',', HEADER true)";
String sql = "COPY " + tableName + " FROM STDIN (FORMAT csv,DELIMITER ',', NULL '' HEADER true)";
I get: PSQLException: ERROR: invalid input syntax for type numeric: ""
String sql = "COPY " + tableName + " FROM STDIN (FORMAT csv,DELIMITER ',', NULL '\"\"' HEADER true)";
I get: PSQLException: ERROR: CSV quote character must not appear in the NULL specification
Any one has done this before ?
I assume you are aware that numeric data types have no concept of "empty string" ('') . It's either a number or NULL (or 'NaN' for numeric - but not for integer et al.)
Looks like you exported from a string data type like text and had some actual empty string in there - which are now represented as "" - " being the default QUOTE character in CSV format.
NULL would be represented by nothing, not even quotes. The manual:
NULL
Specifies the string that represents a null value. The default is \N
(backslash-N) in text format, and an unquoted empty string in CSV format.
You cannot define "" to generally represent NULL since that already represents an empty string. Would be ambiguous.
To fix, I see two options:
Edit the CSV file / stream before feeding to COPY and replace "" with nothing. Might be tricky if you have actual empty string in there as well - or "" escaping literal " inside strings.
(What I would do.) Import to an auxiliary temporary table with identical structure except for the integer column converted to text. Then INSERT (or UPSERT?) to the target table from there, converting the integer value properly on the fly:
-- empty temp table with identical structure
CREATE TEMP TABLE tbl_tmp AS TABLE tbl LIMIT 0;
-- ... except for the int / text column
ALTER TABLE tbl_tmp ALTER col_int TYPE text;
COPY tbl_tmp ...;
INSERT INTO tbl -- identical number and names of columns guaranteed
SELECT col1, col2, NULLIF(col_int, '')::int -- list all columns in order here
FROM tbl_tmp;
Temporary tables are dropped at the end of the session automatically. If you run this multiple times in the same session, either just truncate the existing temp table or drop it after each transaction.
Related:
How to update selected rows with values from a CSV file in Postgres?
Rails Migrations: tried to change the type of column from string to integer
postgresql thread safety for temporary tables
Since Postgres 9.4 you now have the ability to use FORCE_NULL. This causes the empty string to be converted into a NULL. Very handy, especially with CSV files (actually this is only allowed when using CSV format).
The syntax is as follow:
COPY table FROM '/path/to/file.csv'
WITH (FORMAT CSV, DELIMITER ';', FORCE_NULL (columnname));
Further details are explained in the documentation: https://www.postgresql.org/docs/current/sql-copy.html
If we want to replace all blank and empty rows with null then you just have to add emptyasnull blanksasnull in copy command
syntax :
copy Table_name (columns_list)
from 's3://{bucket}/{s3_bucket_directory_name + manifest_filename}'
iam_role '{REDSHIFT_COPY_COMMAND_ROLE}' emptyasnull blanksasnull
manifest DELIMITER ',' IGNOREHEADER 1 compupdate off csv gzip;
Note: It will apply for all the records which contains empty/blank values
I need to import certain information from an Excel file into an Access DB and in order to do this, I am using DAO.
The user gets the excel source file from a system, he does not need to directly interact with it. This source file has 10 columns and I would need to retrieve only certain records from it.
I am using this to retrieve all the records:
Set destinationFile = CurrentDb
Set dbtmp = OpenDatabase(sourceFile, False, True, "Excel 8.0;")
DoEvents
Set rs = dbtmp.OpenRecordset("SELECT * FROM [EEX_Avail_Cap_ALL_DEU_D1_S_Y1$A1:J65536]")
My problem comes when I want to retrieve only certain records using a WHERE clause. The name of the field where I want to apply the clause is 'Date (UCT)' (remember that the user gets this source file from another system) and I can not get the WHERE clause to work on it. If I apply the WHERE clause on another field, whose name does not have ( ) or spaces, then it works. Example:
Set rs = dbtmp.OpenRecordset("SELECT * FROM [EEX_Avail_Cap_ALL_DEU_D1_S_Y1$A1:J65536] WHERE Other = 12925")
The previous instruction will retrieve only the number of records where the field Other has the value 12925.
Could anyone please tell me how can I achieve the same result but with a field name that has spaces and parenthesis i.e. 'Date (UCT)' ?
Thank you very much.
Octavio
Try enclosing the field name in square brackets:
SELECT * FROM [EEX_Avail_Cap_ALL_DEU_D1_S_Y1$A1:J65536] WHERE [Date (UCT)] = 12925
or if it's a date we are looking for:
SELECT * FROM [EEX_Avail_Cap_ALL_DEU_D1_S_Y1$A1:J65536] WHERE [Date (UCT)] = #02/14/13#;
To use date literal you must enclose it in # characters and write the date in MM/DD/YY format regardless of any regional settings on your machine