Load Azure Blob CSV in SSIS

Load Azure Blob CSV in SSIS - azure

I am trying to load a Blob CSV into SSIS in a Flat File destination.
You can see in my source editor that columns are appearing (So its reading my CSV blob file)
When I run SSIS, it returns me these errors:
[Azure Blob Source] Error: The remote server returned an error: (400) Bad Request.
Warning: SSIS Warning Code DTS_W_MAXIMUMERRORCOUNTREACHED. The Execution method succeeded, but the number of errors raised (2) reached the maximum allowed (1); resulting in failure. This occurs when the number of errors reaches the number specified in MaximumErrorCount. Change the MaximumErrorCount or fix the errors.
What I am doing wrong here?

Related

How to log errors in dataflow adf of parallel sources

I have to do some data engineering by reading manifest.cdm.json files from datalake.
add pipeline run id column and push to sql database.
I have one json list file which have required parameter to read CDM json file in source of dataflow.
Previous Approach: I used Foreach and passed parameter to dataflow with single activity then error capturing. But use of Dataflow with for each costs too much..
Current Approch: I mannually created Dataflow with all cdm files. But here I'm not able to capture error. If any source got error all dataflow activity fails. and If I select skip error in dataflow activity I'm not getting any error.
So what should be the approch to get errors from current approch.

You can capture the error using set variable activity in Azure Data Factory.
Use below expression to capture the error message using Set Variable activity:
#activity('Data Flow1').Error.message
Later you can store the error message in blob storage for future reference using copy activity. In below example we are saving error message in .csv file using DelimitedText dataset.

S3 Query Exception file has an invalid version number and InvalidRange The requested range is not satisfiable

I am running a Python script that read a lot of parquet files from a S3 bucket and insert the dataframe into redshift. However, the errors "S3 Query Exception file has an invalid version number" and "InvalidRange The requested range is not satisfiable" frequently happen, I say "frequently" because it is not always that this occur but mostly of the executions.
For each insertion I commit the changes, until the read proccess end and then I close the cursor and the connection. This started when I updated my script, the old version read all files and insert to redshift. Now it read one file and insert the data. What may be causing this problem?

Error when trying to import excel data to SQL Server 2008

I have an excel file that has data. The first row has the columns. I am trying throw SSIS to import these data on a database. On the edit mappings I am setting the destination columns “Nvarchar (max) “ . All the columns are imported successfully on Sql server but one column produce the bellow error. I have try to edit the “HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel” registry key from 8 to 64 but I still taking the same error. I have follow also the instructions from this post Text was truncated or one or more characters had no match in the target code page When importing from Excel file
with no luck.
The error that I am taking:
"Executing (Error)
Messages
Error 0xc020901c: Data Flow Task 1: There was an error with output column "MyColumn,M" (63) on output "Excel Source Output" (9). The column status returned was: "Text was truncated or one or more characters had no match in the target code page.".
(SQL Server Import and Export Wizard)
Error 0xc020902a: Data Flow Task 1: The "output column "MyColumn,M" (63)" failed because truncation occurred, and the truncation row disposition on "output column "MyColumn,M" (63)" specifies failure on truncation. A truncation error occurred on the specified object of the specified component.
(SQL Server Import and Export Wizard)
Error 0xc0047038: Data Flow Task 1: SSIS Error Code DTS_E_PRIMEOUTPUTFAILED. The PrimeOutput method on component "Source - Page1$" (1) returned error code 0xC020902A. The component returned a failure code when the pipeline engine called PrimeOutput(). The meaning of the failure code is defined by the component, but the error is fatal and the pipeline stopped executing. There may be error messages posted before this with more information about the failure.
(SQL Server Import and Export Wizard)
"

ParseExceptions when using HQL file on HDInsight

I'm following this tutorial http://azure.microsoft.com/en-us/documentation/articles/hdinsight-use-hive/ but have become stuck when changing the source of the query to use a file.
It all works happily when using New-AzureHDInsightHiveJobDefinition -Query $queryString but when I try New-AzureHDInsightHiveJobDefinition -File "/example.hql" with example.hql stored in the "root" of the blob container I get ExitCode 40000 and the following in standarderror:
Logging initialized using configuration in file:/C:/apps/dist/hive-0.11.0.1.3.7.1-01293/conf/hive-log4j.properties
FAILED: ParseException line 1:0 character 'Ã?' not supported here
line 1:1 character 'Â»' not supported here
line 1:2 character 'Â¿' not supported here
Even when I deliberately misspell the hql filename the above error is still generated along with the expected file not found error so it's not the content of the hql that's causing the error.
I have not been able to find the hive-log4j.properties in the blob store to see if it's corrupt, I have torn down the HDInsight cluster and deleted the associated blob store and started again but ended up with the same result.
Would really appreciate some help!

I am able to induce a similar error by putting a Utf-8 or Unicode encoded .hql file into blob storage and attempting to run it. Try saving your example.hql file as 'ANSI' in Notepad (Open, the Save As and the encoding option is at the bottom of the dialog) and then copy it to blob storage and try again.
If the file is not found on Start-AzureHDInsightJob, then that cmdlet errors out and does not return a new AzureHDInsightJob object. If you had a previous instance of the result saved, then the subsequent Wait-AzureHDInsightJob and Get-AzureHDInsightJobOutput would be referring to a previous run, giving the illusion of the same error for the not found case. That error should definitely indicate a problem reading an UTF-8 or Unicode file when one is not expected.

Oracle Table to SAS Dataset

I am facing a problem in converting a large Oracle table to a SAS dataset. I did this earlier and the method worked. However, this time, it is giving me the following error messages.
SAS code:
option compress = yes;
libname sasdata ".";
libname myora oracle user=scott password=tiger path=XYZDATA ;
data sasdata.expt_tabl;
set myora.expt_tabl;
run;
Log file:
You are running SAS 9. Some SAS 8 files will be automatically converted
by the V9 engine; others are incompatible. Please see
http://support.sas.com/rnd/migration/planning/platform/64bit.html
PROC MIGRATE will preserve current SAS file attributes and is
recommended for converting all your SAS libraries from any
SAS 8 release to SAS 9. For details and examples, please see
http://support.sas.com/rnd/migration/index.html
This message is contained in the SAS news file, and is presented upon
initialization. Edit the file "news" in the "misc/base" directory to
display site-specific news and information in the program log.
The command line option "-nonews" will prevent this display.
NOTE: SAS initialization used:
real time 1.63 seconds
cpu time 0.03 seconds
1 option compress = yes;
2 libname sasdata ".";
NOTE: Libref SASDATA was successfully assigned as follows:
Engine: V9
Physical Name: /******/dibyendu
3 libname myora oracle user=scott password=XXXXXXXXXX path=XYZDATA ;
NOTE: Libref MYORA was successfully assigned as follows:
Engine: ORACLE
Physical Name: XYZDATA
4 data sasdata.expt_tabl;
5 set myora.expt_tabl;
6 run;
NOTE: There were 6422133 observations read from the data set MYORA.EXPT_TABL.DATA.
ERROR: Expecting page 1, got page -1 instead.
ERROR: Page validation error while reading SASDATA.EXPT_TABL.DATA.
ERROR: Expecting page 1, got page -1 instead.
ERROR: Page validation error while reading SASDATA.EXPT_TABL.DATA.
ERROR: File SASDATA.EXPT_TABL.DATA is damaged. I/O processing did not complete.
NOTE: The data set SASDATA.EXPT_TABL.DATA has 6422133 observations and 49 variables.
ERROR: Expecting page 1, got page -1 instead.
ERROR: Page validation error while reading SASDATA.EXPT_TABL.DATA
ERROR: Expecting page 1, got page -1 instead.
ERROR: Page validation error while reading SASDATA.EXPT_TABL.DATA.
ERROR: Expecting page 1, got page -1 instead.
2 The SAS System 21:40 Monday, April 1, 2013
ERROR: Page validation error while reading SASDATA.EXPT_TABL.DATA.
ERROR: Expecting page 1, got page -1 instead.
ERROR: Page validation error while reading SASDATA.EXPT_TABL.DATA.
NOTE: Compressing data set SASDATA.EXPT_TABL.DATA decreased size by 78.88 percent.
Compressed is 37681 pages; un-compressed would require 178393 pages.
ERROR: File SASDATA.EXPT_TABL.DATA is damaged. I/O processing did not complete.
NOTE: SAS set option OBS=0 and will continue to check statements. This might cause NOTE: No observations in data set.
NOTE: DATA statement used (Total process time):
real time 8:55.98
cpu time 1:39.33
7
ERROR: Errors printed on pages 1,2.
NOTE: SAS Institute Inc., SAS Campus Drive, Cary, NC USA 27513-2414
NOTE: The SAS System used:
real time 8:58.67
cpu time 1:39.40
This is running on a RH Linux Server.
Any suggestion will be appreciated.
Thanks and regards,

This sounds like a space issue on your server. How large is the file system in your default directory (from your libname sasdata '.'; statement)? Use the data set option obs=1 on your Oracle table reference to create a new SAS dataset with one row and inspect the variables.
data sasdata.dummy_test;
set myora.expt_tabl(obs=1);
run;
Perhaps there are extremely large VARCHAR or BLOB columns that are consuming too much space. Remember that SAS does not have a VARCHAR type.

Though I am not totally sure, I believe the main issue was that I was initially trying to create/write the dataset in a directory, which was restricted in some (?) sense. This was indirectly causing trouble, since the dataset created was defective. When I created it elsewhere, it was okay.
Thanks and regards,
Dibyendu

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string