Null and date format handling in talend

Null and date format handling in talend - excel

I have an excel with a date field but the first row in the excel is blank and few other rows are having a date format as MM/dd/yyyy HH:mm:ss.
The data to be loaded into a Postgresql table with the field of data type timestamp yyyy-mm-dd HH:mm:ss.
The excel cannot be modified as it is being downloaded from the cloud and the data is loaded straight away into the table.
I tried using tConvert type but it cannot accept null or " " values in timestamp. I am facing a Null tMap error during runtime in talend. Even if I try to convert from string to date format in order to pass null in tmap, it is changing the date format and showing error.How can this be handled ?
The talend structure is : tFileInputExcel - > TMAP(date field : MM/dd/yyyy HH:mm:ss) -> tConvertType(date field : yyyy-mm-dd HH:mm:ss) ->TMAP(yyyy-mm-dd HH:mm:ss) -> Postgresql Table
Here is the Excel screenshot:

At first, I do not quite understand why do you want to use tConvertType component. After defining a proper schema Talend is changing your data into Java Date object and from that moment format is not important and you don't have to convert it when you want to put it into Postgres table. At least it should not cause NullPointerException.
Consider following steps:
Sample input file
I've prepared some file with date value/space/empty string, solution I'm describing works also with nulls.
Configure tFileInputExcel component
You have to allow taking null values in by checking the Nullable check box. You should also check trim option.
Examine output
After connecting input component to tLogRow null/empty/space values are handled properly.
I hope this will be helpful.

You can capture date format or null handling in variable within tMAP component
that is
var :TalendDate.formatDate("yyyy-mm-dd HH:mm:ss",row1.columnname)
so data flow would be
tFileInputExcel --->tMAP --->Postgresql Table

Related

Not able to change datatype of Additional Column in Copy Activity - Azure Data Factory

I am facing very simple problem of not able to change the datatype of additional column in copy activity in ADF pipeline from String to Datetime
I am trying to change source datatype for additional column in mapping using JSON but still it doesn't work with polybase cmd
When I run my pipeline it gives same error
Is it not possible to change datatype of additional column, by default it takes string only

Dynamic columns return string.
Try to put the value [Ex. utcnow()] in the dynamic content of query and cast it to the required target datatype.
Otherwise you can use data-flow-derived-column :
https://learn.microsoft.com/en-us/azure/data-factory/data-flow-derived-column

Since your source is a query, you can choose to bring current date in source SQL query itself in the desired format rather than adding it in the additional column.
Thanks

Try to use formatDateTime as shown below and define the desired Date format:
Here since format given is ‘yyyy-dd-MM’, the result will look as below:
Note: The output here will be of string format only as in Copy activity we could not cast data type as of the date.
We could either create current date in the Source sql query or use above way so that the data would load into the sink in expected format.

varchar yy/mm/dd to date azure sql server

Im trying to convert a varchar value into a date format.
The varchar value is 99/01/30and the result should be 1999-01-30.
Would somebody help me with this?
I'm trying to do this but :
select convert(date, convert(varchar, '99/01/30', 11),23) as date
but this gives me an error: Conversion failed when converting date and/or time from character string.

This
select convert(date, '99/01/30', 2) as date
will get you what you want. See also this page for more information about what the 2 means

How to Convert a column having one timestamp to another timestamp in Azure Data Factory

I have column ABC where timestamp is of format dd/MM/yyyy HH:mm:SS (11/04/2020 1:17:40).I want to create another column ABC_NEW with same data as old column but with different timestamp 'yyyy-MM-dd HH:mm:SS'.I tried doing in azure data factory derived column using
toTimestamp(column_name,'yyyy-MM-dd HH:mm:SS') but it did not work it is coming as NULL. Can anyone help?

It's a 2-step process. You first need to tell ADF what each field in your timestamp column represents, then you can use string conversions to manipulate that timestamp into the output string as you like:
toString(toTimestamp('11/04/2020 1:17:40','MM/dd/yyyy HH:mm:ss'),'yyyy-MM-dd HH:mm:SS')

Data Factory doesn't support date format 'dd/mm/yyyy', we can not convert it to 'YYYY-MM-DD' directly.
I use DerivedColumn to generate a new column ABC_NEW from origin column DateTime and enter the expression bellow：
toTimestamp(concat(split(substring(DateTime,1, 10), '/')[3], '-',split(substring(DateTime,1, 10), '/')[2],'-',split(substring(DateTime,1, 10), '/')[1],substring(DateTime,11, length(DateTime))))
The result shows:

This is a trick which was a blocker for me, but try this-
Go to sink
Mapping
Click on output format
Select the data format or time format you prefer to store the data into the sink.

Specifying timestamp or date format in Athen Table

I have a timestamp in ISO-8601 format and want to specify it either as a timestamp or datetime format when creating a table in Athena. Any clues on how to do this ?
Thanks!

When you create table in Athena you can set a column as date or timestamp only in the Unix format as follows:
DATE, in the UNIX format, such as YYYY-MM-DD.
TIMESTAMP. Instant in time and date in the UNiX format, such as
yyyy-mm-dd hh:mm:ss[.f...]. For example, TIMESTAMP '2008-09-15
03:04:05.324'. This format uses the session time zone.
If the format is different, define it as a String and when you query the data use the date function:
from_iso8601_date(string) → date
You can convert the data to make it easier and cheaper for specific use cases by using CTAS (create table as select) query that will generate a new copy of the data in a simpler and more efficient (compressed and columnar) parquet format.

Logstash convert the "yyyy-MM-dd" to "yyyy-MM-dd'T'HH:mm:ss.SSSZ"

I use the logstash-input-jdbc plugin to sync my data from mysql to elasiticsearch. However, when I looked at the data in elasticsearch, I found that the format of the fields of all date types changed from "yyyy-MM-dd" to "yyyy-MM-dd'T'HH:mm:ss.SSSZ".I have nearly 200 fields whose type is date, so I want to know how to configure logstash so that it can output the format "yyyy-MM-dd" instead of "yyyy-MM-dd'T'HH:mm:ss.SSSZ".

Elasticsearch stores dates as UTC timestamps:
Internally, dates are converted to UTC (if the time-zone is specified) and stored as a long number representing milliseconds-since-the-epoch.
Queries on dates are internally converted to range queries on this long representation, and the result of aggregations and stored fields is converted back to a string depending on the date format that is associated with the field.
So if you want to retain the yyyy-MM-dd format, you'll have to store it as a keyword (which you then won't be able to do range queries on).
You can change Kibana's display to only show the yyyy-MM-dd format, but note that it will convert the date to the timezone of the viewer which may result in a different day than you entered in the input field.
If you want to ingest the date as a string, you'll need to create a mapping for the index in question to prevent default date processing.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Null and date format handling in talend - excel

You can capture date format or null handling in variable within tMAP component that is var :TalendDate.formatDate("yyyy-mm-dd HH:mm:ss",row1.columnname) so data flow would be tFileInputExcel --->tMAP --->Postgresql Table

Related

Not able to change datatype of Additional Column in Copy Activity - Azure Data Factory

varchar yy/mm/dd to date azure sql server

How to Convert a column having one timestamp to another timestamp in Azure Data Factory

Specifying timestamp or date format in Athen Table

Logstash convert the "yyyy-MM-dd" to "yyyy-MM-dd'T'HH:mm:ss.SSSZ"

Categories

Resources