Copy activity fails while adding additional columns in parquet file - azure

I have been trying to add additional column in parquet file through copy activity pipeline to copy from csv file to parquet
but it is giving me error the column name is invalid. Column name cannot contain these character:[,;{}()\n\t=]
I am adding only filename as column name in additional column at source and taking $$filepath as value.

I reproduced the above and got the same error.
The above error occurs when we give special characters (,;{}()\n\t=) in the column name of the additional column. For Parquet file special characters not allowed for columns names.
When we avoid the above special characters in the column name, we can get desired result.
Parquet file:

Related

How can I add dynamic content to "First Row As Header" condition in Azure Data Factory Dataset?

In a dataset, I can see that I can add dynamic content in the First Row as Header box:
My question is can I use dynamic content in a way that if a column header is empty in the csv then I can add a custom name. If all the column names are there, it would take the first row as is?
Asking because I have some files with 1/2 empty column names.
Thanks!
My question is can I use dynamic content in a way that if a column header is empty in the csv then I can add a custom name. If all the column names are there, it would take the first row as is?
No,because dynamic content must return boolean value,you can't replace empty column name with your custom name.
As a workaround,you can use data flow.
Below is my test sample:
My data in csv file:
fieldA,,fieldB,,fieldC
1,2,3,4,5
Setting of source of dataset:
ADF will auto generate column name when your column name is empty,like _c1
Then you can use DerivedColumn:
Finally:you can use select or sink mapping,delete columns which are generated by ADF.

cassandra skip columns on copy data from csv file

I want to copy some columns from csv file to cassandra table. There's 300 columns in csv file and I only need the first ten columns. There's no header in the csv file.
I tried
copy table from 'file.csv' with header=false and skipcols=[range(11,300)]
but it didn't work.
Can you specify column names after table name i.e.
COPY TABLE xx(c1,c2..C30) FROM ..

How to merge two cells in CSV file

How to merge two cell in CSV file like as Excel?
I want to merge two cell in CSV file file as:
Header Id Name mobileNo
Sub-Header id first Name last Name countryCode MobNumber
This is not possible
A comma-separated values (CSV) file stores tabular data (numbers and
text) in plain text.
It is just data, with no attached formatting or knowledge of how the
cells should be merged when the data is imported.
You could use a script maybe like python to merge them, refer this link
https://dzone.com/articles/merging-cells
You can just keep on adding comma's after a text in one cell to merge it with other cells. It did work for me at least.

Copying absolute values of columns

Hi i've got an excel file (file A) with multiple columns created using xlsxwriter.
However this column value outputs are based on a formula.
Now i'd like to copy this column values to another excel file i've created using xlsxriter / xlrd. If there anyway to copy JUST absolute values out?My attempts often copies the whole formula from the previous file A and hence I've got no values. Thanks!

Exact match for all fields in table with excel in Talend

I have a table and an excel wherein I have to compare values for all the fields. If there is a mismatch in the fields with the excel for that row in that column then it should highlight which column is a mismatch with the table field.I m getting all the columns in the rejected instead of only a column with differed value i.e. here only my row 2, column 4 is different from the table. Here is my excel and the table structure. Also I have attached the output needed. Excel Input :
Table Input :
Job Design :
TMAP Design :
Expected Output :
Your mapping seems correct, debug your work following these stemps:
Make sure that data types are same for both inputs
Visualize each input using a tLogRow, check out if there is some whitespaces around fields.
There is a check box in the advanced settings of your input components to trim data and get rid of white spaces.
For the output, Talend cannot highlight cells or color it, think about something else, you can write in a new column the indeces of the unmached cells.
THIS IS RESOLVED. I put the data into other table and did a case query and joined the other table with the first table.

Resources