I have the file location in my database and i would loop on each file location to open the file and extract data in Talend to insert in my database .
I get the file location with a request and i get something like that :
So, i would loop on each row, and put the FILE_PATH result in a component Excel.
How can I do this?
Thank you !
You have to use tFlowToIterate : it will create a loop based on incoming data (filepath here)
tDBInput>tFlowToIterate>tFileInputExcel
In tFileInputExcel, just put the global variable coming from your tFlowToIterate as the filepath.
Your case will be similar to this talend example with tFlowToIterate : https://help.talend.com/r/LffDtDyh9DacekHL_U0CGw/798DoNGV_8WjRj6C42_HJQ
Related
I am getting every data single excel file in my data lake. My container name is 'odoo' in the data lake. Excel files get stored in the folder called 'odoo' and below is the name of the file
report_2022-01-20.xlsx
I am using dataflow and I wanted to take everyday file using a wildcard path. Below is the dynamic expression I am trying to give but no success
/odoo/#concat('report_', string(formatDateTime(utcNow(), 'yyyy-MM-dd')), '.xlsx')
Can anyone advise me how to write the correct expression? I am a newbie to the adf.
Your expression looks fine. In your dataset, browse the container/folder and add the expression in the file path to get the file dynamically.
Source file:
#concat('report_', string(formatDateTime(utcNow(), 'yyyy-MM-dd')), '.xlsx')
Dataflow:
Can you try: #concat('/odoo/report_', formatDateTime(utcNow(), 'yyyy-MM-dd'), '.xlsx')
The string() might be causing the issue
I'm trying to solve following scenario in Azure Data Factory:
I have a large number of folders in Azure Blob Storage. Each folder contains varying number of files in parquet format. Folder name contains the date when data contained in the folder was generated, something like this: DATE=2021-01-01. I need to filter the files and save them into another container in delimited format and each file should have the date indicated in source folder name in it's file name.
So when my input looks something like this...
DATE=2021-01-01/
data-file-001.parquet
data-file-002.parquet
data-file-003.parquet
DATE=2021-01-02/
data-file-001.parquet
data-file-002.parquet
...my output should look something like this:
output-data/
data_2021-01-01_1.csv
data_2021-01-01_2.csv
data_2021-01-01_3.csv
data_2021-01-02_1.csv
data_2021-01-02_2.csv
Reading files from subfolders and filtering them and saving them is easy. Problems start when I'm trying to set output dataset file name dynamically. I can get the folder names using Get Metadata activity and then I can use ForEach activity to set them into variables. However, I can't figure out how to use this variable in filtering data flow sinks dataset.
Update:
My Get Metadata1 activity, set the container input as:
Set the container input as follows:
My debug info is as follows:
I think I've found the solution. I'm using csv files for example.
My input looks something like this
container:input
2021-01-01/
data-file-001.csv
data-file-002.csv
data-file-003.csv
2021-01-02/
data-file-001.csv
data-file-002.csv
My debug result is as follows:
Using Get Metadata1 activity to get the folder list and then using ForEach1 activity to iterate this list.
Inside the ForEach1 activity, we now using data flow to move data.
Set the source dataset to the container and declare a parameter FolderName.
Then add dynamic content #dataset().FolderName to the source dataser.
Back to the ForEach1 activity, we can add dynamic content #item().name to parameter FolderName.
Key in File_Name to the tab. It will store the file name as a column eg. /2021-01-01/data-file-001.csv.
Then we can process this column to get the file name we want via DerivedColumn1.
Addd expression concat('data_',substring(File_Name,2,10),'_',split(File_Name,'-')[5]).
In the Settings of sink, we can select Name file as column data and File_Name.
That's all.
I have a list of dataframes (df_cleaned) created from multiple csv files chosen by the user.
My objective is to save each dataframe within the df_cleaned list as a separate csv file locally.
I have the following code done which saves the file with its original title. But I see that it overwrites and manages to save a copy of only the last dataframe.
How can I fix it? According to my very basic knowledge perhaps I could use a break-continue statement in the loop? But I do not know how to implement it correctly.
for i in range(len(df_cleaned)):
outputFile = df_cleaned[i].to_csv(r'C:\...\Data Docs\TrainData\{}.csv'.format(name))
print('Saving of files as csv is complete.')
You can create a different name for each file, as an example in the following I attach the index to name:
for i in range(len(df_cleaned)):
outputFile = df_cleaned[i].to_csv(r'C:\...\Data Docs\TrainData\{0}_{1}.csv'.format(name,i))
print('Saving of files as csv is complete.')
this will create a list of files named <name>_N.csv with N = 0, ..., len(df_cleaned)-1.
A very easy way of solving. Just figured out the answer myself. Posting to help someone else.
fileNames is a list I created at the start of the code to save the
names of the files chosen by the user.
for i in range(len(df_cleaned)):
outputFile = df_cleaned[i].to_csv(r'C:\...\TrainData\{}.csv'.format(fileNames[i]))
print('Saving of files as csv is complete.')
Saves a separate copy for each file in the defined directory.
Is it possible to set the filename of a excel to a name from database upon downloading ?
Your question is not precise, but I guess, probably you want to set the name of an Excell's file possible to download from your page to the name saved in the database.
In CodeIgniter it's possible to use Download Helper.
First select from the DataBase the name you want to show for users and save it to a variable. For example:
$name.
Load Download Helper:
$this->load->helper('download');
Use function force_download() to serve your Excel file for user with your name from DB:
$data = file_get_contents($excell_file_path); // Read the file's contents
force_download($name, $data);
I have qvw file with sql query
Data:
LOAD source, color, date;
select source, color, date
as Mytable;
STORE Data into [..\QV_Data\Data.qvd] (qvd);
Then I export data to excel and save.
I need something to do that automatically instead of me
I need to run query every day and automatically send data to excel but keep old data in excel and append new value.
Can qlikview to do that?
For that you need to create some crazy macro that runs after a reload task in on open-trigger. If you schedule a windows task that execute a bat file with path to qlikview.exe with the filepath as parameters and -r flag for reload(?) you can probably accomplish this... there are a lot of code of similar projects to be found on google.
I suggest adding this to the loadscript instead.
STORE Table into [..\QV_Data\Data.csv] (txt);
and then open that file in excel.
If you need to append data you could concatenate new data onto the previous data.. something like:
Data:
load * from Data.csv;
//add latest data
concatenate(Data)
LOAD source, color, date from ...
STORE Data into [..\QV_Data\Data.csv] (txt);
I assume you have the desktop version so you don't have access to the Qlikview Management Console (if you do, this is obviously the best way).
So, without the Console, you should create a txt file with this command: "C:\Program Files\QlikView\Qv.exe" /r "\\thePathToYourFile\test.qvw". Save this file with .cmd file extension. After that you can schedule this command file with the windows task scheduler.