Kettle (spoon) - get filename for excel output from field in the excel field in input - excel

I'm trying to process an excel , I need to generate una excel file for each row and as filename I need to use one of the fields in the row.
The excel output hasn't the option "Accept filename from field" and I can't figure out how to achieve it.
thanks

You need to copy the rows into memory and then loop it across the excel file to generate multiple files. You need to break your solution to 2 parts. First of all, read all the rows from Excel Input step into "Copy rows to Result" step as a variable. In the next transformation, use the same variable to use it as a file parameter.
Please check the two links:
SO Similar Question: Pentaho : How to split single Excel file to multiple excel sheet output
Blog : https://anotherreeshu.wordpress.com/2014/12/23/using-copy-rows-to-result-in-pentaho-data-integration/
Hope this helps :)

The issue is that the step is mostly made for outputting the rows to a single file, not making a file for each row.
This isn't the most elegant solution but I do think it will work. From your transformation you can call a sub-transformation (Mapping) and send a variable to it containing the filename. The sub-transformation can simply do one thing: write the file, and it should work fine. Make sense?

Related

How to combine two files in Alteryx

i am learning Alteryx and have ran into my first issue. I have an excel file that i am using as one source. The files has two sheets with the same data, but the second sheet does not have headers.
I wanted to see if there was a way to combine the two sheets into one, within Alteryx using column position instead of headers since the second does not have them. Any help is very much appreciated.
Yes, both their Join (https://help.alteryx.com/20213/designer/join-tool) and Union (https://help.alteryx.com/20213/designer/union-tool) tools have a "Record Position" option which is exactly what you're requesting. See the links for details.
You have to input the file twice, once for each sheet.
For the 2nd sheet make sure to click on the option that the first row contains Data
Then you can use the Union tool --> Auto Config by position --> Set a specific order (Check). See image links below.
First Row Contains Data
Union Tool Configuration
Sheet 1 Example Input
Sheet 2 Example Input
Output

Combining CSVs in Power Query returning 1 row of data

I am trying to set up a query that will simply combine data from CSVs into a table as new files get added to a specific folder, where each row contains the data from a separate file. While doing tests with CSVs that I created in excel, this was very simple. After expanding the content column, I would see an individual row of data for each file.
In practice however, where I am trying to use CSVs that are put out from a proprietary android app, expanding the content column leads to 1 single row, with data from all files placed end to end.
Does this have something to do with there not being and "end of line" character in the CSVs the app is producing? If so, is there an easy way to remedy this without changing the app? If not, is there something simple and direct I can ask the developer to change which would prevent this behavior?
Thanks for any insight!

Making an excel readable file?

Is there any information regarding being able to use a toString or a separate method to write to a file and that file can actually be used and opened in excel, therefor creating excel cells etc? Or is this not a well known practice.
You should try using a Comma Separated Values file, or CSV.
These files can be opened in Excel with columns and rows.
Example:
Title,Author,FirstPublished,ISBN,
The Communist Manifesto,Marx K.,1887,9780140447576,
The Black Swan,Taleb N.,2008,9780812979183,
When Money Dies,Fergusson A.,1975,9781906964443,
Liar's Poker,Lewis M.,1989,9780340839966,
Paradox,Al-Khalili J.,2012,9780552778060,
Cosmos,Sagan C.,1981,9780349107035,
An Unquiet Mind,Jamison K.,1995,9780330528078,
Principia Mathematica,Russell B.,1913,9781178292992,
Elements,Euclid,-300,9781420934762,
The Principia,Newton I.,1687,9781607962403,
Relativity,Einstein A.,1920,9781891396304,
The fields on the first line are the columns, and the fields below are the rows (one on each line).

Editing big excel file using SAX + SXSSF

I am trying to edit an existing excel file using SXSSF. This excel may contain 1 Million records. I have to validate each row in excel and if the recors is invalid, append error message in the last column of row with errors in it. Validation process is done first, at that time i will note down the row number which is invalid. once all the validation is over, i'll take a copy of the file and write the error details at the end of each failed rows.
Since SXSSF is write only, I am getting a null value when i tried to get the invalid row with the row number. Please suggest a better way to resolve it.
I came across some suggestions to use SAX + SXSSF to do it in the following threads.
poi read existing excel and edit with large data
How do i read and edit huge excel files using POI?
I know how to read excel using SAX. But dont know how to relate it with SXSSF and edit the excel.But it will be great if some one share a sample code.
Thanks in advance.

Sql script display leading 0 in excel output file

I have sql script "example.sql": SPOOL &1 Select '<.TR>'||'<.TD align="left">'||column_name||'<./TD>'||'<.TR>' from table1; spool off..which dumps it contents to cshell script "getdata.csh" this is how i get data from sql script to csh script sqlplus $ORA_UID/$ORA_PSWD #${SQL}example.sql ${DATA}${ext} once i extract data from it i create a excel file by combining 3 files header. html <html>
<.head>
<.title)
Title
<./title>
<./head>
<.body>
<.table >
<.tr>
<.th>Column Name<./th>
<.tr>ext file that has query results and trailer.html <./tr>
<./table>
<./body>
<./html> and i save this file as .xls and send it through email as attachment.. Now my problem is Column_name has data that starts with 0 but when i open excel file leading 0 are gone but i wanna keep that 0.. so what can i add to make sure that email attached excel file will have leading 0 when that is opened on the other side.. plz any help would be good
Using oracle:
Say your attribute is called 'number'
select '0' || to_char(number) as number
from table mytable
Use the excel object model, or a macro to go into the excel file grab the column and change the formatting.
In your case:
Range("A1").Numberformat = "#"
If you're generating the excel file on the fly, you could prepend those numbers with an apostrophe, ie '
This causes Excel to treat the number like a string. The only downside is it might cause some side effects if the sheet has any equations that use those numbers.
I have dealt with this issue in the past, and the problem is strictly a "feature" of Excel formatting. Unfortunately, I don't have the resources to completely test an answer, but here are two things you can try.
Add a step inside your cshell script to surround your $1 value with ="",
awk '{$1= "=\"" $1 "\""; print $0}' inFile > outFile
The downside is that you're now telling Excel to treat these values as strings. If you're doing any fancy calculations on these values you may have different problems.
#2 (why does SO formatting always renumber numbered blocks as 1 !;-!) . As this is really an Excel formatting problem AND in my recollection, you can't retrieve the leading zero once the file has been opened and processed, I seem to remember I had a trick of pre-formatting a black worksheet, saving it as a template, and then loading the file into the template. I recall that was tricky too, so don't expect it to work. You might have to consult Excel users on the best tactics if #1 above doesn't work.
You might also want to tell people what version of Excel you are using, if you go to them for help.
I hope this helps.
P.S. as you appear to be a new user, if you get an answer that helps you please remember to mark it as accepted, and/or give it a + (or -) as a useful answer

Resources