How to extract (import) data from a mainframe dataset to excel table

How to extract (import) data from a mainframe dataset to excel table - excel

I want to build a little application that calculates the critical batch of a batch flow.
As input I need to use a Mainframe dataset. If possible, being dynamic, that is, I can choose the fields that apply at the time.
I've searched the internet about that but found nothing that suited what I wanted to do.
Is there a way to do that?

I have a dataset in a mainframe library and I want to ftp that file to Excel.
Convert the file to CSV on the mainframe (for example, via a REXX exec, a z/OS UNIX shell script, or a Lua4z program),
and then insert that CSV file into Excel via FTP.
You do not need to transfer the CSV file to your PC's file system and then, as a separate step, open it in Excel.
Instead, you define the FTP (or HTTP) URL for the CSV as a data source in Excel. One advantage of this technique is that you can refresh the data from that URL
without having to reapply formatting in Excel.
There are various tutorials on the web for doing this.
In brief:
Create a new blank workbook (I'm using Excel 2010).
Select the first cell in the empty worksheet (this step is unnecessary - the cell is already selected - if you've only just created the workbook).
On the Data tab, click From Text
In the File name text box of the Import Text File dialog, enter the FTP URL of the CSV file. For example:
ftp://zos1//u/me/data.csv
(This assumes that your mainframe is configured to allow FTP using this path.)
The two consecutive slash (/) characters following the host name (zos1) indicate that the path refers to a z/OS UNIX file (/u/me/data.csv).
The CSV file must be in a z/OS UNIX path. The FTP client does not accept MVS-style (dsname) paths such as 'me.csv(data)' (even when URL-encoded; that is, with the single quotes escaped as %27); by contrast, cURL accepts such paths just fine.
The CSV file on the mainframe must be ASCII encoded, not EBCDIC. (Here, I'm using the term ASCII imprecisely: the precise character encoding you want depends on your PC's settings. You probably want Windows-1252.) This is because the FTP client sets the default transfer type to binary.
Enter your user name and password (your z/OS TSO user ID and password).
Wait for the data to load.
Format the cells. For example, set the format of any columns containing date/time values.
On the Data tab, click Connections, select the connection (that Excel created when you specified a URL for the file name), and clear the check box Prompt for file name on refresh.
To refresh the data, replacing the current data with the results of a new FTP request: on the Data tab, click Refresh All. The data is replaced; the cell formatting remains intact.
Converting an EBCDIC-encoded CSV file to ASCII
(Strictly speaking, I mean ISO-8859, not ASCII.)
Suppose you have JCL that generates a CSV file encoded in EBCDIC. You want to make that CSV file available to Excel via FTP as an ASCII-encoded z/OS UNIX (zFS) file.
Replace your existing DD statement for the output CSV file with the following DD statement:
//OUTCSV DD PATH='/u/me/data-ebcdic.csv',
// PATHOPTS=(OWRONLY,OCREAT,OTRUNC),
// PATHDISP=(KEEP,DELETE),
// PATHMODE=(SIRUSR,SIWUSR,SIRGRP),
// FILEDATA=TEXT
Replace the ddname OUTCSV with your ddname, and the zFS file path /u/me/data-ebcdic.csv with the path that you want to use.
Thanks to the FILEDATA=TEXT parameter, the resulting CSV file will have a X'15' byte at the end of each line.
Append the following step to your JCL:
//ICONV EXEC PGM=IKJEFT01
//SYSTSIN DD *
BPXBATCH sh iconv -f IBM-037 -t iso8859-1 +
/u/me/data-ebcdic.csv +
> /u/me/data-ascii.csv
/*
//SYSPRINT DD SYSOUT=*
//SYSTSPRT DD SYSOUT=*
In case you're wondering why I'm calling iconv as a shell command via BPXBATCH, the following:
//ICONV EXEC PGM=EDCICONV
// PARM=('FROMCODE(IBM-037),TOCODE(iso8859-1)')
didn't quite work: it left the X'15' bytes as is, whereas running iconv as a shell command correctly converted them to X'0A'. (z/OS 2.2.)

You've got some good information in the comments, consensus appears to be conversion to CSV (or TSV to avoid commas embedded in your data) is the easiest route. Here is a bit more information, copied from another answer...
I would strongly suggest you get the files into a text format before
transferring them to another box with a different code page. Trying to
deal with mixed text (which must have its code page translated) and
binary (which must not have its code page translated but which likely
must be converted from big endian to little endian) is harder than
doing the conversion up front.
The conversion can likely be done via the SORT utility on the
mainframe. Mainframe SORT utilities tend to have extensive data
manipulation functions. There are other mechanisms you could use
(other utilities, custom code written in the language of your choice,
purchased packages) but this is what we tend to do in these
circumstances.
Once you have your flat files converted such that all data is text,
you can transfer them via FTP or SFTP or FTPS.
...and thanks for coming back and adding more information. Hopefully the people here have provided enough information to help you solve your problem.

XML would be another possible text oriented solution. It would take more effort to create, but you could design your spreadsheet in Excel and save as an XML document, then write a program to generate the xml text using the data from your mainframe dataset. While this would be more difficult to implement than a simple CSV or TSV file, it has the advantage of implementing the spreadsheet formulas and attributes that a CSV file can not do. Another advantage, you can attach the XML document to an SMTP email note and deliver the document in "spreadsheet format" to your client.

Related

Batch file creation: Convert xls to csv using only batch script

I have done quite a bit of searching before posting this question so let me outline what I am trying to do.
1.) I do not want to use applications I have to download from a website or created custom commands (please no start Xls2Csv.exe here's a link to a website where you can download the program) I do not want to download a program to do this.
2.) I want to keep it in the batch file if possible - I have tried the vbc/vbs/vb files that is not what I am looking for.
3.) I found this an this is close to what I need but if I can stay within a batch file that would be best: Can a Batch File Tell a program to save a file as? (If so how)
Background
I have a bunch of test records stored in excel sheets within folders. Each test record has autoformatted name so the only real difference between any of the filenames is a serial number, otherwise each file name is formatted the exact same way.
I have written a batch file to search and find the files I need but I am stuck on obtaining a tiny bit of information in a .xls file.
What I am trying to do - I have excel files (.xls) and there is a word in a cell on one of many sheets that I would like to copy into a textfile. However I am unable to use findstr for an excel find because the command searches the file as if you opened it in notepad and the data I need is not present.
I am not concerned of data loss as long as I can get this tiny bit of information to a text file.
Otherwise what I have found to be the best solution is to convert an XLS to a CSV. I have manually done it by opening the file and saving as type .csv that worked.
What hasn't worked is:
example1.xls >> example2.csv
ren example1.xls example3.csv - this will save it as a csv file but still opens with the same formating of the xls file in both excel and notepad.
I was hoping that the was a command to recreate the manual process of opening the file and saving as csv.
If there are any other suggested solutions - maybe a command where I can search for a string within an excel file? That would be the simplest option.

(dd command linux) last byte goes to next line

Hi friends I need some help.
We have a tool that convert binary files to text files, and after that stores into Hadoop (HDFS).
In production, that ingestion tool uses ftp to download files from mainframe in binary format (EBCDIC), and we don't have access to donwload files from mainframe in development environment.
In order to test file conversion, we manually create text files, and we are trying to convert file using dd command (linux), using these parameters:
dd if=asciifile.txt of=ebcdicfile conf=ebcdic
After pass through our conversion tool, the expected result is:
000000000000000 DATA
000000000000000 DATA
000000000000000 DATA
000000000000000 DATA
However, it's returning the following result:
000000000000000 DAT
A000000000000000 DA
TA000000000000000 D
ATA000000000000000
I have tried with cbs, obs and ibs parameters, assigning lrec (number of lines of each line) without success.
Can anyone help me?

A few things to consider:
How exactly is the data transferred via FTP? Your "in binary format(EBCDIC)" simply doesn't make any sense at all. The FTP either transfers in binary format, then nothing gets changed, or converted during the transfer. Or the FTP transfers in text mode, aka. ASCII mode, then data is converted from a specific EBCDIC code page to a specific non-EBCDIC code page. You need to know what mode, and if text mode, what are the two code pages being used.
From the man pages for dd, it is unclear what EBCDIC, and ASCII code pages are used for the conversion. I'm just guessing here: EBCDIC code page might be CP-037, and ASCII might be CP-437. If these don't match the ones used in the FTP, the resulting test data is incorrect.
I understand you don't have access to production data in the development environment. However, you should still be able to get test data from the development mainframe using FTP from there. If not, how will you be doing end to end testing?

The EBCDIC conversion is eating your line endings:
https://www.ibm.com/docs/en/zos/2.2.0?topic=server-different-end-line-characters-in-text-files

IBM Mainframe copy/paste

Disclaimer: I'm new to using Rumba to access IBM Mainframe.
I have currently set up a library for personal use and I have some code that I want to store in a member of this library, how can I copy/paste from a .txt file on my desktop into this program??? As of right now I can successfully copy/paste one line at a time from documents outside of Rumba.

There are various ways. The best one will depend upon the size of the file/amount of data to be transferred.
If it's only a few lines, block copy and paste should work, but you might have to play with Rumba's 'paste' edit settings such as how to handle new lines, etc.
Bigger files can be transferred with the TSO file transfer program ind£file (maybe ind$file on your system) which essentially copies a file to the screen and then Rumba 'scrapes' the screen for data to put into a file (this is for a mainframe-to-PC transfer; for going the other way the operation is reversed). This can be surprisingly quick.
Lastly there's FTP - either from the command line or via a program such as WinSCP.
Edit:
Based on your comment that the files are about 300 lines long, I'd look into using Rumba's file-transfer option using the ind$file utility. Once you have the files on one system, speak to your mainframe tech support team about the best way to get them to the other systems.
If you need help uploading the files, then the tech support team should be your first point of call.

What mainframe editor are you running? TSO/ISPF?
I copy and and paste from ".txt" files into ISPS all the time with no problem.
Select the text you want to copy (in the ".txt" file)
Press CTRL-C
Open the mainframe file using ISPF Edit (option 2).
Enter line command "Inn" at the line where where you want the copy to start.
(This inserts "nn" empty lines to receive the copied data. Personally, I usually use "nn"=20)
Position your cursor at the first character of the first empty line.
Press CTRL-V

Pentaho - CSV Input not understanding special character [Windows to Linux]

I have a transformation on Pentaho Data Integration where the first thing I do is I use the "CSV Input" to map my flat file.
I've never had a problem with it on windows, but now I'm chaning my server that spoon is going to run to a linux server and now I'm having problems with special characters.
The first thing I noticed was that my tables where being updated because the system was understanding the names as diferent strings to the ones that are at my database.
Checking for the problem, I also noticed that if I go to my "CSV Input" -> Preview, it will show me the preview of my data with the problem above:
Special characters are not showing.
Where it should be:
Diretoria de Suporte à Decisão e Aplicação
I used a command to checked my file charset/codification and it showed:
$ file -bi foo.csv
text/plain; charset=iso-8859-1
If I open foo.csv on vi, it understands the special characters.
Any idea on what could be the problem or what should I try?

I don't have any data files with this encoding, so you'll have to do some experimenting, but there are some steps designed to deal with these issues.
First, the CSV Input step has a field that allows you to select the encoding of the source file. The Text File Input step has both a "Format" (meaning line terminator) and "Encoding" selector under the "Content" tab.
In Transforms, you have the Change file encoding step under the Utility tab. This step is designed to copy many files while changing their encoding; that's why it's in a transform.
In Jobs, there's the Convert file between Windows and Unix step under the File Management tab, but this appears to only deal with line terminators.
Either way it appears if the CSV/Text file input steps don't suit your needs, you'll have to copy the file to a new encoding before reading it in. It will probably be easiest to try handling it with the file input steps first.

How to determine file encoding type with Excel VBA

I have built an Excel/VBA tool to validate csv files to ensure the data they contain is valid. They csv can come originate from anywhere (from a full blown unix system or a desktop user saving data out from Excel). The Excel tool is sent out to businesses so they can validate their csv files in their own environment and without taking the risk of their data leaving thier systems. Thus, the solution needs to be in native VBA and not link into external libraries.
So using VBA, I need to be able to automatically detect UTF-8 (with or without BOM) or ANSI file encodings and warn the user if these are not the file encodings used for the csv.
I think this would perhaps involve reading in a few bytes from the start of the file and determining the encoding based on the existance of the byte order mark.
Could you help me get me started on the right track?

Assuming you have the freedom to ask user to choose the correct file type, making them responsible for what they choose as a file ;)
That means, you can create a form where users can choose the filename and the encoding type like how we do on file open wizard.
Else,
I suggest you to use the FileSystemObject. It returns a TextStream which can be utilized to determine the encoding. I doubt VBA supports other types of encoding and please correct me if it does :) and happy to hear. :)
how to detect encoding type
msdn object library model
Here is a link for further considerations:-
change encode type

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string