Compare two files and write it to "match" and "nomatch" files - mainframe

I have two input files, each with length of 5200 bytes. A seven byte key is used to compare both files, if there is a match then it needs to be written to "match" file but while writing to match file I need a few fields from infile1 and all other fields from infile2.
If there is no match then write to no match file.
Is it possible to do it in sort? I know it can be easily done using COBOL program but just want to know in SORT/ICETOOL/Easytrieve Plus (EZTPA00).

Since 12,200 people have looked at this question and not got an answer:
DFSORT and SyncSort are the predominant Mainframe sorting products. Their control cards have many similarities, and some differences.
JOINKEYS FILE=F1,FIELDS=(key1startpos,7,A)
JOINKEYS FILE=F2,FIELDS=(key2startpos,7,A)
JOIN UNPAIRED,F1,F2
REFORMAT FIELDS=(F1:1,5200,F2:1,5200)
SORT FIELDS=COPY
A "JOINKEYS" is made of three Tasks. Sub-Task 1 is the first JOINKEYS. Sub-Task 2 is the second JOINKEYS. The Main Task follows and is where the joined data is processed. In the example above it is a simple COPY operation. The joined data will simply be written to SORTOUT.
The JOIN statement defines that as well as matched records, UNPAIRED F1 and F2 records are to be presented to the Main Task.
The REFORMAT statement defines the record which will be presented to the Main Task. A more efficient example, imagining that three fields are required from F2, is:
REFORMAT FIELDS=(F1:1,5200,F2:1,10,30,1,5100,100)
Each of the fields on F2 is defined with a start position and a length.
The record which is then processed by the Main task is 5311 bytes long, and the fields from F2 can be referenced by 5201,10,5211,1,5212,100 with the F1 record being 1,5200.
A better way achieve the same thing is to reduce the size of F2 with JNF2CNTL.
//JNF2CNTL DD *
INREC BUILD=(207,1,10,30,1,5100,100)
Some installations of SyncSort do not support JNF2CNTL, and even where supported (from Syncsort MFX for z/OS release 1.4.1.0 onwards), it is not documented by SyncSort. For users of 1.3.2 or 1.4.0 an update is available from SyncSort to provide JNFnCNTL support.
It should be noted that JOINKEYS by default SORTs the data, with option EQUALS. If the data for a JOINKEYS file is already in sequence, SORTED should be specified. For DFSORT NOSEQCHK can also be specified if sequence-checking is not required.
JOINKEYS FILE=F1,FIELDS=(key1startpos,7,A),SORTED,NOSEQCHK
Although the request is strange, as the source file won't be able to be determined, all unmatched records are to go to a separate output file.
With DFSORT, there is a matching-marker, specified with ? in the REFORMAT:
REFORMAT FIELDS=(F1:1,5200,F2:1,10,30,1,5100,100,?)
This increases the length of the REFORMAT record by one byte. The ? can be specified anywhere on the REFORMAT record, and need not be specified. The ? is resolved by DFSORT to: B, data sourced from Both files; 1, unmatched record from F1; 2, unmatched record from F2.
SyncSort does not have the match marker. The absence or presence of data on the REFORMAT record has to be determined by values. Pick a byte on both input records which cannot contain a particular value (for instance, within a number, decide on a non-numeric value). Then specify that value as the FILL character on the REFORMAT.
REFORMAT FIELDS=(F1:1,5200,F2:1,10,30,1,5100,100),FILL=C'$'
If position 1 on F1 cannot naturally have "$" and position 20 on F2 cannot either, then those two positions can be used to establish the result of the match. The entire record can be tested if necessary, but sucks up more CPU time.
The apparent requirement is for all unmatched records, from either F1 or F2, to be written to one file. This will require a REFORMAT statement which includes both records in their entirety:
DFSORT, output unmatched records:
REFORMAT FIELDS=(F1:1,5200,F2:1,5200,?)
OUTFIL FNAMES=NOMATCH,INCLUDE=(10401,1,SS,EQ,C'1,2'),
IFTHEN=(WHEN=(10401,1,CH,EQ,C'1'),
BUILD=(1,5200)),
IFTHEN=(WHEN=NONE,
BUILD=(5201,5200))
SyncSort, output unmatched records:
REFORMAT FIELDS=(F1:1,5200,F2:1,5200),FILL=C'$'
OUTFIL FNAMES=NOMATCH,INCLUDE=(1,1,CH,EQ,C'$',
OR,5220,1,CH,EQ,C'$'),
IFTHEN=(WHEN=(1,1,CH,EQ,C'$'),
BUILD=(1,5200)),
IFTHEN=(WHEN=NONE,
BUILD=(5201,5200))
The coding for SyncSort will also work with DFSORT.
To get the matched records written is easy.
OUTFIL FNAMES=MATCH,SAVE
SAVE ensures that all records not written by another OUTFIL will be written here.
There is some reformatting required, to mainly output data from F1, but to select some fields from F2. This will work for either DFSORT or SyncSort:
OUTFIL FNAMES=MATCH,SAVE,
BUILD=(1,50,10300,100,51,212,5201,10,263,8,5230,1,271,4929)
The whole thing, with arbitrary starts and lengths is:
DFSORT
JOINKEYS FILE=F1,FIELDS=(1,7,A)
JOINKEYS FILE=F2,FIELDS=(20,7,A)
JOIN UNPAIRED,F1,F2
REFORMAT FIELDS=(F1:1,5200,F2:1,5200,?)
SORT FIELDS=COPY
OUTFIL FNAMES=NOMATCH,INCLUDE=(10401,1,SS,EQ,C'1,2'),
IFTHEN=(WHEN=(10401,1,CH,EQ,C'1'),
BUILD=(1,5200)),
IFTHEN=(WHEN=NONE,
BUILD=(5201,5200))
OUTFIL FNAMES=MATCH,SAVE,
BUILD=(1,50,10300,100,51,212,5201,10,263,8,5230,1,271,4929)
SyncSort
JOINKEYS FILE=F1,FIELDS=(1,7,A)
JOINKEYS FILE=F2,FIELDS=(20,7,A)
JOIN UNPAIRED,F1,F2
REFORMAT FIELDS=(F1:1,5200,F2:1,5200),FILL=C'$'
SORT FIELDS=COPY
OUTFIL FNAMES=NOMATCH,INCLUDE=(1,1,CH,EQ,C'$',
OR,5220,1,CH,EQ,C'$'),
IFTHEN=(WHEN=(1,1,CH,EQ,C'$'),
BUILD=(1,5200)),
IFTHEN=(WHEN=NONE,
BUILD=(5201,5200))
OUTFIL FNAMES=MATCH,SAVE,
BUILD=(1,50,10300,100,51,212,5201,10,263,8,5230,1,271,4929)

I had used JCL about 2 years back so cannot write a code for you but here is the idea;
Have 2 steps
First step will have ICETOOl where you can write the matching records to matched file.
Second you can write a file for mismatched by using SORT/ICETOOl or by just file operations.
again i apologize for solution without code, but i am out of touch by 2 yrs+

Though its really long back this question was posted, I wish to answer as it might help others. This can be done easily by means of JOINKEYS in a SINGLE step. Here goes the pseudo code:
Code JOINKEYS PAIRED(implicit) and get both the records via reformatting filed. If there is NO match from either of files then append/prefix some special character say '$'
Compare via IFTHEN for '$', if exists then it doesnt have a paired record, it'll be written into unpaired file and rest to paired file.
Please do get back incase of any questions.

In Eztrieve it's really easy, below is an example how you could code it:
//STEP01 EXEC PGM=EZTPA00
//FILEA DD DSN=FILEA,DISP=SHR
//FILEB DD DSN=FILEB,DISP=SHR
//FILEC DD DSN=FILEC.DIF,
// DISP=(NEW,CATLG,DELETE),
// SPACE=(CYL,(100,50),RLSE),
// UNIT=PRMDA,
// DCB=(RECFM=FB,LRECL=5200,BLKSIZE=0)
//SYSOUT DD SYSOUT=*
//SRTMSG DD SYSOUT=*
//SYSPRINT DD SYSOUT=*
//SYSIN DD *
FILE FILEA
FA-KEY 1 7 A
FA-REC1 8 10 A
FA-REC2 18 5 A
FILE FILEB
FB-KEY 1 7 A
FB-REC1 8 10 A
FB-REC2 18 5 A
FILE FILEC
FILE FILED
FD-KEY 1 7 A
FD-REC1 8 10 A
FD-REC2 18 5 A
JOB INPUT (FILEA KEY FA-KEY FILEB KEY FB-KEY)
IF MATCHED
FD-KEY = FB-KEY
FD-REC1 = FA-REC1
FD-REC2 = FB-REC2
PUT FILED
ELSE
IF FILEA
PUT FILEC FROM FILEA
ELSE
PUT FILEC FROM FILEB
END-IF
END-IF
/*

//STEP01 EXEC SORT90MB
//SORTJNF1 DD DSN=INPUTFILE1,
// DISP=SHR
//SORTJNF2 DD DSN=INPUTFILE2,
// DISP=SHR
//SORTOUT DD DSN=MISMATCH_OUTPUT_FILE,
// DISP=(,CATLG,DELETE),
// UNIT=TAPE,
// DCB=(RECFM=FB,BLKSIZE=0),
// DSORG=PS
//SYSOUT DD SYSOUT=*
//SYSIN DD *
JOINKEYS FILE=F1,FIELDS=(1,79,A)
JOINKEYS FILE=F2,FIELDS=(1,79,A)
JOIN UNPAIRED,F1,ONLY
SORT FIELDS=COPY
/*

Related

Multiplication division using DFSORT utility in Mainframe

There are two files FILE1.DATA and FILE2.DATA
To calculate percentage (Number of records in FILE1/Number of records in FILE2)*100 using DFSORT in Mainframe. And setting Return Code if it crossing a threshold (90%).
//********Extracting Unique records data*****************
//SORTT000 EXEC PGM=SORT
//SYSOUT DD SYSOUT=*
//SORTIN DD DSN=SAMPLE.DATA1,DISP=SHR
//SORTOUT DD DSN=FILE1.DATA,
// SPACE=(2790,(5376,1075),RLSE),
// UNIT=TSTSF,
// DCB=(RECFM=FB,LRECL=05,BLKSIZE=0),
// DISP=(NEW,CATLG,DELETE)
//SYSIN DD *
SORT FIELDS=(10,5,CH,A)
OUTREC FIELDS=(1:10,5)
SUM FIELDS=NONE
/*
//************Getting count of records*****************
//STEP001 EXEC PGM=ICETOOL
//TOOLMSG DD SYSOUT=*
//DFSMSG DD SYSOUT=*
//SYSOUT DD SYSOUT=*
//SYSPRINT DD SYSOUT=*
//IN1 DD DISP=SHR,DSN=FILE1.DATA
//IN2 DD DISP=SHR,DSN=FILE2.DATA
//OUT1 DD DSN=FILE1.DATA.COUNT,
// SPACE=(2790,(5376,1075),RLSE),
// UNIT=TSTSF,
// DCB=(RECFM=FB,LRECL=06,BLKSIZE=0),
// DISP=(NEW,CATLG,DELETE)
//OUT2 DD DSN=FILE2.DATA.COUNT,
// SPACE=(2790,(5376,1075),RLSE),
// UNIT=TSTSF,
// DCB=(RECFM=FB,LRECL=06,BLKSIZE=0),
// DISP=(NEW,CATLG,DELETE)
//TOOLIN DD *
COUNT FROM(IN1) WRITE(OUT1) DIGITS(6)
COUNT FROM(IN2) WRITE(OUT2) DIGITS(6)
/*
//*******Calculating percentage and if above 90% setting RC 04*****
//STEP002 EXEC PGM=SORT
//SYSOUT DD SYSOUT=*
//SORTIN DD DSN=FILE2.DATA.COUNT,DISP=SHR
// DD DSN=FILE1.DATA.COUNT,DISP=SHR
//SORTOUT DD DSN=FILE.DATA.COUNT.OUT,
// SPACE=(2790,(5376,1075),RLSE),
// UNIT=TSTSF,
// DCB=(RECFM=FB,LRECL=80,BLKSIZE=0),
// DISP=(NEW,CATLG,DELETE)
//SETRC DD SYSOUT=*
//SYSIN DD *
INREC IFTHEN=(WHEN=INIT,BUILD=(1,6,X,6X'00',SEQNUM,1,ZD,80:X)),
IFTHEN=(WHEN=(14,1,ZD,EQ,2),OVERLAY=(8:1,6))
SORT FIELDS=(7,1,CH,A),EQUALS
SUM FIELDS=(8,4,BI,12,2,BI)
OUTREC OVERLAY=(15:X,1,6,ZD,DIV,+2,M11,LENGTH=6,X,
(8,6,ZD,MUL,+100),DIV,1,6,ZD,MUL,+100,EDIT=(TTT.TT))
OUTFIL FNAMES=SETRC,NULLOFL=RC4,INCLUDE=(23,6,CH,GT,C'090.00')
OUTFIL BUILD=(05:C'TOTAL NUMBER RECRODS IN FILE2 : ',1,6,/,
05:C'TOTAL NUMBER RECRODS IN FILE1 : ',8,6,/,
05:C'PERCENTAGE : ',23,6,/,
80:X)
//*
The problem I am facing is datasets FILE1.DATA.COUNT and FILE1.DATA.COUNT are getting created of 15 record length despite mentioning LRECL 6. (note, this was the question that existed when the first answer was written and does not relate now to the above code).
Can we merge both steps into one?
What does this, (15:X,1,6,ZD,DIV,+2,M11,LENGTH=6,X, (8,6,ZD,MUL,+100),DIV,1,6,ZD,MUL,+100,EDIT=(TTT.TT)), mean specifically?
The answer to your first question is simply that you did not tell
ICETOOL's COUNT operator how long you wanted the output data to be, so
it came up with its own figure.
This is from the DFSORT Application Programming Guide:
WRITE(countdd) Specifies the ddname of the count data set to be
produced by ICETOOL for this operation. A countdd DD statement must be
present. ICETOOL sets the attributes of the count data set as follows:
v RECFM is set to FB.
v LRECL is set to one of the following:
– If WIDTH(n) is specified, LRECL is set to n. Use WIDTH(n) if your count
record length and LRECL must be set to a particular value (for
example, 80), or if you want to ensure that the count record length
does not exceed a specific maximum (for example, 20 bytes).
– If WIDTH(n) is not specified, LRECL is set to the calculated required
record length. If your LRECL does not need to be set to a particular
value, you can let ICETOOL determine and set the appropriate LRECL
value by not specifying WIDTH(n).
And:
DIGITS(d)
Specifies d digits for the count in the output record, overriding the
default of 15 digits. d can be 1 to 15. The count is written as d
decimal digits with leading zeros. DIGITS can only be specified if
WRITE(countdd) is specified.
If you know that your count requires less than 15 digits, you can use
a lower number of digits (d) instead by specifying DIGITS(d). For
example, if DIGITS(10) is specified, 10 digits are used instead of 15.
If you use DIGITS(d) and the count overflows the number of digits
used, ICETOOL terminates the operation. You can prevent the overflow
by specifying an appropriately higher d value for DIGITS(d). For
example, if DIGITS(5) results in overflow, you can use DIGITS(6)
instead.
And:
WIDTH(n)
Specifies the record length and LRECL you want ICETOOL to use for the
count data set. n can be from 1 to 32760. WIDTH can only be specified
if WRITE(countdd) is specified. ICETOOL always calculates the record
length required to write the count record and uses it as follows:
v If WIDTH(n) is specified and the calculated record length is less
than or equal to n, ICETOOL sets the record length and LRECL to n.
ICETOOL pads the count record on the right with blanks to the record
length.
v If WIDTH(n) is specified and the calculated record length is greater
than n, ICETOOL issues an error message and terminates the operation.
v If WIDTH(n) is not specified, ICETOOL sets the record length and
LRECL to the calculated record length.
Use WIDTH(n) if your count record length and LRECL must be set to a
particular value (for example, 80), or if you want to ensure that the
count record length does not exceed a specific maximum (for example,
20 bytes). Otherwise, you can let ICETOOL calculate and set the
appropriate record length and LRECL by not specifying WIDTH(n).
For your second question, yes it can be done in one step, and greatly simplified.
The thing is, it can be further simplified by doing something else. Exactly what else depends on your actual task, which we don't know, we only know of the solution you have chosen for your task.
For instance, you want to know when one file is within 10% of the size of the other. One way, if on-the-dot accuracy is not required, is to talk to the technical staff who manage your storage. Tell them what you want to do, and they probably already have something you can use to do it with (when discussing this, bear in mind that these are technically data sets, not files).
Alternatively, something has already previously read or written those files. If the last program to do so does not already produce counts of what it has read/written (to my mind, standard good practice, with the program reconciling as well) then amend the programs to do so now. There. Magic. You have your counts.
Arrange for those counts to be in a data set of their own (preferably with record-types, headers/trailers, more standard good practice).
One step to take the larger (expectation) of the two counts, "work out" what 00% would be (doesn't need anything but a simple subtraction, with the right data) and generate a SYMNAMES format file (fixed-length 80-byte records) with a SORT-symbol for a constant with that value.
Second step which uses INCLUDE/OMIT with the symbol in comparison to the second record-count, using NULLOUT or NULLOFL.
The advantage of the above types of solution is that they basically use very few resources. On the Mainframe, the client pays for resources. Your client may not be so happy at the end of the year to find that they've paid for reading and "counting" 7.3m records just so that you can set an RC.
OK, perhaps 7.3m is not so large, but, when you have your "solution", the next person along is going to do it with 100,000 records, the next with 1,000,000 records. All to set an RC. Any one run of which (even with the 10,000-record example) will outweigh the costs of a "Mainframe" solution running every day for the next 15+ years.
For your third question:
OUTREC OVERLAY=(15:X,1,6,ZD,DIV,+2,M11,LENGTH=6,X,
(8,6,ZD,MUL,+100),DIV,1,6,ZD,MUL,+100,EDIT=(TTT.TT))
OUTREC is processed after SORT/MERGE and SUM (if present) otherwise after INREC. Note, the physical order in which these are specified in the JCL does not affect the order they are processed in.
OVERLAY says "update the information in the current record with these data-manipulations (BUILD always creates a new copy of the current record).
15: is "column 15" (position 15) on the record.
X inserts a blank.
1,6,ZD means "the information, at this moment, at start-position one for a length of six, which is a zoned-decimal format".
DIV is divde.
+2 is a numeric constant.
1,6,ZD,DIV,+2 means "take the six-digit number starting at position one, and divide it by two, giving a 'result', which will be placed at the next available position (16 in your case).
M11 is a built-in edit-mask. For details of what that mask is, look it up in the manual, as you will discover other useful pre-defined masks at the time. Use that to format the result.
LENGTH=6 limits the result to six digits.
So far, the number in the first six positions will be divided by two, treated (by the mask) as an unsigned zoned-decimal of six digits, starting from position 16.
The remaining elements of the statement are similar. Brackets affect the "precedence" of numeric operators in a normal way (consult the manual to be familiar with the precedence rules).
EDIT=(TTT.TT) is a used-defined edit mask, in this case inserting a decimal point, truncating the otherwise existing left-most digit, and having significant leading zeros when necessary.

Compare two files and include both match and non match records

I need to merge two files into one .
Suppose I have 2 input files FILE1 and FILE2. And I need to non-matching records from FILE1 and FILE2 into FILE 3 as well as I want to write matching records also into FILE3.If there is matching based on key in FILE1 and FILE2 then matching record to be written must be picked from FILE1/FILE2 on basis of some condition.
The key position in both the Input Files is same.
Can anybody please help me to write SORTCARD, how Can I get this in single step in SyncSort or DFSort??
Try using join keys
SORT FIELDS=COPY
JOINKEYS FILES=F1,FIELDS=(1,5,A)
JOINKEYS FILES=F2,FIELDS=(1,5,A)
JOIN UNPAIRED,F1,F2
REFORMAT FIELDS=(F1:1,6,F2:1,80)
in reformat fields, you can mention the fields as you want, i.e.., if you want matching records to be picked from file2 then mention those fileds beside F2:
I got my solution using following sort card:
JOINKEYS F1=IN1,FIELDS=(1,7,A,13,7,A)
JOINKEYS F2=IN2,FIELDS=(1,7,A,13,7,A)
JOIN UNPAIRED,F1,F2
REFORMAT FIELDS=(F1:1,239,F2:1,239,?)
OPTION COPY
OUTFIL FNAMES=OUT1,INCLUDE=(479,1,SS,EQ,C'1,2'),
IFTHEN=(WHEN=(479,1,CH,EQ,C'1'),
BUILD=(1,239,479,1)),
IFTHEN=(WHEN=NONE,
BUILD=(240,239,479,1))
OUTFIL FNAMES=OUT2,INCLUDE=(479,1,SS,EQ,C'B'),
IFTHEN=(WHEN=(111,1,FS,EQ,NUM,AND,175,1,FS,EQ,NUM),
BUILD=(1,239)),
IFTHEN=(WHEN=(350,1,FS,EQ,NUM,AND,414,1,FS,EQ,NUM),
BUILD=(240,239)),
IFTHEN=(WHEN=NONE,
BUILD=(1,239))

How to compare two totals from different datasets

We have three different file one is like that
File A
000001000
000002000
000003000
000004000
File B (After Summing of all Records in file A)
000010000
File C
Total : - 10000
I have to compare the value in the File C and File B and if the value matches successfully I have to set the desired Return Code RC.
The starting position of the word "Total" is five.
I also find the solution of the problem using the JOINKEY.
The code is given below.
//STEP1 EXEC PGM=SORT,PARM=’NULLOUT=RC4′
//SORTJNF1 DD DSN=FILEB
//SORTJNF2 DD DSN=FILEC
//SORTOUT DD SYSOUT=*
//SYSOUT DD SYSOUT=*
//SYSIN DD *
JOINKEYS FILE=F1,FIELDS=(1,9,A)
JOINKEYS FILE=F2,FIELDS=(5,9,a)
REFORMAT FIELDS=(F1:1,80)
OPTION COPY
/*
When the Match is success full it will return the RC=0
And
When Match is not success full it will return the RC=4 as it is supplied using the PARM.
Basically it search for the record in the output file (sortout) if Match successful the jcl sort utility will copy the record from fileB into the Output file(in spool) and if the unsuccessful match then the output file is empty and the with the help of PARM=’NULLOUT=RC4′ it return the RC=4.
Why you wouldn't use JOINKEYS?
A professional would know not to use JOINKEYS for this task or would recognise specific advice against using it.
For everyone else, here's why not.
JOINKEYS consists of three "Tasks", the Main Task and two Sub-Tasks (one for each input dataset). This means JOINKEYS can use up to three times the memory of a plain SORT/MERGE/COPY. This means a JOINKEYS step can be more difficult to "select" and can keep other JOBs from being selected. It also means a JOINKEYS step is going to be slower than an if an equivalent solution is possible not using JOINKEYS (this can depend on the exact solution, but is true for anything trivial).
More memory, more CPU, more elapsed time, more impact on other JOBs.
Only use JOINKEYS where necessary. Necessary does not mean "so I can type/copy-paste less code", it means when a good solution requires the use of JOINKEYS.
If a solution does not require JOINKEYS, then don't use JOINKEYS. Someone pays for the extra resources and impacts. A professional avoids such costs.
Of course, such advice is free, and you get what you pay for at times. However, I am the DFSORT Moderator at www.ibmmainframes.com, taking over that task from Frank Yaeger of IBM, the inventor of the modern DFSORT.
Plus in the sample code given by https://stackoverflow.com/users/5433120/sharad-singhal, the JOINKEYS won't even work with the data that they themselves showed in the question, as the field-types are different (one left-zero filled, the other leading-zero truncated, right-space-padded).
You can of course normalise the second key, using a JNF2CNTL DD and some code (the code needed is included here).
But. Even. Though. It. Can. Be. Made. To. Work. JOINKEYS. Is. A. Bad. Solution. For. This. Task.
I hope this is clear for any future searchers.
I have to set the desired Return Code RC
This is impossible, except by coincidence. The only Condition Code/Return Code (CC/RC) available to you from DFSORT is zero, four and 16.
Perhaps that provides one of the CC/RC which you desired. If not, change your desire. Or do it the odd way around by using an IDCAMS step to "convert" the CC/RC you get from DFSORT to the CC/RC you so desperately want.
The only way you can get, with correct control cards, other than a zero RC is by having an empty output file (either SORTOUT or an OUTFIL dataset).
You want to SUM. SUM requires SORT or MERGE. You can also use OUTFIL reporting functions, REMOVECC and TOTAL to get sums, but by that time you don't have an opportunity to test that two totals are equal, or not.
Matching datasets is a good task for JOINKEYS.
You can also code the summation yourself.
Resource-wise and since you are probably a beginner, the best solution will be the MERGE.
You need two DD's in your JCL, SORTIN01 and SORTIN02, in place of SORTIN which you would usually have for a SORT step. You will also need at least one output DD, which can be SORTOUT, but best as another, and this should be set to DUMMY or DSN=NULLFILE.
You need to arrange that all the value records get a "key" which is equal, so that they can be SUMmed, and your Total record should get a different value.
//MATCHTOT EXEC PGM=SORT
//SYMNAMES DD *
* IN- is the input records with values (and the summed value later)
* EXT- are temporary extensions made to the record for processing
* TOT- is the total record
* IND- is an indicator determining wheter value or total record
IN-RECORD,*,80,CH
EXT-IN-IND,*,1,CH
EXT-TOT-TOTAL-VALUE,*,9,zd
POSITION,IN-RECORD
IN-VALUE,=,9,ZD
IN-SUM-VALUE,=,=,=
POSITION,IN-RECORD
SKIP,5
TOT-TOTAL-NAME,*,5,CH
SKIP,5
TOT-TOTAL-VALUE,*,9,CH
* Constants
TOTAL-TEXT,C'Total'
IND-TOT-TOTAL,C'0'
IND-IN-VALUE,C'5'
CLOBBER-FIRST-PART,C'000000000'
//SYMNOUT DD SYSOUT=*
//CHECK DD DUMMY
//SYSOUT DD SYSOUT=*
//SORTOUT DD SYSOUT=*
//FILEB DD SYSOUT=*
//SYSIN DD *
INREC IFTHEN=(WHEN=(TOT-TOTAL-NAME,
EQ,
TOTAL-TEXT),
OVERLAY=(EXT-IN-IND:
IND-TOT-TOTAL,
IN-VALUE:
CLOBBER-FIRST-PART)),
IFTHEN=(WHEN=NONE,
OVERLAY=(EXT-IN-IND:
IND-IN-VALUE))
MERGE FIELDS=(EXT-IN-IND,A)
SUM FIELDS=(IN-VALUE)
OUTREC IFTHEN=(WHEN=GROUP,
BEGIN=(EXT-IN-IND,
EQ,
IND-TOT-TOTAL),
PUSH=(EXT-TOT-TOTAL-VALUE:
TOT-TOTAL-VALUE)),
IFTHEN=(WHEN=INIT,
OVERLAY=(EXT-TOT-TOTAL-VALUE:
EXT-TOT-TOTAL-VALUE,UFF,
TO=ZD,
LENGTH=9))
OUTFIL FNAMES=CHECK,
INCLUDE=(EXT-IN-IND,
EQ,
IND-IN-VALUE,
AND,
IN-SUM-VALUE,
EQ,
EXT-TOT-TOTAL-VALUE),
NULLOFL=RC4
OUTFIL FNAMES=FILEB,
INCLUDE=(EXT-IN-IND,CH,EQ,IND-IN-VALUE),
BUILD=(IN-RECORD)
//SORTIN01 DD *
000001000
000002000
000003000
000004000
//SORTIN02 DD *
Total : - 20000
This uses DFSORT symbols/SYMNAMES. These are defined on the SYMNAMES DD (best as a PDSE (or PDS) member, fixed-length 80-byte records.
The SYMNOUT DD statement lists the source symbols, and the normalised symbols which will by used by DFSORT to translate the control cards.
In INREC, the key for the merge is established and in the case of the Total record, the input positions (no longer required) that match the value on the value records are set to zero. Any data from an input file which is not needed for output can be happily destroyed (if it is not written, it is not written, so its content is irrelevant and can be amended if convenient to do so).
The MERGE then uses the key just established. This is just a trick to allow SUM to be used. All the value records (key five) will be SUMmed to an actual value, the total record will be left as it is (since its key is unique).
In OUTREC, the total from the Total record will be PUSHed to the next record (the SUMmed values) using WHEN=GROUP. The Unsigned Free Format data of the Total record will be converted to a Zoned Decimal value.
There are then two OUTFIL statements.
The first looks at the SUMmed record and compares the value to the value PUSHed from the Total record. If they are equal, a record is written to the OUTFIL (which is DUMMY in the JCL). The RC4 will not be set. If the values are different, no record will be written and the RC4 will be set.
The second OUTFIL is to create your FILEB, which is not itself needed as a file. If you don't need FILEB for anything else, you can remove this OUTFIL.
I have left the SORTOUT DD in the JCL so you can see what happens to the records. You can remove this once you are happy you know what is going on.
The only other RC you can get is 16. Avoid this as 16 is also produced for Control Card or run-time errors.

Compare files with DFSORT , insert a string 'ADD' in front of new records

I have 2 work files old(F1) and new (F2), both of the same length. I should compare both the files record by record and for any new records in the new (F2) work file , I should insert 'ADD' in front of it in starting 3 positions and for rest of the records(matched) it should be spaces.
As of now I am able to copy the records which are in F2 but not in F1 using the below code:
//SYSIN DD *
JOINKEYS FILE=F1,FIELDS=(1,79,A)
JOINKEYS FILE=F2,FIELDS=(1,79,A)
JOIN UNPAIRED,F2,ONLY
SORT FIELDS=COPY
/*
but I need all the records from F2 with 'NEW' string in front of new records, can this be done in single step?
This seems to be what you want. I am not convinced you will want it shortly.
All records on each file will be sorted based on, what I'm assuming is, their entire length. This means for the first run your output will be in a different sequence from the input.
A "change" will look exactly the same as a NEW, if a change is possible.
If you're OK with both of those, you should have said so in your question.
//SYSIN DD *
JOINKEYS FILE=F1,FIELDS=(1,79,A)
JOINKEYS FILE=F2,FIELDS=(1,79,A)
JOIN UNPAIRED,F2
REFORMAT FIELDS=(?,F2:1,2,1,79)
INREC IFTHEN=(WHEN=(1,1,CH,EQ,C'2'),
OVERLAY=(1:C'NEW')),
IFTHEN=(WHEN=(NONE,
OVERLAY=(1:3X))
SORT FIELDS=COPY
The UNPAIRED,F2 will get you all matches, plus those from F2 (your NEW input) which don't match.
The REFORMAT statement puts the join match-marker (the ?) in the first position, then puts two bytes of anything, then the entire data. The REFORMAT record will be 82 bytes.
In INREC, the field sourced from the match-marker is tested so that NEW can be overlaid at the start of the record if required. Else, three blanks will be overlaid.
There is a clearer way to express the same output:
//SYSIN DD *
JOINKEYS FILE=F1,FIELDS=(1,79,A)
JOINKEYS FILE=F2,FIELDS=(1,79,A)
JOIN UNPAIRED,F2
REFORMAT FIELDS=(?,F2:1,79)
INREC IFTHEN=(WHEN=(1,1,CH,EQ,C'2'),
BUILD=(C'NEW',2,79)),
IFTHEN=(WHEN=(NONE,
BUILD=(3X,2,79)),
SORT FIELDS=COPY
This time BUILD is used, not OVERLAY.
With better (any?) knowledge of your data, better solutions may be available.
Even with information provided on another site but not here, there is not enough.
//SYSIN DD *
JOINKEYS FILE=F1,FIELDS=(1,79,A)
JOINKEYS FILE=F2,FIELDS=(1,79,A)
JOIN UNPAIRED,F2,ONLY
SORT FIELDS=COPY
/*
We can guess that your file is 79, FB, but it needn't be.
To only get the mismatches and put NEW at the front with your existing code, is easy:
//SYSIN DD *
JOINKEYS FILE=F1,FIELDS=(1,79,A)
JOINKEYS FILE=F2,FIELDS=(1,79,A)
JOIN UNPAIRED,F2,ONLY
INREC BUILD=(C'NEW',1,79)
SORT FIELDS=COPY
You output is of course now 82 bytes per record.
However, this presumes that you have (and can never have) no duplicates on you 79-byte key, and that your entire record is 79 bytes.
It also assumes that you do not care about the order of the output file.
Each JOINKEYS is SORTing its file, and then the data is presented to the matching process.
Bear in mind that with a change, and this method of verifying (sort the whole record, compare the whole record), you will find it difficult to not output two records for a record which is logically the same. One will look like a Delete (from F1) and the other a NEW (on F2) whereas those taken together are just a Change.

Sync sort, Unpaired records of File1 have spaces for no records in F2 file. Can we replace those specific column's spaces by ZEROS?

SORT:
JOINKEYS FILES=F1,FIELDS=(5,4,A,10,20,A)
JOINKEYS FILES=F2,FIELDS=(1,4,A,6,20,A)
REFORMAT FIELDS=(F1:10,20,9,1,5,4,30,1,31,10,F2:27,10)
JOIN UNPAIRED,F1
INREC BUILD=(1,36,C',',37,10,C',',27,10,SFF,SUB,37,10,SFF,
EDIT=(TTTTTT))
OUTPUT IS: *2nd row 4th column is spaces as unpaired from 2nd file, needs to be 0s automatically.
22680372 ,5102, 1, 1,000000
22222222 ,5105, 2, ,000002
OUTPUT shud be: *2nd row 4th column is 0 or 0000s as unpaired from 2nd file, needs to be 0s automatically.
22680372 ,5102, 1, 1,000000
22222222 ,5105, 2, 0,000002
You need a condition, which means IFTHEN. You can't have IFTHEN and BUILD on the same INREC, but you can have multiple IFTHENs and BUILD can be part of an IFTHEN.
IFTHEN=(WHEN=INIT indicates something which should be done for every record (unconditional).
IFTHEN=(WHEN=(logical-expression will only be actioned if the condition is true.
Every BUILD statement makes a complete new intermediate record (intermediate between input and output). OVERLAY only affects the data at the position specified (assuming no extension of the record).
Your condition will be that the 46th byte of the record is space. You have already used SFF (did you try the other suggestions, especially FS?), so there is no need to make the value zero before the BUILD.
JOINKEYS FILES=F1,FIELDS=(5,4,A,10,20,A)
JOINKEYS FILES=F2,FIELDS=(1,4,A,6,20,A)
REFORMAT FIELDS=(F1:10,20,9,1,5,4,30,1,31,10,F2:27,10)
JOIN UNPAIRED,F1
INREC IFTHEN=(WHEN=INIT,
BUILD=(1,36,
C',',
37,10,
C',',
27,10,SFF,
SUB,
37,10,SFF,
EDIT=(TTTTTT))),
IFTHEN=(WHEN=(47,1,CH,EQ,C' '),
OVERLAY=(46:C'0'))
I don't format the statements like that just for fun, but to make them easier to understand and maintain.
OK, that solution was a little clunky. You can replace the INREC with this, which shows, for this type of data, an alternative to the EDIT:
INREC IFTHEN=(WHEN=INIT,
BUILD=(1,36,
C',',
37,10,FS,TO=FS,LENGTH=10,
C',',
27,10,FS,
SUB,
37,10,FS,
TO=FS,LENGTH=8))
This is much more natural, as the space gets turned into a zero with leading blanks with no conditions at all, and using references only to that field in its position on the REFORMAT record.

Resources