I want to write a sort JCL with requirement where I want to sort on variable record length file
Input file:
Mark aaaaaaa
Amy bbbbbb
Paula ccccccccccc
Sort on the length of field before spaces on ascending order. That is sorting on length of first col/word Mark,Amy etc.. On basis of their length.
And second one is like performing sort on field after spaces on descending order but if any vowels in field should always be first and then rest of data.
Coming on second part ,here it's like the fields after spaces or aaaaa, bbbbb and ccccc we need to sort it in descending order (alphabetically) ,but then we also need to check if the field is vovel ,if any vovel then that field will be always as top, so the expected output will be like:
Considering above input file output file will be:
Mark aaaaaaaa
Paula cccccc
Amy bbbbbb
Now here vovel as in first record which contains aaaa in it is at top and rest data is sorted in descending order. I want to achieve this.
What you are asking is not at all a simple thing :-)
Whilst DFSORT has much intrinsic functionality, finding the length of a sequence of non-space characters is not available.
So you have to roll-your-own.
Although the task is also possible with fixed-length records (different technique) it is easier with variable-length records.
Because the fields are variable-length as well, you'll need PARSE to separate the fields. For variable-length or variably-located fields, PARSE is usually the answer.
PARSE creates fixed-length parsed fields, so you have to know the maximum lengths of your text. In this example 30 is chosen for each.
The solution will develop piece by piece, because you will need to be secure in your understanding of it. The pieces are presented as "stand alone" code which you can run and see what happens:
OPTION COPY
INREC IFTHEN=(WHEN=INIT,
PARSE=(%01=(ENDBEFR=C' ',
FIXLEN=30),
%02=(FIXLEN=30))),
IFTHEN=(WHEN=INIT,
BUILD=(1,4,%01,%02))
If you run that, you will get this output:
MARK AAAAAAA
AMY BBBBBB
PAULA CCCCCCCCCCC
INREC runs before a SORT, so to make any changes to the data before a SORT, you use INREC. OUTREC runs after SORT, and OUTFIL after OUTREC.
For now, the BUILD is just to show that the PARSEd fields contain the output you want (don't worry about the case, if you used mixed-case it will be like that).
WHEN=INIT means "do this for each record, before the following IFTHEN statements (if any)". You can use multiple WHEN=INIT, and you have to use multiple IFTHEN of some type to transform data in multiple stages.
The 1,4 in the BUILD is for the Record Descriptor Word (RDW) which each variable-length record hase, and is always necessary when creating a variable-length current record in SORT, but we'll use it for another purpose here as well.
The next stage is to "extend" the records, because we need two fields to SORT on. For a variable-length record, you extend "at the front". In general:
BUILD=(1,4,extensionstuff,5)
This makes a new version of the current record, with first the RDW from the old current record, then "does some stuff" to create the extension, then copies from position 5 (the first data-byte on a variable-length record) to the end of the record.
Although the RDW is "copied", the value of the RDW at the time is irrelevant, as it will be calculated for the BUILD. It just must be an RDW to start with, you can't just put anything there except an actual RDW.
Another component that will be needed is to extend the records for the SORT key. We need the length of the first field, and we need a "flag" for whether or not to "sort early" for the second field containing a vowel. For the length it will be convenient to have a two-byte binary value. For now, we are just reserving bytes for the things:
OPTION COPY
INREC BUILD=(1,4,2X,2X,X,5)
The 2X is two blanks, the X is one blank, so a total of five blanks. It could have been written as 5X, and in the final code is best that way, but for now it is clearer. Run that and you will see your records prefixed by five blanks.
There are two tasks. The length of the first field, and whether the second field contains a vowel.
The key to the first task is to replace blanks from the PARSEd field with "nothing". This will cause the record to be shortened by one for each blank replaced. Saving the length of the original current record, and calculating with the length of the current record and the fixed-length (30) reveals the length of the data.
The key to the second task applies a similar technique. This time, change the second PARSEd field such that a, e, i, o, u are replaced by "nothing". Then if the length is the same as the original, there were no vowels.
The FINDREP will look something like this:
IFTHEN=(WHEN=INIT,
FINDREP=(IN=C' ',
OUT=C'',
STARTPOS=n1,
ENDPOS=n2)),
You'll need a variant for the vowels:
IFTHEN=(WHEN=INIT,
FINDREP=(IN=(C'A',C'E',C'I',C'O',C'U'),
OUT=C'',
STARTPOS=n1,
ENDPOS=n2)),
To run:
OPTION COPY
INREC IFTHEN=(WHEN=INIT,
PARSE=(%01=(ENDBEFR=C' ',
FIXLEN=30),
%02=(FIXLEN=30))),
IFTHEN=(WHEN=INIT,
BUILD=(1,4,2X,X,%02)),
IFTHEN=(WHEN=INIT,
OVERLAY=(5:1,2)),
IFTHEN=(WHEN=INIT,
FINDREP=(IN=(C'A',
C'E',
C'I',
C'O',
C'U'),
OUT=C'',
STARTPOS=8,
ENDPOS=38)),
IFTHEN=(WHEN=(1,4,BI,EQ,5,2,BI),
OVERLAY=(7:C'N'))
If you run that, you will see the flag (third data-position) is now space (for a vowel present) or "N". Don't worry that all the "A"s have disappeared, they are still tucked away in %02.
OVERLAY can make changes to the current record without creating a new, replacement record (which is what BUILD does). You'll see OVERLAY used below to get the new record-length after the a new current record-length has been created (the BUILD would get the original record-length from the RDW).
A similar process for the other task.
I've included some additional test-data and made further assumptions about your SORT order. Here's full, annotated (the comments can remain, they do not affect the processing), code:
* PARSE CURRENT INPUT TO GET TWO FIELDS, HELD SEPARATELY FROM THE RECORD.
*
INREC IFTHEN=(WHEN=INIT,
PARSE=(%01=(ENDBEFR=C' ',
FIXLEN=30),
%02=(FIXLEN=30))),
* MAKE A NEW CURRENT RECORD, RDW FROM EXISTING RECORD, THREE EXTENSIONS, AND
* A COPY OF THE FIRST PARSED FIELD.
*
IFTHEN=(WHEN=INIT,
BUILD=(1,4,
2X,
2X,
X,
%01)),
* STORE THE LENGTH OF THE NEW CURRENT RECORD ON THE CURRENT RECORD.
*
IFTHEN=(WHEN=INIT,
OVERLAY=(5:
1,2)),
* REPLACE BLANKS WITH "NOTHING" WITHIN THE COPY OF THE PARSED FIELD. THIS WILL
* AUTOMATICALLY ADJUST THE RDW ON THE CURRENT RECORD.
*
IFTHEN=(WHEN=INIT,
FINDREP=(IN=C' ',
OUT=C'',
STARTPOS=10,
ENDPOS=40)),
* CALCULATE THE LENGTH OF THE NON-BLANKS IN THE FIELD, BY SUBTRACTING PREVIOUS
* STORED RECORD-LENGTH FROM CURRENT RECORD-LENGTH (FIRST TWO BYTES, BINARY, OF
* RDW) AND ADDING 30 (LENGTH OF PARSED FIELD).
*
IFTHEN=(WHEN=INIT,
OVERLAY=(5:
1,2,BI,
SUB,
5,2,BI,
ADD,
+30,
TO=BI,
LENGTH=2)),
* MAKE A NEW CURRENT RECORD, COPYING RDW AND THE VALUE CALCULATED ABOVE, BLANKS
* (COULD BE COPIED) AND THEN THE SECOND PARSED FIELD.
*
IFTHEN=(WHEN=INIT,
BUILD=(1,4,
5,2,
2X,
X,
%02)),
* AGAIN SAVE THE LENGTH OF THE NEW CURRENT RECORD.
*
IFTHEN=(WHEN=INIT,
OVERLAY=(7:
1,2)),
* CHANGE ALL VOWELS TO "NOTHING". THIS WILL AUTOMATICALLY ADJUST THE RDW. FOR
* MIXED-CASE JUST EXTEND THE IN TO INCLUDE LOWER-CASE VOWELS AS WELL.
*
IFTHEN=(WHEN=INIT,
FINDREP=(IN=(C'A',
C'E',
C'I',
C'O',
C'U'),
OUT=C'',
STARTPOS=10,
ENDPOS=40)),
* CALCULATE NUMBER OF VOWELS.
*
IFTHEN=(WHEN=INIT,
OVERLAY=(7:
7,2,BI,
SUB,
1,2,BI,
TO=BI,
LENGTH=2)),
* MAKE A NEW CURRENT RECORD TO BE SORTED, WITH BOTH PARSED FIELDS.
*
IFTHEN=(WHEN=INIT,
BUILD=(1,4,
5,2,
7,2,
9,1,
%01,
%02)),
* SET THE FLAG TO "OUTSORT" THOSE RECORDS WITH A VOWEL IN THE SECOND FIELD.
*
IFTHEN=(WHEN=(7,2,BI,EQ,0),
OVERLAY=(9:
C'N'))
* SORT ON "OUTSORT FLAG", LENGTH OF NAME (DESCENDING), NAME, 2ND FIELD.
SORT FIELDS=(9,1,CH,A,
5,2,CH,D,
10,30,CH,A,
40,30,CH,A)
* FIELDS NEEDED TO BE IN FIXED POSITION FOR SORT, AND EXTENSION FIELDS NO
* LONGER NEEDED. ALSO REMOVE BLANKS FROM THE TWO FIELDS, KEEPING A SEPARATOR
* BETWEEN THEM. THIS COULD INSTEAD BE DONE ON THE OUTFIL.
*
OUTREC BUILD=(1,4,
10,60,
SQZ=(SHIFT=LEFT,
MID=C' '))
* CURRENTLY THE VARIABLE-LENGTH RECORDS ARE ALL THE SAME LENGTH (69 BYTES) SO
* REMOVE TRAILING BLANKS.
*
OUTFIL VLTRIM=C' '
Extensive test-data:
MARK AAAAAAA
AMY BBBBBB
PAULA CCCCCCCCCCC
PAULA BDDDDDDDDDD
IK JJJJJJJJJJO
You can also see how the code works by "removing a line at a time" from the end of the code, so you can see how the transformation reaches that point, or by running the code increasing a line at a time from the start of the code.
It is important that you, and your colleagues, understand the code.
There are some opportunities for some rationalisation. If you can work those out, it means you understand the code. Probably.
Alright, so I have an idea, but I am not sure if there is a way to accomplish this. Starting with this equation:
=IF(OR(ARRAYFORMULA(SUM(COUNTIF(B7:O7,{"I","A","X","R","K","E","AL","FFSL","ADM*"})))=10),"80 Hours","Error")
I would like to embed an AND statement within the same IF statement, if that is at all possible. For instance, the equation above checks all the possible 8 hours shifts. If there are 10 of them then the employee is schedule to work 80 hours. I next need to check for a combination of 4 ten hour shifts and 5 eight hour shifts. I then need to continue checking other possible combinations that would get the employee to 80 hours.
I know this equation below does not work, but this is what I am trying to do something similar to.
=IF(OR(ARRAYFORMULA(SUM(COUNTIF(B7:O7,{"I","A","X","R","K","E","AL","FFSL","ADM*"})))=10,(ARRAYFORMULA(SUM(COUNTIF(B7:O7,{"R-10","I-10","X-10","A-10"})))=4,AND(ARRAYFORMULA(SUM(COUNTIF(B7:O7,{"I","A","X","R","K","E","AL","FFSL","ADM*"})))=5),"80 Hours","Error")
Essentially I am trying to embed an AND statement within the original OR statement. Not sure if that is even allowed. I am saying something like this:
IF 1 OR (2 AND 3) OR (3 AND 4), etc...
Shouldn't it be like this?
=IF(OR(ARRAYFORMULA(SUM(COUNTIF(B7:O7,{"I","A","X","R","K","E","AL","FFSL","ADM*"})))=10,
AND(ARRAYFORMULA(SUM(COUNTIF(B7:O7,{"R-10","I-10","X-10","A-10"})))=4, ARRAYFORMULA(SUM(COUNTIF(B7:O7,"I","A","X","R","K","E","AL","FFSL","ADM*"})))=5)) ,"80 Hours","Error")
In excel AND and OR are functions, not operators. So IF 1 OR (2 AND 3) OR (3 AND 4).. will translate to
IF( OR(1, AND(2, 3), AND(3,4)) , <true_statement>, <false_statement>)
I’m trying to automate some processes for task management, but I’m having no success. I can’t use macros or similar, just formulas, and I’m not an adept at spreadsheet hacking.
Anyways, here’s my workbook, with its **sheets**:
**Form**
TASK LI DE X
Test 1 3
Test2 2
**LI**
WEEK TASK COMPLETED
1 Test
2 Test
2 Test *
4 Test2 *
**DE**
WEEK TASK COMPLETED
1 Test *
What I’ve been trying to do is:
On Form, check which column, from LI or DE, is > 0.
For each one > 0, check for the existence of TASK on its respective sheet (LI or DE).
If it is there, check if it has an *.
If it has an *, take the WEEK number of that row, compare it to the WEEK from the other sheet, take the greater number, and load it into the X column of the TASK on Form. The order here doesn’t really matter. I just need the WEEK from the one with an *.
For this example, in order for X to change, TASK must be with an * in the sheets where it is. For instance, if, on Form, Test has numbers in LI and DE, and Test has an * in LI sheet, but not in DE sheet, X must remain empty. But if both have it with *, X must be loaded with the greater WEEK between LI and DE.
If I were to do it with macros, I would simply check each column with a loop, but with formulas I suppose nested IFs would suffice.
I’ve tried with VLOOKUP, but it only takes the first item in the array, and though the order doesn’t matter, it is generally (I think I will make this a policy) the last value.
Any doubt, just let me know! I hope I made my issue clear.
Thank you very much in advance!
I think you can do it with formula but as you will have to loop, you will need SUMPRODUCT or Array Formula.
Here is a formula you can try (validate with CtrlShiftEnter):
=MAX((LI!$C$2:$C$5="*")*(LI!$A$2:$A$5)*(LI!$B$2:$B$5=Form!A2),(DE!$C$2:$C$5="*")*(DE!$A$2:$A$5)*(DE!$B$2:$B$5=Form!A2))
Some explanation:
The MAX formula will find the greatest value between the two ARRAY FORMULA of the two worsheets
The array formula works like a multiple loop test:
(LI!$C$2:$C$5="*") checks if there is a star in the third column
(LI!$A$2:$A$5) will return the week number
(LI!$B$2:$B$5=Form!A2) will check if the tasks are the same
I hope I understood well what you intended to do :)
[EDIT] Another try thanks to your comment (both task should be completed to appear)
=IF(AND((LI!$C$2:$C$5="*")*(LI!$A$2:$A$5)*(LI!$B$2:$B$5=Form!A2),(DE!$C$2:$C$5="*")*(DE!$A$2:$A$5)*(DE!$B$2:$B$5=Form!A2))),MAX((LI!$C$2:$C$5="*")*(LI!$A$2:$A$5)*(LI!$B$2:$B$5=Form!A2),(DE!$C$2:$C$5="*")*(DE!$A$2:$A$5)*(DE!$B$2:$B$5=Form!A2)),"")