Finding index of a substring in COBOL - string

I'm looking for the positions in a string where a specified substring occurs.
E.g, looking for substring "green" in the the string "green eggs and ham" should return me 1, but from "green eggs and green ham" would return me 1 and 14.
How should I do this?
Edit 1: Changed the wording so position starts at 1, not 0.
Edit 2: I can find the first instance as WS-POINTER in the following snippet:
MOVE 1 TO WS-POINTER
UNSTRING WS-STRING(1:WS-STRING-LEN)
DELIMITED BY LT-MY-DELIMITER
INTO WS-STRING-GARBAGE
WITH POINTER WS-POINTER
END-UNSTRING

AFAIK COBOL does not have a statement to find the position of a string within a string, so that needs to be done manually. However, COBOL does have a statement that counts the occurrences of a string within a string:
INSPECT string TALLYING counter FOR ALL search-string
Here is an example program that works in OpenCOBOL (see OpenCobol.org):
IDENTIFICATION DIVISION.
PROGRAM-ID. OCCURRENCES.
ENVIRONMENT DIVISION.
INPUT-OUTPUT SECTION.
FILE-CONTROL.
DATA DIVISION.
FILE SECTION.
WORKING-STORAGE SECTION.
01 TEST-STRING-1 PIC X(30)
VALUE 'green eggs and ham'.
01 TEST-STRING-2 PIC X(30)
VALUE 'green eggs and green ham'.
01 TEST-STRING PIC X(30).
01 SEARCH-STRING PIC X(05)
VALUE 'green'.
01 MATCH-COUNT PIC 9.
01 SEARCH-INDEX PIC 99.
01 MATCH-POSITIONS.
05 MATCH-POS PIC 99 OCCURS 9 TIMES.
PROCEDURE DIVISION.
MAIN.
MOVE TEST-STRING-1 TO TEST-STRING
PERFORM FIND-MATCHES
MOVE TEST-STRING-2 TO TEST-STRING
PERFORM FIND-MATCHES
STOP RUN
.
FIND-MATCHES.
MOVE ZERO TO MATCH-COUNT
INSPECT TEST-STRING TALLYING MATCH-COUNT
FOR ALL SEARCH-STRING.
DISPLAY 'FOUND ' MATCH-COUNT ' OCCURRENCE(S) OF '
SEARCH-STRING ' IN:'
DISPLAY TEST-STRING
DISPLAY 'MATCHES FOUND AT POSITIONS: ' WITH NO ADVANCING
PERFORM VARYING SEARCH-INDEX FROM 1 BY 1
UNTIL SEARCH-INDEX = 30
IF TEST-STRING (SEARCH-INDEX:5) = SEARCH-STRING
DISPLAY SEARCH-INDEX ' ' WITH NO ADVANCING
END-PERFORM
DISPLAY ' '
DISPLAY ' '
.

You could use QCLSCAN on IBM i
77 QCLSCAN-SRCHLEN PIC S9(3) COMP-3.
77 QCLSCAN-STARTPOS PIC S9(3) COMP-3.
77 QCLSCAN-PATLEN PIC S9(3) COMP-3.
77 QCLSCAN-XLATE PIC X(01) VALUE "0".
77 QCLSCAN-TRIM PIC X(01) VALUE "0".
77 QCLSCAN-WILDCARD PIC X(01) VALUE LOW-VALUES.
77 QCLSCAN-FOUNDPOS PIC S9(3) COMP-3.
...
...
MOVE LENGTH OF WRK-ACCT-NBR TO QCLSCAN-SRCHLEN
MOVE 1 TO QCLSCAN-STARTPOS
MOVE 9 TO QCLSCAN-PATLEN
MOVE "0" TO QCLSCAN-XLATE
MOVE "0" TO QCLSCAN-TRIM
MOVE "?" TO QCLSCAN-WILDCARD
CALL "QCLSCAN" USING WRK-ACCT-NBR
QCLSCAN-SRCHLEN
QCLSCAN-STARTPOS
EMPLOYEE-SSN-9X
QCLSCAN-PATLEN
QCLSCAN-XLATE
QCLSCAN-TRIM
QCLSCAN-WILDCARD
QCLSCAN-FOUNDPOS
IF QCLSCAN-FOUNDPOS > ZERO
* Found data in position QCLSCAN-FOUNDPOS
ELSE
* Found no match
END-IF

MOVE 1 TO WS-POINTER
UNSTRING WS-STRING(1:WS-STRING-LEN)
DELIMITED BY LT-MY-DELIMITER
INTO WS-STRING-GARBAGE
WITH POINTER WS-POINTER
END-UNSTRING
You ask about how to use the above for subsequent strings.
It is possible to use UNSTRING in two ways to get the counts you want. Either by having multiple receiving fields and COUNT-IN or by using multiple executions of UNSTRING using the POINTER value from the previous UNSTRING each time.
You need to account for the length of the delimiter. However, you will end up with "non-intuitive" code which will have to be "understood" each time someone picks up the program with it in.
Instead, it is a simple task with "substring" processing with either OCCURS DEPENDING ON or reference-modification (the method in the accepted answer).
You must make sure you don't "go beyond the end of the field" by ending the search when count + length-of-delimiter = max-length-of-string-to-search.

Related

COBOL: remove certain characters from string

I would like to remove certain characters from a string in COBOL.
For example, '****This is*a test** string.' will become 'This isa test string.', '"Second one"' will become 'Second one'.
While INSPECT ... REPLACING cannot change the position of characters within a data item, INSPECT ... CONVERTING may be used to prepare the data item for subsequent operations.
In the following, the procedure strip-string first converts all characters, to be replaced, to a single common character, in this case, LOW-VALUES. This fragments the string so that the common character maybe be used to easily delimit the fragments. The PERFORM loops over the fragmented string. The UNSTRING statement moves one fragment to the output and provides a COUNT of the number of characters moved. The ADD augments the output starting position so that the fragments are positioned in sequence.
Code:
data division.
working-storage section.
1 binary.
2 p pic 9(4).
2 o pic 9(4).
2 o-count pic 9(4).
1 i-string pic x(40).
88 test-1 value '****This is*a test** string.'.
88 test-2 value '"Second one"'.
1 o-string pic x(40).
1 r-chars pic x(2) value '*"'. *> characters to be removed
procedure division.
begin.
set test-1 to true
perform test-prep
set test-2 to true
perform test-prep
stop run
.
test-prep.
display i-string
perform strip-string
display o-string
display space
.
strip-string.
inspect i-string converting r-chars to low-values
move 1 to p o
perform until p > function length (i-string)
unstring i-string
delimited all low-values
into o-string (o:)
count in o-count
with pointer p
add o-count to o
end-perform
.
Output:
****This is*a test** string.
This isa test string.
"Second one"
Second one
Try the following code snippet.
IDENTIFICATION DIVISION.
PROGRAM-ID. HELLO-WORLD.
DATA DIVISION.
WORKING-STORAGE SECTION.
01 WS-STR PIC X(20) VALUE '****This is*a test**'.
01 WS-CNT PIC 99 VALUE 0.
01 WS-I PIC 99 VALUE 0.
01 WS-J PIC 99 VALUE 1.
01 WS-CHAR.
05 WS-LETTER OCCURS 1 TO 20 TIMES DEPENDING ON WS-CNT PIC X.
PROCEDURE DIVISION.
PERFORM VARYING WS-I FROM 1 BY 1 UNTIL WS-I > FUNCTION LENGTH(WS-STR)
IF WS-STR(WS-I:1) = '*' THEN
CONTINUE
ELSE
MOVE WS-STR(WS-I:1) TO WS-LETTER(WS-J)
ADD 1 TO WS-J
ADD 1 TO WS-CNT
END-IF
END-PERFORM
DISPLAY WS-CHAR
STOP RUN.
Output:
This isa test
Note: I used Tutorial Point's COBOL Coding ground to run the above snippet. COBOL Code doesn't need to be indented there.

Reversing a string without using reverse function in COBOL

my task is to reverse a string in cobol without using the reverse function.
So far i've got this:
MOVE 20 TO LOO.
MOVE 1 TO LOP.
MOVE 20 TO LOU.
MOVE EINA01 OF FORMAT1 TO WORTTXT1.
PERFORM 20 TIMES
MOVE WORTTXT1 (LOP:1) TO B (20:LOO)
SUBTRACT 1 FROM LOO
ADD 1 TO LOP
MOVE B TO WORTTXT2 (20:LOU)
SUBTRACT 1 FROM LOU
END-PERFORM.
MOVE WORTTXT2 TO AUSA01 OF FORMAT1.
AUSA01 is the output
EINA01 the input.
The problem i have right now is: If i write "Hello" into the input field, all i get is "00000000000h" he just reverses the first letter but its supposed to look like " Hello".
As you've mentioned that the program has to reverse the spaces as well, I suggest you to modify the PERFORM loop as shown below.
PERFORM 20 TIMES
MOVE WORTTXT1(LOP:1) TO B(LOO:1)
SUBTRACT 1 FROM LOO
ADD 1 TO LOP
END-PERFORM.
Full program:
IDENTIFICATION DIVISION.
PROGRAM-ID. HELLO-WORLD.
DATA DIVISION.
WORKING-STORAGE SECTION.
01 EINA01 PIC X(20) VALUE 'Srinivasan '.
01 WORTTXT1 PIC X(20) VALUE SPACES.
01 WORTTXT2 PIC X(20) VALUE SPACES.
01 AUSA01 PIC X(20) VALUE SPACES.
01 B PIC X(20) VALUE SPACES.
01 LOO PIC 9(2) VALUE 0.
01 LOP PIC 9(2) VALUE 0.
PROCEDURE DIVISION.
MOVE 20 TO LOO.
MOVE 1 TO LOP.
MOVE EINA01 TO WORTTXT1.
PERFORM 20 TIMES
MOVE WORTTXT1(LOP:1) TO B(LOO:1)
SUBTRACT 1 FROM LOO
ADD 1 TO LOP
END-PERFORM.
MOVE B TO AUSA01.
DISPLAY AUSA01.
STOP RUN.
Note: I'm not using the data items, B, LOU & WORTTXT2 as I felt that they are not required.
Output:
nasavinirS
Try it here
You can also use INSPECT to reverse a string. Code something like this to reverse an 8-character string SRC and get the result in TGT:
MOVE x'0807060504030201' TO TGT
INSPECT TGT CONVERTING x'0102030405060708' TO SRC
Note that you can use this trick to reorder a field in any desired way.

COBOL: How to count all characters after trimming all the spaces before and after Input

STRING FUNCTION TRIMR(EINA01 OF FORMAT1)
DELIMITED BY SIZE
INTO WORTTXT1
END-STRING.
MOVE FUNCTION REVERSE (WORTTXT1) TO WORTTXT2.
STRING FUNCTION TRIMR(WORTTXT2)
DELIMITED BY SIZE
INTO WORTTXT3
END-STRING.
INSPECT WORTTXT3 TALLYING LOO FOR CHARACTERS
BEFORE INITIAL SPACES.
MOVE EINN01 OF FORMAT1 TO X.
MOVE EINN02 OF FORMAT1 TO Y.
MOVE EINA01 OF FORMAT1 (X:Y)
TO AUSA01 OF FORMAT1.
Our problem is that if we exceed the length of the Variable EINA01, which is 50, the program crashes.
Our idea was to trim all the spaces from left and right and count all characters of the input given.
THe problem we face is that we have no way to count all the characters, since we would usually do it with "Inspect count all characters before initial spaces".
But if we for example have an input like "Hello World" he would only count everything till the first space after "Hello".
If you want to get the length of a string there a couple of different methods to do this:
METHOD 1
a simple loop:
WS-INPUT-STRING PIC X(100) VALUE "12345678901234567890".
WS-OUTPUT-STRING PIC X(50).
WS-POS PIC X(4) COMP.
PERFORM VARYING WS-POS
FROM 100 BY -1
UNTIL WS-INPUT-STRING(WS-POS:1)
NOT EQUAL SPACE OR
WS-POS < 1
END-PERFORM
IF WS-POS <= 50
MOVE WS-INPUT-STRING(1:WS-POS) TO WS-OUTPUT-STRING
END-IF
METHOD 2
inspect tallying
WS-INPUT-STRING PIC X(100) VALUE "12345678901234567890".
WS-OUTPUT-STRING PIC X(50).
WS-BLANK-COUNT PIC 9(4) COMP.
WS-IN-MAX PIC 9(4) COMP VALUE 100.
INSPECT FUNCTION REVERSE (WS-INPUT-STRING)
TALLYING WS-BLANK-COUNT FOR LEADING SPACES
IF (WS-IN-MAX - WS-BLANK-COUNT) <= 50
MOVE WS-INPUT-STRING(1:WS-IN-MAX - WS-BLANK-COUNT)
END-IF
both of these are viable options. I prefer the loop my self.
Also remember typically, leading spaces are important, I wouldn't recommend trimming them unless you are 100% sure they are not required.

Concatenate strings in COBOL without leading or trailing spaces

I'm having issues with this task; our task is to concatenate 2 strings without spaces in between those two, e.g.:
Input Alphanum. . 1: This string is
Input Alphanum. . 2: concatenated.
Input Alphanum. . 3:
Alphanum. output 1: This string isconcatenated.
Alphanum. output 2:
I can get these 2 strings together into the output, but they wont attach to each other without the spaces in between.
so assuming you don't know the length of the two strings, you would need to do something like this:
01 WS-INPUT-STRINGS.
05 WS-STRING1 PIC X(100) VALUE "THIS STRING IS".
05 WS-STRING2 PIC X(100) VALUE "CONCATENATED".
05 WS-STRING3 PIC X(100) VALUE SPACES.
01 WS-OUTPUT-STRINGS.
05 WS-CONCATENATED-OUTPUT PIC X(300) VALUE SPACES.
01 WS-COUNTERS.
05 WS-LEN-IN PIC 9(4) COMP VALUE 1.
05 WS-POS PIC 9(4) COMP VALUE 1.
IF WS-STRING1 NOT EQUAL SPACES OR LOW-VALUES
PERFORM VARYING WS-LEN-IN
FROM 100 BY -1
UNTIL WS-STRING1(WS-LEN-IN:1)
NOT EQUAL (SPACES OR LOW-VALUES) OR
WS-LEN-IN = 1
END-PERFORM
MOVE WS-STRING1(1:WS-LEN-IN)
TO WS-CONCATENATED-OUTPUT(WS-POS:WS-LEN-IN)
ADD WS-LEN-IN TO WS-POS
END-IF
IF WS-STRING2 NOT EQUAL SPACES OR LOW-VALUES
PERFORM VARYING WS-LEN-IN
FROM 100 BY -1
UNTIL WS-STRING2(WS-LEN-IN:1)
NOT EQUAL (SPACES OR LOW-VALUES) OR
WS-LEN-IN = 1
END-PERFORM
MOVE WS-STRING2(1:WS-LEN-IN)
TO WS-CONCATENATED-OUTPUT(WS-POS:WS-LEN-IN)
ADD WS-LEN-IN TO WS-POS
END-IF
IF WS-STRING3 NOT EQUAL SPACES OR LOW-VALUES
PERFORM VARYING WS-LEN-IN
FROM 100 BY -1
UNTIL WS-STRING3(WS-LEN-IN:1)
NOT EQUAL (SPACES OR LOW-VALUES) OR
WS-LEN-IN = 1
END-PERFORM
MOVE WS-STRING3(1:WS-LEN-IN)
TO WS-CONCATENATED-OUTPUT(WS-POS:WS-LEN-IN)
ADD WS-LEN-IN TO WS-POS
END-IF
DISPLAY WS-CONCATENATED-OUTPUT(1:WS-POS)
You could put this into a paragraph and perform it over and over, but I did it this way to illustrate exactly what it going on. When you define a picture clause in COBOL, it will always to that length, so if I just tried to string the 3 variables together, there would be tons of extra space between them because each picture clause is 100 characters long regardless of what I put in them. I use these loops to calculate the length of each variable. First I check to make sure there is something in the variable, then loop backwards until I find a character.
You did not say if you needed to trim leading spaces as well, so I just assumed trailing spaces only. You could also you use INSPECT TALLYING to get the count rather than writing the loops
Use the UNSTRING command with the TALLYING IN and WITH POINTER options to keep track of where you want to put the next string.

Vim incorrectly removes zeros when decrementing

Vim removes zeros from in front of some digits when decrementing:
If I take a text file with the following:
a02
a03
a04
a05
a06
a07
a08
a09
a10
a11
And use ctrl+V to highlight the second and third columns, and then hit ctrl+X to decrement, I am left with:
a01
a02
a03
a04
a05
a06
a7
a8
a9
a10
I am running Vim version 7.4.1689 and I loaded it without my .vimrc via
$ vim -u NONE
This is happening because Vim will automatically recognize and convert octal values.
From the help (:h variables):
Conversion from a Number to a String is by making the ASCII representation of
the Number.
Examples:
Number 123 --> String "123"
Number 0 --> String "0"
Number -1 --> String "-1"
Conversion from a String to a Number is done by converting the first digits to
a number. Hexadecimal "0xf9", Octal "017", and Binary "0b10" numbers are
recognized. If the String doesn't start with digits, the result is zero.
Examples:
String "456" --> Number 456
String "6bar" --> Number 6
String "foo" --> Number 0
String "0xf1" --> Number 241
String "0100" --> Number 64
String "0b101" --> Number 5
String "-8" --> Number -8
String "+8" --> Number 0
Your values 02 through 07 are being recognized as valid octal values and preserved as such, decremented to octal 01 through 06.
When you reach 08 it is not a valid octal value. It is treated as the string 08, converted to decimal value 8, and decremented to 7. This happens again with 09, which ends up being 8.
The 10 and 11 values are decremented as decimal as well. Because 10 was decimal, not octal, you don't get a leading 0 in the resulting 9 value.
I'm not aware of a way to do what you want with the decrement command.
EDIT: After finding this answer, I tested this expression and it does what you are trying to do in this specific case:
:%s/\v[0-9]+/\=printf("%02d", substitute(submatch(0), '^0\+', '', 0)-1)/
I'm not sure whether this solves your general use case, because it's quite different from the original operation using a selection. But for the file you provided, it achieves the result you were after.
Dissecting this a bit to explain it:
First we start by calling the global sub command %s and passing the \v flag to turn on "very magic" mode. This may or may not change the behavior depending on your settings, but this is a public example, so it is included here to ensure that mode is active.
:%s/\v
Then, we find all the contiguous sequences of digits. This will find 02, 03, and so on from your example.
[0-9]+
Then in the replacement portion we have this command, which does the real work:
\=printf("%02d", substitute(submatch(0), '^0\+', '', 0)-1)
The substitute() function determines what the new value is. submatch(0) means to use the entire match. Using a pattern of ^0\+ and a replacement of (empty string) says to strip the leading zero from any number which has one. The 0 at the end isn't too important; it just says there are no flags to the substitute() function.
The result of the substitute command is a number. Say 02 has been stripped down to be 2. Using the - 1 at the end, we subtract 1 from that result (decrement).
Finally, this result is passed to the printf function. Using a format string %02d says to print the values as decimal, in 2-digit wide format, padding with leading zeroes.
If you want Vim to treat all numbers as decimals, you may want to add the following line to your .vimrc:
set nrformats=

Resources