How to 'manipulate' strings in BASIC V2? - basic

I would like to reach the following:
I ask for a number from the user, and then output a string like the following:
-STR$
--STR$
---STR$
----STR$
-----STR$
I tried to do this:
10 INPUT NUM%
20 FOR X=1 TO NUM%: PRINT NUM%*"-" + "TEXT" : NEXT
The code above got me an error: ?TYPE MISMATCH EROR IN 20
However, I didn't yet figure out how to manipulate the string's beginning to multiply the '-' marks on each loop run

Maybe this:
10 INPUT NUM%
20 FOR I = 1 TO NUM%
30 FOR J = 1 TO I: PRINT "-"; : NEXT
40 PRINT " TEXT"
50 NEXT
There is no multipy of strings/character, as far as I remember to old (good) times.

I believe even older, more primitive forms of BASIC had the STRING$() function. It takes two parameters: the number of times to repeat the character and the character itself. So...
10 INPUT NUM%
20 FOR X=1 TO NUM%: PRINT STRING$(NUM%, "-") + "TEXT" : NEXT

An alternative:
100 INPUT NM%
110 BR$="----------"
120 PRINT LEFT$(BR$,NM%);
130 PRINT "TEXT"
This eliminates the need for an expensive FOR loop, and should be okay as long as NM% is not greater than the length of BR$.
One other thing to point out is that your variable names are effectively capped at two characters, e.g.:
The length of variable names are optional, but max. 80 chars (logical input line of BASIC). The BASIC interpreter used only the first 2 chars for controlling the using variables. The variables A$ and AA$ are different, but not AB$ and ABC$.
(Source: https://www.c64-wiki.com/wiki/Variable). For that reason I used NM% instead of NUM%; it will prevent issues later.

Related

Was trying to get it to return the index value of multiple instances of the same character rather than just the first, can't figure out what happened

So, was just trying to see if I could figure out a way to get it to print the index value of not only the first "o" character, but also the second, so the output would clearly have to be 4, 8.
string = "Python for Beginners"
x = "o"
for x in string:
print (string.index (x))
My reasoning being that for every character equal to "o" in the string, it would give me its specific index count. Instead, it gave me this as an output
0
1
2
3
4
5
6
7
4
9
6
11
12
13
14
5
5
12
9
19
So other than not doing what I thought it would, I've spent most of the past hour trying to figure out the logic that got me that output but alas, being a noob, I couldn't figure it out for the life of me.
I'd be interested in a solution on how to get it to count multiple instances of the same character, but even more interested in understanding exactly what happened to give me that output. Just can't figure it out.
Cheers all
OK so with a bit of diggin arouund i found of the re.finditer command which first requires us to import re (regular expression). I actually have no idea what that means but figured out enough to use it. Credit to this thread Finding multiple occurrences of a string within a string in Python
Then just played around a bit but finally got it down to this
string = "Python for Beginners"
import re
search = input ("What are you looking for? ")
msg = f" '{search}' found at index: "
for x in re.finditer (search , string):
print (msg, x.start())
if string.find (search) == -1:
print ("Error: sequence not present in string. Note: search is case sensitive")
This is all just for the purposes of educating myself practically a little bit as I go through the learning. Cheers all. Still open to suggestions for "better ways" of achieving the same output.

why is this trim text trailing not working?

IDENTIFICATION DIVISION.
PROGRAM-ID. KATA.
DATA DIVISION.
WORKING-STORAGE SECTION.
01 WS-INPUT PIC A(200).
01 WS-OUT PIC A(200).
01 I PIC 9(08).
01 J PIC 9(08).
01 INP-LEN PIC 9(08).
PROCEDURE DIVISION.
DISPLAY "INPUT YOUR STRING"
ACCEPT WS-INPUT
DISPLAY "REVERSING ......."
MOVE FUNCTION LENGTH(FUNCTION TRIM(WS-INPUT TRAILING)) TO INP-LEN
DISPLAY "Just for reference : Your string is "INP-LEN " long"
MOVE 1 to I.
PERFORM VARYING J from INP-LEN by -1 UNTIL J =0
MOVE WS-INPUT(I:1) to WS-OUT(J:1)
MOVE FUNCTION TRIM(WS-OUT TRAILING) TO WS-OUT
ADD 1 to I
END-PERFORM
MOVE FUNCTION TRIM(WS-OUT TRAILING) TO WS-OUT.
DISPLAY WS-OUT
DISPLAY FUNCTION LENGTH(WS-OUT)
STOP RUN.
Run the program for input ctrl test
If you run the program you will see that the length of WS-INPUT is :
Just for reference : Your string is 00000009 long
But if you do that for output it will say length of string is 200
Also the reversed string I get is :
tset lrtc
Which is 200 and not what I set.
Can someone explain where I went wrong and what can I do to fix it ?
(Note : I initially tried with function REVERSE so a simple
MOVE FUNCTION REVERSE(WS-INPUT) TO WS-OUTPUT
same problem was there as well
)
FUNCTION LENGTH (source) takes the length from source, in your case that's WS-OUT, which is PIC A(200) - so the answer 200 is correct.
FUNCTION TRIM (source TRAILING) creates as every function a temporary/internal item - in this case removing trailing SPACES from source.
Because of your MOVE of this temporary item with length 9 to one field which is of length 200 it gets right-padded by spaces.
Only DYNAMIC LENGTH items get a dynamic size by MOVE, all other items always stay with their size. [keeping "ODO" out for simplicity...]
You possibly want a nested function call: TRIM + REVERSE / LENGTH:
DISPLAY FUNCTION LENGTH ( FUNCTION TRIM (WS-OUT) )
DISPLAY "-" FUNCTION REVERSE ( FUNCTION TRIM (WS-IN TRAILING) ) "-"

Python ord() and chr()

I have:
txt = input('What is your sentence? ')
list = [0]*128
for x in txt:
list[ord(x)] += 1
for x in list:
if x >= 1:
print(chr(list.index(x)) * x)
As per my understanding this should just output every letter in a sentence like:
))
111
3333
etc.
For the string "aB)a2a2a2)" the output is correct:
))
222
B
aaaa
For the string "aB)a2a2a2" the output is wrong:
)
222
)
aaaa
I feel like all my bases are covered but I'm not sure what's wrong with this code.
When you do list.index(x), you're searching the list for the first index that value appears. That's not actually what you want though, you want the specific index of the value you just read, even if the same value occurs somewhere else earlier in the list too.
The best way to get indexes along side values from a sequence is with enuemerate:
for i, x in enumerate(list):
if x >= 1:
print(chr(i) * x)
That should get you the output you want, but there are several other things that would make your code easier to read and understand. First of all, using list as a variable name is a very bad idea, as that will shadow the builtin list type's name in your namespace. That makes it very confusing for anyone reading your code, and you even confuse yourself if you want to use the normal list for some purpose and don't remember you've already used it for a variable of your own.
The other issue is also about variable names, but it's a bit more subtle. Your two loops both use a loop variable named x, but the meaning of the value is different each time. The first loop is over the characters in the input string, while the latter loop is over the counts of each character. Using meaningful variables would make things a lot clearer.
Here's a combination of all my suggested fixes together:
text = input('What is your sentence? ')
counts = [0]*128
for character in text:
counts[ord(character)] += 1
for index, count in enumerate(counts):
if count >= 1:
print(chr(index) * count)

Hyphen with strings in PROC FORMAT

I am working with IC9 codes and am creating somewhat of a mapping between codes and an integer:
proc format library = &formatlib;
invalue category other = 0
'410'-'410.99', '425.4'-'425.99' = 1
I have searched and searched, but haven't been able to find an explanation of how that range actually works when it comes to formatting.
Take the first range, for example. I assume SAS interprets '410'-'410.99' as "take every value between the inclusive range [410, 410.99] and convert it to a 1. Please correct me if I'm wrong in that assumption. Does SAS treat these seeming strings as floating-point decimals, then? I think that must be the case if these are to be numerical ranges for formatting all codes within the range.
I'm coming to SAS from the worlds of R and Python, and thus the way quote characters are used in SAS sometimes is unclear (like when using %let foo = bar... not quotes are used).
When SAS compares string values with normal comparison operators, what it does is compare the byte representation of each character in the string, one at a time, until it reaches a difference.
So what you're going to see here is when a string is input, it will be compared to the 'start' string and, if greater than start, then compared to the 'end' string, and if less than end, evaluated to a 1; if it's not for each pair listed, then evaluated to a zero.
Importantly, this means that some nonsensical results could occur - see the last row of the following test, for example.
proc format;
invalue category other = 0
'410'-'410.99', '425.4'-'425.99' = 1
;
quit;
data test;
input #1 testval $6.;
category=input(testval,category.);
datalines;
425.23
425.45
425.40
410#
410.00
410.AA
410.7A
;;;;
run;
410.7A is compared to 410 and found greater, as '4'='4', '1'='1', '0'='0', '.' > ' ', so greater . Then 410.7A is compared to 410.99 and found less, as '4'='4', '1'='1', '0'='0', '7' < '9', so less. The A is irrelevant to the comparison. But on the row above it you see it's not in the sequence, since A is ASCII 41x and that is not less than '9' (ASCII 39x).
Note that all SAS strings are filled to their full length by spaces. This can be important in string comparisons, because space is the lowest-valued printable character (if you consider space printable). Thus any character you're likely to compare to space will be higher - so for example the fourth row (410#) is a 1 because # is between and . in the ASCII table! But change that to / and it fails. Similarly, change it to byte(13) (through code) and it fails - because it is then less than space (so 410^M, with ^M representing byte(13), is less than start (410)). In informats and formats, SAS will treat the format/informat start/end as being whatever the length that it needs to - so if you're reading a 6 long string, it will treat it as length 6 and fill the rest with spaces.

Fortran read of data with * to signify similar data

My data looks like this
-3442.77 -16749.64 893.08 -3442.77 -16749.64 1487.35 -3231.45 -16622.36 902.29
.....
159*2539.87 10*0.00 162*2539.87 10*0.00
which means I start with either 7 or 8 reals per line and then (towards the end) have 159 values of 2539.87 followed by 10 values of 0 followed by 162 of 2539.87 etc. This seems to be a space-saving method as previous versions of this file format were regular 6 reals per line.
I am already reading the data into a string because of not knowing whether there are 7 or 8 numbers per line. I can therefore easily spot lines that contain *. But what then? I suppose I have to identify the location of each * and then identify the integer number before and real value after before assigning to an array. Am I missing anything?
Read the line. Split it into tokens delimited by whitespace(s). Replace the * in tokens that have it with space. Then read from the string one or two values, depending on wheather there was an asterisk or not. Sample code follows:
REAL, DIMENSION(big) :: data
CHARACTER(LEN=40) :: token
INTEGER :: iptr, count, idx
REAL :: val
iptr = 1
DO WHILE (there_are_tokens_left)
... ! Get the next token into "token"
idx = INDEX(token, "*")
IF (idx == 0) THEN
READ(token, *) val
count = 1
ELSE
! Replace "*" with space and read two values from the string
token(idx:idx) = " "
READ(token, *) count, val
END IF
data(iptr:iptr+count-1) = val ! Add "val" "count" times to the list of values
iptr = iptr + count
END DO
Here I have arbitrarily set the length of the token to be 40 characters. Adjust it according to what you expect to find in your input files.
BTW, for the sake of completeness, this method of compressing something by replacing repeating values with value/repetition-count pairs is called run-length encoding (RLE).
Your input data may have been written in a form suitable for list directed input (where the format specification in the READ statement is simply ''*''). List directed input supports the r*c form that you see, where r is a repeat count and c is the constant to be repeated.
If the total number of input items is known in advance (perhaps it is fixed for that program, perhaps it is defined by earlier entries in the file) then reading the file is as simple as:
REAL :: data(size_of_data)
READ (unit, *) data
For example, for the last line shown in your example on its own ''size_of_data'' would need to be 341, from 159+10+162+10.
With list directed input the data can span across multiple records (multiple lines) - you don't need to know how many items are on each line in advance - just how many appear in the next "block" of data.
List directed input has a few other "features" like this, which is why it is generally not a good idea to use it to parse "arbitrary" input that hasn't been written with it in mind - use an explicit format specification instead (which may require creating the format specification on the fly to match the width of the input field if that is not know ahead of time).
If you don't know (or cannot calculate) the number of items in advance of the READ statement then you will need to do the parsing of the line yourself.

Resources