How to base64 encode in ascii - rpgle

I'm needing to do some base64 encoding in ascii mode from a RPGLE program. Below is a strip down program of my attempt. This program uses the apr_base64_encode_binary procedure in the QSYSDIR/QAXIS10HT service program to do perform the encoding. The field (myPlainData) that it tries to encode has a value of 'Hello'. This field has a ccsid of 819 (ascii), and I'm needing the encoded result to be in ascii also. But apr_base64_encode_binary keeps return the encoded result in EBCDIC. Is there a way to get the result in ASCII?
* play variables
D myPlainData s 200 ccsid(819)
D myPlainDataLen...
D s 10I 0
D myBase64Data s 65535A ccsid(819)
D myBase64DataLen...
D s 10I 0
* ibm base 64 encoder
* note: apr_base64_* functions can be found in the QSYSDIR/QAXIS10HT service program
D apr_base64_encode_binary...
D pr 10i 0 extproc('apr_base64_encode_binary')
D piBase64Data...
D 65535A options(*varsize) ccsid(819)
D piPlainData...
D 65535A options(*varsize) const
D piPlainDataLen...
D 10i 0 value
/free
myPlainData = 'Hello'; // myPlainData is a ccsid(819) field (ascii field)
myPlainDataLen = %len(%trimr(myPlainData));
//encode the data
myBase64DataLen = apr_base64_encode_binary(myBase64Data
:myPlainData
:myPlainDataLen);
*inlr = *on;
/end-free

The second parameter of your prototype doesn't have the CCSID keyword, so it defaults to the job CCSID. When you pass the CCSID(819) field for the second parameter, the compiler converts it to the job CCSID.
The reason your workaround is working is that the compiler now thinks that the second parameter is already in the job CCSID, so it doesn't have to convert it.
I think your first program will work correctly if you add CCSID(819) to the second parameter.

It works if I change my code to below. This new code creates a temporary myPlainData2 field, assigns its base pointer to the myPlainData field, and uses this temporary field to call the encoder.
* play variables
D myPlainData s 200 ccsid(819)
D myPlainDataLen...
D s 10I 0
D myBase64Data s 65535A
D myBase64DataLen...
D s 10I 0
D myPlainData2 s 200 based(myPlainData2_p)
* ibm base 64 encoder
* note: apr_base64_* functions can be found in the QSYSDIR/QAXIS10HT service program
D apr_base64_encode_binary...
D pr 10i 0 extproc('apr_base64_encode_binary')
D piBase64Data...
D 65535A options(*varsize)
D piPlainData...
D 65535a options(*varsize)
D piPlainDataLen...
D 10i 0 value
/free
myPlainData = 'Hello'; // myPlainData is a ccsid(819) field (ascii field)
myPlainDataLen = %len(%trimr(myPlainData));
myPlainData2_p = %addr(myPlainData);
//encode the data
myBase64DataLen = apr_base64_encode_binary(myBase64Data
:myPlainData2
:myPlainDataLen);
*inlr = *on;
/end-free

Ok, finally found something...
From the Apache docs
apr_base64_encode - Encode a text string using base64encoding. On
EBCDIC machines, the input is first converted to ASCII.
apr_base64_encode_binary - Encode an text string using base64encoding.
This is the same as apr_base64_encode() except on EBCDIC machines,
where the conversion of the input to ASCII is left out.
So I agree with Barbara's answer that you should include CCSID(819) on both the procedures text parameters.

Related

How to count strings in specified field within each line of one or more csv files

Writing a Python program (ver. 3) to count strings in a specified field within each line of one or more csv files.
Where the csv file contains:
Field1, Field2, Field3, Field4
A, B, C, D
A, E, F, G
Z, E, C, D
Z, W, C, Q
the script is executed, for example:
$ ./script.py 1,2,3,4 file.csv
And the result is:
A 10
C 7
D 2
E 2
Z 2
B 1
Q 1
F 1
G 1
W 1
ERROR
the script is executed, for example:
$ ./script.py 1,2,3,4 file.csv file.csv file.csv
Where the error occurs:
for rowitem in reader:
for pos in field:
pos = rowitem[pos] ##<---LINE generating error--->##
if pos not in fieldcnt:
fieldcnt[pos] = 1
else:
fieldcnt[pos] += 1
TypeError: list indices must be integers or slices, not str
Thank you!
Judging from the output, I'd say that the fields in the csv file does not influence the count of the string. If the string uniqueness is case-insensitive please remember to use yourstring.lower() to return the string so that different case matches are actually counted as one. Also do keep in mind that if your text is large the number of unique strings you might find could be very large as well, so some sort of sorting must be in place to make sense of it! (Or else it might be a long list of random counts with a large portion of it being just 1s)
Now, to get a count of unique strings using the collections module is an easy way to go.
file = open('yourfile.txt', encoding="utf8")
a= file.read()
#if you have some words you'd like to exclude
stopwords = set(line.strip() for line in open('stopwords.txt'))
stopwords = stopwords.union(set(['<media','omitted>','it\'s','two','said']))
# make an empty key-value dict to contain matched words and their counts
wordcount = {}
for word in a.lower().split(): #use the delimiter you want (a comma I think?)
# replace punctuation so they arent counted as part of a word
word = word.replace(".","")
word = word.replace(",","")
word = word.replace("\"","")
word = word.replace("!","")
if word not in stopwords:
if word not in wordcount:
wordcount[word] = 1
else:
wordcount[word] += 1
That should do it. The wordcount dict should contain the word and it's frequency. After that just sort it using collections and print it out.
word_counter = collections.Counter(wordcount)
for word, count in word_counter.most_common(20):
print(word, ": ", count)
I hope this solves your problem. Lemme know if you face problems.

Issue with ASCii in Python3

I am trying to convert a string of varchar to ascii. Then i'm trying to make it so any number that's not 3 digits has a 0 in front of it. then i'm trying to add a 1 to the very beginning of the string and then i'm trying to make it a large number that I can apply math to it.
I've tried a lot of different coding techniques. The closest I've gotten is below:
s = 'Ak'
for c in s:
mgk = (''.join(str(ord(c)) for c in s))
num = [mgk]
var = 1
num.insert(0, var)
mgc = lambda num: int(''.join(str(i) for i in num))
num = mgc(num)
print(num)
With this code I get the output: 165107
It's almost doing exactly what I need to do but it's taking out the 0 from the ord(A) which is 65. I want it to be 165. everything else seems to be working great. I'm using '%03d'% to insert the 0.
How I want it to work is:
Get the ord() value from a string of numbers and letters.
if the ord() value is less than 100 (ex: A = 65, add a 0 to make it a 3 digit number)
take the ord() values and combine them into 1 number. 0 needs to stay in from of 65. then add a one to the list. so basically the output will look like:
1065107
I want to make sure I can take that number and apply math to it.
I have this code too:
s = 'Ak'
for c in s:
s = ord(c)
s = '%03d'%s
mgk = (''.join(str(s)))
s = [mgk]
var = 1
s.insert(0, var)
mgc = lambda s: int(''.join(str(i) for i in s))
s = mgc(s)
print(s)
but then it counts each letter as its own element and it will not combine them and I only want the one in front of the very first number.
When the number is converted to an integer, it
Is this what you want? I am kinda confused:
a = 'Ak'
result = '1' + ''.join(str(f'{ord(char):03d}') for char in a)
print(result) # 1065107
# to make it a number just do:
my_int = int(result)

How to convert hex encoded bytes to String in Python3?

I read some value from Windows Registry (SAM) with Python3. As far as I can tell it looks like hex encoded bytes:
>>> b = b'A\x00d\x00m\x00i\x00n\x00i\x00s\x00t\x00r\x00a\x00t\x00o\x00r\x00'
>>> print(b)
A d m i n i s t r a t o r
Now how would I convert that to a String (should be "Administrator")? Using "print" just gives me "A d m i n i s t r a t o r". How to do the conversion correctly without using dirty tricks?
b = b'A\x00d\x00m\x00i\x00n\x00i\x00s\x00t\x00r\x00a\x00t\x00o\x00r\x00'
b = b.replace(b'\x00', b'')
print(b)
# b'Administrator'
I propably should have used utf-16 decoding:
>>> b = b'A\x00d\x00m\x00i\x00n\x00i\x00s\x00t\x00r\x00a\x00t\x00o\x00r\x00'
>>> print(b.decode('utf-16'))
Administrator
SORRY!

String formatting in Python

How do I format the following numbers that are in vector?
For an instance, numbers which I have:
23.02567
0.025679
and I would like to format to this:
0.230256700+E02
0.025679000+E00
First, note that this is not the proper way to format numbers in scientific- or engineering-notation. Those numbers should always have exactly one digit in front of the decimal point, unless the exponent is required to be a multiple of 3 (i.e. a power of 1000, corresponding to one of the SI prefixes). If, however, you have to use this format, you could write your own format string for that.
>>> x, e = 23.02567, 2
>>> "%f%sE%02d" % (x/10**e, "+" if e >= 0 else "-", abs(e))
'0.230257+E02'
>>> x, e = 0.025679, -1
>>> "%f%sE%02d" % (x/10**e, "+" if e >= 0 else "-", abs(e))
'0.256790-E01'
This is assuming that the exponent, e, is given. If the exponent does not matter, you could also use the proper %E format and just replace E+ with +E:
>>> ("%E" % x).replace("E+", "+E").replace("E-", "-E")
'2.567900-E02'

How to compute word scores in Scrabble using MATLAB

I have a homework program I have run into a problem with. We basically have to take a word (such as MATLAB) and have the function give us the correct score value for it using the rules of Scrabble. There are other things involved such as double word and double point values, but what I'm struggling with is converting to ASCII. I need to get my string into ASCII form and then sum up those values. We only know the bare basics of strings and our teacher is pretty useless. I've tried converting the string into numbers, but that's not exactly working out. Any suggestions?
function[score] = scrabble(word, letterPoints)
doubleword = '#';
doubleletter = '!';
doublew = [findstr(word, doubleword)]
trouble = [findstr(word, doubleletter)]
word = char(word)
gameplay = word;
ASCII = double(gameplay)
score = lower(sum(ASCII));
Building on Francis's post, what I would recommend you do is create a lookup array. You can certainly convert each character into its ASCII equivalent, but then what I would do is have an array where the input is the ASCII code of the character you want (with a bit of modification), and the output will be the point value of the character. Once you find this, you can sum over the points to get your final point score.
I'm going to leave out double points, double letters, blank tiles and that whole gamut of fun stuff in Scrabble for now in order to get what you want working. By consulting Wikipedia, this is the point distribution for each letter encountered in Scrabble.
1 point: A, E, I, O, N, R, T, L, S, U
2 points: D, G
3 points: B, C, M, P
4 points: F, H, V, W, Y
5 points: K
8 points: J, X
10 points: Q, Z
What we're going to do is convert your word into lower case to ensure consistency. Now, if you take a look at the letter a, this corresponds to ASCII code 97. You can verify that by using the double function we talked about earlier:
>> double('a')
97
As there are 26 letters in the alphabet, this means that going from a to z should go from 97 to 122. Because MATLAB starts indexing arrays at 1, what we can do is subtract each of our characters by 96 so that we'll be able to figure out the numerical position of these characters from 1 to 26.
Let's start by building our lookup table. First, I'm going to define a whole bunch of strings. Each string denotes the letters that are associated with each point in Scrabble:
string1point = 'aeionrtlsu';
string2point = 'dg';
string3point = 'bcmp';
string4point = 'fhvwy';
string5point = 'k';
string8point = 'jx';
string10point = 'qz';
Now, we can use each of the strings, convert to double, subtract by 96 then assign each of the corresponding locations to the points for each letter. Let's create our lookup table like so:
lookup = zeros(1,26);
lookup(double(string1point) - 96) = 1;
lookup(double(string2point) - 96) = 2;
lookup(double(string3point) - 96) = 3;
lookup(double(string4point) - 96) = 4;
lookup(double(string5point) - 96) = 5;
lookup(double(string8point) - 96) = 8;
lookup(double(string10point) - 96) = 10;
I first create an array of length 26 through the zeros function. I then figure out where each letter goes and assign to each letter their point values.
Now, the last thing you need to do is take a string, take the lower case to be sure, then convert each character into its ASCII equivalent, subtract by 96, then sum up the values. If we are given... say... MATLAB:
stringToConvert = 'MATLAB';
stringToConvert = lower(stringToConvert);
ASCII = double(stringToConvert) - 96;
value = sum(lookup(ASCII));
Lo and behold... we get:
value =
10
The last line of the above code is crucial. Basically, ASCII will contain a bunch of indexing locations where each number corresponds to the numerical position of where the letter occurs in the alphabet. We use these positions to look up what point / score each letter gives us, and we sum over all of these values.
Part #2
The next part where double point values and double words come to play can be found in my other StackOverflow post here:
Calculate Scrabble word scores for double letters and double words MATLAB
Convert from string to ASCII:
>> myString = 'hello, world';
>> ASCII = double(myString)
ASCII =
104 101 108 108 111 44 32 119 111 114 108 100
Sum up the values:
>> total = sum(ASCII)
total =
1160
The MATLAB help for char() says (emphasis added):
S = char(X) converts array X of nonnegative integer codes into a character array. Valid codes range from 0 to 65535, where codes 0 through 127 correspond to 7-bit ASCII characters. The characters that MATLABĀ® can process (other than 7-bit ASCII characters) depend upon your current locale setting. To convert characters into a numeric array, use the double function.
ASCII chart here.

Resources