Lua: string of specific length - string

local data = "here is a string"
local no = 12
foo = string.format("%50s %05d",data,no)
print(foo:len(),string.format("%q",foo))
defines foo as a string of specific length
" here is a string 00012"
However, is there an easy way to get
"here is a string 00012"
I know, that I can fill up the string data with spaces
while data:len() < 50 do data = data.." " end

Add a minus to format string %-50s to align text to the left:
foo = string.format("%-50s %05d","here is a string", 12)
print(foo:len(), foo)
Output:
56 here is a string 00012
Allowed flags:
- : left align result inside field
+ : always prefix with a sign, using + if field positive
0 : left-fill with zeroes rather than spaces
(space) : If positive, put a space where the + would have been
# : Changes the behaviour of various formats, as follows:
For octal conversion (o), prefixes the number with 0 - if necessary.
For hex conversion (x), prefixes the number with 0x
For hex conversion (X), prefixes the number with 0X
For e, E and f formats, always show the decimal point.
For g and G format, always show the decimal point, and do not truncate trailing zeroes.
The option to 'always show the decimal point' would only apply if you had the precision set to 0.

Related

Differentiating widths of a string in Presto (zero-width characters)

I'd like to compare string lengths in bytes of a field, because the field could have some zero-width characters in it.
The Binary functions seem most relevant, but it seems they all expect length-one input and I can't figure any way to split out the string to individual rows.
e.g.
select from_hex('abc')
gives error
invalid input length 3
And I can't "split on nothing" to convert this to 3 rows (a, b, c separately):
select split('abc', '')
The delimiter may not be the empty string
In R, I'm accustomed to the nchar function where I can specify type = 'bytes' (count by bytes), 'chars' (matches Presto's length from what I've seen), or 'width' (zero-width characters don't count -- the visible width of the string), and where strsplit('abc', NULL) or strsplit('abc', '') give list(c('a', 'b', 'c')).
I'm not sure I can copy-paste a string with a zero-width character here so here's R code to create one:
rawToChar(as.raw(c(0x68, 0x65, 0x6c, 0x6c, 0x6F, 0xe2, 0x80, 0x8d)))
with Presto length output:
select length('hello‍')
In R, I can get three different widths of this string:
sapply(c('width', 'chars', 'bytes'), x = 'hello‍')
# width chars bytes
# 5 6 8
Is there any way to replicate this in Presto?

Padding with zeros in the middle of a string?

Padding a number with leading zeros has been answered here. But in my case I have a string character followed by digits. I want to add leading zeros after the string character, but before the digits, keeping the total length to 4. For example:
A1 -> A001
A12 -> A012
A123 -> A123
I have the following code that gets me what I want, but is there a shorter way to do this without using re to split my string into text and numbers first?
import re
mystr = 'A4'
elements = re.match(r"([a-z]+)([0-9]+)", mystr, re.I)
first, second = elements.groups()
print(first + '{:0>3}'.format(second))
output = A004
You could use the following to avoid using re:
def pad_center(s):
zeros = '0' * (4 - len(s))
first, *the_rest = list(s)
return first + zeros + ''.join(the_rest)
print(pad_center('A1'))
print(pad_center('A12'))
print(pad_center('A123'))
Or, if you want to use format() you could try this:
def pad_center(s):
zeros = '0' * (4 - len(s))
first, *the_rest = list(s)
return '{}{}{}'.format(first, zeros, ''.join(the_rest))
However, I am not aware of any way to add padding to the center of a string with the format string syntax without prior processing.

How to compute word scores in Scrabble using MATLAB

I have a homework program I have run into a problem with. We basically have to take a word (such as MATLAB) and have the function give us the correct score value for it using the rules of Scrabble. There are other things involved such as double word and double point values, but what I'm struggling with is converting to ASCII. I need to get my string into ASCII form and then sum up those values. We only know the bare basics of strings and our teacher is pretty useless. I've tried converting the string into numbers, but that's not exactly working out. Any suggestions?
function[score] = scrabble(word, letterPoints)
doubleword = '#';
doubleletter = '!';
doublew = [findstr(word, doubleword)]
trouble = [findstr(word, doubleletter)]
word = char(word)
gameplay = word;
ASCII = double(gameplay)
score = lower(sum(ASCII));
Building on Francis's post, what I would recommend you do is create a lookup array. You can certainly convert each character into its ASCII equivalent, but then what I would do is have an array where the input is the ASCII code of the character you want (with a bit of modification), and the output will be the point value of the character. Once you find this, you can sum over the points to get your final point score.
I'm going to leave out double points, double letters, blank tiles and that whole gamut of fun stuff in Scrabble for now in order to get what you want working. By consulting Wikipedia, this is the point distribution for each letter encountered in Scrabble.
1 point: A, E, I, O, N, R, T, L, S, U
2 points: D, G
3 points: B, C, M, P
4 points: F, H, V, W, Y
5 points: K
8 points: J, X
10 points: Q, Z
What we're going to do is convert your word into lower case to ensure consistency. Now, if you take a look at the letter a, this corresponds to ASCII code 97. You can verify that by using the double function we talked about earlier:
>> double('a')
97
As there are 26 letters in the alphabet, this means that going from a to z should go from 97 to 122. Because MATLAB starts indexing arrays at 1, what we can do is subtract each of our characters by 96 so that we'll be able to figure out the numerical position of these characters from 1 to 26.
Let's start by building our lookup table. First, I'm going to define a whole bunch of strings. Each string denotes the letters that are associated with each point in Scrabble:
string1point = 'aeionrtlsu';
string2point = 'dg';
string3point = 'bcmp';
string4point = 'fhvwy';
string5point = 'k';
string8point = 'jx';
string10point = 'qz';
Now, we can use each of the strings, convert to double, subtract by 96 then assign each of the corresponding locations to the points for each letter. Let's create our lookup table like so:
lookup = zeros(1,26);
lookup(double(string1point) - 96) = 1;
lookup(double(string2point) - 96) = 2;
lookup(double(string3point) - 96) = 3;
lookup(double(string4point) - 96) = 4;
lookup(double(string5point) - 96) = 5;
lookup(double(string8point) - 96) = 8;
lookup(double(string10point) - 96) = 10;
I first create an array of length 26 through the zeros function. I then figure out where each letter goes and assign to each letter their point values.
Now, the last thing you need to do is take a string, take the lower case to be sure, then convert each character into its ASCII equivalent, subtract by 96, then sum up the values. If we are given... say... MATLAB:
stringToConvert = 'MATLAB';
stringToConvert = lower(stringToConvert);
ASCII = double(stringToConvert) - 96;
value = sum(lookup(ASCII));
Lo and behold... we get:
value =
10
The last line of the above code is crucial. Basically, ASCII will contain a bunch of indexing locations where each number corresponds to the numerical position of where the letter occurs in the alphabet. We use these positions to look up what point / score each letter gives us, and we sum over all of these values.
Part #2
The next part where double point values and double words come to play can be found in my other StackOverflow post here:
Calculate Scrabble word scores for double letters and double words MATLAB
Convert from string to ASCII:
>> myString = 'hello, world';
>> ASCII = double(myString)
ASCII =
104 101 108 108 111 44 32 119 111 114 108 100
Sum up the values:
>> total = sum(ASCII)
total =
1160
The MATLAB help for char() says (emphasis added):
S = char(X) converts array X of nonnegative integer codes into a character array. Valid codes range from 0 to 65535, where codes 0 through 127 correspond to 7-bit ASCII characters. The characters that MATLAB® can process (other than 7-bit ASCII characters) depend upon your current locale setting. To convert characters into a numeric array, use the double function.
ASCII chart here.

How to read numeric data from a string in Fortran

I have a character string array in Fortran as ' results: CI- Energies --- th= 89 ph=120'. How do I extract the characters '120' from the string and store into a real variable?
The string is written in the file 'input.DAT'. I have written the Fortran code as:
implicit real*8(a-h,o-z)
character(39) line
open(1,file='input.DAT',status='old')
read(1,'(A)') line,phi
write(*,'(A)') line
write(*,*)phi
end
Upon execution it shows:
At line 5 of file string.f (unit = 1, file = 'input.dat')
Fortran runtime error: End of file
I have given '39' as the dimension of the character array as there are 39 characters including 'spaces' in the string upto '120'.
Assuming that the real number you want to read appears after the last equal sign in the string, you can use the SCAN intrinsic function to find that location and then READ the number from the rest of the string, as shown in the following program.
program xreadnum
implicit none
integer :: ipos
integer, parameter :: nlen = 100
character (len=nlen) :: str
real :: xx
str = "results: CI- Energies --- th= 89 ph=120"
ipos = scan(str, "=", back=.true.)
print*, "reading real variable from '" // trim(str(1+ipos:)) // "'"
read (str(1+ipos:),*) xx
print*, "xx = ", xx
end program xreadnum
! gfortran output:
! reading real variable from '120'
! xx = 120.000000
To convert string s into a real type variable r:
READ(s, "(Fw.d)") r
Here w is the total field width and d is the number of digits after the decimal point. If there is no decimal point in the input string, values of w and d might affect the result, e.g.
s = '120'
READ(s, "(F3.0)") r ! r <-- 120.0
READ(s, "(F3.1)") r ! r <-- 12.0
Answer to another part of the question (how to extract substring with particular number to convert) strongly depends on the format of the input strings, e.g. if all the strings are formed by fixed-width fields, it's possible to skip irrelevant part of the string:
s = 'a=120'
READ(s(3:), "(F3.0)") r

How compiler is converting integer to string and vice versa

Many languages have functions for converting string to integer and vice versa. So what happens there? What algorithm is being executed during conversion?
I don't ask in specific language because I think it should be similar in all of them.
To convert a string to an integer, take each character in turn and if it's in the range '0' through '9', convert it to its decimal equivalent. Usually that's simply subtracting the character value of '0'. Now multiply any previous results by 10 and add the new value. Repeat until there are no digits left. If there was a leading '-' minus sign, invert the result.
To convert an integer to a string, start by inverting the number if it is negative. Divide the integer by 10 and save the remainder. Convert the remainder to a character by adding the character value of '0'. Push this to the beginning of the string; now repeat with the value that you obtained from the division. Repeat until the divided value is zero. Put out a leading '-' minus sign if the number started out negative.
Here are concrete implementations in Python, which in my opinion is the language closest to pseudo-code.
def string_to_int(s):
i = 0
sign = 1
if s[0] == '-':
sign = -1
s = s[1:]
for c in s:
if not ('0' <= c <= '9'):
raise ValueError
i = 10 * i + ord(c) - ord('0')
return sign * i
def int_to_string(i):
s = ''
sign = ''
if i < 0:
sign = '-'
i = -i
while True:
remainder = i % 10
i = i / 10
s = chr(ord('0') + remainder) + s
if i == 0:
break
return sign + s
I wouldn't call it an algorithm per se, but depending on the language it will involve the conversion of characters into their integral equivalent. Many languages will either stop on the first character that cannot be represented as an integer (e.g. the letter a), will blindly convert all characters into their ASCII value (e.g. the letter a becomes 97), or will ignore characters that cannot be represented as integers and only convert the ones that can - or return 0 / empty. You have to get more specific on the framework/language to provide more information.
String to integer:
Many (most) languages represent strings, on some level or another, as an array (or list) of characters, which are also short integers. Map the ones corresponding to number characters to their number value. For example, '0' in ascii is represented by 48. So you map 48 to 0, 49 to 1, and so on to 9.
Starting from the left, you multiply your current total by 10, add the next character's value, and move on. (You can make a larger or smaller map, change the number you multiply by at each step, and convert strings of any base you like.)
Integer to string is a longer process involving base conversion to 10. I suppose that since most integers have limited bits (32 or 64, usually), you know that it will come to a certain number of characters at most in a string (20?). So you can set up your own adder and iterate through each place for each bit after calculating its value (2^place).

Resources