In some programming languages, such as C for example, the end of string may be marked as a separate null terminator symbol.
How do I determine if the current symbol is the end of string?
Currently I use some string functions' calls, but I guess it may be performed easier.
*the string's end
IF ISBLANK(SUBSTR(str, pos, 1) == .T. AND CHR(32) != SUBSTR(str, pos, 1)
RETURN .T.
ENDIF
There's no need to worry about C-style string termination in VFP.
Assuming you don't care what the last character is then from your example:
return (pos = len(str))
If you want to ignore spaces:
return (pos = len(alltrim(str))
VFP strings are not ASCIIZ strings as in C. A VFP string can contain any ASCII character including character 0 - chr(0)- which is a string termination character in C style languages.
Normally the end of the string in VFP is the same as its length. But, although it is not clear from your question, sometimes you get a string from a C source (ie: a win32 call) where multiple string values are separated by chr(0) values. You can easily parse such a string into multiple string with a code like alines(). ie:
? ALines(laLines, "hello"+Chr(0)+"there",1+4,Chr(0)) && prints 2
display memory like laLines && shows two string values
Also you could use many string function like at(), occurs(), asc() ... to locate, count ... that character.
Related
I have a string that includes all the characters which should be
deleted in a given string. With a nested loop I can iterate through
both strings. But is there a shorter way?
local ignore = "'`'"
function ignoreLetters( c )
local new = ""
for cOrig in string.gmatch(c,".") do
local addChar = 1
for cIgnore in string.gmatch(ignore,".") do
if cOrig == cIgnore then
addChar = 0
break -- no other char possible
end
end
if addChar>0 then new = new..cOrig end
end
return new
end
print(ignoreLetters("'s-Hertogenbosch"))
print(ignoreLetters("'s-Hertogen`bosch"))
The string ignore can also be a table if it makes the code shorter.
You can use string.gsub to replace any occurance of a given string in a string by another string. To delete the unwanted characters, simply replace them with an empty string.
local ignore = "'`'"
function ignoreLetters( c )
return (c:gsub("["..ignore.."]+", ""))
end
print(ignoreLetters("'s-Hertogenbosch"))
print(ignoreLetters("'s-Hertogen`bosch"))
Just be aware that in case you want to ignore magic characters you'll have to escape them in your pattern.
But I guess this will give you a starting point and leave you plenty of own work to perfect.
How can I convert a character code to a string character in Lua?
E.g.
d = 48
-- this is what I want
str_d = "0"
You are looking for string.char:
string.char (···)
Receives zero or more integers. Returns a string with length equal to the number of arguments, in which each character has the internal numerical code equal to its corresponding argument.
Note that numerical codes are not necessarily portable across platforms.
For your example:
local d = 48
local str_d = string.char(d) -- str_d == "0"
For ASCII characters, you can use string.char.
For UTF-8 strings, you can use utf8.char(introduced in Lua 5.3) to get a character from its code point.
print(utf8.char(48)) -- 0
print(utf8.char(29790)) -- 瑞
How can I create a Classic ASP function to remove all characters coming from pasting excel columns and keep only LETTERS (A..Z), numbers (0..9) and dot comma (;)?
I need the ; to know where to split the variables... but some cases comes from excel special tabular character and I don't know what the space between the ; so I need to remove all others except letters numbers and ; .
123456 ; newVendor
987654321 ; vendor2
I found the function below on oracle, but how can I make it to accept the dot comma ;
auxTexto:=trim(TRANSLATE(regexp_replace(upper(texto),'[[:punct:]]','') , '.ÁÀÃÂÄÉÈÊËÍÌÎÏÓÒÕÔÖÚÙÛÜÇ_ ','.AAAAAEEEEIIIIOOOOOUUUUC_'));
return auxTexto;
Untested:
' Assume the string you want to operate on is called inString.
' We will create a string called outString that contains only letters, numbers, and semicolon
dim acceptableCharacters
acceptableCharacters = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789;"
dim outString
for i = 1 to Len(inString)
if InStr(acceptableCharacters, Mid(inString, i, 1)) > 0 then
outString = outString & Mid(inString, i, 1)
end if
next
This isn't horribly efficient and there are lots of ways to test if the value at position i in inString is in the set of acceptable letters, but this is one way of doing it. Also, it doesn't take into account that there are letters other than A-Z and a-z, such as é and ά, depending on the language and character set we're talking about. But it gives you an idea of how to do something like this in VB/ASP.
I have a somewhat esoteric problem. My program wants to decode morse code.
The point is, I will need to handle any character. Any random characters that adhere to my system and can correspond to a letter should be accepted. Meaning, the letter "Q" is represented by "- - . -", but my program will treat any string of characters (separated by appropriate newchar signal) to be accepted as Q, for example "dj ir j kw" (long long short long).
There is a danger of falling out of sync, so I will need to implement a "new character" signal. I chose this to be "xxxx" as in 4 letters. For white, blank space symbol, I chose "xxxxxx", 6 chars.
Long story short, how can I split the string that is to be decoded into readable characters based on the length of the delimeter (4 continous symbols), since I can't really deterministically know what letters will make up the newchar delimeter?
The question is not very clearly worded.
For instance, here you show space as a delimeter between parts of the symbol Q:
for example "dj ir j kw" (long long short long)
Later you say:
For white, blank space symbol, I chose "xxxxxx", 6 chars.
Is that the symbol for whitespace, or the delimeter you use within a symbol (such as Q, above)? Your post doesn't say.
In this case, as always, an example is worth a thousands words. You should have shown a few examples of possible input and shown how you'd like them parsed.
If what you mean was that "dj ir j kw jfkl abpzoq jfkl dj ir j kw" should be decoded as "Q Q", and you just want to know how to match tokens by their length, then... the question is easy. There's a million ways you could do that.
In Lua, I'd do it in two passes. First, convert the message into a string containing only the length of each chunk of consequitive characters:
message = 'dj ir j kw jfkl abpzoq jfkl dj ir j kw'
message = message:gsub('(%S+)%s*', function(s) return #s end)
print(message) --> 22124642212
Then split on the number 4 to get each group
for group in message:gmatch('[^4]+') do
print(group)
end
Which gives you:
2212
6
2212
So you could convert something like this:
function translate(message)
local lengthToLetter = {
['2212'] = 'Q',
[ '6'] = ' ',
}
local translation = {}
message = message:gsub('(%S+)%s*', function(s) return #s end)
for group in message:gmatch('[^4]+') do
table.insert(translation, lengthToLetter[group] or '?')
end
return table.concat(translation)
end
print(translate(message))
This will split a string by any len continuous occurrences of char, which may be a character or pattern character class (such as %s), or of any character (i.e. .) if char is not passed.
It does this by using backreferences in the pattern passed to string.find, e.g. (.)%1%1%1 to match any character repeated four times.
The rest is just a bog-standard string splitter; the only real Lua peculiarity here is the choice of pattern.
-- split str, using (char * len) as the delimiter
-- leave char blank to split on len repetitions of any character
local function splitter(str, len, char)
-- build pattern to match len continuous occurrences of char
-- "(x)%1%1%1%1" would match "xxxxx" etc.
local delim = "("..(char or ".")..")" .. string.rep("%1", len-1)
local pos, out = 1, {}
-- loop through the string, find the pattern,
-- and string.sub the rest of the string into a table
while true do
local m1, m2 = string.find(str, delim, pos)
-- no sign of the delimiter; add the rest of the string and bail
if not m1 then
out[#out+1] = string.sub(str, pos)
break
end
out[#out+1] = string.sub(str, pos, m1-1)
pos = m2+1
-- string ends with the delimiter; bail
if m2 == #str then
break
end
end
return out
end
-- and the result?
print(unpack(splitter("dfdsfsdfXXXXXsfsdfXXXXXsfsdfsdfsdf", 5)))
-- dfdsfsdf, sfsdf, sfsdfsdfsdf
I've a question about Fortran 77 and I've not been able to find a solution.
I'm trying to store an array of strings defined as the following:
character matname(255)*255
Which is an array of 255 strings of length 255.
Later I read the list of names from a file and I set the content of the array like this:
matname(matcount) = mname
EDIT: Actually mname value is hardcoded as mname = 'AIR' of type character*255, it is a parameter of a function matadd() which executes the previous line. But this is only for testing, in the future it will be read from a file.
Later on I want to print it with:
write(*,*) matname(matidx)
But it seems to print all the 255 characters, it prints the string I assigned and a lot of garbage.
So that is my question, how can I know the length of the string stored?
Should I have another array with all the lengths?
And how can I know the length of the string read?
Thanks.
You can use this function to get the length (without blank tail)
integer function strlen(st)
integer i
character st*(*)
i = len(st)
do while (st(i:i) .eq. ' ')
i = i - 1
enddo
strlen = i
return
end
Got from here: http://www.ibiblio.org/pub/languages/fortran/ch2-13.html
PS: When you say: matname(matidx) it gets the whole string(256) chars... so that is your string plus blanks or garbage
The function Timotei posted will give you the length of the string as long as the part of the string you are interested in only contains spaces, which, if you are assigning the values in the program should be true as FORTRAN is supposed to initialize the variables to be empty and for characters that means a space.
However, if you are reading in from a file you might pick up other control characters at the end of the lines (particularly carriage return and/or line feed characters, \r and/or \n depending on your OS). You should also toss those out in the function to get the correct string length. Otherwise you could get some funny print statements as those characters are printed as well.
Here is my version of the function that checks for alternate white space characters at the end besides spaces.
function strlen(st)
integer i,strlen
character st*(*)
i = len(st)
do while ((st(i:i).eq.' ').or.(st(i:i).eq.'\r').or.
+ (st(i:i).eq.'\n').or.(st(i:i).eq.'\t'))
i = i - 1
enddo
strlen = i
return
end
If there are other characters in the "garbage" section this still won't work completely.
Assuming that it does work for your data, however, you can then change your write statement to look like this:
write(*,*) matname(matidx)(1:strlen(matname(matidx)))
and it will print out just the actual string.
As to whether or not you should use another array to hold the lengths of the string, that is up to you. the strlen() function is O(n) whereas looking up the length in a table is O(1). If you find yourself computing the lengths of these static strings often, it may improve performance to compute the length once when they are read in, store them in an array and look them up if you need them. However, if you don't notice the slowdown, I wouldn't worry about it.
Depending on the compiler that you are using, you may be able to use the trim() intrinsic function to remove any leading/trailing spaces from a string, then process it as you normally would, i.e.
character(len=25) :: my_string
my_string = 'AIR'
write (*,*) ':', trim(my_string), ':'
should print :AIR:.
Edit:
Better yet, it looks like there is a len_trim() function that returns the length of a string after it has been trimmed.
intel and Compaq Visual Fortran have the intrinsic function LEN_TRIM(STRING) which returns the length without trailing blanks or spaces.
If you want to suppress leading blanks or spaces, use "Adjust Left" i.e. ADJUSTF(STRING)
In these FORTRANs I also note a useful feature: If you pass a string in to a function or subroutine as an argument, and inside the subroutine it is declared as CHARACTER*(*), then
using the LEN(STRING) function in the subroutine retruns the actual string length passed in, and not the length of the string as declared in the calling program.
Example:
CHARACTER*1000 STRING
.
.
CALL SUBNAM(STRING(1:72)
SUBROUTINE SYBNAM(STRING)
CHARACTER*(*) STRING
LEN(STRING) will be 72, not 1000