How to detect if a character from a string is upper or lower case? - string

I'm expanding a class of mine for storing generic size strings to allow more flexible values for user input. For example, my prior version of this class was strict and allowed only the format of 2x3 or 9x12. But now I'm making it so it can support values such as 2 x 3 or 9 X 12 and automatically maintain the original user's formatting if the values get changed.
The real question I'm trying to figure out is just how to detect if one character from a string is either upper or lower case? Because I have to detect case sensitivity. If the deliminator is 'x' (lowercase) and the user inputs 'X' (uppercase) inside the value, and case sensitivity is turned off, I need to be able to find the opposite-case as well.
I mean, the Pos() function is case sensitive...

Delphi 7 has UpperCase() and LowerCase() functions for strings. There's also UpCase() for characters.
If I want to search for a substring within another string case insensitively, I do this:
if Pos('needle', LowerCase(hayStack)) > 0 then
You simply use lower case string literals (or constants) and apply the lowercase function on the string before the search. If you'll be doing a lot of searches, it makes sense to convert just once into a temp variable.
Here's your case:
a := '2 x 3'; // Lowercase x
b := '9 X 12'; // Upper case X
x := Pos('x', LowerCase(a)); // x = 3
x := Pos('x', LowerCase(b)); // x = 3
To see if a character is upper or lower, simply compare it against the UpCase version of it:
a := 'A';
b := 'b';
upper := a = UpCase(a); // True
upper := b = UpCase(b); // False

try using these functions (which are part of the Character unit)
Character.TCharacter.IsUpper
Character.TCharacter.IsLower
IsLower
IsUpper
UPDATE
For ansi versions of delphi you can use the GetStringTypeEx functions to fill a list with each ansi character type information. and thne compare the result of each element against the $0001(Upper Case) or $0002(Lower Case) values.
uses
Windows,
SysUtils;
Var
LAnsiChars: array [AnsiChar] of Word;
procedure FillCharList;
var
lpSrcStr: AnsiChar;
lpCharType: Word;
begin
for lpSrcStr := Low(AnsiChar) to High(AnsiChar) do
begin
lpCharType := 0;
GetStringTypeExA(LOCALE_USER_DEFAULT, CT_CTYPE1, #lpSrcStr, SizeOf(lpSrcStr), lpCharType);
LAnsiChars[lpSrcStr] := lpCharType;
end;
end;
function CharIsLower(const C: AnsiChar): Boolean;
const
C1_LOWER = $0002;
begin
Result := (LAnsiChars[C] and C1_LOWER) <> 0;
end;
function CharIsUpper(const C: AnsiChar): Boolean;
const
C1_UPPER = $0001;
begin
Result := (LAnsiChars[C] and C1_UPPER) <> 0;
end;
begin
try
FillCharList;
Writeln(CharIsUpper('a'));
Writeln(CharIsUpper('A'));
Writeln(CharIsLower('a'));
Writeln(CharIsLower('A'));
except
on E:Exception do
Writeln(E.Classname, ': ', E.Message);
end;
Readln;
end.

if myChar in ['A'..'Z'] then
begin
// uppercase
end
else
if myChar in ['a'..'z'] then
begin
// lowercase
end
else
begin
// not an alpha char
end;
..or D2009 on..
if charInSet(myChar,['A'..'Z']) then
begin
// uppercase
end
else
if charInSet(myChar,['a'..'z']) then
begin
// lowercase
end
else
begin
// not an alpha char
end;

The JCL has routines for this in the JclStrings unit, eg CharIsUpper and CharIsLower. SHould work in Delphi 7.

AnsiPos() is not case-sensitive. You can also force upper or lower case, irrespective of what the user enters using UpperCase() and LowerCase().
Just throwing this out there since you may find it far more simple than the other (very good) answers.

Related

How do I compare 2 strings in Pascal?

So I have 2 strings and I want to be able to say if the 2 strings are the same or not. The only problem is im filling the string 1 character by 1 using a while so if I use length/ord it doesn't work properly. I guess if you see the code im working with you will have an easier tas helping me out, so I'll just paste it here.
var
cad1, cad2: string;
car: char;
icad1, icad2: integer;
begin
car := 'o';
icad1 := 1;
icad2 := 1;
write('Write the cad1: ');
while (car<>'.') do begin
car := readkey;
cad1 := car;
write(car);
inc(icad1);
end;
car := 'o';
writeln;
write('Write thecad2: ');
while (car <> '.') do begin
car := readkey;
cad2 := car;
write(car);
inc(icad2);
end;
writeln;
end.
You have just to do :
CompareText(cad1, cad2)
it will return 0 if the two string are the same.
http://www.freepascal.org/docs-html/rtl/sysutils/comparetext.html
There are several problems in your code. For example: the line cad1:=car; assigns the character to a string. That means that the resulting string contains only one character equal to car. All the previous inputs are lost.
The simpliest way to input the strings and compare them is the following:
write('Write the cad1: ');
readln(cad1);
write('Write thecad2: ');
readln(cad2);
write(cad1=cad2);
readln;

Strange behaviour when simply adding strings in Lazarus - FreePascal

The program has several "encryption" algorithms. This one should blockwise reverse the input. "He|ll|o " becomes "o |ll|He" (block length of 2).
I add two strings, in this case appending the result string to the current "block" string and making that the result. When I add the result first and then the block it works fine and gives me back the original string. But when i try to reverse the order it just gives me the the last "block".
Several other functions that are used for "rotation" are above.
//amount of blocks
function amBl(i1:integer;i2:integer):integer;
begin
if (i1 mod i2) <> 0 then result := (i1 div i2) else result := (i1 div i2) - 1;
end;
//calculation of block length
function calcBl(keyStr:string):integer;
var i:integer;
begin
result := 0;
for i := 1 to Length(keyStr) do
begin
result := (result + ord(keyStr[i])) mod 5;
result := result + 2;
end;
end;
//desperate try to add strings
function append(s1,s2:string):string;
begin
insert(s2,s1,Length(s1)+1);
result := s1;
end;
function rotation(inStr,keyStr:string):string;
var //array of chars -> string
block,temp:string;
//position in block variable
posB:integer;
//block length and block count variable
bl, bc:integer;
//null character as placeholder
n : ansiChar;
begin
//calculating block length 2..6
bl := calcBl(keyStr);
setLength(block,bl);
result := '';
temp := '';
{n := #00;}
for bc := 0 to amBl(Length(inStr),bl) do
begin
//filling block with chars starting from back of virtual block (in inStr)
for posB := 1 to bl do
begin
block[posB] := inStr[bc * bl + posB];
{if inStr[bc * bl + posB] = ' ' then block[posB] := n;}
end;
//adding the block in front of the existing result string
temp := result;
result := block + temp;
//result := append(block,temp);
//result := concat(block,temp);
end;
end;
(full code http://pastebin.com/6Uarerhk)
After all the loops "result" has the right value, but in the last step (between "result := block + temp" and the "end;" of the function) "block" replaces the content of "result" with itself completely, it doesn't add result at the end anymore.
And as you can see I even used a temp variable to try to work around that.. doesnt change anything though.
I am 99.99% certain that your problem is due to a subtle bug in your code. However, your deliberate efforts to hide the relevant code mean that we're really shooting in the dark. You haven't even been clear about where you're seeing the shortened Result: GUI Control/Debugger/Writeln
The irony is that you have all the information at your fingertips to provide a small concise demonstration of your problem - including sample input and expected output.
So without the relevant information, I can only guess; I do think I have a good hunch though.
Try the following code and see if you have a similar experience with S3:
S1 := 'a'#0;
S2 := 'bc';
S3 := S1 + S2;
The reason for my hunch is that #0 is a valid character in a string: but whenever that string needs to be processed as PChar, #0 will be interpreted as a string terminator. This could very well cause the "strange behaviour" you're seeing.
So it's quite probable that you have at least one of the following 2 bugs in your code:
You are always processing 1 too many characters; with the extra character being #0.
When your input string has an odd number of characters: your algorithm (which relies on pairs of characters) adds an extra character with value #0.
Edit
With the additional source code, my hunch is confirmed:
Suppose you have a 5 character string, and key that produces block length 2.
Your inner loop (for posB := 1 to bl do) will read beyond the length of inStr on the last iteration of the outer loop.
So if the next character in memory happens to be #0, you will be doing exactly as described above.
Additional problem. You have the following code:
//calculating block length 2..6
bl := calcBl(keyStr);
Your assumption in the comment is wrong. From the implementation of calcBl, if keyStr is empty, your result will be 0.

How do I count characters in a string, excluding certain types?

I need to determine the total number of characters in a textbox and display the value in a label, but all whitespace need to be excluded.
Here is the code:
var
sLength : string;
i : integer;
begin
sLength := edtTheText.Text;
slength:= ' ';
i := length(sLength);
//display the length of the string
lblLength.Caption := 'The string is ' + IntToStr(i) + ' characters long';
You can count the non-white space characters like this:
uses
Character;
function NonWhiteSpaceCharacterCount(const str: string): Integer;
var
c: Char;
begin
Result := 0;
for c in str do
if not Character.IsWhiteSpace(c) then
inc(Result);
end;
This uses Character.IsWhiteSpace to determine whether or not a character is whitespace. IsWhiteSpace returns True if and only if the character is classified as being whitespace, according to the Unicode specification. So, tab characters count as whitespace.
If you are using an Ansi version of Delphi you can also use a Lookup Table with something like
NotBlanks: Array[0..255] Of Boolean
A Bool in the array is set if the matching character is not a blank. Then In the loop you simply increment your counter
Count := 0;
For i := 1 To Length(MyStringToParse) Do
Inc(Count, Byte(NotBlanks[ Ord(MyStringToParse[i]])) );
In the same fashion you can use a set:
For i := 1 To Length(MyStringToParse) Do
If Not (MyStringToParse[i] In [#1,#2{define the blanks in this enum}]) Then
Inc(Count).
Actually you have many ways to solve this.

Check if a string contains a word but only in specific position?

How can I check if a string contains a substring, but only in a specific position?
Example string:
What is your favorite color? my [favorite] color is blue
If I wanted to check if the string contained a specific word I usually do this:
var
S: string;
begin
S := 'What is your favorite color? my [favorite] color is blue';
if (Pos('favorite', S) > 0) then
begin
//
end;
end;
What I need is to determine if the word favorite exists in the string, ignoring though if it appears inside the [ ] symbols, which the above code sample clearly does not do.
So if we put the code into a boolean function, some sample results would look like this:
TRUE: What is your favorite color? my [my favorite] color is blue
TRUE: What is your favorite color? my [blah blah] color is blue
FALSE: What is your blah blah color? my [some favorite] color is blue
The first two samples above are true because the word favorite is found outside of the [ ] symbols, whether it is inside them or not.
The 3rd sample is false because even though there is the word favorite, it only appears inside the [ ] symbols - we should only check if it exists outside of the symbols.
So I need a function to determine whether or not a word (favorite in this example) appears in a string, but ignoring the fact if the word is surrounded inside [ ] symbols.
I like Sertac's idea about deleting strings enclosed by brackets and searching for a string after that. Here is a code sample extended by a search for whole words and case sensitivity:
function ContainsWord(const AText, AWord: string; AWholeWord: Boolean = True;
ACaseSensitive: Boolean = False): Boolean;
var
S: string;
BracketEnd: Integer;
BracketStart: Integer;
SearchOptions: TStringSearchOptions;
begin
S := AText;
BracketEnd := Pos(']', S);
BracketStart := Pos('[', S);
while (BracketStart > 0) and (BracketEnd > 0) do
begin
Delete(S, BracketStart, BracketEnd - BracketStart + 1);
BracketEnd := Pos(']', S);
BracketStart := Pos('[', S);
end;
SearchOptions := [soDown];
if AWholeWord then
Include(SearchOptions, soWholeWord);
if ACaseSensitive then
Include(SearchOptions, soMatchCase);
Result := Assigned(SearchBuf(PChar(S), StrLen(PChar(S)), 0, 0, AWord,
SearchOptions));
end;
Here is an optimized version of the function, which uses pointer char iteration without string manipulation. In comparison with a previous version this handles the case when you have a string with missing closing bracket like for instance My [favorite color is. Such string is there evaluated to True because of that missing bracket.
The principle is to go through the whole string char by char and when you find the opening bracket, look if that bracket has a closing pair for itself. If yes, then check if the substring from the stored position until the opening bracket contains the searched word. If yes, exit the function. If not, move the stored position to the closing bracket. If the opening bracket doesn't have own closing pair, search for the word from the stored position to the end of the whole string and exit the function.
For commented version of this code follow this link.
function ContainsWord(const AText, AWord: string; AWholeWord: Boolean = True;
ACaseSensitive: Boolean = False): Boolean;
var
CurrChr: PChar;
TokenChr: PChar;
TokenLen: Integer;
SubstrChr: PChar;
SubstrLen: Integer;
SearchOptions: TStringSearchOptions;
begin
Result := False;
if (Length(AText) = 0) or (Length(AWord) = 0) then
Exit;
SearchOptions := [soDown];
if AWholeWord then
Include(SearchOptions, soWholeWord);
if ACaseSensitive then
Include(SearchOptions, soMatchCase);
CurrChr := PChar(AText);
SubstrChr := CurrChr;
SubstrLen := 0;
while CurrChr^ <> #0 do
begin
if CurrChr^ = '[' then
begin
TokenChr := CurrChr;
TokenLen := 0;
while (TokenChr^ <> #0) and (TokenChr^ <> ']') do
begin
Inc(TokenChr);
Inc(TokenLen);
end;
if TokenChr^ = #0 then
SubstrLen := SubstrLen + TokenLen;
Result := Assigned(SearchBuf(SubstrChr, SubstrLen, 0, 0, AWord,
SearchOptions));
if Result or (TokenChr^ = #0) then
Exit;
CurrChr := TokenChr;
SubstrChr := CurrChr;
SubstrLen := 0;
end
else
begin
Inc(CurrChr);
Inc(SubstrLen);
end;
end;
Result := Assigned(SearchBuf(SubstrChr, SubstrLen, 0, 0, AWord,
SearchOptions));
end;
In regular expressions, there is a thing called look-around you could use. In your case you can solve it with negative lookbehind: you want "favorite" unless it's preceded with an opening bracket. It could look like this:
(?<!\[[^\[\]]*)favorite
Step by step: (?<! is the negative lookbehind prefix, we're looking for \[ optionally followed by none or more things that are not closing or opening brackets: [^\[\]]*, close the negative lookbehind with ), and then favorite right after.
I think you can reword your problem as "find an ocurrence of the provided string not being surrounded by square brackets." If that describes your issue, then you can go ahead and use a simple regular expression like [^\[]favorite[^\]].

How to find a position of a substring within a string with fuzzy match

I have come across a problem of matching a string in an OCR recognized text and find the position of it considering there can be arbitrary tolerance of wrong, missing or extra characters. The result should be a best match position, possibly (not necessarily) with length of matching substring.
For example:
String: 9912, 1.What is your name?
Substring: 1. What is your name?
Tolerance: 1
Result: match on character 7
String: Where is our caat if any?
Substring: your cat
Tolerance: 2
Result: match on character 10
String: Tolerance is t0o h1gh.
Substring: Tolerance is too high;
Tolerance: 1
Result: no match
I have tried to adapt Levenstein algorithm, but it doesn't work properly for substrings and doesn't return position.
Algorithm in Delphi would be preferred, yet any implementation or pseudo logic would do.
Here's a recursive implementation that works, but might not be fast enough. The worst case scenario is when a match can't be found, and all but the last char in "What" gets matched at every index in Where. In that case the algorithm will make Length(What)-1 + Tolerance comparasions for each char in Where, plus one recursive call per Tolerance. Since both Tolerance and the length of What are constnats, I'd say the algorithm is O(n). It's performance will degrade linearly with the length of both "What" and "Where".
function BrouteFindFirst(What, Where:string; Tolerance:Integer; out AtIndex, OfLength:Integer):Boolean;
var i:Integer;
aLen:Integer;
WhatLen, WhereLen:Integer;
function BrouteCompare(wherePos, whatPos, Tolerance:Integer; out Len:Integer):Boolean;
var aLen:Integer;
aRecursiveLen:Integer;
begin
// Skip perfect match characters
aLen := 0;
while (whatPos <= WhatLen) and (wherePos <= WhereLen) and (What[whatPos] = Where[wherePos]) do
begin
Inc(aLen);
Inc(wherePos);
Inc(whatPos);
end;
// Did we find a match?
if (whatPos > WhatLen) then
begin
Result := True;
Len := aLen;
end
else if Tolerance = 0 then
Result := False // No match and no more "wild cards"
else
begin
// We'll make an recursive call to BrouteCompare, allowing for some tolerance in the string
// matching algorithm.
Dec(Tolerance); // use up one "wildcard"
Inc(whatPos); // consider the current char matched
if BrouteCompare(wherePos, whatPos, Tolerance, aRecursiveLen) then
begin
Len := aLen + aRecursiveLen;
Result := True;
end
else if BrouteCompare(wherePos + 1, whatPos, Tolerance, aRecursiveLen) then
begin
Len := aLen + aRecursiveLen;
Result := True;
end
else
Result := False; // no luck!
end;
end;
begin
WhatLen := Length(What);
WhereLen := Length(Where);
for i:=1 to Length(Where) do
begin
if BrouteCompare(i, 1, Tolerance, aLen) then
begin
AtIndex := i;
OfLength := aLen;
Result := True;
Exit;
end;
end;
// No match found!
Result := False;
end;
I've used the following code to test the function:
procedure TForm18.Button1Click(Sender: TObject);
var AtIndex, OfLength:Integer;
begin
if BrouteFindFirst(Edit2.Text, Edit1.Text, ComboBox1.ItemIndex, AtIndex, OfLength) then
Label3.Caption := 'Found #' + IntToStr(AtIndex) + ', of length ' + IntToStr(OfLength)
else
Label3.Caption := 'Not found';
end;
For case:
String: Where is our caat if any?
Substring: your cat
Tolerance: 2
Result: match on character 10
it shows a match on character 9, of length 6. For the other two examples it gives the expected result.
Here is a complete sample of fuzzy match (approximate search), and you can use/change the algorithm as you wish!
https://github.com/alidehban/FuzzyMatch

Resources