Strange behaviour when simply adding strings in Lazarus - FreePascal - string

The program has several "encryption" algorithms. This one should blockwise reverse the input. "He|ll|o " becomes "o |ll|He" (block length of 2).
I add two strings, in this case appending the result string to the current "block" string and making that the result. When I add the result first and then the block it works fine and gives me back the original string. But when i try to reverse the order it just gives me the the last "block".
Several other functions that are used for "rotation" are above.
//amount of blocks
function amBl(i1:integer;i2:integer):integer;
begin
if (i1 mod i2) <> 0 then result := (i1 div i2) else result := (i1 div i2) - 1;
end;
//calculation of block length
function calcBl(keyStr:string):integer;
var i:integer;
begin
result := 0;
for i := 1 to Length(keyStr) do
begin
result := (result + ord(keyStr[i])) mod 5;
result := result + 2;
end;
end;
//desperate try to add strings
function append(s1,s2:string):string;
begin
insert(s2,s1,Length(s1)+1);
result := s1;
end;
function rotation(inStr,keyStr:string):string;
var //array of chars -> string
block,temp:string;
//position in block variable
posB:integer;
//block length and block count variable
bl, bc:integer;
//null character as placeholder
n : ansiChar;
begin
//calculating block length 2..6
bl := calcBl(keyStr);
setLength(block,bl);
result := '';
temp := '';
{n := #00;}
for bc := 0 to amBl(Length(inStr),bl) do
begin
//filling block with chars starting from back of virtual block (in inStr)
for posB := 1 to bl do
begin
block[posB] := inStr[bc * bl + posB];
{if inStr[bc * bl + posB] = ' ' then block[posB] := n;}
end;
//adding the block in front of the existing result string
temp := result;
result := block + temp;
//result := append(block,temp);
//result := concat(block,temp);
end;
end;
(full code http://pastebin.com/6Uarerhk)
After all the loops "result" has the right value, but in the last step (between "result := block + temp" and the "end;" of the function) "block" replaces the content of "result" with itself completely, it doesn't add result at the end anymore.
And as you can see I even used a temp variable to try to work around that.. doesnt change anything though.

I am 99.99% certain that your problem is due to a subtle bug in your code. However, your deliberate efforts to hide the relevant code mean that we're really shooting in the dark. You haven't even been clear about where you're seeing the shortened Result: GUI Control/Debugger/Writeln
The irony is that you have all the information at your fingertips to provide a small concise demonstration of your problem - including sample input and expected output.
So without the relevant information, I can only guess; I do think I have a good hunch though.
Try the following code and see if you have a similar experience with S3:
S1 := 'a'#0;
S2 := 'bc';
S3 := S1 + S2;
The reason for my hunch is that #0 is a valid character in a string: but whenever that string needs to be processed as PChar, #0 will be interpreted as a string terminator. This could very well cause the "strange behaviour" you're seeing.
So it's quite probable that you have at least one of the following 2 bugs in your code:
You are always processing 1 too many characters; with the extra character being #0.
When your input string has an odd number of characters: your algorithm (which relies on pairs of characters) adds an extra character with value #0.
Edit
With the additional source code, my hunch is confirmed:
Suppose you have a 5 character string, and key that produces block length 2.
Your inner loop (for posB := 1 to bl do) will read beyond the length of inStr on the last iteration of the outer loop.
So if the next character in memory happens to be #0, you will be doing exactly as described above.
Additional problem. You have the following code:
//calculating block length 2..6
bl := calcBl(keyStr);
Your assumption in the comment is wrong. From the implementation of calcBl, if keyStr is empty, your result will be 0.

Related

Unexpected behaviour reusing a TMemoryStream in Delphi

I am trying to read two strings of varying length from a TMemoryStream, but both streams end up being the same length. So if, for example, the first string is 'abcdefghijkl', and the second one is 'wxyz', the value I get for the second string is 'wxyzefghijkl' (the first four characters of my new string ('wxyz') followed by the remaining characters of the 1st string that have not been replaced by 'wxyz'
My code is:-
var
L : LongInt
S : string;
...
msRecInfo.Position := 0;
msRecInfo.Read(L, SizeOf(L)); // read size of following string ...
SubStream.Clear;
SubStream.CopyFrom(msRecInfo, L); // copy next block of data to a second TMemoryStream
if (L > 0) then S := StreamToString(SubStream); //convert the stream into a string
msRecInfo.Read(L, SizeOf(L)); // get size of following string ...
SubStream.CopyFrom(msRecInfo, L);
if (L > 0) then S := StreamToString(SubStream);
I have been battling with this for hours without success. Can anyone point out what I am doing wrong?
You are not calling SubStream.Clear() before the 2nd call to SubStream.CopyFrom(). So, the 1st call to StreamToString(SubStream) leaves SubStream.Position at the end of the stream, then the subsequent SubStream.CopyFrom() adds more data to the stream, preserving the existing data. Then the subsequent StreamToString(SubStream) reads all of the data from SubStream.
Also, be aware that if L is 0 when you pass it to SubStream.CopyFrom(), it will copy the entire msRecInfo stream. This is documented behavior:
https://docwiki.embarcadero.com/Libraries/en/System.Classes.TStream.CopyFrom
If Count is 0, CopyFrom sets Source position to 0 before reading and then copies the entire contents of Source into the stream. If Count is greater than or less than 0, CopyFrom reads from the current position in Source.
So, you need to move up your L > 0 check, eg:
msRecInfo.Read(L, SizeOf(L));
if (L > 0) then
begin
SubStream.Clear;
SubStream.CopyFrom(msRecInfo, L);
S := StreamToString(SubStream);
end
else
S := '';
I would suggest wrapping this logic into a reusable function, eg:
var
L : LongInt;
S : string;
function ReadString: string;
begin
msRecInfo.Read(L, SizeOf(L)); // read size of following string ...
if (L > 0) then
begin
SubStream.Clear;
SubStream.CopyFrom(msRecInfo, L); // copy next block of data to a second TMemoryStream
Result := StreamToString(SubStream); //convert the stream into a string
end else
Result := '';
end;
begin
...
msRecInfo.Position := 0;
S := ReadString;
S := ReadString;
...
Although, if feasible, I would suggest just getting rid of SubStream altogether, update StreamToString() to take L as an input parameter, so that you can read the string from msRecInfo directly, eg:
msRecInfo.Read(L, SizeOf(L));
S := StreamToString(msRecInfo, L);
No need for a 2nd TMemoryStream if you can avoid it.

How can I get a char of a String in delphi 7?

This is something that should be easey but I just can´t get it work.
I come from java so maby I have a error in my thinking here.
What I want to do is that I have a string with two letters like 't4' or 'pq'.
Now I just want to get each of the chracters in the string as an own string.
So I do:
firstString := myString[0];
but I don´t even get this compiled.
So I figured that they start counting form 1 and put 1 as an index.
Now I do this in a while loop and the first time I go through it it works fine. Then the second time the results are just empty or wrong numbers.
What am I missing here?
(I also tried copy but that doesn´t work either!)
while i < 10 do
begin
te := 'te';
a := te[1];
b := te[2];
i := i +1;
end;
the first loop a is 't' and b is 'e' as I would expect. The second time a is '' and b ist 't' which I don´t understand!
Strings are 1-based, not zero-based. Try the following, after adding StrUtils to your Uses list (for DupeString):
var
MyString : String;
begin
MyString := '12345';
Caption := StringOfChar(MyString[1], 8) + ':' + DupeString(Copy(MyString, 3, 2), 4);
You could split it up to mke it easier to follow, of course:
var
MyString,
S1,
S2,
S3: String;
begin
MyString := '12345';
S1 := StringOfChar(MyString[1], 8);
S2 := Copy(MyString, 3, 2);
S3 := DupeString(S2, 4);
Caption := S1 + ':' + S3;

Storing string references

Problem
There are multiple ways to store string reference, so how would you do it in the example code? Currently the problem is with storing access to string because it is causing non-local pointer cannot point to local object. Is storing 'First and 'Last to reference a string a preferable way?
String reference storage
This record stores reference to a string. The First and Last is supposed to point to a string. The Name should be able to the same I think, but that will cause non-local pointer cannot point to local object when a local string is assigned to that. So the current work around solution is to use First and Last.
type Segment is record
First : Positive;
Last : Positive;
Length : Natural := 0;
Name : access String;
end record;
Assigning sub string reference
The commented line is causing non-local pointer cannot point to local object. This is because Item is local. Source is not local and that is the string I want sub string references from.
procedure Find (Source : aliased String; Separator : Character; Last : out Natural; Item_Array : out Segment_Array) is
P : Positive := Source'First;
begin
for I in Item_Array'Range loop
declare
Item : aliased String := Separated_String_Next (Source, Separator, P);
begin
exit when Item'Length = 0;
Item_Array (I).Length := Item'Length;
Item_Array (I).First := Item'First;
Item_Array (I).Last := Item'Last;
--Item_Array (I).Name := Item'Access;
Last := I;
end;
end loop;
end;
Example
with Ada.Text_IO;
with Ada.Integer_Text_IO;
procedure Main is
use Ada.Text_IO;
use Ada.Integer_Text_IO;
function Separated_String_Next (Source : String; Separator : Character; P : in out Positive) return String is
A : Positive := P;
B : Positive;
begin
while A <= Source'Last and then Source(A) = Separator loop
A := A + 1;
end loop;
P := A;
while P <= Source'Last and then Source(P) /= Separator loop
P := P + 1;
end loop;
B := P - 1;
while P <= Source'Last and then Source(P) = Separator loop
P := P + 1;
end loop;
return Source (A .. B);
end;
type Segment is record
First : Positive;
Last : Positive;
Length : Natural := 0;
Name : access String;
end record;
type Segment_Array is array (Integer range <>) of Segment;
procedure Find (Source : String; Separator : Character; Last : out Natural; Item_Array : out Segment_Array) is
P : Positive := Source'First;
begin
for I in Item_Array'Range loop
declare
Item : aliased String := Separated_String_Next (Source, Separator, P);
begin
exit when Item'Length = 0;
Item_Array (I).Length := Item'Length;
Item_Array (I).First := Item'First;
Item_Array (I).Last := Item'Last;
--Item_Array (I).Name := Item'Access;
Last := I;
end;
end loop;
end;
Source : String := ",,Item1,,,Item2,,Item3,,,,,,";
Item_Array : Segment_Array (1 .. 100);
Last : Natural;
begin
Find (Source, ',', Last, Item_Array);
Put_Line (Source);
Put_Line ("Index First Last Name");
for I in Item_Array (Item_Array'First .. Last)'Range loop
Put (I, 5);
Put (Item_Array (I).First, 6);
Put (Item_Array (I).Last, 5);
Put (" ");
Put (Source (Item_Array (I).First .. Item_Array (I).Last));
New_Line;
end loop;
end;
Output
,,Item1,,,Item2,,Item3,,,,,,
Index First Last Name
1 3 7 Item1
2 11 15 Item2
3 18 22 Item3
The error message tells you exactly what is wrong : Item is a string declared locally, i.e. on the stack, and you are assigning its address to an access type (pointer). I hope I don't need to explain why that won't work.
The immediate answer - which isn't wrong but isn't best practice either, is to allocate space for a new string - in a storage pool or on the heap - which is done with new.
Item : access String := new String'(Separated_String_Next (Source, Separator, P));
...
Item_Array (I).Name := Item;
Note that some other record members, at least, Length all appear to be completely redundant since it is merely a copy of its eponymous attributes, so should probably be eliminated (unless there's a part of the picture I can't see).
There are better answers. Sometimes you need to use access types, and handle their object lifetimes and all the ways they can go wrong. But more often their appearance is a hint that something in the design can be improved : for example:
the Unbounded_String may manage your strings more simply
You could use the length as a discriminant on the Segment record, and store the actual string (not an Access) in the record itself
Ada.Containers are a standard library of containers to abstract over handling the storage yourself (much as the STL is used in C++).
If you DO decide you need access types, it's better to use a named access type type Str_Access is access String; - then you can create a storage pool specific to Str_Acc types, and release the entire pool in one operation, to simplify object lifetime management and eliminate memory leaks.
Note the above essentially "deep copies" the slices of the Source string. If there is a specific need to "shallow copy" it - i.e. refer to the specific substrings in place - AND you can guarantee its object lifetime, this answer is not what you want. If so, please clarify the intent of the question.
For a "shallow copy" the approach in the question essentially fails because Item is already a deep copy ... on the stack.
The closest approach I can see is to make the source string aliassed ... you MUST do as you want each Segment to refer to it ... and pass its access to the Find procedure.
Then each Segment becomes a tuple of First, Last, (redundant Length) and access to the entire string (rather than a substring).
procedure Find (Source : access String; Separator : Character;
Last : out Natural; Item_Array : out Segment_Array) is
P : Positive := Source'First;
begin
for I in Item_Array'Range loop
declare
Item : String := Separated_String_Next (Source.all, Separator, P);
begin
exit when Item'Length = 0;
...
Item_Array (I).Name := Source;
Last := I;
end;
end loop;
end;
Source : aliased String := ",,Item1,,,Item2,,Item3,,,,,,";
...
Find (Source'access, ',', Last, Item_Array);
for I in Item_Array (Item_Array'First .. Last)'Range loop
...
Put (Item_Array (I).Name(Item_Array (I).First .. Item_Array (I).Last));
New_Line;
end loop;
A helper to extract a string from a Segment would probably be useful:
function get(S : Segment) return String is
begin
return S.Name(S.First .. S.Last);
end get;
...
Put (get(Item_Array (I));
The only rationale I can see for such a design is where the set of strings to be parsed or dissected will barely fit in memory so duplication must be avoided. Perhaps also embedded programming or some such discipline where dynamic (heap) allocation is discouraged or even illegal.
I see no solution involving address arithmetic within a string, since an array is not merely its contents - if you point within it, you lose the attributes. You can make the same criticism of the equivalent C design : you can identify the start of a substring with a pointer, but you can't just stick a null terminator at the end of the substring without breaking the original string.
Given the bigger picture ... what you need, rather than the low level details of how you want to achieve it, there are probably better solutions.

How to detect if a character from a string is upper or lower case?

I'm expanding a class of mine for storing generic size strings to allow more flexible values for user input. For example, my prior version of this class was strict and allowed only the format of 2x3 or 9x12. But now I'm making it so it can support values such as 2 x 3 or 9 X 12 and automatically maintain the original user's formatting if the values get changed.
The real question I'm trying to figure out is just how to detect if one character from a string is either upper or lower case? Because I have to detect case sensitivity. If the deliminator is 'x' (lowercase) and the user inputs 'X' (uppercase) inside the value, and case sensitivity is turned off, I need to be able to find the opposite-case as well.
I mean, the Pos() function is case sensitive...
Delphi 7 has UpperCase() and LowerCase() functions for strings. There's also UpCase() for characters.
If I want to search for a substring within another string case insensitively, I do this:
if Pos('needle', LowerCase(hayStack)) > 0 then
You simply use lower case string literals (or constants) and apply the lowercase function on the string before the search. If you'll be doing a lot of searches, it makes sense to convert just once into a temp variable.
Here's your case:
a := '2 x 3'; // Lowercase x
b := '9 X 12'; // Upper case X
x := Pos('x', LowerCase(a)); // x = 3
x := Pos('x', LowerCase(b)); // x = 3
To see if a character is upper or lower, simply compare it against the UpCase version of it:
a := 'A';
b := 'b';
upper := a = UpCase(a); // True
upper := b = UpCase(b); // False
try using these functions (which are part of the Character unit)
Character.TCharacter.IsUpper
Character.TCharacter.IsLower
IsLower
IsUpper
UPDATE
For ansi versions of delphi you can use the GetStringTypeEx functions to fill a list with each ansi character type information. and thne compare the result of each element against the $0001(Upper Case) or $0002(Lower Case) values.
uses
Windows,
SysUtils;
Var
LAnsiChars: array [AnsiChar] of Word;
procedure FillCharList;
var
lpSrcStr: AnsiChar;
lpCharType: Word;
begin
for lpSrcStr := Low(AnsiChar) to High(AnsiChar) do
begin
lpCharType := 0;
GetStringTypeExA(LOCALE_USER_DEFAULT, CT_CTYPE1, #lpSrcStr, SizeOf(lpSrcStr), lpCharType);
LAnsiChars[lpSrcStr] := lpCharType;
end;
end;
function CharIsLower(const C: AnsiChar): Boolean;
const
C1_LOWER = $0002;
begin
Result := (LAnsiChars[C] and C1_LOWER) <> 0;
end;
function CharIsUpper(const C: AnsiChar): Boolean;
const
C1_UPPER = $0001;
begin
Result := (LAnsiChars[C] and C1_UPPER) <> 0;
end;
begin
try
FillCharList;
Writeln(CharIsUpper('a'));
Writeln(CharIsUpper('A'));
Writeln(CharIsLower('a'));
Writeln(CharIsLower('A'));
except
on E:Exception do
Writeln(E.Classname, ': ', E.Message);
end;
Readln;
end.
if myChar in ['A'..'Z'] then
begin
// uppercase
end
else
if myChar in ['a'..'z'] then
begin
// lowercase
end
else
begin
// not an alpha char
end;
..or D2009 on..
if charInSet(myChar,['A'..'Z']) then
begin
// uppercase
end
else
if charInSet(myChar,['a'..'z']) then
begin
// lowercase
end
else
begin
// not an alpha char
end;
The JCL has routines for this in the JclStrings unit, eg CharIsUpper and CharIsLower. SHould work in Delphi 7.
AnsiPos() is not case-sensitive. You can also force upper or lower case, irrespective of what the user enters using UpperCase() and LowerCase().
Just throwing this out there since you may find it far more simple than the other (very good) answers.

How to find a position of a substring within a string with fuzzy match

I have come across a problem of matching a string in an OCR recognized text and find the position of it considering there can be arbitrary tolerance of wrong, missing or extra characters. The result should be a best match position, possibly (not necessarily) with length of matching substring.
For example:
String: 9912, 1.What is your name?
Substring: 1. What is your name?
Tolerance: 1
Result: match on character 7
String: Where is our caat if any?
Substring: your cat
Tolerance: 2
Result: match on character 10
String: Tolerance is t0o h1gh.
Substring: Tolerance is too high;
Tolerance: 1
Result: no match
I have tried to adapt Levenstein algorithm, but it doesn't work properly for substrings and doesn't return position.
Algorithm in Delphi would be preferred, yet any implementation or pseudo logic would do.
Here's a recursive implementation that works, but might not be fast enough. The worst case scenario is when a match can't be found, and all but the last char in "What" gets matched at every index in Where. In that case the algorithm will make Length(What)-1 + Tolerance comparasions for each char in Where, plus one recursive call per Tolerance. Since both Tolerance and the length of What are constnats, I'd say the algorithm is O(n). It's performance will degrade linearly with the length of both "What" and "Where".
function BrouteFindFirst(What, Where:string; Tolerance:Integer; out AtIndex, OfLength:Integer):Boolean;
var i:Integer;
aLen:Integer;
WhatLen, WhereLen:Integer;
function BrouteCompare(wherePos, whatPos, Tolerance:Integer; out Len:Integer):Boolean;
var aLen:Integer;
aRecursiveLen:Integer;
begin
// Skip perfect match characters
aLen := 0;
while (whatPos <= WhatLen) and (wherePos <= WhereLen) and (What[whatPos] = Where[wherePos]) do
begin
Inc(aLen);
Inc(wherePos);
Inc(whatPos);
end;
// Did we find a match?
if (whatPos > WhatLen) then
begin
Result := True;
Len := aLen;
end
else if Tolerance = 0 then
Result := False // No match and no more "wild cards"
else
begin
// We'll make an recursive call to BrouteCompare, allowing for some tolerance in the string
// matching algorithm.
Dec(Tolerance); // use up one "wildcard"
Inc(whatPos); // consider the current char matched
if BrouteCompare(wherePos, whatPos, Tolerance, aRecursiveLen) then
begin
Len := aLen + aRecursiveLen;
Result := True;
end
else if BrouteCompare(wherePos + 1, whatPos, Tolerance, aRecursiveLen) then
begin
Len := aLen + aRecursiveLen;
Result := True;
end
else
Result := False; // no luck!
end;
end;
begin
WhatLen := Length(What);
WhereLen := Length(Where);
for i:=1 to Length(Where) do
begin
if BrouteCompare(i, 1, Tolerance, aLen) then
begin
AtIndex := i;
OfLength := aLen;
Result := True;
Exit;
end;
end;
// No match found!
Result := False;
end;
I've used the following code to test the function:
procedure TForm18.Button1Click(Sender: TObject);
var AtIndex, OfLength:Integer;
begin
if BrouteFindFirst(Edit2.Text, Edit1.Text, ComboBox1.ItemIndex, AtIndex, OfLength) then
Label3.Caption := 'Found #' + IntToStr(AtIndex) + ', of length ' + IntToStr(OfLength)
else
Label3.Caption := 'Not found';
end;
For case:
String: Where is our caat if any?
Substring: your cat
Tolerance: 2
Result: match on character 10
it shows a match on character 9, of length 6. For the other two examples it gives the expected result.
Here is a complete sample of fuzzy match (approximate search), and you can use/change the algorithm as you wish!
https://github.com/alidehban/FuzzyMatch

Resources