I want to count how many times a String occurs in another String in Pascal Script like shown in the below example.
I've seen the answer to Delphi: count number of times a string occurs in another string, but there is no PosEx function in Pascal Script.
MyString := 'Hello World!, Hello World!, Hello World!, Hello World!';
If I count the number of times Hello or World occurs here, the result should be 4.
If I count the number of times , (comma) occurs here, the result should be 3.
UPDATE
The following function works, but it copies given String again to a new Variable, and deletes parts of Strings, so it works slowly.
function OccurrencesOfSubString(S, SubStr: String): Integer;
var
DSStr: String;
begin
if Pos(SubStr, S) = 0 then
Exit
else
DSStr := S;
Repeat
if Pos(SubStr, S) <> 0 then
Inc(Result);
Delete(DSStr, Pos(SubStr, DSStr), Length(Copy(DSStr, Pos(SubStr, DSStr), Length(SubStr))));
Until Pos(SubStr, DSStr) = 0;
end;
Your implementation is generally correct.
There are some optimizations to be made and useless code to be removed:
The second test for if Pos(SubStr, S) <> 0 (within repeat) is pointless. It's true always. You are testing S, which was tested at the function start already. And the DSStr is already tested in the until.
You should save Pos(SubStr, DSStr) to a variable not to call it multiple times.
Length(Copy(DSStr, Pos(SubStr, DSStr), Length(SubStr))) is actually the same as Length(SubStr).
No need to copy the S to DSStr. You can work directly with the S. It's by-value parameter, so you do not modify the variable that you use to call the function.
Replace the initial Pos(SubStr, S) = 0 check with the same check in the loop to save one Pos call.
Optimized version of your code:
function OccurrencesOfSubString(S, SubStr: String): Integer;
var
P: Integer;
begin
Result := 0;
repeat
P := Pos(SubStr, S);
if P > 0 then
begin
Inc(Result);
Delete(S, P, Length(SubStr));
end;
until P = 0;
end;
But actually with the Inno Setup StringChange function (which Delphi does not have), you do not have to code any algorithm yourself.
function OccurrencesOfSubString(S, SubStr: String): Integer;
begin
Result := StringChange(S, SubStr, '');
end;
This was inspired by the #RobertFrank's answer to Delphi: count number of times a string occurs in another string.
While the use of the StringChange looks inefficient (as it has significant side effects), it's actually faster. Probably because it is implemented in Pascal, not in Pascal Script.
Tested with 3 million calls to:
OccurrencesOfSubString('Hello World!, Hello World!, Hello World!, Hello World!', 'Hello')
With StringChange: 11 seconds
My optimized version of your code: 49 seconds
Your original code: 99 seconds
Though for few calls, all implementations are good enough.
Related
I am an intermediate Delphi programmer that needs to learn a lot so I hope my question here is not to dumb. I have a file with 1546 strings that I need to place in a StringList and do a custom sort. The strings look like this:
2:X,X,2,2,2,X<A>11
7:5,7,7,6,5,5<A>08
3:3,X,0,0,1,0<C/D>11
5:X,2,4,2,5,2<Asus2/Gb>02
3:0,3,2,0,3,0<C/D>02
4:X,0,4,4,0,0<Asus2/Gb>11
4:X,X,4,4,4,2<B>01
3:3,2,1,0,0,3<B#5>11
I need them to look like this:
2:X,X,2,2,2,X<A>11
7:5,7,7,6,5,5<A>08
5:X,2,4,2,5,2<Asus2/Gb>11
4:X,0,4,4,0,0<Asus2/Gb>02
4:X,X,4,4,4,2<B>01
3:3,2,1,0,0,3<B#5>11
3:3,X,0,0,1,0<C/D>11
3:0,3,2,0,3,0<C/D>02
They need to be sorted by the portion of the string between the <...> and the last 2 chars. Any help would be much appreciated.
OK...done, Works quite well. Sorts a list with over 1500 strings in 62ms. Constructive criticism will be appreciated.
function SortChords(List:TStringList; idx1,idx2:integer): integer;
var
s1,s2:string;
begin
s1:=List[idx1];
s1:=copy(s1,pos('<',s1)+1,pos('>',s1)-pos('<',s1)-1);
s2:=List[idx2];
s2:=copy(s2,pos('<',s2)+1,pos('>',s2)-pos('<',s2)-1);
if s1 < s2 then
result:=-1
else if s1 > s2 then
result:=1
else
result:=0;
end;
You can write your own custom sort procedure and use TStringList.CustomSort to sort in the desired order.
The following demonstrates using the custom sort. It does not produce the exact output you describe, because you're not clear how you determine the precedence of two items that have the same value between the <> (as in lines 1 and 2, or 3 and 4, of your expected output; you can add code to decide the final order where I've indicated in the code comment. The sample is a complete console application that demonstrates sorting the values you've provided. It's slightly verbose in variable declarations for clarity.
program Project1;
{$APPTYPE CONSOLE}
uses
SysUtils, Classes;
function ListSortProc(List: TStringList; Index1, Index2: Integer): Integer;
var
StartPosA, EndPosA: Integer;
StartPosB, EndPosB: Integer;
TestValA, TestValB: string;
Comp: Integer;
begin
StartPosA := Pos('<', List[Index1]) + 1;
EndPosA := Pos('>', List[Index1]);
TestValA := Copy(List[Index1], StartPosA, EndPosA - StartPosA);
StartPosB := Pos('<', List[Index2]) + 1;
EndPosB := Pos('>', List[Index2]);
TestValB := Copy(List[Index2], StartPosB, EndPosB - StartPosB);
Result := CompareStr(TestValA, TestValB);
{ To do further processing for lines with the same value, add
code here.
if Result = 0 then
// Decide on the order of the equal values with whatever
// criteria you want.
}
end;
var
SL: TStringList;
s: String;
begin
SL := TStringList.Create;
try
SL.Add('2:X,X,2,2,2,X<A>11');
SL.Add('7:5,7,7,6,5,5<A>08');
SL.Add('3:3,X,0,0,1,0<C/D>11');
SL.Add('5:X,2,4,2,5,2<Asus2/Gb>02');
SL.Add('3:0,3,2,0,3,0<C/D>02');
SL.Add('4:X,0,4,4,0,0<Asus2/Gb>11');
SL.Add('4:X,X,4,4,4,2<B>01');
SL.Add('3:3,2,1,0,0,3<B#5>11');
SL.CustomSort(ListSortProc);
for s in SL do
WriteLn(s);
ReadLn;
finally
SL.Free;
end;
end.
The code above produces this output:
7:5,7,7,6,5,5<A>08
2:X,X,2,2,2,X<A>11
4:X,0,4,4,0,0<Asus2/Gb>11
5:X,2,4,2,5,2<Asus2/Gb>02
4:X,X,4,4,4,2<B>01
3:3,2,1,0,0,3<B#5>11
3:0,3,2,0,3,0<C/D>02
3:3,X,0,0,1,0<C/D>11
Problem
There are multiple ways to store string reference, so how would you do it in the example code? Currently the problem is with storing access to string because it is causing non-local pointer cannot point to local object. Is storing 'First and 'Last to reference a string a preferable way?
String reference storage
This record stores reference to a string. The First and Last is supposed to point to a string. The Name should be able to the same I think, but that will cause non-local pointer cannot point to local object when a local string is assigned to that. So the current work around solution is to use First and Last.
type Segment is record
First : Positive;
Last : Positive;
Length : Natural := 0;
Name : access String;
end record;
Assigning sub string reference
The commented line is causing non-local pointer cannot point to local object. This is because Item is local. Source is not local and that is the string I want sub string references from.
procedure Find (Source : aliased String; Separator : Character; Last : out Natural; Item_Array : out Segment_Array) is
P : Positive := Source'First;
begin
for I in Item_Array'Range loop
declare
Item : aliased String := Separated_String_Next (Source, Separator, P);
begin
exit when Item'Length = 0;
Item_Array (I).Length := Item'Length;
Item_Array (I).First := Item'First;
Item_Array (I).Last := Item'Last;
--Item_Array (I).Name := Item'Access;
Last := I;
end;
end loop;
end;
Example
with Ada.Text_IO;
with Ada.Integer_Text_IO;
procedure Main is
use Ada.Text_IO;
use Ada.Integer_Text_IO;
function Separated_String_Next (Source : String; Separator : Character; P : in out Positive) return String is
A : Positive := P;
B : Positive;
begin
while A <= Source'Last and then Source(A) = Separator loop
A := A + 1;
end loop;
P := A;
while P <= Source'Last and then Source(P) /= Separator loop
P := P + 1;
end loop;
B := P - 1;
while P <= Source'Last and then Source(P) = Separator loop
P := P + 1;
end loop;
return Source (A .. B);
end;
type Segment is record
First : Positive;
Last : Positive;
Length : Natural := 0;
Name : access String;
end record;
type Segment_Array is array (Integer range <>) of Segment;
procedure Find (Source : String; Separator : Character; Last : out Natural; Item_Array : out Segment_Array) is
P : Positive := Source'First;
begin
for I in Item_Array'Range loop
declare
Item : aliased String := Separated_String_Next (Source, Separator, P);
begin
exit when Item'Length = 0;
Item_Array (I).Length := Item'Length;
Item_Array (I).First := Item'First;
Item_Array (I).Last := Item'Last;
--Item_Array (I).Name := Item'Access;
Last := I;
end;
end loop;
end;
Source : String := ",,Item1,,,Item2,,Item3,,,,,,";
Item_Array : Segment_Array (1 .. 100);
Last : Natural;
begin
Find (Source, ',', Last, Item_Array);
Put_Line (Source);
Put_Line ("Index First Last Name");
for I in Item_Array (Item_Array'First .. Last)'Range loop
Put (I, 5);
Put (Item_Array (I).First, 6);
Put (Item_Array (I).Last, 5);
Put (" ");
Put (Source (Item_Array (I).First .. Item_Array (I).Last));
New_Line;
end loop;
end;
Output
,,Item1,,,Item2,,Item3,,,,,,
Index First Last Name
1 3 7 Item1
2 11 15 Item2
3 18 22 Item3
The error message tells you exactly what is wrong : Item is a string declared locally, i.e. on the stack, and you are assigning its address to an access type (pointer). I hope I don't need to explain why that won't work.
The immediate answer - which isn't wrong but isn't best practice either, is to allocate space for a new string - in a storage pool or on the heap - which is done with new.
Item : access String := new String'(Separated_String_Next (Source, Separator, P));
...
Item_Array (I).Name := Item;
Note that some other record members, at least, Length all appear to be completely redundant since it is merely a copy of its eponymous attributes, so should probably be eliminated (unless there's a part of the picture I can't see).
There are better answers. Sometimes you need to use access types, and handle their object lifetimes and all the ways they can go wrong. But more often their appearance is a hint that something in the design can be improved : for example:
the Unbounded_String may manage your strings more simply
You could use the length as a discriminant on the Segment record, and store the actual string (not an Access) in the record itself
Ada.Containers are a standard library of containers to abstract over handling the storage yourself (much as the STL is used in C++).
If you DO decide you need access types, it's better to use a named access type type Str_Access is access String; - then you can create a storage pool specific to Str_Acc types, and release the entire pool in one operation, to simplify object lifetime management and eliminate memory leaks.
Note the above essentially "deep copies" the slices of the Source string. If there is a specific need to "shallow copy" it - i.e. refer to the specific substrings in place - AND you can guarantee its object lifetime, this answer is not what you want. If so, please clarify the intent of the question.
For a "shallow copy" the approach in the question essentially fails because Item is already a deep copy ... on the stack.
The closest approach I can see is to make the source string aliassed ... you MUST do as you want each Segment to refer to it ... and pass its access to the Find procedure.
Then each Segment becomes a tuple of First, Last, (redundant Length) and access to the entire string (rather than a substring).
procedure Find (Source : access String; Separator : Character;
Last : out Natural; Item_Array : out Segment_Array) is
P : Positive := Source'First;
begin
for I in Item_Array'Range loop
declare
Item : String := Separated_String_Next (Source.all, Separator, P);
begin
exit when Item'Length = 0;
...
Item_Array (I).Name := Source;
Last := I;
end;
end loop;
end;
Source : aliased String := ",,Item1,,,Item2,,Item3,,,,,,";
...
Find (Source'access, ',', Last, Item_Array);
for I in Item_Array (Item_Array'First .. Last)'Range loop
...
Put (Item_Array (I).Name(Item_Array (I).First .. Item_Array (I).Last));
New_Line;
end loop;
A helper to extract a string from a Segment would probably be useful:
function get(S : Segment) return String is
begin
return S.Name(S.First .. S.Last);
end get;
...
Put (get(Item_Array (I));
The only rationale I can see for such a design is where the set of strings to be parsed or dissected will barely fit in memory so duplication must be avoided. Perhaps also embedded programming or some such discipline where dynamic (heap) allocation is discouraged or even illegal.
I see no solution involving address arithmetic within a string, since an array is not merely its contents - if you point within it, you lose the attributes. You can make the same criticism of the equivalent C design : you can identify the start of a substring with a pointer, but you can't just stick a null terminator at the end of the substring without breaking the original string.
Given the bigger picture ... what you need, rather than the low level details of how you want to achieve it, there are probably better solutions.
The program has several "encryption" algorithms. This one should blockwise reverse the input. "He|ll|o " becomes "o |ll|He" (block length of 2).
I add two strings, in this case appending the result string to the current "block" string and making that the result. When I add the result first and then the block it works fine and gives me back the original string. But when i try to reverse the order it just gives me the the last "block".
Several other functions that are used for "rotation" are above.
//amount of blocks
function amBl(i1:integer;i2:integer):integer;
begin
if (i1 mod i2) <> 0 then result := (i1 div i2) else result := (i1 div i2) - 1;
end;
//calculation of block length
function calcBl(keyStr:string):integer;
var i:integer;
begin
result := 0;
for i := 1 to Length(keyStr) do
begin
result := (result + ord(keyStr[i])) mod 5;
result := result + 2;
end;
end;
//desperate try to add strings
function append(s1,s2:string):string;
begin
insert(s2,s1,Length(s1)+1);
result := s1;
end;
function rotation(inStr,keyStr:string):string;
var //array of chars -> string
block,temp:string;
//position in block variable
posB:integer;
//block length and block count variable
bl, bc:integer;
//null character as placeholder
n : ansiChar;
begin
//calculating block length 2..6
bl := calcBl(keyStr);
setLength(block,bl);
result := '';
temp := '';
{n := #00;}
for bc := 0 to amBl(Length(inStr),bl) do
begin
//filling block with chars starting from back of virtual block (in inStr)
for posB := 1 to bl do
begin
block[posB] := inStr[bc * bl + posB];
{if inStr[bc * bl + posB] = ' ' then block[posB] := n;}
end;
//adding the block in front of the existing result string
temp := result;
result := block + temp;
//result := append(block,temp);
//result := concat(block,temp);
end;
end;
(full code http://pastebin.com/6Uarerhk)
After all the loops "result" has the right value, but in the last step (between "result := block + temp" and the "end;" of the function) "block" replaces the content of "result" with itself completely, it doesn't add result at the end anymore.
And as you can see I even used a temp variable to try to work around that.. doesnt change anything though.
I am 99.99% certain that your problem is due to a subtle bug in your code. However, your deliberate efforts to hide the relevant code mean that we're really shooting in the dark. You haven't even been clear about where you're seeing the shortened Result: GUI Control/Debugger/Writeln
The irony is that you have all the information at your fingertips to provide a small concise demonstration of your problem - including sample input and expected output.
So without the relevant information, I can only guess; I do think I have a good hunch though.
Try the following code and see if you have a similar experience with S3:
S1 := 'a'#0;
S2 := 'bc';
S3 := S1 + S2;
The reason for my hunch is that #0 is a valid character in a string: but whenever that string needs to be processed as PChar, #0 will be interpreted as a string terminator. This could very well cause the "strange behaviour" you're seeing.
So it's quite probable that you have at least one of the following 2 bugs in your code:
You are always processing 1 too many characters; with the extra character being #0.
When your input string has an odd number of characters: your algorithm (which relies on pairs of characters) adds an extra character with value #0.
Edit
With the additional source code, my hunch is confirmed:
Suppose you have a 5 character string, and key that produces block length 2.
Your inner loop (for posB := 1 to bl do) will read beyond the length of inStr on the last iteration of the outer loop.
So if the next character in memory happens to be #0, you will be doing exactly as described above.
Additional problem. You have the following code:
//calculating block length 2..6
bl := calcBl(keyStr);
Your assumption in the comment is wrong. From the implementation of calcBl, if keyStr is empty, your result will be 0.
I have defined
subtype String10 is String(1..10);
and I am attempting to get keyboard input to it without having to manually enter whitespace before hitting enter. I tried get_line() but from some reason it wouldn't actually wait for input before outputting the get put() command, and I also think it will just leave whatever was in the string before there and not fill it with white space.
I know about and have used Bounded_String and Unbounded_String, but I am wondering if there is a way to make this work.
I've tried making a function for it:
--getString10--
procedure getString10(s : string10) is
c : character;
k : integer;
begin
for i in integer range 1..10 loop
get(c);
if Ada.Text_IO.End_Of_Line = false then
s(i) := c;
else
k := i;
exit;
end if;
end loop;
for i in integer range k..10 loop
s(i) := ' ';
end loop;
end getString10;
but, here, I know the s(i) doesn't work, and I don't think the
"if Ada.Text_IO.End_Of_Line = false then"
does what I'm hoping it will do either. It's kinda just a placeholder while I look for the actual way to do it.
I been searching for a couple hours now, but Ada documentation isn't as available or clear as other languages. I've found a lot about getting strings, but not what I'm looking for.
Just pre-initialize the string with spaces before calling Get_Line.
Here's a little program I just threw together:
with Ada.Text_IO; use Ada.Text_IO;
procedure Foo is
S: String(1 .. 10) := (others => ' ');
Last: Integer;
begin
Put("Enter S: ");
Get_Line(S, Last);
Put_Line("S = """ & S & """");
Put_Line("Last = " & Integer'Image(Last));
end Foo;
and the output I get when I run it:
Enter S: hello
S = "hello "
Last = 5
Another possibility, rather than pre-initializing the string, is to set the remainder to spaces after the Get_Line call:
with Ada.Text_IO; use Ada.Text_IO;
procedure Foo is
S: String(1 .. 10);
Last: Integer;
begin
Put("Enter S: ");
Get_Line(S, Last);
S(Last+1 .. S'Last) := (others => ' ');
Put_Line("S = """ & S & """");
Put_Line("Last = " & Integer'Image(Last));
end Foo;
For very large arrays, the latter approach might be more efficient because it doesn't assign the initial portion of the string twice, but in practice the difference is unlikely to be significant.
As an alternative, use either function Get_Line, which returns a fixed-length String that "has a lower bound of 1 and an upper bound of the number of characters read." The example Line_By_Line uses the variation that reads from a file. If need be, you can then use procedure Move to copy the Source string to the Target string; the procedure automatically pads with space by default.
Addendum: For example, this Line_Test pads with * and silently truncates long lines on the right.
with Ada.Integer_Text_IO;
with Ada.Strings.Fixed;
with Ada.Text_IO;
procedure Line_Test is
Line_Count : Natural := 0;
Buffer: String(1 .. 10);
begin
while not Ada.Text_IO.End_Of_File loop
declare
Line : String := Ada.Text_IO.Get_Line;
begin
Line_Count := Line_Count + 1;
Ada.Integer_Text_IO.Put(Line_Count, 0);
Ada.Text_IO.Put_Line(": " & Line);
Ada.Strings.Fixed.Move(
Source => Line,
Target => Buffer,
Drop => Ada.Strings.Right,
Justify => Ada.Strings.Left,
Pad => '*');
Ada.Integer_Text_IO.Put(Line_Count, 0);
Ada.Text_IO.Put_Line(": " & Buffer);
end;
end loop;
end Line_Test;
I'm expanding a class of mine for storing generic size strings to allow more flexible values for user input. For example, my prior version of this class was strict and allowed only the format of 2x3 or 9x12. But now I'm making it so it can support values such as 2 x 3 or 9 X 12 and automatically maintain the original user's formatting if the values get changed.
The real question I'm trying to figure out is just how to detect if one character from a string is either upper or lower case? Because I have to detect case sensitivity. If the deliminator is 'x' (lowercase) and the user inputs 'X' (uppercase) inside the value, and case sensitivity is turned off, I need to be able to find the opposite-case as well.
I mean, the Pos() function is case sensitive...
Delphi 7 has UpperCase() and LowerCase() functions for strings. There's also UpCase() for characters.
If I want to search for a substring within another string case insensitively, I do this:
if Pos('needle', LowerCase(hayStack)) > 0 then
You simply use lower case string literals (or constants) and apply the lowercase function on the string before the search. If you'll be doing a lot of searches, it makes sense to convert just once into a temp variable.
Here's your case:
a := '2 x 3'; // Lowercase x
b := '9 X 12'; // Upper case X
x := Pos('x', LowerCase(a)); // x = 3
x := Pos('x', LowerCase(b)); // x = 3
To see if a character is upper or lower, simply compare it against the UpCase version of it:
a := 'A';
b := 'b';
upper := a = UpCase(a); // True
upper := b = UpCase(b); // False
try using these functions (which are part of the Character unit)
Character.TCharacter.IsUpper
Character.TCharacter.IsLower
IsLower
IsUpper
UPDATE
For ansi versions of delphi you can use the GetStringTypeEx functions to fill a list with each ansi character type information. and thne compare the result of each element against the $0001(Upper Case) or $0002(Lower Case) values.
uses
Windows,
SysUtils;
Var
LAnsiChars: array [AnsiChar] of Word;
procedure FillCharList;
var
lpSrcStr: AnsiChar;
lpCharType: Word;
begin
for lpSrcStr := Low(AnsiChar) to High(AnsiChar) do
begin
lpCharType := 0;
GetStringTypeExA(LOCALE_USER_DEFAULT, CT_CTYPE1, #lpSrcStr, SizeOf(lpSrcStr), lpCharType);
LAnsiChars[lpSrcStr] := lpCharType;
end;
end;
function CharIsLower(const C: AnsiChar): Boolean;
const
C1_LOWER = $0002;
begin
Result := (LAnsiChars[C] and C1_LOWER) <> 0;
end;
function CharIsUpper(const C: AnsiChar): Boolean;
const
C1_UPPER = $0001;
begin
Result := (LAnsiChars[C] and C1_UPPER) <> 0;
end;
begin
try
FillCharList;
Writeln(CharIsUpper('a'));
Writeln(CharIsUpper('A'));
Writeln(CharIsLower('a'));
Writeln(CharIsLower('A'));
except
on E:Exception do
Writeln(E.Classname, ': ', E.Message);
end;
Readln;
end.
if myChar in ['A'..'Z'] then
begin
// uppercase
end
else
if myChar in ['a'..'z'] then
begin
// lowercase
end
else
begin
// not an alpha char
end;
..or D2009 on..
if charInSet(myChar,['A'..'Z']) then
begin
// uppercase
end
else
if charInSet(myChar,['a'..'z']) then
begin
// lowercase
end
else
begin
// not an alpha char
end;
The JCL has routines for this in the JclStrings unit, eg CharIsUpper and CharIsLower. SHould work in Delphi 7.
AnsiPos() is not case-sensitive. You can also force upper or lower case, irrespective of what the user enters using UpperCase() and LowerCase().
Just throwing this out there since you may find it far more simple than the other (very good) answers.