Extract multiple substring - string

I have a string 123HEREWEGOMAMAMIAGOWILLYOULETMEGOMAMAMIATOLOVEYOUSOMAMAMIASOIHATEYOUMAMAMIAHEY How can i get all available MAMAMIA+ 2 string length ? I need : MAMAMIAGO, MAMAMIASO, etc.
My problem, on last substring found give me wrong output. It render the begining of string with length of substring given : 123HEREWE instead of MAMAMIAGO.
I am using this function:
function Occurrences(const Substring, Text: string): Integer;
var
offset: Integer;
begin
Result := 0;
offset := PosEx(Substring, Text, 1);
while offset <> 0 do
begin
inc(Result);
offset := PosEx(Substring, Text, offset + Length(Substring));
memo1.Lines.Add(Copy(Text, offset, Length(Substring)+2));
end;
end;
my procedure
procedure TFH129.btn1Click(Sender: TObject);
var
s: string;
i: Integer;
begin
s := '123HEREWEGOHEREWEGOMAMAMIAGOWILLYOULETMEGOMAMAMIATOLOVEYOUSOMAMAMIASOIHATEYOUMAMAMIAHEY00';
i := Occurrences('MAMAMIA', s);
end;

You are adding the matching text in the wrong place, without checking if the added match exceeds the string length.
When entering the while loop, first thing to do is to check if the added match does not exceed the search string length. Then you can add the match before starting a new search.
procedure ExtractSubStringsPlus2(const SubString, SearchString: string; list: TStringList);
// Extract found sub strings together with the next two characters into a list
// Use StrUtils.PosEx if Delphi version is older than XE3
var
offset,len: integer;
begin
list.Clear;
len := Length(SubString);
offset := Pos(Substring, SearchString, 1);
while offset <> 0 do
begin
// Test if added length is past length of search string
if (offset + len + 1 > Length(SearchString))
then break;
// Copy found match
list.Add(Copy(SearchString,offset,len+2));
// Continue search
offset := Pos(Substring, SearchString, offset + len);
end;
end;

Your problem is as described below in the added comment:
function Occurrences(const Substring, Text: string): Integer;
var
offset: Integer;
begin
Result := 0;
offset := PosEx(Substring, Text, 1);
while offset <> 0 do
begin
inc(Result);
offset := PosEx(Substring, Text, offset + Length(Substring));
// If the string isn't found any more, offset becomes 0, but
// you still add a new string to the memo...
memo1.Lines.Add(Copy(Text, offset, Length(Substring)+2));
end;
end;
Use this instead:
FUNCTION Occurrences(CONST SubString,Text : STRING): Cardinal;
CONST
HellFreezesOver = FALSE;
VAR
OFS : Cardinal;
BEGIN
Result:=0; OFS:=0;
REPEAT
OFS:=PosEx(SubString,Text,OFS+1);
// Here I terminate the loop, in case the string isn't found any more.
IF OFS=0 THEN BREAK;
INC(Result);
Memo1.Lines.Add(COPY(Text,OFS,LENGTH(SubString)+2))
UNTIL HellFreezesOver
END;

One very simple way using TstringList :
Var aList: TstringList;
i: integer;
begin
aList := TstringList.create;
try
aList.lineBreak := 'MAMAMIA';
aList.text := 'x' + '123HEREWEGOMAMAMIAGOWILLYOULETMEGOMAMAMIATOLOVEYOUSOMAMAMIASOIHATEYOUMAMAMIAHEY';
For I := 1 to aList.count-1 do
if length(aList[i]) >=2 then memo1.Lines.Add(aList.lineBreak + Copy(aList[i], 1, 2));
finally
aList.free;
end;
end;

Related

Delphi: Remove chars from string

I have a string containing letters, numbers and other chars.
I want to remove from that string all numbers, dots and commas
Before: 'Axis moving to new position - X-Pos: 5.4mm / Y-Pos: 3.5mm'
After: 'Axis moving to new position - X-Pos mm / Y-Pos mm'
Unfortunately string.replace() only replaces one character. So I need several lines.
How can I avoid writing every replacement line by line?
sString := sString.Replace('0', '');
sString := sString.Replace('1', '');
sString := sString.Replace('2', '');
sString := sString.Replace('3', '');
sString := sString.Replace('3', '');
...
sString := sString.Replace(':', '');
sString := sString.Replace('.', '');
Although the OP's own solution is fine, it is somewhat inefficient.
Just for completeness, here's a slightly more optimized version:
function RemoveCharsFromString(const AString, AChars: string): string;
begin
SetLength(Result, AString.Length);
var ActualLength := 0;
for var i := 1 to AString.Length do
begin
if SomePredicate(AString[i]) then
begin
Inc(ActualLength);
Result[ActualLength] := AString[i];
end;
end;
SetLength(Result, ActualLength);
end;
The algorithm is independent of the particular predicate. In this case, the predicate is Pos(AString[i], AChars) = 0.
There are multiple ways of how you can approach this. Here are three solution.
Solution 1
You can go and simply loop though the source string checking each character to se if it is one of the characters that needs to be removed.
//Simple function that loops through all characters of the source strings removing them one by one
//It is manipulating the same string all the time
function Removechars1(sourceString: string; sCharsToBeRemoved: string):string;
var I: Integer;
begin
//Assign value of the source string to the result so we can work with result strin from now on
result := SourceString;
//Loop throught the whole result sring starting at end searching for characters to be removed
//We start at the end because when we will be removing characters from the string its length
//will be decreasing.
for I := Length(result) downto 1 do
begin
//Use Pos function to see if specific character in the result string can also be found
//in the sCharsToBeRemoved and therefore needs to be removed
if Pos(result[i], sCharsToBeRemoved) <> 0 then
begin
//If so we delete the specific character
Delete(result,I,1);
end;
end;
end;
Solution 2
Second solution is similar to the first one but it relies on adding characters non removable characters to the result. It is slightly slower than the first solution
//Slightly faster function that also loops through the whole sourceString character by character
//and adds such characters to result string if they are not present in sCharsToBeRemoved string
function RemoveChars2(sourceString: string; sCharsToBeRemoved: string):string;
var I: Integer;
begin
//Prepare enpty string for our result strung to which we will be copying our end characters
result := '';
//Loop through the whole string
for I := 1 to Length(sourceString) do
begin
//If specific haracter can't be found in sCharsToBeRemoved copy that character to the
//result string
if Pos(sourceString[I], sCharsToBeRemoved) = 0 then
begin
result := result + sourceString[I];
end;
end;
end;
Solution 3
The third solution relies on string helpers for replacing specific characters. This one is by far the fastest of the three needing about half the time that is needed by the first solution to process the same job
//Another approach of removing characters from source string that relies on Replace string helper
function RemoveChars3(sourceString: string; sCharsToBeRemoved: string):string;
var I: Integer;
begin
//Assign value of the source string to the result so we can work with result strin from now on
result := sourceString;
//Loop through the sCharsToBeRemoved string so we can then call Replace string helper in order
//to replace all occurrences of such character in sourceString;
for I := 1 to Length(sCharsToBeRemoved) do
begin
result := result.Replace(sCharsToBeRemoved[I],'');
end;
end;
Main advantages of this approach is that it is quite fast and could easily modified to be able to remove whole substrings and not only individual characters.
PS: In my testing your solution was actually the slowest needing about 20% more time than my first solution
TestTring
jfkldajflkajdflkajlkčfjaskljflakjflkdasjflkčjdfkldafjadklfjadklfjaldkakljfkldajflkčadjslfkjadklfjlkadčjflkajdflčkjadlkfjladkdjfkladjflkadjflkčjadklčfjaldkjfkladjfklajflkadjfkadgfkljdklfjawdkojfkladsjflčaksdjdfklčasjdklčfdfklčjadslkdfjlka
CharsToBeRemoved
asjk
Solution 1
1230 ms
Solution 2
1263 ms
Solution 3
534 ms
Your solution
1574 ms
This solution works with a very small footprint of code lines.
I just split the string on each occurence of a char which should be removed. After that I put the pieces together without the removed chars.
uses System.SysUtils;
function RemoveCharsFromString(sFullString: string; sCharsToBeRemoved: string): string;
var
splitted: TArray<String>;
begin
splitted := sFullString.Split(sCharsToBeRemoved.ToCharArray());
Result := string.Join('', splitted);
end;
string.Replace has an overload where you can pass flags to replace all instead of just one. Example:
sString := sString.Replace('1', '', [rfReplaceAll, rfIgnoreCase]);
Edit: Stringlist equivalent:
sString.Text := sString.Text.Replace('1', '', [rfReplaceAll, rfIgnoreCase]);
Working with strings spends more time, use PChar instead.
I think here's a slightly more optimized version
function RemoveCharsFromString(const AString, AChars: String): String;
var
i, j, k, LenString, LenChars : Integer;
PString, PChars : PChar;
label
Ends;
begin
PString := Pointer(AString);
PChars := Pointer(AChars);
LenString := AString.Length;
LenChars := AChars.Length;
k := 0;
for i := 0 to LenString - 1 do
begin
for j := 0 to LenChars - 1 do
if PString[i] = PChars[j] then
Goto Ends;
PString[k] := PString[i];
Inc(k);
Ends :
end;
PString[k] := #0;
Result := StrPas(PString);
end;
If you don't like Labels, use this code :
function RemoveCharsFromString(const AString, AChars: String): String;
var
i, j, k, LenString, LenChars : Integer;
PString, PChars : PChar;
found : Boolean;
begin
PString := Pointer(AString);
PChars := Pointer(AChars);
LenString := AString.Length;
LenChars := AChars.Length;
k := 0;
for i := 0 to LenString - 1 do
begin
found := False;
for j := 0 to LenChars - 1 do
if PString[i] = PChars[j] then
begin
found := True;
Break;
end;
if not found then
begin
PString[k] := PString[i];
Inc(k);
end;
end;
PString[k] := #0;
Result := StrPas(PString);
end;
You can call it like this :
sString := RemoveCharsFromString(sString, '0123456789.,');

Trouble with string in Delphi

I have a random string of numbers
(numbers can only be used once, only from 1-9, almost any length(min 1,max 9)):
var
Input: String;
begin
Input := '431829576'; //User inputs random numbers
And now I need to get specified number to front. How about 5.
var
Number: Integer;
begin
Number := 5;
and function executes with result 543182976.
I don't have any ideas how to make a function like this, Thanks.
Do you mean like this?
function ForceDigitInFront(const S: string; const Digit: Char): string;
begin
result := Digit + StringReplace(S, Digit, '', []);
end;
A more lightweight solution is
function ForceDigitInFront(const S: string; const Digit: Char): string;
var
i: Integer;
begin
result := S;
for i := 1 to Length(S) do
if result[i] = Digit then
begin
Delete(result, i, 1);
break;
end;
result := Digit + result;
end;
You could do it this way :
function ForceDigitInFront(const S: string; const Digit: Char): string;
var
dPos : Integer;
begin
Result := s;
dPos := Pos( Digit,S);
if (dPos <> 0) then begin // Only apply Digit in front if Digit exists !?
Delete( Result,dPos,1);
Result := Digit + Result;
end;
end;
If Digit is not in input string, the digit is not added here, but change this if it does not fit your implementation.
Here is a solution that reduces the numer of String allocations needed, as well as checks if the digit is already in the front:
function ForceDigitInFront(const S: string; const Digit: Char): string;
var
dPos : Integer;
begin
Result := s;
for dPos := 1 to Length(Result) do
begin
if Result[dPos] = Digit then
begin
if dPos > 1 then
begin
UniqueString(Result);
Move(Result[1], Result[2], (dPos-1) * SizeOf(Char));
Result[1] := Digit;
end;
Exit;
end;
end;
end;

How to get a specific field from delimited text

I have a string of delimited text ie:
Value1:Value2:Value3:Value4:Value5:Value6
How would I extract, for example, a specific value Ie:
Label.caption := GetValuefromDelimitedText(2); to get Value2
Thanks in advance
Paul
Something like that - if you like compact code (but not as performant as Davids):
function GetValueFromDelimitedText(const s: string; Separator: char; Index: Integer): string;
var sl : TStringList;
begin
Result := '';
sl := TStringList.Create;
try
sl.Delimiter := Separator;
sl.DelimitedText := s;
if sl.Count > index then
Result := sl[index];
finally
sl.Free;
end;
end;
Hope that helps
This should do it:
function GetValueFromDelimitedText(
const s: string;
const Separator: char;
const Index: Integer
): string;
var
i, ItemIndex, Start: Integer;
begin
ItemIndex := 1;
Start := 1;
for i := 1 to Length(s) do begin
if s[i]=Separator then begin
if ItemIndex=Index then begin
Result := Copy(s, Start, i-Start);
exit;
end;
inc(ItemIndex);
Start := i+1;
end;
end;
if ItemIndex=Index then begin
Result := Copy(s, Start, Length(s)-Start+1);
end else begin
Result := '';
end;
end;
This version allows you to specify the separator, you would obviously pass ':'. If you ask for an item beyond the end then the function will return the empty string. You could change that to an exception if you preferred. Finally, I have arranged that this uses 1-based indexing as per your example, but I personally would choose 0-based indexing.
If using Delphi XE or higher you can also use StrUtils.SplitString like this:
function GetValueFromDelimitedText (const Str: string; Separator: Char; Index: Integer) : string;
begin
Result := SplitString (Str, Separator) [Index];
end;
In production code, you should check that Index is indeed a valid index.
This method returns a TStringDynArray (a dynamic array of strings) so you can also use it like this (using enumerators):
for Str in SplitString (Str, Separator) do
Writeln (Str);
which can be very useful IMHO.

Find and Count Words in a String in Delphi?

I have a string comprising numerous words. How do I find and count the total amount of times that a particular word appears?
E.g "hello-apple-banana-hello-pear"
How would I go about finding all the "hello's" in the example above?
Thanks.
In Delphi XE you can use StrUtils.SplitString.
Something like this
var
Words: TstringDynArray;
Word: string;
WordCount: Integer;
begin
WordCount := 0;
Words := SplitString('hello-apple-banana-hello-pear', '-');
for Word in Words do
begin
if Word = 'hello' then
inc(WordCount);
end;
This would depend entirely on how you define a word and the text from which you wish to pull the words. If a "word" is everything between spaces, or "-" in your example, then it becomes a fairly simple task. If, however, you want to deal with hyphenated words, abbreviations, contractions, etc. then it becomes a lot more difficult.
More information please.
EDIT: After rereading your post, and if the example you give is the only one you want, then I'd suggest this:
function CountStr(const ASearchFor, ASearchIn : string) : Integer;
var
Start : Integer;
begin
Result := 0;
Start := Pos(ASearchFor, ASearchIn);
while Start > 0 do
begin
Inc(Result);
Start := PosEx(ASearchFor, ASearchIn, Start + 1);
end;
end;
This will catch ALL instances of a sequence of characters.
I'm sure there is plenty of code around to do this sort of thing, but it's easy enough to do it yourself with the help of Generics.Collections.TDictionary<K,V>.
program WordCount;
{$APPTYPE CONSOLE}
uses
SysUtils, Character, Generics.Collections;
function IsSeparator(const c: char): Boolean;
begin
Result := TCharacter.IsWhiteSpace(c);//replace this with whatever you want
end;
procedure PopulateWordDictionary(const s: string; dict: TDictionary<string, Integer>);
procedure AddItem(Item: string);
var
Count: Integer;
begin
if Item='' then
exit;
Item := LowerCase(Item);
if dict.TryGetValue(Item, Count) then
dict[Item] := Count+1
else
dict.Add(Item, 1);
end;
var
i, len, Start: Integer;
Item: string;
begin
len := Length(s);
Start := 1;
for i := 1 to len do begin
if IsSeparator(s[i]) then begin
AddItem(Copy(s, Start, i-Start));
Start := i+1;
end;
end;
AddItem(Copy(s, Start, len-Start+1));
end;
procedure Main;
var
dict: TDictionary<string, Integer>;
pair: TPair<string, Integer>;
begin
dict := TDictionary<string, Integer>.Create;
try
PopulateWordDictionary('hello apple banana Hello pear', dict);
for pair in dict do
Writeln(pair.Key, ': ', pair.Value);
finally
dict.Free;
end;
end;
begin
try
Main;
except
on E: Exception do
Writeln(E.ClassName, ': ', E.Message);
end;
end.
Output:
hello: 2
banana: 1
apple: 1
pear: 1
Note: I'm working with Delphi 2010 and don't have SplitString() available.
A very clever implementation I saw somewhere on the web:
{ Returns a count of the number of occurences of SubText in Text }
function CountOccurences( const SubText: string;
const Text: string): Integer;
begin
if (SubText = '') OR (Text = '') OR (Pos(SubText, Text) = 0) then
Result := 0
else
Result := (Length(Text) - Length(StringReplace(Text, SubText, '', [rfReplaceAll]))) div Length(subtext);
end; { CountOccurences }

How many different letters are in string

I have to write program that counts how many different letters are in string.
For example "abc" will give 3; and "abcabc" will give 3 too, because there are only 3 different letters.
I need to use pascal, but if you can help with code in different languages it would be very nice too.
Here is my code that does not work:
var s:string;
i,j,x,count:integer;
c:char;
begin
clrscr;
Readln(s);
c:=s[1];
x:=1;
Repeat
For i:=1 to (length(s)) do
begin
If (c=s[i]) then
begin
delete(s,i,1);
writeln(s);
end;
end;
c:=s[1];
x:=x+1;
Until length(s)=1;
Writeln(x);
x is the different letter counter;
Maybe my algorythm is very bad.. any ideas? Thank you.
You've got answers on how to do it, here's why your way doesn't work.
First of all intuitively you had a good idea: Start with the first char in the string, count it (you forgot to include the counting code), remove all occurrences of the same char in the string. The idea is inefficient, but it would work. You ran into trouble with this bit of code:
For i:=1 to (length(s)) do
begin
If (c=s[i]) then
begin
delete(s,i,1);
end;
end;
The trouble is, Pascal will take the Length(s) value when it sets up the loop, but your code changes the length of the string by removing chars (using delete(s,i,1)). You'll end up looking at bad memory. The secondary issue is that i is going to advance, it doesn't matter if it matched and removed an char or not. Here's why that's bad.
Index: 12345
String: aabbb
You're going to test for i=1,2,3,4,5, looking for a. When i is 1 you'll find a match, remove the first char, and your string is going to look like this:
Index: 1234
String: abbb
You're now testing with i=2, and it's not a match, because s[2] =b. You just skiped one a, and that given a is going to stay in the array an other round and cause your algorithm to count it twice. The "fixed" algorithm would look like this:
i := 1;
while i <= Length(s) do
if (c=s[i]) then
Delete(s,i,1)
else
Inc(i);
This is different: In the given example, if I found a match at 1, the cursor doesn't advance, so it sees the second a. Also because I'm using a while loop, not a for loop, I can't get in trouble with possible implementation details of the for loop.
Your algorithm has an other problem. After the loop that removes all occurrences of the first char in string you're preparing the next loop using this code:
c:=s[1];
The trouble is, if you feed this algorithm an string of the form aa (length=2, two identical chars), it's going to enter the loop, delete or occurrences of a (those turning s into an EMPTY string) and then attempt to read the first char of the EMPTY string.
One final word: Your algorithm should handle the empty string on input, returning an count=0. Here's the fixed algorithm:
var s:string;
i,count:integer;
c:char;
begin
Readln(s);
count:=0;
while Length(s) > 0 do
begin
Inc(Count);
c := s[1];
i := 1;
while i <= Length(s) do
begin
If (c=s[i]) then
delete(s,i,1)
else
Inc(i);
end;
end;
Writeln(Count);
Readln;
end.
I am a Delphi expert, so I don't quite know how restrictive plain Pascal is. Nevertheless, this is Delphi:
// Returns the number of *distinct* "ANSI" characters in Str
function NumChrs(const Str: AnsiString): integer;
var
counts: array[0..255] of boolean;
i: Integer;
begin
ZeroMemory(#counts[0], sizeof(boolean) * length(counts));
for i := 1 to length(Str) do
counts[ord(Str[i])] := true;
result := 0;
for i := 0 to high(counts) do
if counts[i] then
inc(result);
end;
The first line can be written
for i := 0 to high(counts) do
counts[i] := false;
if you cannot use the Windows API (or the Delphi FillChar function).
If you wish to have Unicode support (as in Delphi 2009+), you can do
// Returns the number of *distinct* Unicode characters in Str
function NumChrs(const Str: string): integer;
const
AllocBy = 1024;
var
FoundCodepoints: array of integer;
i: Integer;
procedure Push(Codepoint: integer);
var
i: Integer;
begin
for i := 0 to result - 1 do
if FoundCodepoints[i] = Codepoint then
Exit;
if length(FoundCodepoints) = result then
SetLength(FoundCodepoints, length(FoundCodepoints) + AllocBy);
FoundCodepoints[result] := Codepoint;
inc(result);
end;
begin
result := 0;
for i := 1 to length(Str) do
Push(ord(Str[i]));
end;
Here's my version. I'm not saying you'll get a great mark in your assignment if you hand this in.
function NumberOfUniqueChars(s: string): Integer;
var
i, j: Integer;
c: char;
begin
for i := 1 to Length(s) do
for j := i+1 to Length(s) do
if s[i]<s[j] then
begin
c := s[i];
s[i] := s[j];
s[j] := c;
end;
Result := 0;
for i := 1 to Length(s) do begin
if (i=1) or (s[i]<>c) then
inc(Result);
c := s[i];
end;
end;
And using a Delphi construct (not efficient, but clean)
function returncount(basestring: String): Integer;
var charstrings: TStringList;
I:Integer;
begin
Result := 0;
charstrings := TStringlist.create;
try
charstrings.CaseSensitive := False;
charstrings.Duplicates := DupIgnore;
for I := 1 to length(basestring) do
charstrings.Add(basestring[i]);
Result := charstrings.Count;
finally
charstrings.free;
end;
end;
Different languages are ok?
RUBY:
s = "abcabc"
=> "abcabc"
m = s.split(//)
=> ["a", "b", "c", "a", "b", "c"]
p = m & m
=> ["a", "b", "c"]
p.count
=> 3
A Delphi version. Same idea as #The Communist Duck Python version.
function GetNumChars(Str: string): Integer;
var
s: string;
c: Char;
begin
s := '';
for c in Str do
begin
if Pos(c, s) = 0 then
begin
s := s + c;
end;
end;
Result := Length(s);
end;
Just tossing in a set-alternative...
program CountUniqueChars;
{$APPTYPE CONSOLE}
uses
SysUtils;
var
InputStr: String;
CountedChr: Set of Char;
TotalCount: Integer;
I: Integer;
begin
Write('Text: ');
ReadLn(InputStr);
CountedChr := [];
TotalCount := 0;
for I := 1 to Length(InputStr) do
begin
Write('Checking: ' + InputStr[i]);
if (InputStr[i] in CountedChr)
then WriteLn(' --')
else begin
Include(CountedChr, InputStr[i]);
Inc(TotalCount);
WriteLn(' +1')
end;
end;
WriteLn('Unique chars: ' + IntToStr(TotalCount));
ReadLn;
end.
In Python, with explanation if you want it for any other language: (Since you wanted different languages)
s = 'aahdhdfrhr' #s is the string
l = [] #l is an empty list of some kind.
for i in s: #Iterate through the string
if i not in l: #If the list does not contain the character
l.append(i) #Add the character to the list
print len(l) #Print the number of characters in the list
function CountChars(const S:AnsiString):Integer;
var C:AnsiChar; CS:Set of AnsiChar;
begin
Result := 0;
CS := [];
for C in S do
if not (C in CS) then
begin
CS := CS + [C];
Inc(Result);
end;
end;

Resources