Is it Safe to use 'Unsafe' Thread Functions?

Is it Safe to use 'Unsafe' Thread Functions? - multithreading

Please pardon my slightly humorous title. I use two different definitions of the word 'safe' in it (obviously).
I am rather new to threading (well, I have used threading for many years, but only very simple forms of it). Now I am faced with the challange of writing parallal implementations of some algorithms, and the threads need to work on the same data. Consider the following newbie mistake:
const
N = 2;
var
value: integer = 0;
function ThreadFunc(Parameter: Pointer): integer;
var
i: Integer;
begin
for i := 1 to 10000000 do
inc(value);
result := 0;
end;
procedure TForm1.FormCreate(Sender: TObject);
var
threads: array[0..N - 1] of THandle;
i: Integer;
dummy: cardinal;
begin
for i := 0 to N - 1 do
threads[i] := BeginThread(nil, 0, #ThreadFunc, nil, 0, dummy);
if WaitForMultipleObjects(N, #threads[0], true, INFINITE) = WAIT_FAILED then
RaiseLastOSError;
ShowMessage(IntToStr(value));
end;
A beginner might expect the code above to display the message 20000000. Indeed, first value is equal to 0, and then we inc it 20000000 times. However, since the inc procedure is not 'atomic', the two threads will conflict (I guess that inc does three things: it reads, it increments, and it saves), and so a lot of the incs will be effectively 'lost'. A typical value I get from the code above is 10030423.
The simplest workaround is to use InterlockedIncrement instead of Inc (which will be much slower in this silly example, but that's not the point). Another workaround is to place the inc inside a critical section (yes, that will also be very slow in this silly example).
Now, in most real algorithms, conflicts are not this common. In fact, they might be very uncommon. One of my algorithms creates DLA fractals, and one of the variables that I inc every now and then is the number of adsorbed particles. Conflicts here are very rare, and, more importantly, I really don't care if the variable sums up to 20000000, 20000008, 20000319, or 19999496. Thus, it is tempting not to use InterlockedIncrement or critical sections, since they just bloat the code and makes it (marginally) slower to no (as far as I can see) benefit.
However, my question is: can there be more severe consequences of conflicts than a slightly 'incorrect' value of the incrementing variable? Can the program crash, for instance?
Admittedly, this question might seem silly, for, after all, the cost of using InterlockedIncrement instead of inc is rather low (in many cases, but not all!), and so it is (perhaps) stupid not to play safe. But I also feel that it would be good to know how this really works on a theoretical level, so I still think that this question is very interesting.

Your program won't ever crash due to a race on the increment of an integer that is only used as a count. All that can go wrong is that you don't get the correct answer. Obviously if you were using the integer as an index into an array, or perhaps it was a pointer then you could have problems.
Unless you are incrementing this value incredibly frequently, it's hard to imagine that an interlocked increment would be expensive enough for you to notice the performance difference.
What's more the most efficient approach is to get each thread to maintain its own private count. Then sum all the individual thread counts when you join the threads at the end of the calculation. That way you get the best of both worlds. No contention on the incrementing, and the correct answer. Of course, you need to take measures to ensure that you don't get caught out by false sharing.

Related

Swap strings without reference counting

In QuickSort, a lot of time is spent on the swap temp:=var[i]; var[i]:=var[j]; var[j]:=temp. When the vars are integer I time 140 msec for a large random array. When the vars are string, the time is 750 msec. It appears to me that much of the difference is caused by the need to update the reference counts in all three assignments.
But is this necessary? After all, the reference counts for var[i] and var[j] will be the same before and after these three assignments. Would the following code corrupt things? (not that it solves a speed problem, but out of interest):
// P : Pstring;
move(values[i],P,sizeOf(Pstring));
move(values[j],values[i],sizeOf(Pstring));
move(P,values[i],sizeOf(Pstring));
There is no temp variable. Only the two pointers to the strings are interchanged. And if this is oké, is there a Delphi function to swap 2 pointers?

What you are proposing is a well known and valid optimisation. Rather than calling the Move function it is better to perform direct assignments using casts to avoid reference counting code being generated.
var
temp: Pointer;
....
temp := Pointer(var[i]);
Pointer(var[i]) := Pointer(var[j]);
Pointer(var[j]) := temp;
In order for this to work you need to be confident that no exceptions will be raised during the swap process. Simple assignments of valid memory will not lead to exceptions, so this concern can be readily dismissed.

Are concatenated Delphi strings held in a hidden temporary variable that retains a reference to the string?

I'm trying to understand memory issues in a Delphi server application: originally I suspected an outright leak, but now believe we're seeing memory hanging around longer than it should due to the compiler's use of a hidden temporary when dynamically concatenating strings with +, causing painful free-space memory fragmentation.
Background:
This is a suite of 32-bit server applications on Windows, Delphi version is quite old, I think it's 7 but is for sure pre-Unicode, and uses the Nexus 3 memory manager where I've written a DLL to hook all the allocate/free calls (and gigabytes of memory traces).
I have application source code but not the compiler; I am not the developer of this app (or even a Delphi dev) but have created extensive custom tools to monitor, trace, and analyze memory. I've been picking the .EXE apart in the IDA Pro disassembler.
Some sample code:
I've tried to whittle this down to the bare minimum case; this code is not intended to compile:
procedure TaskThread.RunWorkLoop
begin
while not Terminated do
begin
tsk := WaitForWorkToDo(); // this could sit for minutes at a time
SetThreadName('Working on ' + tsk.Name);
tsk.Run(); // THIS COULD TAKE A LONG TIME
SetThreadName('Idle');
end
end;
SetThreadName() takes a const string parameter and hangs onto it so that other parts of the system know what this thread is doing.
My disassembly of the code shows that the compiler has allocated a hidden local temporary variable to receive the concatenation of the "Working on" and task name parts, and this is what's passed to SetThreadName, where it also retains a handle to the string.
While the task is running - and this could be 20 minutes - I believe there are two handles to the string. One is held within SetThreadName, the other is in the hidden temporary.
This is all fine and good.
Then, when the task is over and the thread name is set to 'Idle', SetThreadName() releases the original string and assigns the literal Idle.
BUT: I believe the hidden local temporary still retains a handle to that string, with a refcount=1, so it's going to take up space until either the procedure returns, or the next loop comes around to overwrite that hidden local temporary, releasing the old value.
And during this time, it's not accessible to the program, can't be explicitly released, and is serving no useful purpose but is still consuming memory.
For most procedures this doesn't matter because they start and finish relatively close to each other, so everything is released all at once, but in a looping server app, these can hang around much longer. This is causing us memory fragmentation.
It gets worse
In the actual application, it's more along the lines of:
SetThreadName(tsk.Name + '-' + FormatDateTime('mm/dd/yy hh:nn:ss', Now));
In this case, there are two hidden temporaries: one for the result of FormatDateTime, and the other for the overall concatenation result, in effect running as:
tmp1: String;
tmp2: String;
...
tmp1 := FormatDateTime('...');
tmp2 := tsk.Name + '-' + tmp1;
SetThreadName(tmp2);
I am certain I'm seeing the string result of FormatDateTime hanging around in memory long after the task has completed, and I've seen it literally be a single ~30-byte allocation sitting in the middle of a 1 megabyte memory section, surrounded by free space; Nexus3MM uses VirtualAlloc to allocate larger OS-level chunks.
That single 30-byte string will be released eventually, either on the next loop or when the procedure exits, so I'm certain it's not a leak, but I would rather that single 30-byte allocation sitting in the middle of a lonely one megabyte section actually go away when we're done with it so the whole section could be released to the OS.
But if it sticks around long enough, the memory manager is going to allocate something else from it, and this hole in memory gets more permanent.
We have very detailed busy/free memory maps and are sure that this fragmentation is killing us (this is certainly not the only cause).
My Questions:
1) Am I understanding this correctly?
2) If so, is the only workaround to elide the hidden temporaries by using explicit ones, where we do things like:
tmp1: String;
tmp2: String;
...
tmp1 := FormatDateTime('...');
tmp2 := tsk.Name + '-' + tmp1;
SetThreadName(tmp2);
tmp1 := ''; // release the date/time string
tmp2 := ''; // release the overall thread name string
I'm pretty confident I have to do this with the FormatDateTime intermediate result (I've seen it specifically), but am not sure about the overall concatenation.
This just feels wrong.
EDIT: Just an update a few weeks later. We've rewritten the central loop to use explicit temporaries, and it's actually made a noticeable (though not major) difference in memory fragmentation of some key server processes. We still have other things to look into, but it's clear to me that this was a road worth going down.

From my experience, it does work exactly like that. I'm not sure if this is by contract or by implementation. I guess with the recent addition of inline variable declaration, this might be slightly different now. But in pre-unicode Delphi, I believe it works exactly as you described.
All routines using variables (implicit or explicit) of a managed type, or a record containing one, will generate an implicit try/finally block in the routine, with the finally part clearing the reference. What your code really does is :
procedure TaskThread.RunWorkLoop
var
sImplicit : string;
begin
sImplicit := '';
try
while not Terminated do
begin
tsk := WaitForWorkToDo(); // this could sit for minutes at a time
sImplicit := 'Working on ' + tsk.Name;
SetThreadName(sImplicit);
tsk.Run(); // THIS COULD TAKE A LONG TIME
SetThreadName('Idle');
end;
finally
sImplicit := '';
end;
end;
In your situation, since you never exit the routine where the implicit variable is used, it does remain in memory.
As for a solution, I believe what you propose would work. But you could also simply move the code to another method (or a local procedure).
procedure TaskThread.RunWorkLoop
procedure JustKeepWorking;
begin
tsk := WaitForWorkToDo(); // this could sit for minutes at a time
SetThreadName('Working on ' + tsk.Name);
tsk.Run(); // THIS COULD TAKE A LONG TIME
SetThreadName('Idle');
end;
begin
while not Terminated do
begin
JustKeepWorking;
end
end;
Also, you might want to check this question for additional insight.

Is it necessary to do Multi-thread protection for a Boolean property in Delphi?

I found a Delphi library named EventBus and I think it will be very useful, since the Observer is my favorite design pattern.
In the process of learning its source code, I found a piece of code that may be due to multithreading security considerations, which is in the following (property Active's getter and setter methods).
TSubscription = class(TObject)
private
FActive: Boolean;
procedure SetActive(const Value: Boolean);
function GetActive: Boolean;
// ... other members
public
constructor Create(ASubscriber: TObject;
ASubscriberMethod: TSubscriberMethod);
destructor Destroy; override;
property Active: Boolean read GetActive write SetActive;
// ... other methods
end;
function TSubscription.GetActive: Boolean;
begin
TMonitor.Enter(self);
try
Result := FActive;
finally
TMonitor.exit(self);
end;
end;
procedure TSubscription.SetActive(const Value: Boolean);
begin
TMonitor.Enter(self);
try
FActive := Value;
finally
TMonitor.exit(self);
end;
end;
Could you please tell me the lock protection for FActive is whether or not necessary and why?

Summary
Let me start by making this point as clear as possible: Do not attempt to distill multi-threaded development into a set of "simple" rules. It is essential to understand how the data is shared in order to evaluate which of the available concurrency protection techniques would be correct for a particular situation.
The code you have presented suggests the original authors had only a superficial understanding of multi-threaded development. So it serves as a lesson in what not to do.
First, locking the Boolean for read/write access in that way serves no purpose at all. I.e. each read or write is already atomic.
Furthermore, in cases where the property does need protection for concurrent access: it fails abysmally to provide any protection at all.
The net effect is redundant ineffective code that can trigger pointless wait states.
Thread-safety
In order to evaluate 'thread-safety', the following concepts should be understood:
If 2 threads 'race' for the opportunity to access a shared memory location, one will be first, and the other second. In the absence of other factors, you have no control over which thread would 'start' its access first.
Your only control is to block the 'second' thread from concurrent access if the 'first' thread hasn't finished its critical work.
The word "critical" has loaded meaning and may take some effort to fully understand. Take note of the explanation later about why a Boolean variable might need protection.
Critical work refers to all the processing required for the operation on the shared data to be deemed complete.
It's related to concepts of atomic operations or transactional integrity.
The 'second' thread could either be made to wait for the 'first' thread to finish or to skip its operation altogether.
Note that if the shared memory is accessed concurrently by both threads, then there's the possibility of inconsistent behaviour based on the exact ordering of the internal sub-steps of each thread's processing.
This is the fundamental risk and area of concern when thinking about thread-safety. It is the base principle from which other principles are derived.
'Simple' reads and writes are (usually) atomic
No concurrent operations can interfere with the reading/writing of a single byte of data. You will always either get the value in its entirety or replace the value in its entirety.
This concept extends to multiple bytes up to the machine architecture bit size; but does have a caveat, known as tearing.
When a memory address is not aligned on the bit size, then there's the possibility of the bytes spanning the end of one aligned location into the beginning of the next aligned location.
This means that reading/writing the bytes may take 2 operations at the machine level.
As a result 2 concurrent threads could interleave their sub-steps resulting in invalid/incorrect values being read. E.g.
Suppose one thread writes $ffff over an existing value of $0000 while another reads.
"Valid" reads would return either $0000 or $ffff depending on which thread is 'first'.
If the sub-steps run concurrently, then the reading thread could return invalid values of $ff00 or $00ff.
(Note that some platforms might still guarantee atomicity in this situation, but I don't have the knowledge to comment in detail on this.)
To reiterate: single byte values (including Boolean) cannot span aligned memory locations. So they're not subject to the tearing issue above. And this is why the code in the question that attempts to protect the Boolean is completely pointless.
When protection is needed
Although reads and writes in isolation are atomic, it's important to note that when a value is read and impacts a write decision, then this cannot be assumed to be thread-safe. This is best explained by way of a simple example.
Suppose 2 threads invert a shared boolean value: FBool := not FBool;
2 threads means this happens twice and once both threads have finished, the boolean should end up having its starting value. However, each is a multi-step operation:
Read FBool into a location local to the thread (either stack or register).
Invert the value.
Write the inverted value back to the shared location.
If there's no thread-safety mechanism employed then the sub-steps can run concurrently. And it's possible that both threads:
Read FBool; both getting the starting value.
Both threads invert their local copies.
Both threads write the same inverted value to the shared location.
And the end result is that the value is inverted when it should have been reverted to its starting value.
Basically the critical work is clearly more than simply reading or writing the value. To properly protect the boolean value in this situation, the protection must start before the read, and end after the write.
The important lesson to take away from this is that thread-safety requires understanding how the data is shared. It's not feasible to produce an arbitrary generic safety mechanism without this understanding.
And this is why any such attempt as in the EventBus code in the question is almost certainly doomed to be deficient (or even an outright failure).

Is TStringList thread safe?

Is it ok to read data from TStringList without any form of synchronization? For example synchronization with main thread.
Example code
var MyStringList:TStringList; //declared globally
procedure TForm1.JvThread1Execute(Sender: TObject; Params: Pointer);
var x:integer;
begin
for x:=0 to MaxInt do MyStringList.Add(FloatToStr(Random));
end;
procedure TForm1.ButtonClick(Sender: TObject);
var x:integer;
SumOfRandomNumbers:double;
begin
for x:=0 to MyStringList.Count-1 do
SumOfRandomNumbers:=SumOfRandomNumbers+StrToFloat(MyStringList.Strings[x]);
end;
or Should I protect access to MyStringList with EnterCiticalSection
var MyStringList:TStringList; //declared globally
procedure TForm1.JvThread1Execute(Sender: TObject; Params: Pointer);
var x:integer;
begin
for x:=0 to MaxInt do
begin
EnterCriticalSection(MySemaphore);
MyStringList.Add(FloatToStr(Random));
LeaveCriticalSection(MySemaphore);
end;
end;
procedure TForm1.ButtonClick(Sender: TObject);
var x:integer;
SumOfRandomNumbers:double;
begin
for x:=0 to MyStringList.Count-1 do
begin
EnterCriticalSection(MySemaphore);
SumOfRandomNumbers:=SumOfRandomNumbers+StrToFloat(MyStringList.Strings[x]);
LeaveCriticalSection(MySemaphore);
end;
end;

First, no TStringList is not thread-safe.
Second, attempting to make it so would be a terrible idea for a low-level container that in the vast majority of cases would not be shared across multiple threads.
Third, the naive code you propose to make it thread-safe is woefully insufficient. It falls well short of making it truly thread-safe - which is part of the problem in trying to do so generically.
In the text of your question you ask:
Is it ok to read data from TStringList without any form of synchronisation?
Yes it is okay. In fact, that is preferred because it is more efficient.
However, if the data is shared across threads, you may run into problems. Which is why you should minimise the amount of data (not just string lists) shared across threads. And if you do need to share data, do so in a suitably controlled fashion.
Expanding on point 3
The reason your code is not thread-safe is that it falls short of protecting all your data from shared access. This is a common misunderstanding in multi-threaded development: "I just need to wrap certain operations with locks and all will be fine."
The point is, if your list is shared, you are:
Sharing the structures that represent the container.
AND you are sharing the data members (the actual strings) themselves.
When dealing with strings, this goes a step further, because the way Delphi manages strings means they could be shared (through internal reference counting) with other strings of the same value in an entirely different area of the application.
While it is possible your proposed locking strategy might be suitable for your current requirements, it is far from being generally thread-safe.
Conclusion
If you want to write thread-safe code the onus is on you to:
Understand the data access paths.
Minimise sharing between threads (by far the best bang for buck).
And to implement the best strategy to share the data safely (of which there are many options, and locking is not guaranteed to be best in any case).
Sidenote
I indicated earlier that your locking technique only "might be suitable for your current requirements" because I do not believe you have really given an indication as to you real requirements. If you have then you really do need to take note of the following:
In the code you have presented there would be absolutely no benefit in making your TStringList "thread-safe". You populate the list in a loop, and you read values in a second loop. You're doing absolutely nothing to use the data concurrently.
The closest your code should come to multi-threading is: It would be a good idea to process both loops off the main thread to avoid blocking the UI. In which case, the background thread should NOT share its TStringList instance. And can simply synchronise with the main thread to report the result (and possibly progress updates).
By not sharing data that doesn't need to be shared, you can bypass the need for locks entirely. They would be an unnecessary overhead. And you can be happy that TStringList doesn't have a built-in "thread-safety" mechanism.

No, it isn't. There is no mechanism inside of TStringList, that locks for example .Add() or .GetStrings().
Unfortunately there is nothing built in like TThreadList, that is a threadsafe wrapper for TList. But you could build that easily on your own.
Here is a simple example for a synchronized decorator of TStringList, in that I cover the case for Add():
TThreadStringList = class
private
FStringList: TStringList;
FCriticalSection: TRtlCriticalSection;
// ...
public
function Add(const S: string): Integer;
// ...
end;
// ...
TThreadStringList.Add(const S: string): Integer;
begin
EnterCriticalSection(FCriticalSection);
try
Result:= Add(S);
finally
LeaveCriticalSection(FCriticalSection);
end;
end;
It should be easy, to apply this to all other methods you need.
Bear in mind, that you have to initialize the critical section, before you can use it, and to delete it afterwards.

System.Move and Array of String

I am trying to move some array elements (of string) to some other position.
When I use System.Move(), FastMM4 reports leaks.
Here is a small snippet to show the problem:
procedure TForm1.Button2Click(Sender: TObject);
type
TArrayOfStr = array of string;
const
Count1 = 100;
String1 = 'some string '; {space at end}
var
Array1: TArrayOfStr;
Index1: Integer;
begin
SetLength(Array1, Count1);
Index1 := 0;
while Index1 < Count1 do begin
Array1[Index1] := String1 + IntToStr(Index1);
Inc(Index1);
end;
System.Move(Array1[0], Array1[3], 2 * SizeOf(string)); {move 2 cells from cell 0 to cell 3}
ShowMessage(Array1[3]);
end;
It probably has something to do with SizeOf(String) but I don't know what.
Could someone help me make the leaks go away?

Issues
The problem you are having has to do with the reference counting of the string.
Leaks
If there is already a string in the areas you're overwriting these strings will not get freed. These are the leaks you are reporting.
Potential access violations
You copy the string pointers, but you do not increase the reference count of the string. This will lead to access violations if the original strings ever get destroyed due to going out of scope. This is a very subtle bug and will bite you when you least expect it.
Best solution
It's far simpler to just let Delphi do the copying and then all internal bookkeeping will get done properly.
{move 6 from cell 1 to cell 3}
System.Move(Array1[0], Array1[3], 2 * SizeOf(string));
//This does not increase the reference count for the string;
//leading to problems at cleanup.
Array1[3]:= Array1[0];
Array1[4]:= Array1[1]; //in a loop obviously :-)
//this increases the reference count of the string.
Note that Delphi does not copy the strings, it just copies the pointers and increases the ref counts as needed. It also frees any strings as needed.
Hack solution
You should manually clear the area first.
Using
for i:= start to finish do Array1[i]:= '';
The next part of this solution horrible hack is to manually increase the ref counts on the strings you've copied.
See: http://docwiki.embarcadero.com/RADStudio/Seattle/en/Internal_Data_Formats#Long_String_Types
procedure IncreaseRefCount(const str: string; HowMuch: integer = 1);
var
Hack: PInteger;
begin
Hack:= pointer(str);
Dec(Hack,2); //get ref count
Hack^:= AtomicIncrement(Hack^,HowMuch);
end;
System.Move(Array1[0], Array1[3], 2 * SizeOf(string));
IncreaseRefCount(Array1[3]);
.... do this for every copied item.
Note that this hack is not completely thread safe if you get the strings from somewhere outside your array.
However if you are really really in need of speed it might be a solution to gain a wooping 2% in the performance of the copy.
Warning
Don't use this code to decrease ref counts manually, you'll run into thread-safety issues!
Need for speed
It is unlikely that simply coping a few strings leads to slowness.
There is no way to get out of the clean up issues if you insist on using managed strings.
On the other hand the overhead of the reference counting is really not that bad, so I suspect the reason for the slowness lies elsewhere; somewhere we can't see because you haven't told us your problem.
I suggest you ask a new question explaining what you're trying to do, why and where the slowness is hurting you.

The String type in Delphi is a managed type. Delphi keeps record of references and dereferences and automatically releases memory allocated to the string when it's no longer being referenced.
The reason for the leak is that you are bypassing Delphi's management of the string type. You are simply overwriting the pointer that references the 4th string in the array. (and for that matter the 5th as well because of 2 * ...) So you now have strings in memory that are no longer referenced. But Delphi doesn't realise this and cannot free the memory.
Solution: write Array1[3] := Array1[0];
Edit:
I've fully answered your question; giving you everything you need to: 1) Understand why you've got the memory leaks. 2) And make the leaks go away.
Yet you're not satisfied... In comments you've explained that you're trying to improve a phantom performance problem you've conjured up via an artificial benchmark.
It's been explained to you that the only reason Move is a little faster than normal string assignment is that: string assignment needs to do additional record keeping to prevent memory leaks and access violations.
If you insist on using Move, you'll need to do said record keeping yourself. (Johan has even demonstrated how.)
And then you complain that this will slow down your Move "solution".
Seriously, take your pick: If you want to use string, you can have it a little faster with AV's and memory leaks OR a little slower but behaving correctly. There's no magic wand waving that's going fix it for you.
You could choose to abandon the string type (and all the goodness it gives you). I.e. Use array of char instead. Of course, you'll have to do all the memory management yourself, and see how far multi-byte string copying helps you.
I still maintain that if you ask a new question demonstrating a specific performance problem you're trying to solve, you'll get much better feedback.
E.g. In comments you've mentioned that you're trying to improve the performance of TStringList.
I have previously encountered a performance problem with TStringList in older versions of Delphi. It's generally fine even working with hundreds of thousands of items. However, even with only 10,000 strings, CommaText was noticeably slow; and at 40,000 it was almost unbearable.
The solution didn't involve trying to bastardise reference counting: because that's not the reason it was slow. It was slow because the algorithm was a little naive and performed a huge number of incremental memory allocations. Writing a custom CommaText method solved it.

I'm adding a fundamentally different answer in which I present an option how you can do something you really and truly should not be doing at all.
Disclaimer: What I describe here is terrible advice and should not be done (but it would work). Feel free to learn from it, but don't use it.
As has already been discussed ad nauseam, your problem is that when you use Move you bypass reference counting.
The solution involves making "internal" reference counting unnecessary by working on the principal that no matter how many times a string is internally referenced, you'll hold only a single count to the string.
And you'll only remove that single count when you're certain you have no more "internal" references to the string.
Unfortunately, having abandoned internal reference counting, the only time you can be sure of this is when you've completely finished with your internal array.
At this point you must first manually clear the internal array.
Then reduce the reference count of all strings you had previously used by 1.
This implies you need a separate "master reference" to each string.
Warning
There are 2 immediate problems:
You need a second list to track the master reference count; wasting memory.
You cannot recover memory used by strings that are no longer referenced internally until you've finished with the internal array entirely because you've abandoned your ability to track internal reference counts.
(Technically this is still a memory leak, albeit controlled. And FastMM won't report it provided you cleanup correctly.)
Without further ado, some sample code:
//NOTE: Deliberate use of fixed size array instead of dynamic to
//avoid further complications. See Yet Another Warning after code
//for explanation and resolution.
TStringArray100 = array[0..99] of string;
TBadStrings = class(TObject)
private
FMasterStrings: TStrings;
FInternalStrings: TStringArray100;
public
...
end;
constructor TBadStrings.Create()
begin
FMasterStings := TStringList.Create;
end;
destructor TBadStrings.Destroy;
begin
Clear;
FMasterStrings.Free;
inherited;
end;
procedure TBadStrings.Clear;
begin
for I := 0 to 99 do
Pointer(FInternalStrings[I]) := 0;
//Should optimise to equivalent of
//Move(0, FInternalStrings[0], 100 * SizeOf(String));
//NOTE: Only now is it safe to clear the master list.
FMasterStings.Clear;
end;
procedure TBadStrings.SetString(APos: Integer; AString: string);
begin
FMasterStrings.Add(AString); //Hold single reference count
//Bypass reference counting to assign the string internally
//Equivalent to Move
Pointer(FInternalStrings[APos]) := Pointer(AString);
end;
//Usage
begin
LBadStrings := TBadStrings.Create;
try
for I := 0 to 199 do
begin
//NOTE 0 to 99 are set, then all overwritten with 100 to 199
//However strings 0 to 99 are still in memory...
LBadStrings.SetString(I mod 100, 'String ' + IntToStr(I));
end;
finally
//...until cleanup.
LBadStrings.Free;
end;
end;
NOTE: You can add methods to do whatever you like using Move on FInternalStrings. It won't matter that those references aren't being tracked because the master reference can perform correct cleanup at the end. However....
WARNING: Anything you do to FInternalStrings MUST also bypass reference counting otherwise you'll have nasty side-effects. So it should go without saying you need to solidly guard access to the internal array. If client code gets direct access to the array, you can expect accidental 'abuse'.
Yet Another Warning: As commented in code this uses a fixed size array to avoid other problems. Said problems are that if you use a dynamic array, then resizing the array can apply reference counting. Increasing the array size shouldn't be a problem (I recall that being a pointer copy). However, when the size is decreased, elements that are discarded will be dereferenced as necessary. This means you'll have to take the precaution of first pointer nilling these elements before shrinking the array.
The above is a way you can bypass string reference counting in a controlled fashion. But let me reiterate what a terrible idea it is.
You should have no trouble concocting an artificial benchmark to demonstrate that it's faster. However, I seriously doubt it will provide any benefit in a real-world environment.
In fact, if it does you probably have a different problem entirely; because why on earth would be shuffling the strings around so much that time spent there overshadows other aspects of your application?

procedure TamListVar<T>.Insert(aIndex: Integer; aItem: T);
begin
InitCheck;
if not IsIndex(aIndex) then Add(aItem)
else
begin
Setlength(Items,FCount+1);
System.Move(Items[aIndex], Items[aIndex + 1], (FCount - aIndex) * SizeOf(Items[aIndex]));
PPointer(#Items[aIndex])^:=nil;
Items[aIndex]:= aItem;
inc(FCount);
end;
end;

Here is the solution I have come up with. It is heavily inspired by System.Move().
As far as I could see, after doing a number of tests, it seems to work OK --no leaks reported by FastMM4.
Obvioulsy, this is not the hand-optimized asm routine I was after; but given my (lack of) talents in asm area, this has to do for now.
I'd be most appreciative if you commented on this --especially to point out any pitfalls, as well as any other (e.g. speed) improvements.
{ACount refers to the number of actual array elements (cells of strings), not their byte count.}
procedure MoveString(const ASource; var ATarget; ACount: NativeInt);
type
PString = ^string;
const
SzString = SizeOf(string);
var
Source1: PString;
Target1: PString;
begin
Source1 := PString(#ASource);
Target1 := PString(#ATarget);
if Source1 = Target1 then Exit;
while ACount > 0 do begin
Target1^ := Source1^;
//Source1^ := ''; {enable if you want to avoid duplicates}
Inc(Source1);
Inc(Target1);
Dec(ACount);
end;
end;
procedure TForm1.Button2Click(Sender: TObject);
type
TArrayOfStr = array of string;
const
Count1 = 100;
String1 = 'some string '; {space at end}
var
Array1: TArrayOfStr;
Index1: Integer;
begin
SetLength(Array1, Count1);
Index1 := 0;
while Index1 < Count1 do begin
Array1[Index1] := String1 + IntToStr(Index1);
Inc(Index1);
end;
MoveString(Array1[0], Array1[3], 2); {move 2 cells from cell 0 to cell 3}
ShowMessage(Array1[3]); {should be 'some string 0'}
MoveString(Array1[3], Array1[0], 2); {move 2 cells from cell 3 to cell 0}
ShowMessage(Array1[0]); {should be 'some string 0'}
end;

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string