I am trying to move some array elements (of string) to some other position.
When I use System.Move(), FastMM4 reports leaks.
Here is a small snippet to show the problem:
procedure TForm1.Button2Click(Sender: TObject);
type
TArrayOfStr = array of string;
const
Count1 = 100;
String1 = 'some string '; {space at end}
var
Array1: TArrayOfStr;
Index1: Integer;
begin
SetLength(Array1, Count1);
Index1 := 0;
while Index1 < Count1 do begin
Array1[Index1] := String1 + IntToStr(Index1);
Inc(Index1);
end;
System.Move(Array1[0], Array1[3], 2 * SizeOf(string)); {move 2 cells from cell 0 to cell 3}
ShowMessage(Array1[3]);
end;
It probably has something to do with SizeOf(String) but I don't know what.
Could someone help me make the leaks go away?
Issues
The problem you are having has to do with the reference counting of the string.
Leaks
If there is already a string in the areas you're overwriting these strings will not get freed. These are the leaks you are reporting.
Potential access violations
You copy the string pointers, but you do not increase the reference count of the string. This will lead to access violations if the original strings ever get destroyed due to going out of scope. This is a very subtle bug and will bite you when you least expect it.
Best solution
It's far simpler to just let Delphi do the copying and then all internal bookkeeping will get done properly.
{move 6 from cell 1 to cell 3}
System.Move(Array1[0], Array1[3], 2 * SizeOf(string));
//This does not increase the reference count for the string;
//leading to problems at cleanup.
Array1[3]:= Array1[0];
Array1[4]:= Array1[1]; //in a loop obviously :-)
//this increases the reference count of the string.
Note that Delphi does not copy the strings, it just copies the pointers and increases the ref counts as needed. It also frees any strings as needed.
Hack solution
You should manually clear the area first.
Using
for i:= start to finish do Array1[i]:= '';
The next part of this solution horrible hack is to manually increase the ref counts on the strings you've copied.
See: http://docwiki.embarcadero.com/RADStudio/Seattle/en/Internal_Data_Formats#Long_String_Types
procedure IncreaseRefCount(const str: string; HowMuch: integer = 1);
var
Hack: PInteger;
begin
Hack:= pointer(str);
Dec(Hack,2); //get ref count
Hack^:= AtomicIncrement(Hack^,HowMuch);
end;
System.Move(Array1[0], Array1[3], 2 * SizeOf(string));
IncreaseRefCount(Array1[3]);
.... do this for every copied item.
Note that this hack is not completely thread safe if you get the strings from somewhere outside your array.
However if you are really really in need of speed it might be a solution to gain a wooping 2% in the performance of the copy.
Warning
Don't use this code to decrease ref counts manually, you'll run into thread-safety issues!
Need for speed
It is unlikely that simply coping a few strings leads to slowness.
There is no way to get out of the clean up issues if you insist on using managed strings.
On the other hand the overhead of the reference counting is really not that bad, so I suspect the reason for the slowness lies elsewhere; somewhere we can't see because you haven't told us your problem.
I suggest you ask a new question explaining what you're trying to do, why and where the slowness is hurting you.
The String type in Delphi is a managed type. Delphi keeps record of references and dereferences and automatically releases memory allocated to the string when it's no longer being referenced.
The reason for the leak is that you are bypassing Delphi's management of the string type. You are simply overwriting the pointer that references the 4th string in the array. (and for that matter the 5th as well because of 2 * ...) So you now have strings in memory that are no longer referenced. But Delphi doesn't realise this and cannot free the memory.
Solution: write Array1[3] := Array1[0];
Edit:
I've fully answered your question; giving you everything you need to: 1) Understand why you've got the memory leaks. 2) And make the leaks go away.
Yet you're not satisfied... In comments you've explained that you're trying to improve a phantom performance problem you've conjured up via an artificial benchmark.
It's been explained to you that the only reason Move is a little faster than normal string assignment is that: string assignment needs to do additional record keeping to prevent memory leaks and access violations.
If you insist on using Move, you'll need to do said record keeping yourself. (Johan has even demonstrated how.)
And then you complain that this will slow down your Move "solution".
Seriously, take your pick: If you want to use string, you can have it a little faster with AV's and memory leaks OR a little slower but behaving correctly. There's no magic wand waving that's going fix it for you.
You could choose to abandon the string type (and all the goodness it gives you). I.e. Use array of char instead. Of course, you'll have to do all the memory management yourself, and see how far multi-byte string copying helps you.
I still maintain that if you ask a new question demonstrating a specific performance problem you're trying to solve, you'll get much better feedback.
E.g. In comments you've mentioned that you're trying to improve the performance of TStringList.
I have previously encountered a performance problem with TStringList in older versions of Delphi. It's generally fine even working with hundreds of thousands of items. However, even with only 10,000 strings, CommaText was noticeably slow; and at 40,000 it was almost unbearable.
The solution didn't involve trying to bastardise reference counting: because that's not the reason it was slow. It was slow because the algorithm was a little naive and performed a huge number of incremental memory allocations. Writing a custom CommaText method solved it.
I'm adding a fundamentally different answer in which I present an option how you can do something you really and truly should not be doing at all.
Disclaimer: What I describe here is terrible advice and should not be done (but it would work). Feel free to learn from it, but don't use it.
As has already been discussed ad nauseam, your problem is that when you use Move you bypass reference counting.
The solution involves making "internal" reference counting unnecessary by working on the principal that no matter how many times a string is internally referenced, you'll hold only a single count to the string.
And you'll only remove that single count when you're certain you have no more "internal" references to the string.
Unfortunately, having abandoned internal reference counting, the only time you can be sure of this is when you've completely finished with your internal array.
At this point you must first manually clear the internal array.
Then reduce the reference count of all strings you had previously used by 1.
This implies you need a separate "master reference" to each string.
Warning
There are 2 immediate problems:
You need a second list to track the master reference count; wasting memory.
You cannot recover memory used by strings that are no longer referenced internally until you've finished with the internal array entirely because you've abandoned your ability to track internal reference counts.
(Technically this is still a memory leak, albeit controlled. And FastMM won't report it provided you cleanup correctly.)
Without further ado, some sample code:
//NOTE: Deliberate use of fixed size array instead of dynamic to
//avoid further complications. See Yet Another Warning after code
//for explanation and resolution.
TStringArray100 = array[0..99] of string;
TBadStrings = class(TObject)
private
FMasterStrings: TStrings;
FInternalStrings: TStringArray100;
public
...
end;
constructor TBadStrings.Create()
begin
FMasterStings := TStringList.Create;
end;
destructor TBadStrings.Destroy;
begin
Clear;
FMasterStrings.Free;
inherited;
end;
procedure TBadStrings.Clear;
begin
for I := 0 to 99 do
Pointer(FInternalStrings[I]) := 0;
//Should optimise to equivalent of
//Move(0, FInternalStrings[0], 100 * SizeOf(String));
//NOTE: Only now is it safe to clear the master list.
FMasterStings.Clear;
end;
procedure TBadStrings.SetString(APos: Integer; AString: string);
begin
FMasterStrings.Add(AString); //Hold single reference count
//Bypass reference counting to assign the string internally
//Equivalent to Move
Pointer(FInternalStrings[APos]) := Pointer(AString);
end;
//Usage
begin
LBadStrings := TBadStrings.Create;
try
for I := 0 to 199 do
begin
//NOTE 0 to 99 are set, then all overwritten with 100 to 199
//However strings 0 to 99 are still in memory...
LBadStrings.SetString(I mod 100, 'String ' + IntToStr(I));
end;
finally
//...until cleanup.
LBadStrings.Free;
end;
end;
NOTE: You can add methods to do whatever you like using Move on FInternalStrings. It won't matter that those references aren't being tracked because the master reference can perform correct cleanup at the end. However....
WARNING: Anything you do to FInternalStrings MUST also bypass reference counting otherwise you'll have nasty side-effects. So it should go without saying you need to solidly guard access to the internal array. If client code gets direct access to the array, you can expect accidental 'abuse'.
Yet Another Warning: As commented in code this uses a fixed size array to avoid other problems. Said problems are that if you use a dynamic array, then resizing the array can apply reference counting. Increasing the array size shouldn't be a problem (I recall that being a pointer copy). However, when the size is decreased, elements that are discarded will be dereferenced as necessary. This means you'll have to take the precaution of first pointer nilling these elements before shrinking the array.
The above is a way you can bypass string reference counting in a controlled fashion. But let me reiterate what a terrible idea it is.
You should have no trouble concocting an artificial benchmark to demonstrate that it's faster. However, I seriously doubt it will provide any benefit in a real-world environment.
In fact, if it does you probably have a different problem entirely; because why on earth would be shuffling the strings around so much that time spent there overshadows other aspects of your application?
procedure TamListVar<T>.Insert(aIndex: Integer; aItem: T);
begin
InitCheck;
if not IsIndex(aIndex) then Add(aItem)
else
begin
Setlength(Items,FCount+1);
System.Move(Items[aIndex], Items[aIndex + 1], (FCount - aIndex) * SizeOf(Items[aIndex]));
PPointer(#Items[aIndex])^:=nil;
Items[aIndex]:= aItem;
inc(FCount);
end;
end;
Here is the solution I have come up with. It is heavily inspired by System.Move().
As far as I could see, after doing a number of tests, it seems to work OK --no leaks reported by FastMM4.
Obvioulsy, this is not the hand-optimized asm routine I was after; but given my (lack of) talents in asm area, this has to do for now.
I'd be most appreciative if you commented on this --especially to point out any pitfalls, as well as any other (e.g. speed) improvements.
{ACount refers to the number of actual array elements (cells of strings), not their byte count.}
procedure MoveString(const ASource; var ATarget; ACount: NativeInt);
type
PString = ^string;
const
SzString = SizeOf(string);
var
Source1: PString;
Target1: PString;
begin
Source1 := PString(#ASource);
Target1 := PString(#ATarget);
if Source1 = Target1 then Exit;
while ACount > 0 do begin
Target1^ := Source1^;
//Source1^ := ''; {enable if you want to avoid duplicates}
Inc(Source1);
Inc(Target1);
Dec(ACount);
end;
end;
procedure TForm1.Button2Click(Sender: TObject);
type
TArrayOfStr = array of string;
const
Count1 = 100;
String1 = 'some string '; {space at end}
var
Array1: TArrayOfStr;
Index1: Integer;
begin
SetLength(Array1, Count1);
Index1 := 0;
while Index1 < Count1 do begin
Array1[Index1] := String1 + IntToStr(Index1);
Inc(Index1);
end;
MoveString(Array1[0], Array1[3], 2); {move 2 cells from cell 0 to cell 3}
ShowMessage(Array1[3]); {should be 'some string 0'}
MoveString(Array1[3], Array1[0], 2); {move 2 cells from cell 3 to cell 0}
ShowMessage(Array1[0]); {should be 'some string 0'}
end;
Related
In QuickSort, a lot of time is spent on the swap temp:=var[i]; var[i]:=var[j]; var[j]:=temp. When the vars are integer I time 140 msec for a large random array. When the vars are string, the time is 750 msec. It appears to me that much of the difference is caused by the need to update the reference counts in all three assignments.
But is this necessary? After all, the reference counts for var[i] and var[j] will be the same before and after these three assignments. Would the following code corrupt things? (not that it solves a speed problem, but out of interest):
// P : Pstring;
move(values[i],P,sizeOf(Pstring));
move(values[j],values[i],sizeOf(Pstring));
move(P,values[i],sizeOf(Pstring));
There is no temp variable. Only the two pointers to the strings are interchanged. And if this is oké, is there a Delphi function to swap 2 pointers?
What you are proposing is a well known and valid optimisation. Rather than calling the Move function it is better to perform direct assignments using casts to avoid reference counting code being generated.
var
temp: Pointer;
....
temp := Pointer(var[i]);
Pointer(var[i]) := Pointer(var[j]);
Pointer(var[j]) := temp;
In order for this to work you need to be confident that no exceptions will be raised during the swap process. Simple assignments of valid memory will not lead to exceptions, so this concern can be readily dismissed.
I'm trying to understand memory issues in a Delphi server application: originally I suspected an outright leak, but now believe we're seeing memory hanging around longer than it should due to the compiler's use of a hidden temporary when dynamically concatenating strings with +, causing painful free-space memory fragmentation.
Background:
This is a suite of 32-bit server applications on Windows, Delphi version is quite old, I think it's 7 but is for sure pre-Unicode, and uses the Nexus 3 memory manager where I've written a DLL to hook all the allocate/free calls (and gigabytes of memory traces).
I have application source code but not the compiler; I am not the developer of this app (or even a Delphi dev) but have created extensive custom tools to monitor, trace, and analyze memory. I've been picking the .EXE apart in the IDA Pro disassembler.
Some sample code:
I've tried to whittle this down to the bare minimum case; this code is not intended to compile:
procedure TaskThread.RunWorkLoop
begin
while not Terminated do
begin
tsk := WaitForWorkToDo(); // this could sit for minutes at a time
SetThreadName('Working on ' + tsk.Name);
tsk.Run(); // THIS COULD TAKE A LONG TIME
SetThreadName('Idle');
end
end;
SetThreadName() takes a const string parameter and hangs onto it so that other parts of the system know what this thread is doing.
My disassembly of the code shows that the compiler has allocated a hidden local temporary variable to receive the concatenation of the "Working on" and task name parts, and this is what's passed to SetThreadName, where it also retains a handle to the string.
While the task is running - and this could be 20 minutes - I believe there are two handles to the string. One is held within SetThreadName, the other is in the hidden temporary.
This is all fine and good.
Then, when the task is over and the thread name is set to 'Idle', SetThreadName() releases the original string and assigns the literal Idle.
BUT: I believe the hidden local temporary still retains a handle to that string, with a refcount=1, so it's going to take up space until either the procedure returns, or the next loop comes around to overwrite that hidden local temporary, releasing the old value.
And during this time, it's not accessible to the program, can't be explicitly released, and is serving no useful purpose but is still consuming memory.
For most procedures this doesn't matter because they start and finish relatively close to each other, so everything is released all at once, but in a looping server app, these can hang around much longer. This is causing us memory fragmentation.
It gets worse
In the actual application, it's more along the lines of:
SetThreadName(tsk.Name + '-' + FormatDateTime('mm/dd/yy hh:nn:ss', Now));
In this case, there are two hidden temporaries: one for the result of FormatDateTime, and the other for the overall concatenation result, in effect running as:
tmp1: String;
tmp2: String;
...
tmp1 := FormatDateTime('...');
tmp2 := tsk.Name + '-' + tmp1;
SetThreadName(tmp2);
I am certain I'm seeing the string result of FormatDateTime hanging around in memory long after the task has completed, and I've seen it literally be a single ~30-byte allocation sitting in the middle of a 1 megabyte memory section, surrounded by free space; Nexus3MM uses VirtualAlloc to allocate larger OS-level chunks.
That single 30-byte string will be released eventually, either on the next loop or when the procedure exits, so I'm certain it's not a leak, but I would rather that single 30-byte allocation sitting in the middle of a lonely one megabyte section actually go away when we're done with it so the whole section could be released to the OS.
But if it sticks around long enough, the memory manager is going to allocate something else from it, and this hole in memory gets more permanent.
We have very detailed busy/free memory maps and are sure that this fragmentation is killing us (this is certainly not the only cause).
My Questions:
1) Am I understanding this correctly?
2) If so, is the only workaround to elide the hidden temporaries by using explicit ones, where we do things like:
tmp1: String;
tmp2: String;
...
tmp1 := FormatDateTime('...');
tmp2 := tsk.Name + '-' + tmp1;
SetThreadName(tmp2);
tmp1 := ''; // release the date/time string
tmp2 := ''; // release the overall thread name string
I'm pretty confident I have to do this with the FormatDateTime intermediate result (I've seen it specifically), but am not sure about the overall concatenation.
This just feels wrong.
EDIT: Just an update a few weeks later. We've rewritten the central loop to use explicit temporaries, and it's actually made a noticeable (though not major) difference in memory fragmentation of some key server processes. We still have other things to look into, but it's clear to me that this was a road worth going down.
From my experience, it does work exactly like that. I'm not sure if this is by contract or by implementation. I guess with the recent addition of inline variable declaration, this might be slightly different now. But in pre-unicode Delphi, I believe it works exactly as you described.
All routines using variables (implicit or explicit) of a managed type, or a record containing one, will generate an implicit try/finally block in the routine, with the finally part clearing the reference. What your code really does is :
procedure TaskThread.RunWorkLoop
var
sImplicit : string;
begin
sImplicit := '';
try
while not Terminated do
begin
tsk := WaitForWorkToDo(); // this could sit for minutes at a time
sImplicit := 'Working on ' + tsk.Name;
SetThreadName(sImplicit);
tsk.Run(); // THIS COULD TAKE A LONG TIME
SetThreadName('Idle');
end;
finally
sImplicit := '';
end;
end;
In your situation, since you never exit the routine where the implicit variable is used, it does remain in memory.
As for a solution, I believe what you propose would work. But you could also simply move the code to another method (or a local procedure).
procedure TaskThread.RunWorkLoop
procedure JustKeepWorking;
begin
tsk := WaitForWorkToDo(); // this could sit for minutes at a time
SetThreadName('Working on ' + tsk.Name);
tsk.Run(); // THIS COULD TAKE A LONG TIME
SetThreadName('Idle');
end;
begin
while not Terminated do
begin
JustKeepWorking;
end
end;
Also, you might want to check this question for additional insight.
In my application several threads increment some counter and only one reads this value (main thread). As far I know, reading 32-bit value is thread-safe if it is aligned by double word, so I use such code:
{$A8}
TMyStat = class
private
FCounter: Integer;
public
procedure IncCounter;
property Counter: Integer read FCounter;
...
procedure TMyStat.IncCounter;
begin
InterlockedIncrement(FCounter);
end;
But I'm not sure that it's safe to mix Interlocked functions and direct access to value.
Should I use InterlockedCompareExchange instead?
function TMyStat.GetCounter: Integer;
begin
Result := InterlockedCompareExchange(FCounter, 0, 0);
end;
You can use a normal read. As FCounter is 4-aligned, reading and writing is guaranteed to be atomic.
[This holds for Intel platforms, though. I really don't know how ARM behaves. I would guess that the behaviour is the same (reading of aligned value is atomic).]
Actually, if you only increment a counter and read it, you don't even need InterlockedIncrement. When you read the value you'll always get either the pre-increment or post-increment value. There's no way you get a mix of both.
Reading an aligned 32-bit value is atomic (i.e. all 32 bits are guaranteed to be consistent). But the read is not synchronized. You may not read the "current" value; but instead a value from the caches.
For example:
FCounter := 1; //number of threads running
FResult := 0; //the answer to everything
And then your thread:
procedure ThreadProc;
begin
FResult := 42; //set the answer before we indicate we're done
InterlockedDecrement(FCounter);
end;
And then your code checks that the thread is done:
if (FCounter <= 0) then
begin
//Thread is done; read the answer
ShowMessage('The answer is: '+IntToStr(FResult));
Exit;
end;
The answer is: 0
Even though your flag said the thread set the result, you read the result value out of your local cache.
Even though the read of FCounter and FResult is atomic
they aren't synchronized
If you are only using FCounter then you are fine. But as soon as you use anything else and expect them to be consistent, then you need to also use something to enforce synchronization.
The strict answer to your question is that reading a 32-bit aligned value is already an atomic operation.
But it's entirely possible that people coming here to read this question might forget that the read being atomic is a small part of your worries.
I have a thread class TValidateInvoiceThread:
type
TValidateInvoiceThread = class(TThread)
private
FData: TValidationData;
FInvoice: TInvoice; // Do NOT free
FPreProcessing: Boolean;
procedure ValidateInvoice;
protected
procedure Execute; override;
public
constructor Create(const objData: TValidationData; const bPreProcessing: Boolean);
destructor Destroy; override;
end;
constructor TValidateInvoiceThread.Create(const objData: TValidationData;
const bPreProcessing: Boolean);
var
objValidatorCache: TValidationCache;
begin
inherited Create(False);
FData := objData;
objValidatorCache := FData.Caches.Items['TInvAccountValidator'];
end;
destructor TValidateInvoiceThread.Destroy;
begin
FreeAndNil(FData);
inherited;
end;
procedure TValidateInvoiceThread.Execute;
begin
inherited;
ValidateInvoice;
end;
procedure TValidateInvoiceThread.ValidateInvoice;
var
objValidatorCache: TValidationCache;
begin
objValidatorCache := FData.Caches.Items['TInvAccountValidator'];
end;
I create this thread in another class
procedure TInvValidators.ValidateInvoiceUsingThread(
const nThreadIndex: Integer;
const objValidatorCaches: TObjectDictionary<String, TValidationCache>;
const nInvoiceIndex: Integer; const bUseThread, bPreProcessing: Boolean);
begin
objValidationData := TValidationData.Create(FConnection, FAllInvoices, FAllInvoices[nInvoiceIndex], bUseThread);
objValidationData.Caches := objValidatorCaches;
objThread := TValidateInvoiceThread.Create(objValidationData, bPreProcessing);
FThreadArray[nThreadIndex] := objThread;
FHandleArray[nThreadIndex]:= FThreadArray[nThreadIndex].Handle;
end;
Then I execute it
rWait:= WaitForMultipleObjects(FThreadsRunning, #FHandleArray, True, 100);
Note I have removed some code out of here to try to keep it a bit simpler to follow
The problem is that my Dictionary is becoming corrupt
If I put a breakpoint in the constructor all is fine
However, in the first line of the Execute method, the dictionary is now corrupt.
The dictionary itself is a global variable to the class
Do I need to do anything special to allow me to use Dictionaries inside a thread?
I have also had the same problem with a String List
Edit - additional information as requested
TInvValidators contains my dictionary
TInvValidators = class(TSTCListBase)
private
FThreadArray : Array[1..nMaxThreads] of TValidateInvoiceThread;
FHandleArray : Array[1..nMaxThreads] of THandle;
FThreadsRunning: Integer; // total number of supposedly running threads
FValidationList: TObjectDictionary<String, TObject>;
end;
procedure TInvValidators.Validate(
const Phase: TValidationPhase;
const objInvoices: TInvoices;
const ReValidate: TRevalidateInvoices;
const IDs: TList<Integer>;
const objConnection: TSTCConnection;
const ValidatorCount: Integer);
var
InvoiceIndex: Integer;
i : Integer;
rWait : Cardinal;
Flags: DWORD; // dummy variable used in a call to find out if a thread handle is valid
nThreadIndex: Integer;
procedure ValidateInvoiceRange(const nStartInvoiceID, nEndInvoiceID: Integer);
var
InvoiceIndex: Integer;
I: Integer;
begin
nThreadIndex := 1;
for InvoiceIndex := nStartInvoiceID - 1 to nEndInvoiceID - 1 do
begin
if InvoiceIndex >= objInvoices.Count then
Break;
objInvoice := objInvoices[InvoiceIndex];
ValidateInvoiceUsingThread(nThreadIndex, FValidatorCaches, InvoiceIndex, bUseThread, False);
Inc(nThreadIndex);
if nThreadIndex > nMaxThreads then
Break;
end;
FThreadsRunning := nMaxThreads;
repeat
rWait:= WaitForMultipleObjects(FThreadsRunning, #FHandleArray, True, 100);
case rWait of
// one of the threads satisfied the wait, remove its handle
WAIT_OBJECT_0..WAIT_OBJECT_0 + nMaxThreads - 1: RemoveHandle(rWait + 1);
// at least one handle has become invalid outside the wait call,
// or more than one thread finished during the previous wait,
// find and remove them
WAIT_FAILED:
begin
if GetLastError = ERROR_INVALID_HANDLE then
begin
for i := FThreadsRunning downto 1 do
if not GetHandleInformation(FHandleArray[i], Flags) then // is handle valid?
RemoveHandle(i);
end
else
// the wait failed because of something other than an invalid handle
RaiseLastOSError;
end;
// all remaining threads continue running, process messages and loop.
// don't process messages if the wait returned WAIT_FAILED since we didn't wait at all
// likewise WAIT_OBJECT_... may return soon
WAIT_TIMEOUT: Application.ProcessMessages;
end;
until FThreadsRunning = 0; // no more valid thread handles, we're done
end;
begin
try
FValidatorCaches := TObjectDictionary<String, TValidationCache>.Create([doOwnsValues]);
for nValidatorIndex := 0 to Count - 1 do
begin
objValidator := Items[nValidatorIndex];
objCache := TValidationCache.Create(objInvoices);
FValidatorCaches.Add(objValidator.ClassName, objCache);
objValidator.PrepareCache(objCache, FConnection, objInvoices[0].UtilityType);
end;
nStart := 1;
nEnd := nMaxThreads;
while nStart <= objInvoices.Count do
begin
ValidateInvoiceRange(nStart, nEnd);
Inc(nStart, nMaxThreads);
Inc(nEnd, nMaxThreads);
end;
finally
FreeAndNil(FMeterDetailCache);
end;
end;
If I remove the repeat until and leave just WaitForMultipleObjects I still get lots of errors
You can see here that I am processing the invoices in chunks of no more than nMaxThreads (10)
When I reinstated the repeat until loop it worked on my VM but then access violated on my host machine (which has more memory available)
Paul
Before I offer guidance on how to resolve your problem, I'm going to give you a very important tip.
First ensure your code works single-threaded, before trying to get a multi-threaded implementation working. The point is that multi-threaded code adds a whole new layer of complexity. Until your code works correctly in a single thread, it has no chance of doing so in multiple threads. And the extra layer of complexity makes it extremely difficult to fix.
You might believe you've got a working single-threaded solution, but I'm seeing errors in your code that imply you still have a lot of resource management bugs. Here's one example with relevant lines only, and comments to explain the mistakes:
begin
try //try/finally is used for resource protection, in order to protect a
//resource correctly, it should be allocated **before** the try.
FValidatorCaches := TObjectDictionary<String, TValidationCache>.Create([doOwnsValues]);
finally
//However, in the finally you're destroying something completely
//different. In fact, there are no other references to FMeterDetailCache
//anywhere else in the code you've shown. This strongly implies an
//error in your resource protection.
FreeAndNil(FMeterDetailCache);
end;
end;
Reasons for not being able to use the dictionary
You say that: "in the first line of the Execute method, the dictionary is now corrupt".
For a start, I'm fairly certain that your dictionary isn't really "corrupt". The word "corrupt" implies that it's there, but its internal data is invalid resulting in inconsistent behaviour. It's far more likely that by the time the Execute method wants to use the dictionary, it has already been destroyed. So your thread is basically pointing to an area of memory that used to have a dictionary, but it's no longer there at all. (I.e. not "corrupt")
SIDE NOTE It is possible for your dictionary to truly become corrupt because you have multiple threads sharing the same dictionary. If different threads cause any internal changes to the dictionary at the same time, it could very easily become corrupt. But, assuming your threads are all treating the dictionary as read-only, you would need a memory overwrite to corrupt it.
So let's focus on what might cause your dictionary to be destroyed before the thread gets to use it. NOTE I can't see anything in the code provided, but there are 2 likely possibilities:
Your main thread destroys the dictionary before the child thread gets to use it.
One of your child threads destroys the dictionary as soon as it is destroyed resulting in all other threads being unable to use it.
In the first case, this would happen as follows:
Main Thread: ......C......D........
Child Thread ---------S......
. = code being executed
C = child thread created
- = child thread exists, but isn't doing anything yet
S = OS has started the child thread
D = main thread destroys dictionary
The point of the above is that it's easy to forget that the main thread can reach a point where it decides to destroy the dictionary even before the child thread starts running.
As for the second possibility, this depends on what is happening inside the destructor of TValidationData. Since you haven't shown that code, only you know the answer to that.
Debugging to pinpoint the problem
Assuming the dictionary is being destroyed too soon, a little debugging can quickly pinpoint where/why the dictionary is being destroyed. From your question, it seems you've already done some debugging, so I'm assuming you'll have no trouble following these steps:
Put a breakpoint on the first line of the dictionary's destructor.
Run your code.
If you reach Execute before reaching the dictionary's destructor, then the thread should still be able to use the dictionary.
If you reach the dictionary's destructor before reaching Execute, then you simply need to examine the sequence of calls leading to the object's destruction.
Debugging in case of a memory overwrite
Keeping an open mind about the possibility of a memory overwrite... This is a little trickier to debug. But provided you can consistently reproduce the problem it should be possible to debug.
Put a breakpoint in the thread's destructor.
Run the app
When you reach the above breakpoint, find the address of the of the dictionary by pressing Ctrl + F7 and evaluating #FData.Caches.
Now add a Data Breakpoint (use the drop-down from the Breakpoints window) for the address and the size of the dictionary.
Continue running, the app will pause when the the data changes.
Again, examine the call-stack to determine the cause.
Wrapping up
You have a number of questions and statements that imply misunderstandings about sharing data (dictionary/string list) between threads. I'll try cover those here.
There is nothing special required to use a Dictionary/StringList in a thread. It's basically the same as passing it to any other object. Just make sure the Dictionary/StringList isn't destroyed prematurely.
That said, whenever you share data, you need to be aware of the possibility of "race conditions". I.e. one thread attempts to access the shared data at the same time another thread is busy modifying it. If no threads are modifying the data, then there's no need for concern. But as soon as any thread is able to modify the data, the access needs to be made "thread-safe". (There are a number of ways to do this, please search for existing questions on SO.)
You mention: "The dictionary itself is a global variable to the class". Your terminology is not correct. A global variable is something declared at the unit level and is accessible anywhere. It's enough to simply say the dictionary is a member of or field of the class. When dealing with "globals", there are significantly different things to worry about; so best to avoid any confusion.
You may want to rethink how you initialise your threads. There are a few reasons you some entries of FHandleArray won't be initialised. Are you ok with this?
You mention AV on a machine that has more memory available. NOTE: Amount of memory is not relevant. And if you run in 32-bit mode you wouldn't have access to more than 4 GB in any case.
Finally, to make a special mention:
Using multiple threads to perform dictionary lookups is extremely inefficient. A dictionary lookup is an O(1) operation. The overhead of threading will almost certainly slow you down unless you intend doing a significant amount of processing in addition to the dictionary lookup.
PS - (not so big) mistake
I noticed the following in your code, and it's a mistake.
procedure TValidateInvoiceThread.Execute;
begin
inherited;
ValidateInvoice;
end;
The TThread.Execute method is abstract, meaning there's no implementation. Attempting to call an abstract method will trigger an EAbstractError. Luckily as LU RD points out, the compiler is able to protect you by not compiling the line in. Even so, it would be more accurate to not call inherited here.
NOTE: In general, overridden methods don't always need to call inherited. You should be explicitly aware of what inherited is doing for you and decide whether to call it on a case-by-case basis. Don't go into auto-pilot mode of calling inherited just because you're overriding a virtual method.
Please pardon my slightly humorous title. I use two different definitions of the word 'safe' in it (obviously).
I am rather new to threading (well, I have used threading for many years, but only very simple forms of it). Now I am faced with the challange of writing parallal implementations of some algorithms, and the threads need to work on the same data. Consider the following newbie mistake:
const
N = 2;
var
value: integer = 0;
function ThreadFunc(Parameter: Pointer): integer;
var
i: Integer;
begin
for i := 1 to 10000000 do
inc(value);
result := 0;
end;
procedure TForm1.FormCreate(Sender: TObject);
var
threads: array[0..N - 1] of THandle;
i: Integer;
dummy: cardinal;
begin
for i := 0 to N - 1 do
threads[i] := BeginThread(nil, 0, #ThreadFunc, nil, 0, dummy);
if WaitForMultipleObjects(N, #threads[0], true, INFINITE) = WAIT_FAILED then
RaiseLastOSError;
ShowMessage(IntToStr(value));
end;
A beginner might expect the code above to display the message 20000000. Indeed, first value is equal to 0, and then we inc it 20000000 times. However, since the inc procedure is not 'atomic', the two threads will conflict (I guess that inc does three things: it reads, it increments, and it saves), and so a lot of the incs will be effectively 'lost'. A typical value I get from the code above is 10030423.
The simplest workaround is to use InterlockedIncrement instead of Inc (which will be much slower in this silly example, but that's not the point). Another workaround is to place the inc inside a critical section (yes, that will also be very slow in this silly example).
Now, in most real algorithms, conflicts are not this common. In fact, they might be very uncommon. One of my algorithms creates DLA fractals, and one of the variables that I inc every now and then is the number of adsorbed particles. Conflicts here are very rare, and, more importantly, I really don't care if the variable sums up to 20000000, 20000008, 20000319, or 19999496. Thus, it is tempting not to use InterlockedIncrement or critical sections, since they just bloat the code and makes it (marginally) slower to no (as far as I can see) benefit.
However, my question is: can there be more severe consequences of conflicts than a slightly 'incorrect' value of the incrementing variable? Can the program crash, for instance?
Admittedly, this question might seem silly, for, after all, the cost of using InterlockedIncrement instead of inc is rather low (in many cases, but not all!), and so it is (perhaps) stupid not to play safe. But I also feel that it would be good to know how this really works on a theoretical level, so I still think that this question is very interesting.
Your program won't ever crash due to a race on the increment of an integer that is only used as a count. All that can go wrong is that you don't get the correct answer. Obviously if you were using the integer as an index into an array, or perhaps it was a pointer then you could have problems.
Unless you are incrementing this value incredibly frequently, it's hard to imagine that an interlocked increment would be expensive enough for you to notice the performance difference.
What's more the most efficient approach is to get each thread to maintain its own private count. Then sum all the individual thread counts when you join the threads at the end of the calculation. That way you get the best of both worlds. No contention on the incrementing, and the correct answer. Of course, you need to take measures to ensure that you don't get caught out by false sharing.