CoreText memory leak from CTFramesetterCreateWithAttributedString

CoreText memory leak from CTFramesetterCreateWithAttributedString - memory-leaks

I'm coming across a problematic leak which - thanks to Instruments - appears to be coming from CTFrameSetterCreateWithAttributedString. The call stack is below.
1 CoreText -[_CTNativeGlyphStorage prepareWithCapacity:preallocated:]
2 CoreText -[_CTNativeGlyphStorage initWithCount:]
3 CoreText +[_CTNativeGlyphStorage newWithCount:]
4 CoreText TTypesetterAttrString::Initialize(__CFAttributedString const*)
5 CoreText TTypesetterAttrString::TTypesetterAttrString(__CFAttributedString const*)
6 CoreText TFramesetterAttrString::TFramesetterAttrString(__CFAttributedString const*)
7 CoreText CTFramesetterCreateWithAttributedString
The code generating this call stack is:
CFAttributedStringRef attrRef = (CFAttributedStringRef)self.attributedString;
CTFramesetterRef framesetter = CTFramesetterCreateWithAttributedString(attrRef);
CFRelease(attrRef);
...
CFRelease(framesetter);
self.attributedString gets released elsewhere. I feel like I'm correctly releasing everything else... Given all this, where could the leak be coming from? It's fairly significant for my purposes - 6-10 MB a pop. Thanks for any help.

Turns out this problem was substantially addressed by not using an ivar for a CFMutableArrayRef. We were using the Akosma Obj-C wrapper for CoreText, and in the AKOMultiColumnTextView class there's a CFMutableArrayRef _frames ivar which - for some reason - is hanging around even after CFRelease calls. Not having the _frames ivar is a bit expensive, but has drastically improved memory usage. Thanks Neevek for your contributions.

Related

Class 4 SDHC vs Class 10 SDHC cards [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 5 years ago.
Improve this question
I have been working on some programs that require data to be written/stored onto the SDHC cards, few MBs in size, Sandisk class 4 SDHC * Sandisk class 10 SDHC 16 GB cards in particular.
The results I have observed seems more strange. The write speeds of class 4 cards vs class 10 cards.
Commands used:
I have used dd command to write the data; something like:
dd if=file_10mb.img of=/dev/sdc conv=fsync bs=4096 count=2560
Measured the write speeds by:
iostat /dev/sdc 1 -m -t
Few figures:
Writing a 100MB file:
On class 10 card: 53 secs ->Avg. write speed = 2.03 MB_wrtn/sec
On class 4 card: 31 secs ->Avg. write speed = 2.62 MB_wrtn/sec
Writing a 10MB file:
On class 10 card: 5.7 secs ->Max. & Min. write speed = 1.85 & 1.15
MB_wrtn/sec
On class 4 card: 4 secs ->Max. & Min. write speed = 2.56 & 1.15
MB_wrtn/sec
I expected these results to be exactly opposite as class 10 cards should outperform class 4 cards.
I've tested these on two different cards to remove the probability of wrong readings due to aged cards. Also, the cards are fairly new.
Please let me know about the strange behaviour. Thanks in advance.

A brief research on internet lead me to this page: https://www.raspberrypi.org/forums/viewtopic.php?t=11258&p=123670
which talks about "erase blocks", the size of an "erase" operation; this erase block is generally bigger than a sector size, which is the minimum size for a write operation. On that page some example is shown:
16 GB SanDisk Extreme Pro: erase block size of 4 MB.
8 GB Transcend SDHC 150x: erase block size of 4 MB.
2 GB Transcend SD 150x: erase block size of 8 kB.
Now, your fsync options passed to dd means that after every write, a sync is performed on both the data and metadata, which could involve rewriting part of the FAT, or some other blocks if no FAT is used.
On a classic spinning magnetic disk, that would mean that the head travels a lot, every 4Kb; on a flash memory there is no head, but an erase operation is very costly. Moreover, flash memories have internal algorithms that reduce the wearing, so it becomes very difficult to know what really goes on underneath, inside the memory card.
The conclusion is that, as noted in a comment, 4K block size can be too small, and the fsync option slows down and can be very problematic. Get rid of fsync options, and perform again tests with different block sizes.
In reality, probably every different card has a preferred set of parameters. One way class 10 cards can work faster, can be to choose a big erase block. The time for erasing a block is more or less independent of its size, so a really big erase block effectively improves speed, by erasing more data in the same time. But if blocks are erased too often, speed is reduced instead.
The final answer, from inference, is that your set of parameters seem better suited for a class 4 card than for a class 10 card. In my opinion, your parameters are not well suited for anything, but nobody can be perfectly sure: flash memory cards are intricated. For example, often I record TV transmissions on my TV decoder; there are periods of time in which things go smoothly, and other periods not. 4 months ago the decoder was often complaining about "slow writing speed", with horrible results. Since a couple of months, everything is fine. I touched nothing, the flash USB memory is the same. Probably it entered another phase of its life...

Self-hosted WCF service maxing cpu load

I am looking into an issue at work with a WindowsService that is taking 100% CPU on a machine with 16 CPU's.
The service is hosting a self-hosted .NET WCF service.
I have received a crash dump which I have loaded up in windbg, in order to look for clues.
So what I have tried:
!threads :
ThreadCount: 646
UnstartedThread: 0
BackgroundThread: 643
PendingThread: 0
DeadThread: 2
Hosted Runtime: no
642 of the threads were Threadpool workers as following:
8 29 2a34 000000002068b510 3029220 Preemptive 0000000000000000:0000000000000000 0000000000563f50 0 MTA (Threadpool Worker)
~29s -> !CLRStack
000000003c66eb70 00000000770512fa [GCFrame: 000000003c66eb70]
000000003c66ec40 00000000770512fa [GCFrame: 000000003c66ec40]
000000003c66ec78 00000000770512fa [HelperMethodFrame: 000000003c66ec78] System.Threading.Monitor.Enter(System.Object)
000000003c66ed70 000007fef7af1c9c System.Threading.TimerQueueTimer.Fire()
000000003c66ede0 000007fef7a6c2f3 System.Threading.QueueUserWorkItemCallback.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem()
000000003c66ee30 000007fef7a6c92a System.Threading.ThreadPoolWorkQueue.Dispatch()
000000003c66f388 000007fef8d57d33 [DebuggerU2MCatchHandlerFrame: 000000003c66f388]
~29s -> K
000000003c66e858 000007fefd7010dc ntdll!NtWaitForSingleObject+0xa
000000003c66e860 000007fef8d049bf KERNELBASE!WaitForSingleObjectEx+0x79
000000003c66e900 000007fef8d04977 clr!CLREventBase::WaitEx+0x16c
000000003c66e940 000007fef8d048f8 clr!CLREventBase::WaitEx+0x103
000000003c66e9a0 000007fef8e9c5de clr!CLREventBase::WaitEx+0x70
000000003c66ea30 000007fef8dc5a34 clr!WKS::GCHeap::WaitUntilGCComplete+0x2b
000000003c66ea60 000007fef8d0c4f4 clr!Thread::RareDisablePreemptiveGC+0x176
000000003c66eaf0 000007fef8dd1f3d clr!GCCoop::GCCoop+0x3d
000000003c66eb20 000007fef8e898cf clr!AwareLock::Contention+0x137
000000003c66ebe0 000007fef7af1c9c clr!JITutil_MonContention+0xaf
000000003c66ed70 000007fef7a6c2f3 mscorlib_ni+0x521c9c
000000003c66ede0 000007fef7a6c92a mscorlib_ni+0x49c2f3
000000003c66ee30 000007fef8d57d33 mscorlib_ni+0x49c92a
000000003c66eef0 000007fef8d556e6 clr!CallDescrWorkerInternal+0x83
000000003c66ef30 000007fef8d557af clr!CallDescrWorkerWithHandler+0x4a
000000003c66ef70 000007fef8eda2c9 clr!MethodDescCallSite::CallTargetWorker+0x2e6
000000003c66f120 000007fef8ee51b0 clr!QueueUserWorkItemManagedCallback+0x2a
000000003c66f200 000007fef8ee513e clr!DebuggerU2MCatchHandlerFrame::DebuggerU2MCatchHandlerFrame+0xa0
000000003c66f240 000007fef8ee50b5 clr!ManagedPerAppDomainTPCount::DispatchWorkItem+0x38e
000000003c66f340 000007fef8ee51eb clr!ManagedPerAppDomainTPCount::DispatchWorkItem+0x2bd
000000003c66f3d0 000007fef8eda224 clr!ManagedPerAppDomainTPCount::DispatchWorkItem+0x23b
000000003c66f430 000007fef8ee6baf clr!ManagedPerAppDomainTPCount::DispatchWorkItem+0xb4
000000003c66f5c0 000007fef8ee6ab3 clr!ThreadpoolMgr::ExecuteWorkRequest+0x4c
000000003c66f5f0 000007fef8eda8a6 clr!ThreadpoolMgr::WorkerThreadStart+0xf3
000000003c66f6b0 0000000076c9652d clr!Thread::intermediateThreadProc+0x7d
000000003c66f7f0 000000007702c541 kernel32!BaseThreadInitThunk+0xd
000000003c66f820 0000000000000000 ntdll!RtlUserThreadStart+0x1d
Im having a hard time interpreting the stacktraces since they dont hit any of my applicationcode.
Are they all just idle threadworkers, waiting for work?

Threads with WaitForSingleObject are not critical, since they are waiting and not consuming CPU time. But be aware that your dump is only a snapshot and you might have had bad luck when taking the snapshot.
For a performance analysis with WinDbg you'd need several dumps during high CPU and compare them. If they all have similar stack traces, that's fine and you can conclude something. If they are all very different, it's almost useless.
The command !runaway seems more interesting here, since it lists CPU times consumed per thread, so you can identify the one(s) which are on high CPU. Again: having two snapshots that you can compare is helpful, because the main thread may still have consumed more CPU time in total than some short-living 100% threads.
If you can't use a performance profiler, SysInternals Procdump can generate a series of dumps (-n) for you on high CPU (-c). Use -s to set the time between dumps. For .NET, don't forget -ma for full memory.
Other than that, 646 threads sounds a lot to me. The OS itself could be quite busy scheduling them.

Sounds like the issue could be related to GC. Since this is a self-hosted service, it will use the Workstation GC by default, unless you enable the server GC manually:
http://msdn.microsoft.com/en-us/library/ms229357(v=vs.110).aspx
Have you tried that and see if it makes any difference?

Perfview from Microsoft may be helpful. From the link:
http://blogs.msdn.com/b/dotnet/archive/2012/10/09/improving-your-app-s-performance-with-perfview.aspx
"Late last year, Vance Morrison, who is currently an architect on the .NET Framework Performance team, released PerfView, which is a new performance tool for .NET developers. PerfView helps you discover and investigate performance hotspots in .NET Framework apps, and enables you to deliver consistently high-performance apps to your customers.
Using PerfView, you can perform complex CPU performance analyses to solve hard-to-detect performance problems. PerfView's revolutionary grouping and folding features are what makes it possible to grasp and solve these difficult problems."

use WPRUI.exe to capture a trace and analyze the CPU usage with WPA.exe.
Microsoft explained how to analyze the created trace in the following video:
Defrag Tools: #42 - WPT - CPU Analysis
http://channel9.msdn.com/Shows/Defrag-Tools/Defrag-Tools-42-WPT-CPU-Analysis

Collect ETW with Perfview and follow the big % numbers.
try run in windbg ~*e!clstack => call stacks of all threads look for repeatable code.

Is there a leak in AVPlayers' init method?

I am working on an app that makes extensive use of AVfoundation. Recently I did some leak checking with Instruments. The "leaks" instrument was reporting a leak at a point a in the code where I was instantiating a new AVPlayer, like this:
player1 = [AVPlayer playerWithPlayerItem:playerItem1];
To reduce the problem, I created an entirely new Xcode project with for a single view application, using ARC, and put in the following line.
AVPlayer *player = [[AVPlayer alloc] init];
This produces the same leak report in Instruments. Below is the stack trace. Does anybody know why a simple call to [[AVPlayer alloc] init] would cause a leak? Although I am using ARC, I tried a turning it off and inserting the corresponding [player release]; instruction and it makes no difference. This seems to have to do specifically with AVPlayer.
0 libsystem_c.dylib malloc
1 libsystem_c.dylib strdup
2 libnotify.dylib token_table_add
3 libnotify.dylib notify_register_check
4 AVFoundation -[AVPlayer(AVPlayerMultitaskSupport) _iapdExtendedModeIsActive]
5 AVFoundation -[AVPlayer init]
6 TestApp -[ViewController viewDidLoad] /Users/jason/Synaptic Revival/Project Field Trip/software development/TestApp/TestApp/ViewController.m:22
7 UIKit -[UIViewController view]
--- 2 frames omitted ---
10 UIKit -[UIWindow makeKeyAndVisible]
11 TestApp -[AppDelegate application:didFinishLaunchingWithOptions:] /Users/jason/Synaptic Revival/Project Field Trip/software development/TestApp/TestApp/AppDelegate.m:24
12 UIKit -[UIApplication _callInitializationDelegatesForURL:payload:suspended:]
--- 3 frames omitted ---
16 UIKit _UIApplicationHandleEvent
17 GraphicsServices PurpleEventCallback
18 CoreFoundation __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE1_PERFORM_FUNCTION__
--- 3 frames omitted ---
22 CoreFoundation CFRunLoopRunInMode
23 UIKit -[UIApplication _run]
24 UIKit UIApplicationMain
25 TestApp main /Users/jason/software development/TestApp/TestApp/main.m:16
26 TestApp start

This 48-bytes leak is confirmed by Apple as a known issue, which not only lives in AVPlayer but also in UIScrollView (I have an app happened to use both components.)
Please see this thread to get detail:
Memory leak every time UIScrollView is released
Here's the link to apple's answer on the thead (you may need a developer id to sign in):
https://devforums.apple.com/thread/144449?start=0&tstart=0
Apple's brief quote:
This is a known bug that will be fixed in a future release.
In the meantime, while all leaks are obviously undesirable this isn't going to cause any user-visible problems in the real world. A user would have to scroll roughly 22,000 times in order to leak 1 megabyte of memory, so it's not going to impact daily usage.
It seems any component that refers to notify_register_check and notify_register_mach_port will cause this issue.
Currently no obvious walk around or fix can be found. It is confirmed that this issue remains in iOS versions for 5.1 and 5.1.1. Hopefully apple can fix that in iOS 6 because it is really annoying and destructive.

Leaking in [AVPlayer addBoundaryTimeObserverForTimes]

I have an instance of AVPlayer in my application. I use the time boundary observing feature:
[self setTimeObserver:[player addBoundaryTimeObserverForTimes:watchedTimes
queue:NULL usingBlock:^{
NSLog(#"A: %i", [timeObserver retainCount]);
[player removeTimeObserver:timeObserver];
NSLog(#"B: %i", [timeObserver retainCount]);
[self setTimeObserver:nil];
}]];
The problem is that according to Instruments I am leaking some arrays and values somewhere around this code. I checked the retain count of the time-observing token returned by AVPlayer on places marked A and B in the sample code. At the A point the retain count is 2, at point B the retain count increases to 3 (!). Adding a local autorelease pool does not change anything. I know that retain count is not a reliable metric, but this seems to be fishy. Any ideas about why the retain count increases or about my leaks? The stack trace at the leak point looks like this:
0 libSystem.B.dylib calloc
1 libobjc.A.dylib _internal_class_createInstanceFromZone
2 libobjc.A.dylib class_createInstance
3 CoreFoundation __CFAllocateObject2
4 CoreFoundation +[__NSArrayI __new::]
5 CoreFoundation -[__NSPlaceholderArray initWithObjects:count:]
6 CoreFoundation +[NSArray arrayWithObjects:count:]
7 CoreFoundation -[NSArray sortedArrayWithOptions:usingComparator:]
8 CoreFoundation -[NSArray sortedArrayUsingComparator:]
9 AVFoundation -[AVPlayerOccasionalCaller initWithPlayer:times:queue:block:]
10 AVFoundation -[AVPlayer addBoundaryTimeObserverForTimes:queue:usingBlock:]
If I understand things correctly, AVPlayerOccasionalCaller is the “opaque” object returned by addBoundaryTimeObserverForTimes:queue:usingBlock:, or the time observer.

Do not use -retainCount.
The absolute retain count of an object is meaningless.
You should call release exactly same number of times that you caused the object to be retained. No less (unless you like leaks) and, certainly, no more (unless you like crashes).
See the Memory Management Guidelines for full details.
In this specific case, the retain count you are printing is entirely irrelevant. removeTimeObserver: is probably retaining and autoreleasing the object. Doesn't really matter; it is an implementation detail.
When using the Leaks template in Instrument, note that the Allocations instrument is configured to record reference counts. When you have detected a "leak", look at the list of reference count events for that object. There will likely be a stack where some code of yours is triggering an extra retain. If not, it might be a framework bug.

Visual C++ App crashes before main in Release, but runs fine in Debug

When in release it crashes with an unhandled exception: std::length error.
The call stack looks like this:
msvcr90.dll!__set_flsgetvalue() Line 256 + 0xc bytes C
msvcr90.dll!__set_flsgetvalue() Line 256 + 0xc bytes C
msvcr90.dll!_getptd_noexit() Line 616 + 0x7 bytes C
msvcr90.dll!_getptd() Line 641 + 0x5 bytes C
msvcr90.dll!rand() Line 68 C
NEM.exe!CGAL::Random::Random() + 0x34 bytes C++
msvcr90.dll!_initterm(void (void)* * pfbegin=0x00000003, void (void)* * pfend=0x00345560) Line 903 C
NEM.exe!__tmainCRTStartup() Line 582 + 0x17 bytes C
kernel32.dll!7c817067()
Has anyone got any clues?

Examining the stack dump:
InitTerm is simply a function that walks a list of other functions and executes each in step - this is used for, amongst other things, global constructors (on startup), global destructors (on shutdown) and atexit lists (also on shutdown).
You are linking with CGAL, since that CGAL::Random::Random in your stack dump is due to the fact that CGAL defines a global variable called default_random of the CGAL::Random::Random type. That's why your error is happening before main, the default_random is being constructed.
From the CGAL source, all it does it call the standard C srand(time(NULL)) followed by the local get_int which, in turn, calls the standard C rand() to get a random number.
However, you're not getting to the second stage since your stack dump is still within srand().
It looks like it's converting your thread into a fiber lazily, i.e., this is the first time you've tried to do something in the thread and it has to set up fiber-local storage before continuing.
So, a couple of things to try and investigate.
1/ Are you running this code on pre-XP? I believe fiber-local storage (__set_flsgetvalue) was introduced in XP. This is a long shot but we need to clear it up anyway.
2/ Do you need to link with CGAL? I'm assuming your application needs something in the CGAL libraries, otherwise don't link with it. It may be a hangover from another project file.
3/ If you do use CGAL, make sure you're using the latest version. As of 3.3, it supports a dynamic linking which should prevent the possibility of mixing different library versions (both static/dynamic and debug/nondebug).
4/ Can you try to compile with VC8? The CGAL supported platforms do NOT yet include VC9 (VS2008). You may need to follow this up with the CGAL team itself to see if they're working on that support.
5/ And, finally, do you have Boost installed? That's another long shot but worth a look anyway.
If none of those suggestions help, you'll have to wait for someone more knowledgable than I to come along, I'm afraid.
Best of luck.

Crashes before main() are usually caused by a bad constructor in a global or static variable.
Looks like the constructor for class Random.

Do you have a global or static variable of type Random? Is it possible that you're trying to construct it before the library it's in has been properly initialized?
Note that the order of construction of global and static variables is not fixed and might change going from debug to release.

Could you be more specific about the error you're receiving? (unhandled exception std::length sounds weird - i've never heard of it)
To my knowledge, FlsGetValue automatically falls back to TLS counterpart if FLS API is not available.
If you're still stuck, take .dmp of your process at the time of crash and post it (use any of the numerous free upload services - and give us a link) (Sounds like a missing feature in SO - source/data file exchange?)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string