MODX getChunk alternative - modx

I'm looking for MODX getChunk() alternative mostly because it seems to be really slow when outputting a lot of times.
When I use it once in a snippet then I could hardly notice its speed, but if it's used in a loop then each second matters.
I'm outputting ~1300 images 100 per page as part of the gallery and it takes:
6-7 seconds when the output is placed in a chunk $output .= $modx->getChunk('chunkname');
2-3 seconds when the output is plain HTML
Does anyone know faster alternative to output the result of image query using chunk?

What does your chunk look like?
You might consider abandoning the getChunk() call and just inlining your html:
$output = '';
foreach ($images as $img) {
$output .= '<li><a href="'.$img['path'].'" alt="'.$img['name'].'" /></li>';
}
return $output;
Yeah yeah, it's bad practise but when faced with the alternative taking more than twice as long it's not a bad optimisation.

There's another solution from more of an architectural level - 1300 images is a huge amount to load on one page!
Depending on your design, why not load the first 20 - 30 and implement some kind of infinite scrolling, loading in the rest by ajax (in lots of 20 or so) when the user begins to scroll.
That's going to take the load off your server, save bandwidth, provide a faster user experience. And get around the slow getChunk call.

Related

Is there a way of loading an SVG drawing asynchronously with SkiaSharp?

I am using SkiaSharp to load an SVG drawing. It's a site plan, so is reasonably complex, and it takes a long time to load. On my Samsung Galaxy phone, it's about 3 seconds, and during that time the phone completely locks up, which is quite unacceptable.
I am using SkiaSharp.Extended.Svg.SKSvg, but cannot find an asynchronous version of the Load() method. Is there one? Or maybe a better way of doing this?
I am overlaying obejcts on top of the site plan and it has taken me some considerable time to get all the scaling and alignment sorted, so if at all possible, I'd like to stick with SkiaSharp rather than start with something completely different from scratch.
Thanks for your help!
3 seconds does sound a bit long...
It may not be the SVG part, but rather the loading of the file off the file system.
Either way, you might be able to just load the whole think in a background task (pseudocode):
var picture = Task.Run(() => {
var stream = LoadStream();
var svg = LoadSvg(stream);
return svg.GetPicture();
});

Objective C For loops with #autorelease and ARC

As part of an app that allows auditors to create findings and associate photos to them (Saved as Base64 strings due to a limitation on the web service) I have to loop through all findings and their photos within an audit and set their sync value to true.
Whilst I perform this loop I see a memory spike jumping from around 40MB up to 500MB (for roughly 350 photos and 255 findings) and this number never goes down. On average our users are creating around 1000 findings and 500-700 photos before attempting to use this feature. I have attempted to use #autorelease pools to keep the memory down but it never seems to get released.
for (Finding * __autoreleasing f in self.audit.findings){
#autoreleasepool {
[f setToSync:#YES];
NSLog(#"%#", f.idFinding);
for (FindingPhoto * __autoreleasing p in f.photos){
#autoreleasepool {
[p setToSync:#YES];
p = nil;
}
}
f = nil;
}
}
The relationships and retain cycles look like this
Audit has a strong reference to Finding
Finding has a weak reference to Audit and a strong reference to FindingPhoto
FindingPhoto has a weak reference to Finding
What am I missing in terms of being able to effectively loop through these objects and set their properties without causing such a huge spike in memory. I'm assuming it's got something to do with so many Base64 strings being loaded into memory when looping through but never being released.
So, first, make sure you have a batch size set on the fetch request. Choose a relatively small number, but not too small because this isn't for UI processing. You want to batch a reasonable number of objects into memory to reduce loading overhead while keeping memory usage down. Try 50 or 100 and see how it goes, then consider upping the batch size a little.
If all of the objects you're loading are managed objects then the correct way to evict them during processing is to turn them into faults. That's done by calling refreshObject:mergeChanges: on the context. BUT - that discards any changes, and your loop is specifically there to make changes.
So, what you should really be doing is batch saving the objects you've modified and then turning those objects back into faults to remove the data from memory.
So, in your loop, keep a counter of how many you've modified and save the context each time you hit that count and refresh all the objects that were processed so far. The batch on the fetch and the batch size to save should be the same number.
There's probably a big difference in size between your "Finding" objects and the associated images. So your primary aim should be to redesign your database in a way so that unfaulting (loading) a Finding object does not automatically load the base64 encoded image.
That's actually one of the major strengths of Code Data: Loading part of an object hierarchy. Just try to move the base64 encoded data to an own (managed) object so that Core Data does not load it. It will still be loaded as needed when the reference is touched.

FabricJS ApplyFilter is slow on large images

Taking a very basic stock example such as the redify filter, with a large image (1200x1024) I was trying to determine why it takes (what I think) is too long. After some investigating, I find that the delay occurs in fabricjs::ApplyFilter, where replacement.src = canvasEl.toDataURL('image/png'); (line 17933 in 1.6.2). That take a long time, even compared to the complete pixel run through by the filter.
Is there some way around this? Can I do something differently to speed up the process? TIA

in-memory string deduplication

Context: I'm writing something to process log data which involves loading several GB of data into memory and cross checking various things, finding correlations in data and writing the results out to another file. (This is essentially a cooking/denormalization step before loading into a Druid.io cluster.) I want to avoid having to write the information to a database for both performance and code simplicity - it is assumed that in the foreseeable future the volume of data processed at one time can be handled by adding memory to the machine.
My question is if it is a good idea to attempt to explicitly deduplicate strings in my code; and if so, what is a good approach. Many of the values in these log files are the same exact pieces of text (probably about 25% of the total text values in the file are unique, rough guess).
Since we're talking about gigs of data, and while ram is cheap and swap is possible, there is still a limit and if I'm careless I will very likely hit it. If I do something like this:
strstore := make(map[string]string)
// do some work that involves slicing and dicing some text, resulting in:
var a = "some string that we figured out that has about a 75% chance of being duplicate"
// note that there are about 10 other such variables that are calculated here, only "a" shown for simplicity
if _, ok := strstore[a]; !ok {
strstore[a] = a
} else {
a = strstore[a]
}
// now do some more stuff with "a" and keep it in a struct which is in
// some map some place
It would seem to me that this would have the effect of "reusing" existing strings, at the cost of a hash lookup and compare. Seemingly a good trade off.
However, this might not be that helpful if the strings that are in fact new cause memory to be fragmented and have various holes that are left unclaimed.
I could also try to keep one big byte array/slice which has the character data and index into that, but it would make the code hard to write (esp having to mess around with conversion between []byte and strings, which involves it's own allocation) and I would probably just be doing a poor job of something that is really the Go runtime's domain anyway.
Looking for any advice on approaches to this problem, or if anyone's experience with this sort of thing has yielded particularly useful mechanisms to address this.
There are a lot of interesting data structures and algorithms that you could use here. It depends on what you are trying do in the stats and processing stages.
I am not sure how compressible your logs are but you could pre-process the data, again depending on your uses cases : https://github.com/alecthomas/mph/blob/master/README.md
Take a look at some of these data structures as well for background :
https://github.com/Workiva/go-datastructures

Perl - creating text files with some data in lesser time - using threading

Whats the best way to generate 1000K text files? (with Perl and Windows 7) I want to generate those text files in as possible in less time (possibly withing 5 minutes)? Right now I am using Perl threading with 50 threads. Still it is taking longer time.
What will be best solution? Do I need to increase thread count? Is there any other way to write 1000K files in under five minutes? Here is my code
$start = 0;
$end = 10000;
my $start_run = time();
my #thr = "";
for($t=0; $t < 50; $t++) {
$thr[$t] = threads->create(\&files_write, $start, $end);
#start again from 10000 to 20000 loop
.........
}
for($t=0; $t < 50; $t++) {
$thr[$t]->join();
}
my $end_run = time();
my $run_time = $end_run - $start_run;
print "Job took $run_time seconds\n";
I don't want return result of those threads. I used detach() also but it didn't worked me.
For generating 500k files (with only size of 20kb) it took 1564 seconds (26min). Can I able to achieve within 5min?
Edited: The files_write will only take values from array predefined structure and write into file. thats it.
Any other solution?
The time needed depends on lots of factors, but heavy threading is probably not the solution:
creating files in the same directory at the same time needs probably locking in the OS, so it's better done not too much in parallel
the layout how the data gets written on disk depend on the amount of data and on how many writes you do in parallel. A bad layout can impact the performance a lot, especially on HDD. But even a SDD cannot do lots of parallel writes. This all depends a lot on the disk you use, e.g. it is a desktop system which is optimized for sequential writes or is it a server system which can do more parallel writes as required by databases.
... lots of other factors, often depending on the system
I would suggest that you use a thread pool with a fixed size of threads to benchmark, what the optimal number of threads is for your specific hardware. E.g. start with a single thread and slowly increase the number. My guess is, that the optimal number might be between factor 0.5 and 4 of the number of processor cores you have, but like I said, it heavily depends on your real hardware.
The slow performance is probably due to Windows having to lock the filesystem down while creating files.
If it is only for testing - and not critical data - a RAMdisk may be ideal. Try Googling DataRam RAMdisk.

Resources