Is it ok to instantiate many short-lived Dictionary<string, MyClass> or better to object-pool? - garbage-collection

I have a method that reads some values from the DB and then yields a Dictionary - could be many thousand times.
public IEnumerable<Dictionary<string, MyClass>> GetValues()
{
var results = new Dictionary<string, MyClass>();
// ... fill data from data reader
yield return results;
}
Consumer of this method only iterates over the result and uses the dictionary in a reading manner.
So... in regards to garbage collection and memory continuity: Would it make sense to do something like this?
private Dictionary<string, MyClass> _resultHolder = new Dictionary<string, MyClass>();
public IEnumerable<ReadOnlyDictionary<string, MyClass>> GetValues()
{
// Don't want the previous record values, so clear
_resultHolder.Clear();
// ... fill data from data reader
yield return _resultHolder;
}
As far as I remember there are different levels to the garbage collector... So will this result only ever end up in Level 0 anyway and therefore not cause any problems in regard to memory fragmentation?
Would this kind of object pooling be premature optimization?

Related

The performance issue of validating entity using value object

I have the following value object code which validates CustCode by some expensive database operations.
public class CustCode : ValueObject<CustCode>
{
private CustCode(string code) { Value = code; }
public static Result<CustCode> Create(string code)
{
if (string.IsNullOrWhiteSpace(code))
return Result.Failure<CustCode>("Code should not be empty");
// validate if the value is still valid against the database. Expensive and slow
if (!ValidateDB(code)) // Or web api calls
return Result.Failure<CustCode>("Database validation failed.");
return Result.Success<CustCode>(new CustCode(code));
}
public string Value { get; }
// other methods omitted ...
}
public class MyEntity
{
CustCode CustCode { get; }
....
It works fine when there is only one or a few entity instances with the type. However, it becomes very slow for method like GetAll() which returns a lot of entities with the type.
public async IAsyncEnumerable<MyEntity> GetAll()
{
string line;
using var sr = File.OpenText(_config.FileName);
while ((line = await sr.ReadLineAsync()) != null)
{
yield return new MyEntity(CustCode.Create(line).Value); // CustCode.Create called many times
}
}
Since data in the file was already validated before saving so it's actually not necessary to be validated again. Should another Create function which doesn't validate the value to be created? What's the DDD idiomatically way to do this?
I generally attempt not to have the domain call out to retrieve any additional data. Everything the domain needs to do its job should be passed in.
Since value objects represent immutable state it stands to reason that once it has managed to be created the values are fine. To this end perhaps the initial database validation can be performed in the integration/application "layer" and then the CustCode is created using only the value(s) provided.
Just wanted to add an additional point to #Eben Roux answer:
In many cases the validation result from a database query is dependent on when you run the query.
For example when you want to check if a phone number exists or if some product is in stock. The answers to those querys can change any second, and though are not suited to allow or prevent the creation of a value object.
You may create a "valid" object, that is (unknowingly) becoming invalid in the very next second (or the other way around). So why bother running an expensive validation, if the validation result is not reliable.

Modify object from multiple async streams in Dart

Imagine we had a object like this
class Foo {
List<int> data = [];
void addAndCheck(int n){
for(int number in data){
// check something
}
data.add(n);
}
}
and the imagine we spawn a bunch of subscriptions like this
Foo foo = Foo();
for(int i = 0; i++; i<10){
subscriptions.add(api.someRandomStream().listen((response){
foo.addAndCheck(response.value);
}));
}
As it stands, if this code is run it might work but as soon as the streams start emitting around the same time we get a exception: Concurrent modification during iteration
The cause is the for loop, but how can this problem be solved? In a language like Java there are things like ConcurrentHashMap, Collections.synchronizedList(...), etc..
If you get a concurrent modification error during the iteration, then you are doing something asynchronous inside the loop. That is, your function is probably async and there is at least one await inside the loop. That will allow another event to trigger while you are awaiting, and then modify the list.
There are several ways to avoid the exception, all with different trade-offs:
Don't do anything asynchronous in the loop, and make sure that nothing you do in there will call addAndCheck again. Then there should be no problem because the loop will complete before anyone else has a chance to modify the list. That obviously only works if you don't need to do something asynchronous.
Copy the list. If you do for(int number in [...data]) { ... } (or in data.toList() as it used to be written), then the list that you iterate is a different list than the one which is modified. It also means that you might not have checked all the elements that are actually in the list at the point you reach the add call.
Don't use an iterator. If you do for (int i = 0; i < data.length; i++) { var number = data[i]; ... } instead, you will not get a concurrent modification error from the iterator. If elements are added at the end of the list, then you will eventually reach them, and all is well. If elements are removed from the list, or added in any place other than at the end, then you might be skipping elements or seeing some of them twice, which may be bad for you.
Use a mutex. If you want to be sure that all the tests on existing elements are performed before any other element is added, then you need to prevent anything from happening while you are adding. Assume a Mutex class of some sort, which would allow you to write code like:
class Foo {
List<int> data = [];
final _mutex = Mutex();
void addAndCheck(int n) async {
await _mutex.acquire();
for(int number in data){
// check something
}
data.add(n);
_mutex.release();
}
}
(I found package:mutex by searching, I have no experience with it).
This might slow down your code, though, making every operation wait for the previous one to complete entirely.
In the end, only you can say which trade-off is best for the behavior of your code.

Actor's value sometimes returns null

I have an Actor and some other object:
object Config {
val readValueFromConfig() = { //....}
}
class MyActor extends Actor {
val confValue = Config.readValueFromConfig()
val initValue = Future {
val a = confValue // sometimes it's null
val a = Config.readValueFromConfig() //always works well
}
//..........
}
The code above is a very simplified version of what I actually have. The odd thing is that sometimes val a = confValue returns null, whereas if I replace it with val a = Config.readValueFromConfig() then it always works well.
I wonder, is this due to the fact that the only way to interact with an actor is sending it a message? Therefore, since val confValue is not a local variable, I must either use val a = Config.readValueFromConfig() (a different object, not an actor) or val a = self ! GetConfigValue and read the result afterwards?
val readValueFromConfig() = { //....}
This gives me a compile error. I assume you mean without parentheses?
val readValueFromConfig = { //....}
Same logic with different timing gives different result = a race condition.
val confValue = Config.readValueFromConfig() is always executed during construction of MyActor objects (because it's a field of MyActor). Sometimes this is returning null.
val a = Config.readValueFromConfig() //always works well is always executed later - after MyActor is constructed, when the Future initValue is executed by it's Executor. It seems this never returns null.
Possible causes:
Could be explained away if the body of readValueFromConfig was dependent upon another
parallel/async operation having completed. Any chance you're reading the config asynchronously? Given the name of this method, it probably just reads synchronously from a file - meaning this is not the cause.
Singleton objects are not threadsafe?? I compiled your code. Here's the decompilation of your singleton object java class:
public final class Config
{
public static String readValueFromConfig()
{
return Config..MODULE$.readValueFromConfig();
}
}
public final class Config$
{
public static final MODULE$;
private final String readValueFromConfig;
static
{
new ();
}
public String readValueFromConfig()
{
return this.readValueFromConfig;
}
private Config$()
{
MODULE$ = this;
this.readValueFromConfig = // ... your logic here;
}
}
Mmmkay... Unless I'm mistaken, that ain't thread-safe.
IF two threads are accessing readValueFromConfig (say Thread1 accesses it first), then inside method private Config$(), MODULE$ is unsafely published before this.readValueFromConfig is set (reference to this prematurely escapes the constructor). Thread2 which is right behind can read MODULE$.readValueFromConfig before it is set. Highly likely to be a problem if '... your logic here' is slow and blocks the thread - which is precisely what synchronous I/O does.
Moral of story: avoid stateful singleton objects from Actors (or any Threads at all, including Executors) OR make them become thread-safe through very careful coding style. Work-Around: change to a def, which internally caches the value in a private val.
I wonder, is this due to the fact that the only way to interact with an actor is sending it a message? Therefore, since val confValue is not a local variable, I must either use val a = Config.readValueFromConfig() (a different object, not an actor)
Just because it's not an actor, doesn't mean it's necessarily safe. It probably isn't.
or val a = self ! GetConfigValue and read the result afterwards?
That's almost right. You mean self ? GetConfigValue, I think - that will return a Future, which you can then map over. ! doesn't return anything.
You cannot read from an actor's variables directly inside a Future because (in general) that Future could be running on any thread, on any processor core, and you don't have any memory barrier there to force the CPU caches to reload the value from main memory.

ConcurrentModificationException with WeakHashMap

I have the code below but I'm getting ConcurrentModificationException, how should I avoid this issue? (I have to use WeakHashMap for some reason)
WeakHashMap<String, Object> data = new WeakHashMap<String, Object>();
// some initialization code for data
for (String key : data.keySet()) {
if (data.get(key) != null && data.get(key).equals(value)) {
//do something to modify the key
}
}
The Javadoc for WeakHashMap class explains why this would happen:
Map invariants do not hold for this class. Because the garbage
collector may discard keys at any time, a WeakHashMap may behave as
though an unknown thread is silently removing entries
Furthermore, the iterator generated under the hood by the enhanced for-loop you're using is of fail-fast type as per quoted explanation in that javadoc.
The iterators returned by the iterator method of the collections
returned by all of this class's "collection view methods" are
fail-fast: if the map is structurally modified at any time after the
iterator is created, in any way except through the iterator's own
remove method, the iterator will throw a
ConcurrentModificationException. Thus, in the face of concurrent
modification, the iterator fails quickly and cleanly, rather than
risking arbitrary, non-deterministic behavior at an undetermined time
in the future.
Therefore your loop can throw this exception for these reasons:
Garbage collector has removed an object in the keyset.
Something outside the code added an object to that map.
A modification occurred inside the loop.
As your intent appears to be processing the objects that are not GC'd yet, I would suggest using an iterator as follows:
Iterator<String> it = data.keySet().iterator();
int count = 0;
int maxTries = 3;
while(true) {
try {
while (it.hasNext()) {
String str = it.next();
// do something
}
break;
} catch (ConcurrentModificationException e) {
it = data.keySet().iterator(); // get a new iterator
if (++count == maxTries) throw e;
}
}
You can clone the key set first, but note that you hold the strong reference after that:
Set<KeyType> keys;
while(true) {
try {
keys = new HashSet<>(weakHashMap.keySet());
break;
} catch (ConcurrentModificationException ignore) {
}
}
for (KeyType key : keys) {
// ...
}
WeakHashMap's entries are automatically removed when no ordinary use of the key is realized anymore, this may happens in a different thread. While cloning the keySet() into a different Set a concurrent Thread may remove entries meanwhile, in this case a ConcurrentModificationException will 100% be thrown! You must synchronize the cloning.
Example:
Collections.synchronizedMap(data);
Please understand that
Collections.synchronizedSet(data.keySet());
Can not be used because data.keySet() rely on data's instance who is not synchronized here! More detail: synchronize(keySet) prevents the execution of methods on the keySet but keySet's remove-method is never called but WeakHashMap's remove-method is called so you have to synchronize over WeakHashMap!
Probably because your // do something in the iteration is actually modifying the underlying collection.
From ConcurrentModificationException:
For example, if a thread modifies a collection directly while it is iterating over the collection with a fail-fast iterator, the iterator will throw this exception.
And from (Weak)HashMap's keySet():
Returns a Set view of the keys contained in this map. The set is backed by the map, so changes to the map are reflected in the set, and vice-versa. If the map is modified while an iteration over the set is in progress (except through the iterator's own remove operation), the results of the iteration are undefined.

How to have a typed random-access read-only array structure

Still finding my feet with Haxe and am looking for way to have a array-like read-only collection that I can specify a type for at compile time
So ideally I need something like the following:
var collection:Collection<ItemType>;
var item:ItemType = collection[3];//or
var other:ItemType = collection.getAt(3);
//also, it would be good if it was iterable
for (item in collection)
{
//stuff
}
So, exactly like an Array, but read only. Would anybody be able to give me a few pointers, please.
Many thanks
Well, you can't have read-only array access as such, but you can do it with methods:
class ReadonlyArray<T> {
var source:Array<T>;
public function new(source) this.source = source
inline function get(index) return source[index]
inline function iterator() return index.iterator()
}
The overhead should be barely noticeable.

Resources