What is the lifespan of assets cached by service worker?

What is the lifespan of assets cached by service worker? - web

Some of the articles I have read suggest that items cached by service worker (web Cache API) is stored in system forever.
I have come across a scenario when some of the cached resources are evicted automatically for users who revisit my website after a long time(~ > 2 months)
I know for a fact that assets cached via HTTP caching are removed by browser after certain time. Does same apply for service worker too?
If that is the case, then how does browser decide what asset it has to remove and is there a way I can tell browser that if it is removing something from cache, then remove everything that are cached with same cache name?

It seems it lasts forever, until it doesn't :) (ie. storage space is low)
https://developers.google.com/web/ilt/pwa/caching-files-with-service-worker
You are responsible for implementing how your script (service worker)
handles updates to the cache. All updates to items in the cache must
be explicitly requested; items will not expire and must be deleted.
However, if the amount of cached data exceeds the browser's storage
limit, the browser will begin evicting all data associated with an
origin, one origin at a time, until the storage amount goes under the
limit again. See Browser storage limits and eviction criteria for more
information.
If their storage is running low then it may be evicted: (See Storage Limits)
https://developer.mozilla.org/en-US/docs/Web/API/IndexedDB_API/Browser_storage_limits_and_eviction_criteria

Related

Design a simple cache manager in nodejs to manage file on disk

My server receives an input data and generate an output file. I need to cache this output file, when an user requests input again, server will return output file instantly. The cache manager need:
Entry: inputId -> path to processed file (cached file)
Many server processes can set and get cache entries at the same time
Limit total size of cached file, remove old cached files if disk is full
Cached files expire and are removed after some time. If cache hit then reset expire time of cached file.
Server processes can crash, or computer can shutdown anytime. Cache manager can discard incorrect data but keep valid cached file.
Now, I'm using redis as LRU cache.
inputId -> filePath with expire time
A Sorted Set: inputId -> last access time of file
Write 3 lua scripts to setCache(inputId, filePath), getCache(inputId), removeCache(inputId)
Periodically check disk space to remove least recently used files.
Listen to event a redis key expires to remove file cache
In general, I feel my implementation is not strong enough to handle processes/computer restart/crash. I intend to save cache index into databases.
I need some comments about my design. Do I need to reinventing the wheel ? (stackoverflow doesn't allow seeking about libraries or documents)

Cache model for often requesting items

I have a bunch of user-generated messages with timestamps, text messages, profile images respectively and other stuff. All clients (phones) who are using my Web API are able to request last messages then scroll them down and request oldest items. Obviously, top messages are hottest data in whole list. Obviously, I want to make a cache, which has caching policy and clear undestanding about new requested messages - are requsted messages hot, or not?
I created a stateless service with MemoryCache and now use it for my purposes. Is there are any underwater stones which I should take into account during my work with it? Except point, of course, that I have five nodes, and user is able to make a request to service which has no cache inside. In that case this service goes to data-layer-service then gets and loads some data from it.
UPD #1
Forgot mention that this list of messages updates time out of time with new entries.
UPD #2
I wrapped MemoryCache in IReliableDictionary implementation and palm off it under a stateful Service with my own StateManager implementation. Every time a request didn't find an item in the collection I go to the Azure Storage and retrieve actual data. After I had finished I realized that my experiment is not useful because there is no way for scaling such approach. I mean if my app has fixed partitioned Reliable Services working as cache, I do not have possibility to grow them up with upscaling my Service Fabric. In case of load increase after some time this fact hits me in my face :)
I still do not know how to make a cache for my super hot most readable messages more efficient way. And I still doubt in Reliable Actors approach. It creates a huge amount of replicated data.

I think this is an ideal use of an actor.
The actor will be garbage collected after a period of time, so data won't stay in memory.
One actor per user.

Nodejs application memory usage tracking and clean up on exit

"A Node application is an instance of a Node Process Object".link
Is there a way in which local memory on the server can be cleared every time the node application exits.
[By application exit i mean that when each individual user of the website shuts down the tab on the browser]

node.js is a single process that serves all your users. There is no specific memory associated with a given user other than any state that you yourself in your own node.js code might be storing locally in your node.js server on behalf of a given user. If you have some memory like that, then the typical ways to know when to clear out that state are as follows:
Offer a specific logout option in the web page and when the user logs out, you clear their state from memory. This doesn't catch all ways the user might disappear so this would typically be done in conjunction with other optins.
Have a recurring timer (say every 10 minutes) that automatically clears any state from an user who has not made a web request within the last hour (or however long you want the time set to). This also requires you to keep a timestamp for each user each time they access something on the site which is easy to do in a middleware function.
Have all your client pages keep a webSocket connection to the server and when that webSocket connection has been closed and not re-established for a few minutes, then you can assume that the user no longer has any page open to your site and you can clear their state from memory.
Don't store user state in memory. Instead, use a persistent database with good caching. Then, when the user is no longer using your site, their state info will just age out of the database cache gracefully.
Note: Tracking memory overall usage in node.js is not a trivial task so it's important you know exactly what you are measuring if you're tracking this. Overall process memory usage is a combination of memory that is actually being used and memory that was previously used, is currently available for reuse, but has not been given back to the OS. You obviously need to be able to track memory that is actually in use by node.js, not just memory that the process may be allocated. A heapsnapshot is one of the typical ways to track what is actually being used, not just what is allocated from the OS.

Connecting to the new Azure Caching (DataCache, DataCacheFactory, & Connection Pooling)

The Windows Azure Caching Document says
If possible, store and reuse the same DataCacheFactory object to conserve memory and optimize performance."
Has anyone seen any metrics or any quantification of how expensive this is?
One argument is that
"MaxConnectionsToServer setting... determines the number of chennels per DataCacheFactory that are opened to the cache cluster."
So if MaxConnectionsToServer = 1 and DataCacheFactory is a singleton in your app, then you've effectively syncronized all requests to your web server!
However, there is a lot of indication that DataCacheFactory should be a singleton (i.e. put in Application_OnStart).
This is critical and I can't believe it is not in the Microsoft documentation. Is the DataCacheFactory treated the same in AppFabric, Azure Shared Caching, and Azure Caching? I just have a difficult time believing that Microsoft designed caching in a way that requires a singleton factory object. This is like requiring anyone that uses SqlConnection to have a singleton SqlConnectionFactory object in their application.
So, considering a relatively average web app (For example, 1,000s of requests per hour, ~ 100 objects in cache, the average request accesses 5 cached objects):
By default (and recommendation) how many Factory objects should there be at one time?
How long does it take to create a DataCacheFactory reference?
How long does it take to create a DataCache reference?
Should their only be 1 DataCacheFactory object per app and only 1 DataCache reference per request?
EDIT (answers in progress):
(1/2). Let Azure connection pooling handle the Factory objects
(3). Still testing...
(4). Still trying to figure out if I should re-use DataCache references

How about that, Microsoft did document best practices and it does involve connection pooling! Although not easy to find (at least for me).
It appears that the answer is simply to not use the DataCacheFactory object when implementing the newer Azure Caching and just access the DataCache object directly
"There are also new overloads to the DataCache constructor that make
it simpler to create a cache client. In the past, it was always
necessary to create a DataCacheFactory object that returns the target
cache. Now it is possible to create the cache with the DataCache
constructor directly. The following example creates a client to the
default cache from the default section of the configuration file."
DataCache cache = new DataCache();
And to use connection pooling.
"With the latest Windows Azure SDK, connection pooling is enabled by
default when you define your cache settings in the application or web
configuration files. Because of this default behavior, it is important
to set the size of the connection pool correctly. The connection pool
size is configured with the maxConnectionsToServer attribute on the
dataCacheClient element."
I wish Microsoft gave some guidance on how to configure the maxConnectionsToServer correctly but that can be determined through testing. The automatic connection pooling with the new Azure Caching is pretty cool :)

I'm assuming you're referring to Shared Caching Service (previously known as Azure AppFabric Cache)
There is no cost per an individual connection. However, when you purchase a Cache account, you're paying not for only the size of a cache account but also for a particular number of connections.
Smallest cache account has 10 connections per hour, while most expensive one allows for 160 concurrent connections. Thus, if you're concerned that you may run out of connections given the size of your account, it maybe prudent to be careful as to how many connections you open from your app.
More details
http://msdn.microsoft.com/en-us/library/windowsazure/hh697522.aspx

Azure DataCache MaxConnectionToServer

I am using the AppFabricCacheSessionStoreProvider and occasionally get the error
ErrorCode:SubStatus:There is a temporary failure. Please retry later.
(The request failed, because you exceeded quota limits for this hour.
If you experience this often, upgrade your subscription to a higher
one). Additional Information : Throttling due to resource :
Connections.
I am using a basic 128mb cache with a web role which has two instances. What is the default MaxConnectionToServer value if it is not set? I think when I fire up a staging instance as well it can cause this error (4 simultaneous instances). Will setting MaxConnectionToServer to a higher value make it better or worse? I believe the 128mb cache has limit of 5 connections so should I set it to 1 which would mean only 4 connections could be used. The cache is not used elsewhere in the app.

The default for MaxConnectToServer is 1, so you shouldn't have to change this setting, but if you do set it to 1, it will avoid anyone else looking at your config from getting confused as well. If you set it to a higher value then you will see this problem more often.
The cache session provider seems to be a little slow at disposing of its connections to the cache when it doesn't need them any more. This means that if you're running a number of instances which is close to the limit for you cache size you do seem to see this error. You're correct a 128MB cache does only allow 5 concurrent connections. If you want to avoid this problem at the moment the only solution I'm aware of is to buy the next cache size up.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string