|
nk4um User
Posts: 101
|
Keeping in mind that I have not actually tried to do this yet, my worry about implementing a cache as a resource in the address
space is that there would then be two caches to think about (the system standard cache and my caching resource) and only a
single set of metadata. To make effective transparent use of my private cache I''d need to implement a different set of metadata
and then all resources that would be cached need to use that metadata instead of the standard metadata so it would not be
a simple matter of dropping the caching resource on top. So for example, is I want to cache some:resource using active:myCache I add in the rewrite rule
<rewrite> <match>(some:resource.*)</match> <to>active:myCache+resource@$1</to> </rewrite>
|
but when the myCache accessor makes the subrequest to the original resource it hits the system cache first. The original
resource could have been set up to not get cached by the standard cache (e.g., using AlwaysExpiredMeta) but then myCache would
need to use different information to determine cachability. And then, if I test this and find that myCache is ineffective
then the original resource needs to go back to using a metadata that will allow it to be cached normally by the system.
|
|
nk4um Moderator
Posts: 485
|
2008-09-10T08:29:41.000ZSeptember 10, 2008 08:29
Hi Jeff,
Yes I can see how the caching solution you require will be common across a lot of your infrastructure and is likely a common
requirement when building scalable solutions.
I would suggest that we work toward getting the caching technology working independent of how it plugs into the NetKernel
infrastructure.Then we can try both approaches and see what looks best.
I would not agree that mapping the caching into address space is a clumsier approach. The degrees of freedom that this offers
is why ROC is a powerful abstraction. It becomes easy to reconfigure what and where things are cached.This would be my preferred
approach to getting started as you will be working within the abstraction of NetKernel. I would start with an accessor supporting
SOURCE and SINK verbs to do the cache GET and PUT operations respectively.
Cheers, Tony
|
|
nk4um User
Posts: 101
|
Our goal here with caching is a common one, to reduce load on our database server. We want to be able to cache (and importantly,
pre-cache) several tens of thousands of queries in near-html format (so that the processing to get to a complete page is minimal).
The standard cache would be less than ideal as it gets cleared on restart and would likely be unable to accommodate the number
of entries. A clustered memory space such as terracotta would eliminate (or at least cushion) the size limit. A second consideration
is that we need high availability, so we would have at least 2 NK instances backing the service, so having them able to serve
out of a common cache (so that the heavy work only needs to be done once) would be a plus. I think a disk cache would meet our performance and scale needs, but as they say the proof is in the pudding so I need to
first make pudding :) I could implement a cache as a filtering resource, and rewrite everything to use that resource, e.g.,
<rewrite> <match>(.*)</match> <to>active:myCache+uri@$1</to> </rewrite>
|
but that seems clumsier than implementing a cachelet and letting NK handle the cache mapping.
|
|
nk4um Moderator
Posts: 485
|
Hi Jeff,
You are absolutely correct; in NetKernel 3 the cache is chosen by looking up the superstack of the request for a custom cache
and defaults to the system wide cache if none is found. That functionality is an integral part of the kernel but to be honest
I have never seen it utilized as system usually work best with one cache.
If you move away from the simplicity of one cache then there are many ways that you might want to partition requests across
them. One approach is to replace the global cache with a wrapper that does some switching based on the request.Another approach
is to insert an application level cache as an accessor in your modules.
I would suggest given the specificness of what you want to cache that the later approach is better. I don''t think there is
a problem with overhead as all the variable are there in the response metadata for you to utilise in your cache. There could
be an issue with remote invalidation synchronization if you have those two layers. In that case you might be better forcing
the system cache not to cache those values.
Could you give us an overview of the architecture you are thinking of that would utilise this networked cache. It may give
some more ideas that I could suggest to you.
Cheers, Tony
|
|
nk4um User
Posts: 101
|
Well, I guess it is when reading the docs, but not as useful as I had hoped.
I''m not sure what would be the best approach to distributed caching, but the first step I''m trying to take is implementing
a basic cache in the first place. I''d like to apply a custom cache to just the results from one module while letting the
system cache handle everything else. However, since a cache implementation is found by going up the call stack when a cache
is set in a module then all subrequests must use that same cache implementation. So if I wanted to cache the output of my
dpml pages differently (say, to a slower but persistent cache of disk-based files, or using slightly different cache semantics
like using only the request and not the call stack as a key) using the built-in cache, it would end up trying to cache every
resource request that is made in the creation of that page. Needless to say, this could adversely affect both performance
and correctness.
The alternative I guess is to rewrite all requests in my module to be wrapped by a caching accessor that handles this explicitly,
but then if I want to use the same caching logic (metadata, expiry, and so forth) I need to implement all that myself which
seems, well, redundant.
A possible enhancement to address this is to add in a module private cache that would be used for requests within a module
but not for subrequests to other modules. Looking at Cache.java I don''t think this would be a complicated change (check
for a module private cache before walking the superstack and only check for a inherited cache in the walk) but there may well
be subtleties in the cache finding that are not immediately evident.
|
|
nk4um Moderator
Posts: 756
|
Hi Jeff,
I did a little experimenting with Terracotta when it first appeared and successfully shared an object between two NetKernel
instances. I can''t recall the details but I recall that it would be tricky to implement a kernel-level distributed cache
with NK3 (though NK4 is planned for this). However if you want to share resource state between NKs then the best thing to
do is to have an object container shared inside an Accessor and then RESTfully treat that as a distributed resource.
We''re really busy putting the last touches to NK4 before the developer conference next month in Virginia. As I mentioned,
this is definitely something we have planned in the NK4 roadmap.
Cheers
Pete
|
|
nk4um User
Posts: 101
|
I''ve been thinking about implementing a networked cache module module using terracotta or memcached. Has anyone done something similar and have any failures or success to warn or encourage me?[/url]
|