Information-rich Cache

Information-rich Cache

I was surprised to learn that different stacks of software all do some type of computational analysis to better do its job. The CPU does it to figure out how to pipeline instructions. Compilers do it to feed CPUs better set of instructions. Front-end frameworks do it to figure out how to incrementally update the DOM.

We separate our tech stack into layers to make it more understandable, but it does seem like a pity that the same work is done multiple times. It might be one of those problems that seem the same on the surface, but it really isn't, so if you tried to unify them somehow, you'd get a big specification mess. Still, it does seem regrettable.

This brings me to caching. Caching is hard because the cache subsystem often has the least information about what has changed in the system. If the computational dependency graph was available to a cache, it'd know exactly what outputs to invalidate based on the inputs that were changed. Taken to the extreme, if you have this computational dependency graph, you can carry it all the way to the GPU. This only solves instances where we need to update caches initiated by a change in the data store. This is the problem of cache invalidation.

In the instances where we want to cache based on the user's actions, it's harder to know what to cache. This is not a problem of cache invalidation, but what to cache in the first place. A cache miss requires us to fetch the data as it's requested, which means we need to wait for new data when we've discovered that we're about to step off the edge of the world.

There are predictive techniques we can use to try to figure out what the user will do next, so we can do pre-fetching. I think this can be simplified if we know ahead of time the possible actions, and hence the new sets of data the user could possibly request. Just like there are only two ordinal directions on a map, if we had such a map–a state dependency graph, then we can pre-fetch the "next possible steps" ordered by likelihood so that the user would never fall off the edge of the world if they didn't walk too fast.

We currently don't encode this state dependency graph in any way, other than in HATEOS, which isn't in wide accepted use. But I think if we can have the same idea on the client, then it'd be able to ask the server for more data without involvement from the developer manually asking for data when the client is fetching data.