Lab notes #007 Reuse not possible
While moving forward through the implementation, it wasn't clear how the indexing worked in detail. How do you determine what segments were different between the live and storage-backed indices? By definition, aren't all segments in the live index novel? But more worryingly, what is the key used to store the segments in the storage?
It can't be the key used in the live index, because that maps to an individual datom. Rich says it's "like a UUID", but it's unclear exactly how. Should it be a randomly generated UUID or should it be a content-addressed hash? After finding a half-implemented version of Datomic in Rust on Github, and re-reading papers on Relaxed Radix Balanced Trees, it was clear to me you can use the upper bits of the key that captures the path down to the internal node just above the leaf node as the key to store the segment in storage.
Given that, I took a look at how the im crate implements its OrdMap. While it internally uses a BTree, it doesn't give me access to the node or a partial path down to the laste internal node. I need this, presumably to give a stable key for external storage.
Hence, it looks like I have to take a step back and try to implement the basic persistent data structure. It looks like Relaxed Radix Balanced Trees is the one to go for, given that it has all the properties I'm looking for.
Tried really hard to find a persistence library, since it seems like it's a commodity, and not something that's a differentiator. But given very specific needs, that isn't satisfied by the small number of libraries anyway, I'd need to implement it from scratch.