<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:media="http://search.yahoo.com/mrss/"><channel><title><![CDATA[Interjected Future]]></title><description><![CDATA[Programming frontiers and technology optimism with a sprinkling of startups.]]></description><link>https://interjectedfuture.com/</link><image><url>https://interjectedfuture.com/favicon.png</url><title>Interjected Future</title><link>https://interjectedfuture.com/</link></image><generator>Ghost 4.48</generator><lastBuildDate>Sat, 10 Feb 2024 21:03:31 GMT</lastBuildDate><atom:link href="https://interjectedfuture.com/rss/" rel="self" type="application/rss+xml"/><ttl>60</ttl><item><title><![CDATA[This is not a device. It's an invitation to explore the frontier.]]></title><description><![CDATA[<hr><figure class="kg-card kg-image-card"><img src="https://interjectedfuture.com/content/images/2024/01/rabbit_r1-1.jpg" class="kg-image" alt loading="lazy" width="818" height="600" srcset="https://interjectedfuture.com/content/images/size/w600/2024/01/rabbit_r1-1.jpg 600w, https://interjectedfuture.com/content/images/2024/01/rabbit_r1-1.jpg 818w" sizes="(min-width: 720px) 720px"></figure><p>Recently, the <a href="https://www.rabbit.tech/keynote">Rabbit r1 launched</a>. It&apos;s a new handheld device that promises a new primary interaction model with compute: by talking to them in the context of the real world. In turn, they would help us perform actions to help us with unstructured, but menial tasks throughout our</p>]]></description><link>https://interjectedfuture.com/this-is-not-a-device-its-an-invitation-to-explore-the-frontier/</link><guid isPermaLink="false">65a06ef0ffc0c60001f61378</guid><category><![CDATA[techno-optimism]]></category><dc:creator><![CDATA[Wil]]></dc:creator><pubDate>Sat, 27 Jan 2024 17:10:00 GMT</pubDate><media:content url="https://interjectedfuture.com/content/images/2024/01/DALL-E-2024-01-26-11.25.39---Adjusted-illustration-for-the-post--This-is-not-a-device.-It-s-an-invitation-to-explore-the-frontier---keeping-the-previously-specified-attributes-but.jpg" medium="image"/><content:encoded><![CDATA[<hr><figure class="kg-card kg-image-card"><img src="https://interjectedfuture.com/content/images/2024/01/rabbit_r1-1.jpg" class="kg-image" alt="This is not a device. It&apos;s an invitation to explore the frontier." loading="lazy" width="818" height="600" srcset="https://interjectedfuture.com/content/images/size/w600/2024/01/rabbit_r1-1.jpg 600w, https://interjectedfuture.com/content/images/2024/01/rabbit_r1-1.jpg 818w" sizes="(min-width: 720px) 720px"></figure><img src="https://interjectedfuture.com/content/images/2024/01/DALL-E-2024-01-26-11.25.39---Adjusted-illustration-for-the-post--This-is-not-a-device.-It-s-an-invitation-to-explore-the-frontier---keeping-the-previously-specified-attributes-but.jpg" alt="This is not a device. It&apos;s an invitation to explore the frontier."><p>Recently, the <a href="https://www.rabbit.tech/keynote">Rabbit r1 launched</a>. It&apos;s a new handheld device that promises a new primary interaction model with compute: by talking to them in the context of the real world. In turn, they would help us perform actions to help us with unstructured, but menial tasks throughout our day.</p><p>The reaction was visceral. As evidenced by the entire thread of replies, otherwise reasonable people were overcome by unreasonable emotion against someone building something categorically new.</p><figure class="kg-card kg-embed-card"><blockquote class="twitter-tweet"><p lang="en" dir="ltr">Hot take: NO ONE actually wants AI to:<br>1) Book their vacations<br>2) Order a random pizza<br>3) Have access to their bank account <br><br>Why are we creating tech no one wants? <a href="https://t.co/nvwlBxv1HN">pic.twitter.com/nvwlBxv1HN</a></p>&#x2014; Matt Van Swol (@matt_vanswol) <a href="https://twitter.com/matt_vanswol/status/1745149206447140984?ref_src=twsrc%5Etfw">January 10, 2024</a></blockquote>
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>

</figure><p>Of course, hot takes for internet points is nothing new, and it has no cost to take up a dissenting view. But there&apos;s this insidious underlying pessimism whenever we introduce something categorically new: why are we making tech where I can&apos;t see the need?</p><p>I had originally set out to write about the pessimistic reaction to the Rabbit r1 (having no affiliation with the company), but this expanded to an expression of my broader view: that technological optimism is required for seeing into the future for the health of both the individual and society. </p><p>Surprisingly, I find this cynical perspective common even among tech workers, the people supposedly building the future.[^0] Our collective memories are so short that we can&apos;t remember how obviously useful technology started as toys with non-obvious utility. As if the technologies one uses today weren&apos;t once <a href="https://pessimistsarchive.org/">radical heresies</a> that transformed into an ambient miracle we take for granted. And as a technology optimist, I find it incredibly shortsighted to dismiss anything outright. </p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://interjectedfuture.com/content/images/2024/01/image-3.png" class="kg-image" alt="This is not a device. It&apos;s an invitation to explore the frontier." loading="lazy" width="1456" height="972" srcset="https://interjectedfuture.com/content/images/size/w600/2024/01/image-3.png 600w, https://interjectedfuture.com/content/images/size/w1000/2024/01/image-3.png 1000w, https://interjectedfuture.com/content/images/2024/01/image-3.png 1456w" sizes="(min-width: 720px) 720px"><figcaption>Hide your loved ones. Bicycles are coming. <a href="https://newsletter.pessimistsarchive.org/p/new-years-resolutions-of-yore-vices">More fear and uncertainty here</a>.</figcaption></figure><p>Here&apos;s where one might object: it&apos;s selective bias if we only look backward! Of course, that will make it seem all heretical new inventions will become commonplace. What about all the different technological innovations that failed?</p><p>I&apos;m not suggesting that every new thing that comes on the market has an inevitable climb to being indispensable and commonplace. But I am saying for categorically new technology, we can&apos;t tell just by thinking about it.</p><p>For sustaining technologies that have an existing market of jobs-to-be-done, we can accurately tell whether we can make use of a new technology or not. When a new smartphone comes on the market today, it does mostly the same job that I&apos;ve used other smartphones for before. Take pictures, chat with friends, navigate the world, check email, troll on social media, or hail a ride. I can accurately predict whether I&apos;d find any of these useful because it&apos;s easier to imagine something I&apos;m already doing, but better. </p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://interjectedfuture.com/content/images/2024/01/image-4.png" class="kg-image" alt="This is not a device. It&apos;s an invitation to explore the frontier." loading="lazy" width="640" height="504" srcset="https://interjectedfuture.com/content/images/size/w600/2024/01/image-4.png 600w, https://interjectedfuture.com/content/images/2024/01/image-4.png 640w"><figcaption>One-wheeled motorcycles and other <a href="https://www.themarginalian.org/2012/03/21/strange-invetions/">inventions that failed</a>.</figcaption></figure><p>But for potentially disruptive and categorically new technology that addresses completely new and underserved markets, it&apos;s very hard to see or predict where the utility will be. By definition, technology that caters to new and underserved markets addresses our changing habits, boosts our capabilities, or expands our possibilities&#x2013;something different from what we do now. Unless you happen to be smack middle of the underserved niche, it won&apos;t be obvious how a categorically new technology is useful. It takes work to move yourself into the niche to see a new technology&apos;s potential.</p><p>When the iPhone first came out, the biggest objection was its <a href="https://www.youtube.com/watch?v=eywi0h_Y5_U">lack of a physical keyboard</a>. But with <a href="https://zlib.pub/book/creative-selection-4c9706ek36s0">additional predictive technology</a>, a device with no keyboard was usable without fanfare. Initial objections loom so large in the pessimist&apos;s mind that they can&apos;t get beyond it to ask more forward-looking questions&#x2013;to move themselves into the niche of a categorically new product. Doing so is to explore, to say to oneself, &quot;What if this did work? Then what?&quot;</p><figure class="kg-card kg-embed-card kg-card-hascaption"><iframe width="200" height="113" src="https://www.youtube.com/embed/NtOgnq8lLtw?feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen title="Autocorrect - we have a live/hate relationship"></iframe><figcaption>Perhaps some fanfare.</figcaption></figure><p>An explorer would notice that a multitouch interface expands the set of mobile applications beyond email. Then the forward-looking question was, &quot;<a href="https://www.youtube.com/watch?v=szrsfeyLzyg">What mobile applications does this enable</a>?&quot; An explorer would see the smartphone made computing intensely personal for the first time by being always on you in your pocket. Then the forward-looking question was, &quot;<a href="https://www.afterbabel.com/p/academic-pressure-social-media">How does this affect how people view each other or themselves</a>?&quot; It may not be clear what the answers are, but at least you&apos;re starting to ask targeted questions about the future. Because, unlike technology in existing markets, it&apos;s hard for us to imagine something we&apos;ve never done before. </p><p>Hence, this is my own hot take: <strong>The Rabbit r1 isn&apos;t a device. It&apos;s an invitation to explore a new frontier.</strong></p><!--kg-card-begin: html--><div class="kg-card kg-button-card" style="border: 1px solid #ddd; padding: 1em; border-radius: 0.5em;">
   Enjoying what you&apos;re reading?
   <a class="kg-btn kg-btn-accent" style="margin-left: 1em" href="https://interjectedfuture.com/this-is-not-a-device-its-an-invitation-to-explore-the-frontier">Subscribe for more</a>
</div><!--kg-card-end: html--><h3 id="all-is-not-lost">All is not lost</h3><p>Well, not everyone is pessimistic. Rabbit sold 10k <s>devices</s> invitations to explore on the first day, and sold out of 5 batches.</p><p>What I&apos;m excited about the Rabbit r1 is that it brings real-world context into an LLM. My own experiences with LLMs conclude context is very important for LLMs to be useful. To that end, we currently pair a search over a database (Retrieval Augmented Generation, RAG) with an LLM. But we can pair it with more than just a database. Rabbit provides a multimodal context of the real world. It makes the real world a &quot;database&quot; that an LLM can query, and that opens up many categorically different use cases than we&apos;ve had in the past year using LLMs in a chatroom context. </p><p>And lots of people on <a href="https://twitter.com/localghost/status/1745153927690223841">this thread</a> have positive outlooks about its prospects too.</p><figure class="kg-card kg-embed-card"><blockquote class="twitter-tweet"><p lang="en" dir="ltr">lots of folks coming down hard on rabbit.<br><br>i&apos;d rather live in a world with more stuff like this being made, wouldn&apos;t you? <a href="https://t.co/pWMoMuVcGe">https://t.co/pWMoMuVcGe</a></p>&#x2014; Aaron Ng (@localghost) <a href="https://twitter.com/localghost/status/1745153927690223841?ref_src=twsrc%5Etfw">January 10, 2024</a></blockquote>
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>

</figure><figure class="kg-card kg-embed-card"><blockquote class="twitter-tweet"><p lang="en" dir="ltr">People are so used to a mature market where they get a handful of iterative/boring devices a year<br><br>News flash: we&#x2019;re in the midst of a paradigm shift and we get to have fun again</p>&#x2014; Ben South (@bnj) <a href="https://twitter.com/bnj/status/1745181950200156454?ref_src=twsrc%5Etfw">January 10, 2024</a></blockquote>
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>

</figure><figure class="kg-card kg-embed-card"><blockquote class="twitter-tweet"><p lang="en" dir="ltr">Great point! We all want innovation and competitors, and even if big companies can do it better, I&#x2019;ll support the small companies on principle and knowing there&#x2019;s always gems &#x1F48E; to be found.</p>&#x2014; AlphaRomeoSierra (@ARomeoSierra) <a href="https://twitter.com/ARomeoSierra/status/1745467799781175452?ref_src=twsrc%5Etfw">January 11, 2024</a></blockquote>
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>

</figure><figure class="kg-card kg-embed-card"><blockquote class="twitter-tweet"><p lang="en" dir="ltr">Great to see folks trying things and opening our eyes to possibilities. Not all will be here long term but they do provide something to build on and perhaps help us rethink how we get through our day.</p>&#x2014; Gary Greco (@garyzface) <a href="https://twitter.com/garyzface/status/1745154524132835493?ref_src=twsrc%5Etfw">January 10, 2024</a></blockquote>
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>

</figure><p>There are still explorers of frontiers out there, looking towards the future. I think it&apos;s important that we keep that sense of excitement for the future alive. Technology doesn&apos;t progress all on its own. It only happens when people who see a problem are motivated to find a solution, either by using one or by creating one.</p><p>Why is seeing the future important? How do you set yourself up to have the greatest chance of seeing the future?</p><p>If you want to hear more on this line of thought about optimism and seeing the future, <a href="https://interjectedfuture.com/this-is-not-a-device-its-an-invitation-to-explore-the-frontier">subscribe</a> or <a href="https://twitter.com/iamwil">follow me on Twitter</a>. <br><br><em>Thanks to Eric Dementhon for reading the drafts.</em></p><hr><p>[^0] For non-tech workers, I find disruptive innovative tech doesn&apos;t even register in their heads. They don&apos;t even know what they&apos;re looking at when you show it to them.</p>]]></content:encoded></item><item><title><![CDATA[An incomplete Unicit Merkle Tree]]></title><description><![CDATA[<div class="kg-card kg-callout-card kg-callout-card-blue"><div class="kg-callout-emoji">&#x1F449;</div><div class="kg-callout-text">When I had mistakenly thought that <a href="https://interjectedfuture.com/lab-notes/lab-note-030-the-trick-to-unicity-of-prolly-trees/">Prolly Trees weren&apos;t unict</a>, I thought maybe an interesting way forward was to leverage the same mechanism CRDTs use to converge.<br><br>Below is a mathematical description of what I called a Lattice Merkle Tree, still incomplete. I had time-boxed this, so</div></div>]]></description><link>https://interjectedfuture.com/an-incomplete-unicit-merkle-tree/</link><guid isPermaLink="false">65863015ffc0c60001f5f5ea</guid><category><![CDATA[programming]]></category><dc:creator><![CDATA[Wil]]></dc:creator><pubDate>Thu, 25 Jan 2024 17:05:00 GMT</pubDate><media:content url="https://interjectedfuture.com/content/images/2024/01/DALL-E-2024-01-08-11.29.14---Hero-image-for-the-article--Inventing-the-Lattice-Merkle-Tree---depicted-in-an-origami-art-style-using-pastel-colors.-The-image-should-creatively-repr.jpg" medium="image"/><content:encoded><![CDATA[<div class="kg-card kg-callout-card kg-callout-card-blue"><div class="kg-callout-emoji">&#x1F449;</div><div class="kg-callout-text">When I had mistakenly thought that <a href="https://interjectedfuture.com/lab-notes/lab-note-030-the-trick-to-unicity-of-prolly-trees/">Prolly Trees weren&apos;t unict</a>, I thought maybe an interesting way forward was to leverage the same mechanism CRDTs use to converge.<br><br>Below is a mathematical description of what I called a Lattice Merkle Tree, still incomplete. I had time-boxed this, so I got as far as I could in a limited amount of time. <br><br>I&apos;m putting it out there as a pin, to come back to it at a later time. If anyone wants to carry the torch from here on out, they&apos;re welcome to.</div></div><img src="https://interjectedfuture.com/content/images/2024/01/DALL-E-2024-01-08-11.29.14---Hero-image-for-the-article--Inventing-the-Lattice-Merkle-Tree---depicted-in-an-origami-art-style-using-pastel-colors.-The-image-should-creatively-repr.jpg" alt="An incomplete Unicit Merkle Tree"><p>We want a data structure that holds an ordered set of keys, as a way to build composite CRDTs. <em>The keys must have a total order</em>.</p><p>This data structure should be like a B+tree, Prolly Tree, and Merkle Tree. It should have the following properties:</p><ul><li>High-fanout on each node.</li><li>The tree is balanced and low in height to keep query complexity around O(log n)</li><li>Balancing the tree will change as little of the tree as possible upon insertion of keys.</li><li>The tree is unicit, where the order of insertion doesn&apos;t alter the state of the tree. The same ordered set of keys will yield the same tree.</li></ul><p>The construction of the tree is as follows:</p><ul><li>A leaf node contains entries that are an ordered subsequence of keys in the set. </li><li>A branch node contains entries of the content hash and the range of keys of every child node.</li><li>Regardless of type, the ideal node size is \(N\).</li><li>It&apos;s a Merkle tree, so each node is identified with a hash of its contents. A leaf node is a hash of its keys. A branch node is a hash of the concatenations of the child node hashes and ranges.</li></ul><h2 id="balance-with-history-independence">Balance with history independence</h2><p>As with any tree, we want ours to be <strong>balanced</strong>, where the number of entries in each node should be about the same. Balanced trees are desirable for efficient queries. At the same time, we want <strong>unicity</strong>, where the same set of ordered keys must result in the same tree, regardless of insertion order. Unicit trees are desirable for Merkle Trees for fast equality comparisons by simply comparing root hashes.</p><p>In B+trees. nodes are balanced on insert, but the structure of the tree is not history-independent. Two B+trees could be different, even if they contain the same set of keys.</p><p>In Merkle Search Trees and Prolly Trees, nodes are probabilistically balanced. While the expected value of nodes is \(N\), the variance of node sizes is large. There is a variant of Prolly Trees with a tighter variance of node sizes, but it is not history-independent. </p><p>It turns out unicity is in conflict with balancing nodes. Could we use some of the same math in CRDTs to get a unique balanced tree regardless of the insertion order of keys?</p><p>To do that, we&apos;ll need:</p><ul><li>An element type that describes how the ordered set of keys is divided between leaf nodes.</li><li>A partially ordered set of elements defined with a binary relation that orders two elements, \(A \leq B\)</li><li>A merge function that combines two elements into a third element that is closed over the element type.</li></ul><p>The binary relation of a partially ordered set needs to have the following properties: Reflexive, Antisymmetric, and Transitive.</p><p>The merge function needs to have the following properties: Commutative, Associative, and Idempotent.</p><p>We&apos;ll define these properties later.</p><h2 id="multi-split-description">Multi-Split Description</h2><p>How do we define the elements in the partially ordered set that can describe how keys are split between leaf nodes? We define <em>Multi-split descriptions (MSD)</em> that describe the splits.</p><ul><li>A multi-split description can be represented as a sequence of <em>key groups</em>: \(A = [K_0^A, K_1^A, \ldots, K_i^A]\), where each key group \(K_j^A\) is an ordered subset of keys.</li><li>The key groups are ordered in the sense that if \(j&lt;k\), then every key in \(K_j\)&#x200B; is less than every key in \(K_k\)&#x200B; according to the total order of keys.</li><li>Each key group \(K_j\)&#x200B; represents a set of keys that would reside together in a node if the tree were split at this point.</li></ul><!--kg-card-begin: html--><div class="kg-card kg-button-card" style="border: 1px solid #ddd; padding: 1em; border-radius: 0.5em;">
   Enjoying what you&apos;re reading?
   <a class="kg-btn kg-btn-accent" style="margin-left: 1em" href="https://interjectedfuture.com/an-incomplete-unicit-merkle-tree/#/portal">Subscribe for more</a>
</div><!--kg-card-end: html--><h2 id="partially-ordered-set">Partially Ordered Set</h2><p>To get a partially ordered set (poset) of MSDs, we&apos;d have to define a binary relation \(\leq\) that orders two MSDs. Keep in mind that the keys in a key group of an MSD have a total order, but the MSDs themselves have a partial order.</p><p>The binary relation has two criteria: a primary criterion based on <em>closeness/variance</em> and a secondary criterion based on <em>subsequences</em>.</p><h3 id="primary-criterion-closenessvariance">Primary Criterion: Closeness/Variance</h3><p>Each MSD \(A = [K_0^A, K_1^A, \ldots, K_i^A]\) is evaluated based on how closely each key group \(K_i^A\)&#x200B; adheres to a target node size \(N\) and the variance in sizes across the key groups. The metric for closeness/variance is</p><p>$$M_A = \sum_{i=0}^{n} (\text{length}(K_i^A) - N)^2$$</p><p>where \(n\) is the total number of key groups in the multi-split description \(A\), \(K_i^A\) is the ith key group of MSD \(A\), and \(N\) is the target node size. </p><h3 id="secondary-criterion-subsequences">Secondary Criterion: Subsequences</h3><p>There can be multiple MSDs with the same primary metric \(M\), so the secondary criterion serves as a tie-breaker and indication of partial order based on subsequences. &#xA0;For two multi-split descriptions \(A\) and \(B\), \(A\) is considered less than or equal to \(B\) under this criterion if each key group in \(A\) (\(K_i^A\)), is a subsequence of the corresponding key group in \(B\), (\(K_i^B\)).</p><h3 id="binary-relation-definition">Binary Relation Definition</h3><p>For two multi-split descriptions \(A\) and \(B\), we define \(A \leq B\) in the poset if:</p><ol><li><strong>Primary</strong>: The closeness/variance measure \(M_A&#x200B; \geq M_B\). Notice the direction of the inequality is reversed. The closer \(M\) is to zero, the better balanced it is. </li><li><strong>Secondary</strong>: If two descriptions \(A\) and \(B\) have the same closeness/variance measure \(M_A = M_B\)&#x200B;, then \(A \leq B\) if each key group \(K_i^A\) in \(A\) is a subsequence of the corresponding key group \(K_i^B\) in \(B\).</li></ol><h3 id="required-properties-of-a-partially-ordered-set">Required Properties of a Partially Ordered Set</h3><ul><li><strong>Reflexivity</strong>: \(A \leq A\) <br>Every multi-split description \(A\) is reflexive with itself since \(M_A = M_A\)&#x200B; and each key group \(K_i^A\)&#x200B; is trivially a subsequence of itself. Therefore, \(A \leq A\).<br></li><li><strong>Antisymmetry</strong>: If \(A \leq B\) and \(B \leq A\), then \(A = B\). <br>Assume \(A \leq B\) and \(B \leq A\). Hence, \(M_A \geq M_B\) and \(\text{for each } i, K_i^A \text{ is a subsequence of } K_i^B\). Also, \(M_B \geq M_A\) and \(\text{for each } i, K_i^B \text{ is a subsequence of } K_i^A\). From \(M_A \geq M_B\) and \(M_B \geq M_A\), we deduce that \(M_A = M_B\). And since each \(K_i^A\)&#x200B; is a subsequence of \(K_i^B\) and each \(K_i^B\) is a subsequence of \(K_i^A\), each pair of corresponding key groups must be identical. That is, for each \(i\), \(K_i^A = K_i^B\).<br></li><li><strong>Transitivity</strong>: If \(A \leq B\) and \(B \leq C\), then \(A = C\) <br>If \(A \leq B\) (meaning \(M_A \geq M_B\)&#x200B; and subsequences align) and \(B \leq C\) (meaning \(M_B \geq M_C\)&#x200B; and subsequences align), then \(M_A \geq M_C\)&#x200B; and the subsequence condition from \(A\) to \(C\) will also hold, thus \(A \leq C\). Therefore, If \(A \leq B\) and \(B \leq C\), then \(A \leq C\).</li></ul><h2 id="merge-function">Merge function</h2><p>Now that we have a partially ordered set (poset) of MSDs, we need to define a merge function. The merge function combines two MSDs and returns a third MSD that represents the Least Upper Bound of the two MSDs in the join-semilattice.</p><p>$$A \sqcup B = C$$</p><p>where \(A\) and \(B\) are MSDs that combine into the merged MSD \(C\).</p><p>An outline of this merge function is as follows:</p><ul><li>For every key group \(K_i^A\) in \(A\), merge it with its corresponding key group \(K_i^B\) in \(B\). Recall each key group is a totally ordered set, and each key is unique, so they can be merged according to a set merge. </li><li>For the key group \(K_c^{A \sqcup B}\) that changed at index \(c\), due to the insertion or deletion of a key, we rebalance the splits between the key group and its neighbouring siblings. This <em>has</em> to be a function that only relies upon the changed key group \(K_{c}^{A \sqcup B}\), and its previous sibling \(K_{c-1}^{A \sqcup B}\) and next sibling \(K_{c+1}^{A \sqcup B}\).</li><li>After this rebalancing, we change the content hash references to the child node on the parent node recursively back up to the root node.</li></ul><p>The rebalancing function must be a pure function of the three key groups under consideration. There is a list of possible &quot;moves&quot; to rebalance the three key groups:</p><ul><li>Do nothing (leave \(K_{c}^{A \sqcup B}\) as is)</li><li>Move the leftmost key of \(K_{c}^{A \sqcup B}\) to \(K_{c-1}^{A \sqcup B}\)</li><li>Move the rightmost key of \(K_{c}^{A \sqcup B}\) to \(K_{c+1}^{A \sqcup B}\)</li><li>Borrow the rightmost key of \(K_{c-1}^{A \sqcup B}\) to \(K_{c}^{A \sqcup B}\)</li><li>Borrow the leftmost key of \(K_{c+1}^{A \sqcup B}\) to \(K_{c}^{A \sqcup B}\)</li><li>Split \(K_{c}^{A \sqcup B}\) into two nodes (at the middle index or median key value).</li></ul><p>For each move, calculate the closeness metric. </p><p>$$M_{A \sqcup B} = \sum_{i=c-1}^{c+1} (\text{length}(K_i^{A \sqcup B}) - N)^2$$</p><p>where \(K_i^{A \sqcup B}\) denotes the original state of the key groups. Because the rest of the tree is unchanged, we don&apos;t need to take into account the contribution of the unchanged nodes to the closeness metric computation. </p><p>We also need to calculate the closeness metric of both A and B at the point of insertion before the merge.</p><p>$$M_{A} = \sum_{i=c-1}^{c+1} (\text{length}(K_i^{A}) - N)^2$$</p><p>$$M_{B} = \sum_{i=c-1}^{c+1} (\text{length}(K_i^{B}) - N)^2$$</p><p>Given the closeness metrics, \(M_{A}\)<em>, </em>\(M_{B}\)<em>, </em>and all moves of \(M_{A \sqcup B}\), we pick the move with the <strong>max metric that&apos;s less than the metric of both A and B</strong>. This should give us the least upper bound of the merge.</p><p>$$\max { m \in M_{A \sqcup B} \mid m &lt; M_{A} \text{ and } m &lt; M_{B} }$$</p><p>Remember, the smaller the metric, the greater the binary relation. So we want the maximum metric (least) with a metric less than (upper bound) both A and B.</p><p>In the case of a split, we have a special case. We need to rebalance the parent node and repeat this calculation, recursively up to the root node, or until we get to an ancestor branch node that doesn&apos;t need to rebalance. </p><p>In the case of a tie, we choose the move where every key group \(K_i^{A \sqcup B}\) is a supersequence of the key group \(K_j^{A \sqcup B}\) from the other move. </p><p>We want to pick the move that gives us an MSD that&apos;s greater than the MSD of the other move per the binary relation of the poset as described above. </p><h3 id="required-properties-of-a-merge-function">Required Properties of a Merge Function</h3><p>This merge function has to satisfy three algebraic properties: commutativity, associativity, and idempotence. Objects that satisfy all three properties are called semilattices.</p><ul><li><strong>Commutativity:</strong> \(A \sqcup B = B \sqcup A\)</li><li><strong>Associativity:</strong> \((A \sqcup B) \sqcup C = A \sqcup (B \sqcup C)\)</li><li><strong>Idempotence:</strong> \((A \sqcup B) \sqcup B = A \sqcup B\)</li></ul><p>For all of these, it relies on properties of sets and a property of pure functions.</p><ol><li>Merging sets have all of the algebraic properties above. </li><li>Pure functions return the same output given the same inputs.</li></ol><p>The keys have a total order and are unique. Each key group is an ordered set. Therefore, a merged key group \(K_{i}^{A \sqcup B}\) is the identical regardless of merge order \(K_{i}^{A \sqcup B} = K_{i}^{B \sqcup A}\). The rebalance function is pure. </p><p>Therefore, given the same set of input key groups, the resulting key groups are the same. Hence, it does not matter the order the merge function is applied for commutivity and associativity. Idempotence relies on the idempotence of the set merge function&#x2013;it doesn&apos;t matter how many times an element is inserted, the merged set will be the same.</p><h2 id="join-semi-lattice">Join semi-lattice</h2><p>With both a partially ordered set of multi-split descriptions and an algebraic merge function, we can combine the two into a join semi-lattice that can maintain the same splits regardless of the insertion order.</p><p>This should give us a unicit tree where the same data gives us the same tree, making it easy to compare hashes as a Merkle tree.</p><h2 id="addendum-monotonically-increasing-state">Addendum: monotonically increasing state</h2><p>So far, the formulation for this only addresses insertions. However, it should also work for deletions, but we&apos;d need a different representation of a key group to maintain a monotonically increasing state. It should be enough to add tombstones: two sets for every key group&#x2013;a set for addendums, and a set for deletions.</p><p>CRDTs often have to keep this monotonically increasing state around to retroactively sync any replicas in the future. We don&apos;t have that requirement. For single-threaded use, I think we can collapse the state after every reversal from insertion to deletion or vice versa. For multi-threaded use, we can keep the state around while there are concurrent edits, but after all changes are committed, we can probably compress that state.</p><hr><p>I designed this with the help of GPT. It probably warrants a blog post at some point. But in the meantime, you can see the transcripts here:</p><ul><li><a href="https://chat.openai.com/share/12f5c547-b110-48dd-9488-3039a7d1ccae">Designing Lattice Merkle Trees Part 1</a></li><li><a href="https://chat.openai.com/g/g-FeCSRVPMd-jeff-dean-aura/c/034770a7-2d04-4b1b-bd0c-3979510bcaa2">Designing Lattice Merkle Trees Part 2</a></li><li><a href="https://chat.openai.com/share/e8fce0dd-9c29-4f62-85ac-ffb760653472">GPT writing Isabelle code</a></li></ul><p>And the resulting incomplete Isabelle code.</p><ul><li>The incomplete initial strategy of <a href="https://gist.github.com/iamwilhelm/e984efe79b031b8eaac8673b0c1a0618">proving a full implementation</a>.</li><li>The incomplete simplified strategy of <a href="https://gist.github.com/iamwilhelm/2ed1ea08bd352d4ca273e4894e08593d">proving properties on node sizes</a>.</li></ul>]]></content:encoded></item><item><title><![CDATA[Year 2023 in Review]]></title><description><![CDATA[<p><a href="https://interjectedfuture.com/2021-year-in-review/">Compared to 2021</a>, I was much more focused this year. Compared to 2022, I have built more conviction. And starting in September, I started to have full days to work, rather than just the nights and weekends.</p><p>Unlike 2021, I wasn&apos;t trying a lot of different things. I</p>]]></description><link>https://interjectedfuture.com/year-2023-in-review/</link><guid isPermaLink="false">65875a80ffc0c60001f5fc38</guid><category><![CDATA[retro review]]></category><dc:creator><![CDATA[Wil]]></dc:creator><pubDate>Tue, 02 Jan 2024 18:07:00 GMT</pubDate><media:content url="https://interjectedfuture.com/content/images/2024/01/DALL-E-2024-01-02-10.24.11---Illustration-for-a-self-reflective-blog-post-about-personal-achievements-and-future-goals--using-a-lines-and-flat-color-art-style.-The-image-depicts-a.jpg" medium="image"/><content:encoded><![CDATA[<img src="https://interjectedfuture.com/content/images/2024/01/DALL-E-2024-01-02-10.24.11---Illustration-for-a-self-reflective-blog-post-about-personal-achievements-and-future-goals--using-a-lines-and-flat-color-art-style.-The-image-depicts-a.jpg" alt="Year 2023 in Review"><p><a href="https://interjectedfuture.com/2021-year-in-review/">Compared to 2021</a>, I was much more focused this year. Compared to 2022, I have built more conviction. And starting in September, I started to have full days to work, rather than just the nights and weekends.</p><p>Unlike 2021, I wasn&apos;t trying a lot of different things. I was mainly trying to focus and build conviction in the direction I needed to go forward.</p><p>At the beginning of this year, I wasn&apos;t sure if building a database in the browser was the right move. Do I want to do this because I am attracted to the problem, or am I attracted to the solution? I&apos;ve fallen down the rabbit hole of the latter before, and it results in a whole lot of wasted time, even if there were challenges and fun to be had along the way. </p><p>Hence, I <a href="https://docs.google.com/spreadsheets/d/1NYb0YSRR7-iG0SvipYvWRvq3JRZIsBLnJSb47olhEpQ/edit?usp=sharing">surveyed the existing databases</a> as a local-first solution and found most wanting. Some came close, like ElectricSQL, Evolu, and CR-SQLite, but I don&apos;t think they have the same long-term focus that I&apos;m envisioning. And yet, that wasn&apos;t enough to build conviction. So I <a href="https://interjectedfuture.com/immutable-database-for-interactive-apps/">sought to try and build a to-do list app</a>, both as a demonstration of needs, as well as a way to explore questions about why applications didn&apos;t use avatars.</p><figure class="kg-card kg-embed-card"><blockquote class="twitter-tweet"><p lang="en" dir="ltr">Starting small. Adding tasks to a todo list. <a href="https://t.co/oQgeTr4Ifc">pic.twitter.com/oQgeTr4Ifc</a></p>&#x2014; Wil Chung (@iamwil) <a href="https://twitter.com/iamwil/status/1651633439291109381?ref_src=twsrc%5Etfw">April 27, 2023</a></blockquote>
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>

</figure><p>It did take a while to work through a bunch of issues on the application side, and I did learn quite a bit about application-level issues one would run into when it came to migrations. </p><figure class="kg-card kg-embed-card"><blockquote class="twitter-tweet"><p lang="en" dir="ltr">Was curious what diamond-shaped state change with diamond-shaped schema migrations looked like--it&apos;s a hypercube. If a DAG, it&apos;s a semi-lattice.<br><br>Every level is N step away from root. Migrations point down or right. State changes point down or left. <a href="https://t.co/cshQoEExSo">pic.twitter.com/cshQoEExSo</a></p>&#x2014; Wil Chung (@iamwil) <a href="https://twitter.com/iamwil/status/1689699525886361601?ref_src=twsrc%5Etfw">August 10, 2023</a></blockquote>
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>

</figure><p>However, at some point, I just ran into a roadblock with off-the-shelf CRDT libraries. Almost all of them were focused on a document-orientated way of modelling data. I thought there was much value to a relational model that could sync. </p><p>So at the end of August, I ended back where I was in January: it&apos;s the right thing to build a database in the browser. After making the call, I made relatively quick progress. I have a model of how relational data can be synced, how to leverage existing databases, and the tactical details of how the whole thing would fit together.</p><p>At the end of this year, I find myself deep in the rabbit hole, eager to pop my way all the way up the stack to get something usable. </p><p>When it comes to syncing, I think I have all the pieces I need. However, when it comes to incrementality, that&apos;s a whole topic I have yet to dive into. But I think just like the CRDTs, I think just committing fully to diving in will get me up to speed pretty quickly. I&apos;ve shown myself this fall that I could catch up pretty quickly, not only because I&apos;ve been looking at this from the sidelines for a long time, but also because I can now <a href="https://interjectedfuture.com/ai-assist/accelerated-learning-with-gpt-academic-paper-reader/">enlist the help of GPTs</a> to help me get up to speed quickly.</p><p>Aside from this main line of work, <a href="https://twitter.com/sridatta">Sri</a> and I recorded <a href="https://www.youtube.com/@techniumpod">The Technium Podcast</a> together. Our original impetus for starting it was just that we talked about this sort of stuff anyway over the pandemic, why not just record it and put it out there? </p><figure class="kg-card kg-image-card"><a href="https://www.youtube.com/channel/UCl_rEKDGBw4myn0uOnPxYsg"><img src="https://interjectedfuture.com/content/images/2023/12/image-2.png" class="kg-image" alt="Year 2023 in Review" loading="lazy" width="2000" height="1335" srcset="https://interjectedfuture.com/content/images/size/w600/2023/12/image-2.png 600w, https://interjectedfuture.com/content/images/size/w1000/2023/12/image-2.png 1000w, https://interjectedfuture.com/content/images/size/w1600/2023/12/image-2.png 1600w, https://interjectedfuture.com/content/images/2023/12/image-2.png 2274w" sizes="(min-width: 720px) 720px"></a></figure><p>And while fun, it took a while to edit each episode. We tried to structure everything so we could do as little editing as possible, and yet it took us away from our other main thrust of work. With 300 subscribers, it was hard to justify the timesink. So after the third season, we decided to take a hiatus.</p><p>However, it seems like some people do enjoy our work after feeling our absence. With some time away, we&apos;ve come to think of 300 as an auditorium of people, which is quite a lot! In addition, building an audience is an activity with superlinear returns if one can get the flywheel going. So it&apos;s probably worth putting in the effort despite the low absolute numbers. To get us through this period, we&apos;ve changed our perspective to this being a beacon to interesting people. So we&apos;ll be talking about how we&apos;ll proceed in the coming year on a new season.</p><figure class="kg-card kg-image-card"><img src="https://interjectedfuture.com/content/images/2023/12/image-3.png" class="kg-image" alt="Year 2023 in Review" loading="lazy" width="1626" height="1054" srcset="https://interjectedfuture.com/content/images/size/w600/2023/12/image-3.png 600w, https://interjectedfuture.com/content/images/size/w1000/2023/12/image-3.png 1000w, https://interjectedfuture.com/content/images/size/w1600/2023/12/image-3.png 1600w, https://interjectedfuture.com/content/images/2023/12/image-3.png 1626w" sizes="(min-width: 720px) 720px"></figure><p>With the additional time, I started writing more again, committing to weekly updates. These are less polished, more to jot down progress and accountability, starting with <a href="https://interjectedfuture.com/lab-notes/lab-notes-021-crdt-in-depth/">Lab Note #021</a>. </p><p>A friend of mine told me that I&apos;m very good at consistently living in the future, but I&apos;m not very good at selling the future. So to help fix that, I&apos;ll also write more complete and long-form thoughts in longer posts. They should take no longer than three weeks to write, just to keep some consistency going. This year, I managed to put out three. That should be an easy bar to clear. Check them out if you haven&apos;t already.</p><ul><li><a href="https://interjectedfuture.com/state-management-library-front-ends-are-looking-for-is-a-database/">The state management library front-ends are looking for is a database</a></li><li><a href="https://interjectedfuture.com/trade-offs-between-different-crdts/">Trade-offs between Different CRDTs</a></li><li><a href="https://interjectedfuture.com/crdts-turned-inside-out/">CRDTs Turned Inside Out</a></li></ul><hr><p>I think a lot lately about building conviction, and why it was so tough for me to find it this year. It&apos;s the classic mid-wit move of talking yourself out of everything when you know slightly too much. This year, I learned that you just don&apos;t know how things will turn out ahead of time, no matter how much you think about it or ask other people. Instead, you must light your conviction based on your initial hunch, and run as fast as you can to the end before the candle of conviction burns out. </p><p>What I hope for 2024 is that I will continue to accelerate the pace of learning, and keep building conviction for what I&apos;m doing. But importantly, this year, I&apos;m committing to systematically find a way to learn how to work faster. Hope you had a great year too, and commit to finding direction in your own work.</p>]]></content:encoded></item><item><title><![CDATA[CRDTs Turned Inside Out]]></title><description><![CDATA[<div class="kg-card kg-callout-card kg-callout-card-grey"><div class="kg-callout-emoji">&#x1F530;</div><div class="kg-callout-text">If CRDTs are new to you, here are some introductory links.<br>&#x2003;- <a href="https://jakelazaroff.com/words/an-interactive-intro-to-crdts/">An interactive intro to CRDTs</a><br>&#x2003;- <a href="https://www.bartoszsypytkowski.com/the-state-of-a-state-based-crdts/">An introduction to state-based CRDTs</a><br>&#x2003;- <a href="https://www.youtube.com/watch?v=vBU70EjwGfw">CRDTs for non-academics</a><br>&#x2003;- <a href="https://www.youtube.com/watch?v=x7drE24geUw">CRDT: The Hard Parts</a><br>&#x2003;- <a href="https://christophermeiklejohn.com/crdt/2014/07/22/readings-in-crdts.html">Readings in CRDTs</a><br>&#x2003;- <a href="https://crdt.tech">crdt.tech</a></div></div><p>Last time, I discussed</p>]]></description><link>https://interjectedfuture.com/crdts-turned-inside-out/</link><guid isPermaLink="false">65684f2c09770b0001cd7b13</guid><category><![CDATA[crdt]]></category><dc:creator><![CDATA[Wil]]></dc:creator><pubDate>Fri, 15 Dec 2023 17:00:00 GMT</pubDate><media:content url="https://interjectedfuture.com/content/images/2023/11/DALL-E-2023-11-30-01.15.28---Illustration-for-an-article-titled--Merkle-Data-Structures-as-CRDTs--using-pastel-colors.-The-image-is-centered-on-a-stylized--colorful-representation.png" medium="image"/><content:encoded><![CDATA[<div class="kg-card kg-callout-card kg-callout-card-grey"><div class="kg-callout-emoji">&#x1F530;</div><div class="kg-callout-text">If CRDTs are new to you, here are some introductory links.<br>&#x2003;- <a href="https://jakelazaroff.com/words/an-interactive-intro-to-crdts/">An interactive intro to CRDTs</a><br>&#x2003;- <a href="https://www.bartoszsypytkowski.com/the-state-of-a-state-based-crdts/">An introduction to state-based CRDTs</a><br>&#x2003;- <a href="https://www.youtube.com/watch?v=vBU70EjwGfw">CRDTs for non-academics</a><br>&#x2003;- <a href="https://www.youtube.com/watch?v=x7drE24geUw">CRDT: The Hard Parts</a><br>&#x2003;- <a href="https://christophermeiklejohn.com/crdt/2014/07/22/readings-in-crdts.html">Readings in CRDTs</a><br>&#x2003;- <a href="https://crdt.tech">crdt.tech</a></div></div><img src="https://interjectedfuture.com/content/images/2023/11/DALL-E-2023-11-30-01.15.28---Illustration-for-an-article-titled--Merkle-Data-Structures-as-CRDTs--using-pastel-colors.-The-image-is-centered-on-a-stylized--colorful-representation.png" alt="CRDTs Turned Inside Out"><p>Last time, I discussed the <a href="https://interjectedfuture.com/trade-offs-between-different-crdts">trade-offs between more traditional CRDTs</a>, such as the State-based CRDT, Op-based CRDT, and Delta-based CRDT. </p><p>There is another class of CRDTs: Merkle CRDTs.</p><p>Most ink spilled on CRDTs focuses on consistently merging data from different replicas. That&apos;s just half the story. Without a way of storing when concurrent edits happen, merges can&apos;t make consistent decisions about conflict resolution. </p><p>Normally, this is the job of <a href="https://haslab.wordpress.com/2011/07/08/version-vectors-are-not-vector-clocks/">version vectors</a> in State-based CRDTs and the log of causal operations in Op-based CRDTs. But if you&apos;re familiar with blockchains or Git, we notice that <a href="https://en.wikipedia.org/wiki/Merkle_tree">Merkle Trees and DAGs</a> (Directed Acyclical Graph) can also do the same job. </p><p>The following is how we construct this bookkeeping, exposed as an inherent structure of the concurrent state, rather than hidden away within a container. Let&apos;s look at the two variations:</p><ul><li><a href="https://arxiv.org/pdf/2004.00107.pdf">Merkle-DAG CRDTs</a></li><li><a href="https://inria.hal.science/hal-02303490v1/document">Merkle Search Tree CRDTs</a></li></ul><h2 id="merkle-dag-crdt">Merkle-DAG CRDT</h2><p>The Merkle-DAG (Directed Acyclical Graph) models CRDTs as a growing DAG of Merkle nodes about concurrent updates as events occur. This is a Merklized version of <a href="https://www.bartoszsypytkowski.com/operation-based-crdts-protocol/">Op-based CRDTs</a>, where the internal bookkeeping grows with the number of state updates, rather than with the number of replicas.</p><p>Each node in the Merkle-DAG can carry a payload of operations, state snapshots, or even deltas. This node payload must be a CRDT for the Merkle-DAG to be a CRDT. However, these payload CRDTs can be much simpler in their implementation because the bookkeeping on concurrent edits is delegated up to the Merkle-DAG.</p><p>It&apos;s as if a state-based CRDT was turned inside-out. The common bookkeeping elements are now the container (the DAG), and the payload-specific bookkeeping is inside of the payload CRDT.</p><p>Beyond simpler bookkeeping for payload CRDTs, Merkle-DAG CRDTs have an advantage that non-Merklized CRDTs don&apos;t. They can determine equality by comparing root hashes, and they can determine diffs by walking down the DAG.</p><p>This reduces the bandwidth for syncing between replicas. </p><ol><li>A remote replica broadcasts its root hash to our local replica. </li><li>The local replica compares its root hash with the remote hash to determine if something has changed. </li><li>If the root hashes are different, it then asks the remote replica to send every node hash as it walks down the DAG from the root. </li><li>When the remote replica sends a hash that the local replica already has, the path from the root to this node is the list of all nodes missing in the local replica.</li></ol><p>This is quite a bit of chatter over the network, but the hashes don&apos;t have to be delivered in causal order any more. Merkle DAGs are by definition a causal chain of hashes. Hence, the nodes can be reordered at the receiver, regardless of how they were sent, as long as all of the nodes were received.</p><p>An upside of keeping the Merkle-DAG is that the data kept by a Merkle-DAG is immutable. It can be reconstructed and queried at any point in history. Immutability means diffs are easy, undos are easy, and provenance can be verified. But again, the downside here is a history of changes that grows with the number of events. However, Merkle Trees uses <a href="https://medium.com/@dumindux/how-immutable-data-structures-e-g-immutable-js-are-optimized-using-structural-sharing-e4424a866d56">structural sharing</a> as a form of compression to lower the space cost of storing such a data structure. </p><div class="kg-card kg-callout-card kg-callout-card-grey"><div class="kg-callout-emoji">&#x1F4A1;</div><div class="kg-callout-text">- Merging of two Merkle DAGs is done by comparing hashes and walking down a DAG.<br>- Node payload is a CRDT, and its merge function still needs to be commutative, associative, and idempotent.</div></div><div class="kg-card kg-callout-card kg-callout-card-green"><div class="kg-callout-emoji">&#x1F44D;</div><div class="kg-callout-text">- Smaller amount of data sent over the network to sync by comparing hashes.<br>- Sync protocol is simpler due to not needing causal order delivery, but still requires some chatter over the wire to find diff to send.<br>- Data is immutable so diffs, undos, and provenance is easy.</div></div><div class="kg-card kg-callout-card kg-callout-card-red"><div class="kg-callout-emoji">&#x1F44E;</div><div class="kg-callout-text">- Internal data structure grows with the number of updates.</div></div><h2 id="merkle-search-tree-crdt">Merkle Search Tree CRDT</h2><p>The second kind of Merkle CRDT is covered in the <a href="https://inria.hal.science/hal-02303490v1/document">Merkle Search Tree paper</a>.</p><figure class="kg-card kg-image-card"><img src="https://interjectedfuture.com/content/images/2023/12/image-1.png" class="kg-image" alt="CRDTs Turned Inside Out" loading="lazy" width="858" height="724" srcset="https://interjectedfuture.com/content/images/size/w600/2023/12/image-1.png 600w, https://interjectedfuture.com/content/images/2023/12/image-1.png 858w" sizes="(min-width: 720px) 720px"></figure><p>A Merkle Search Tree (MST) is a tree that maintains an ordered set of keys. It does this by maintaining a few invariants. </p><ol><li>All the keys in a node are ordered.</li><li>All child nodes between two keys in a node contain entries lexicographically between the two keys.</li><li>The leaf layer is layer zero. The layer above is layer one, with increasing layer numbers until the root. The layer a key belongs to is dictated by the number of leading zeros in the hash of the key.</li><li>A node has a content ID that&apos;s a hash of all its keys concatenated with the content IDs of all its children. A child node is referenced by its content id.</li></ol><p>Keys live in all nodes of an MST tree, unlike a B-tree where keys are only at the leaves. This construction makes MSTs advantageous in two ways: it is self-balancing and it is unicit.</p><!--kg-card-begin: html--><div class="kg-card kg-button-card" style="border: 1px solid #ddd; padding: 1em; border-radius: 0.5em;">
   Enjoying what you&apos;re reading?
   <a class="kg-btn kg-btn-accent" style="margin-left: 1em" href="https://interjectedfuture.com/crdts-turned-inside-out/#/portal">Subscribe for more</a>
</div><!--kg-card-end: html--><p>A MST is self-balancing through both invariants #2 and #3. When a key is inserted in the tree, the layer is determined by the key&apos;s hash and the node in the layer is determined by the key&apos;s value. When the key is inserted in a node between two keys, it necessarily splits the child node between the two keys into two nodes due to invariant #2. Given a hash is in base <em>B</em> and invariant #3, the average number of keys in layer <em>L</em> is <em>B</em> times the average number of keys in layer <em>L + 1</em>. Hence, the average node size is <em>B</em>. Due to the placement of a key being completely deterministic, the tree self-balances.</p><p>Unicity means, for a given set of keys, the MST is unique. It implies the MST&apos;s construction does not depend on the order of the keys that were inserted. This is extremely desireable, because we can tell if two MSTs are different simply by comparing their root hashes regardless of the insertion order. If the root hash is the same, then the two trees are the same. </p><p>To construct a CRDT out of an MST, we simply add a payload to each key that&apos;s also a CRDT. Since the MST is an ordered set, merging sets have well-defined semantics, and merging the payloads is delegated to the payload itself to resolve. </p><p>And just like the Merkle-DAG construction, the payload CRDT doesn&apos;t need to be as complicated. For example, to construct a Grow-Only CRDT, we select the set { &#x23CA;, &#x23C9; } (called &quot;bottom&quot; and &quot;top&quot; respectively) as the payload CRDT. All the values of the payload start as &quot;bottom&quot; by default. Merging means taking the max of &quot;top&quot; vs &quot;bottom&quot; (top always wins in a max operation). </p><p>Unfortunately, the paper leaves out the construction of other CRDTs using MSTs, as it was left as an exercise for the reader. However, there are two other possible constructions I think will work.</p><p><a href="https://dl.acm.org/doi/pdf/10.1145/3380787.3393678">Causal Length CRDTs</a> for add-remove sets could be implemented in MSTs with &#x2124; (the set of integers) as the payload. If the number is even the key doesn&apos;t exist, and if it&apos;s odd, then it does. Merging would just be taking the max of the two replica&apos;s payloads.</p><p><a href="https://www.bartoszsypytkowski.com/operation-based-crdts-arrays-1/">Replicated Growable Arrays</a> (RGA) could be implemented by using virtual pointers concatted with the replica ID as the key. Since the MST keeps the keys ordered, we can read back the string by traversing in key order. This would need a mechanism outside of the MST to insert and assign tombstones for deletions. [^3]</p><p>Beyond that, there is an undetailed edge case. What if the key you&apos;re adding has a lot more (or a lot less) leading zeros in its hash than the number of leading zeros in the current root layer? Do you just add nodes in between the root and the new node? What should go in those in-between nodes? If there are no keys in these in-between nodes and just child content IDs, there would be no way to discern which child to descend into on a query. My guess is to simply insert a ghost key that doesn&apos;t exist in the set but only serves as a range bookend for queries. There are implementations to check out for details. [^4]</p><p>As for downsides, it&apos;s also vulnerable to attackers constructing a key with a large number of zeros and trying to insert it. That would result in a tall tree, reducing queries from O(log n) to O(n). And while the node size is on average <em>B</em>, the variance is spread pretty wide, so it can be common to have nodes that are too small or too large.</p><div class="kg-card kg-callout-card kg-callout-card-grey"><div class="kg-callout-emoji">&#x1F4A1;</div><div class="kg-callout-text">- Merging of two Merkle Search Trees is done similarly to a Merkle Tree: by comparing hashes and walking down the tree.<br>- Node payload is a CRDT, and its merge function still needs to be commutative, associative, and idempotent.</div></div><div class="kg-card kg-callout-card kg-callout-card-green"><div class="kg-callout-emoji">&#x1F44D;</div><div class="kg-callout-text">- Smaller amount of data sent over the network to sync by comparing hashes<br>- Sync protocol is simpler due to not needing causal order delivery, but still requires some chatter over the wire to find diff to send<br>- Self-balancing tree<br>- Trees are unicit<br>- Internal bookkeeping doesn&apos;t grow with either replicas or the number of updates</div></div><div class="kg-card kg-callout-card kg-callout-card-red"><div class="kg-callout-emoji">&#x1F44E;</div><div class="kg-callout-text">- Possible for attackers to construct tall trees to slow query times<br>- Variance of node sizes is large</div></div><h2 id="the-surprising-connection">The surprising connection</h2><p>After this and the <a href="https://interjectedfuture.com/trade-offs-between-different-crdts/">previous survey</a> on CRDTs, I came away with some clarity on them. </p><p>CRDTs need two main ingredients to work, a merge/application operator that&apos;s commutative and associative, and some bookkeeping to keep track of concurrent or causal edits. This bookkeeping is like an uncollapsed superposition of state while the system isn&apos;t sure if there are any more concurrent edits. If we can be sure that all replicas are synced, we can either collapse the state or simply throw it away if we don&apos;t need to travel back in time.</p><p>For typical CRDTs, this bookkeeping is internal, and you&apos;re given an interface to set and query, so one can pretend this is a regular state that can be updated in place. Merkle CRDTs turn this inside out, exposing the inherent structure of concurrent edits and uncollapsed state for users to see and exploit.</p><p>This mirrors closely with how Git is structured. That surprised me at first. But as I dove into Git&apos;s history, I found this tidbit in passing.</p><blockquote>The important part of a merge is not how it handles conflicts (which need to be verified by a human anyway if they are at all interesting), but that it should meld the history together right so that you have a new solid base for future merges.<br><br>In other words, the important part is the _trivial_ part: the naming of the parents, and keeping track of their relationship. Not the clashes.<br><a href="https://wincent.com/blog/a-look-back-bram-cohen-vs-linus-torvalds">Linus Travolus vs Bram Cohen</a></blockquote><p>The values are the contents of the file, and the bookkeeping on concurrent edits happens outside of the value. What stops Git from being a CRDT is that its default 3-way merge isn&apos;t commutative and associative.</p><p>As history proved, it&apos;s the bookkeeping data structure that&apos;s important for merges, not the conflict resolution itself. But in the case of CRDTs, it&apos;s important to constrain the merge with algebraic properties [^2].</p><h2 id="the-search-continues">The search continues</h2><p>Merkle Search Trees have lots of nice properties, such as unicity, self-balancing, and a CRDT that doesn&apos;t grow with the number of replicas or events. But I think we can do better. For one, the variance of the node sizes is large in MSTs, which makes for uneven trees. </p><p>In my search, I found <a href="https://github.com/attic-labs/noms/blob/master/doc/intro.md#prolly-trees-probabilistic-b-trees">Prolly Trees</a>, and will take a look at them. If anyone has other suggestions for data structures to look at to implement composable CRDTs, <a href="https://twitter.com/iamwil">let me know</a>. In the meantime, <a href="https://interjectedfuture.com/crdts-turned-inside-out/#/portal">subscribe</a> or <a href="https://twitter.com/iamwil">follow me on twitter</a>.</p><hr><p>[^1] If you aren&apos;t familiar with the internals of Git, I&apos;d recommend looking into it. &#xA0;Git is the only piece of software where learning its internal data structure made it easier to use. <br>- <a href="https://git-scm.com/book/en/v2/Git-Internals-Plumbing-and-Porcelain">Git Internals: Plumbing and Porcelain</a><br>- <a href="https://marklodato.github.io/visual-git-guide/index-en.html">A Visual Guide to Git</a></p><p>[^2] The algebraic properties we need to maintain for a merge are commutativity, associativity, and idempotency. This forces every replica to arrive at the same merged state without coordinating with each other.</p><p>[^3] Of course, actual RGA implementations are far more optimized than this outline.</p><p>[^4] There are two existing implementations of MSTs. </p><ul><li><a href="https://github.com/domodwyer/merkle-search-tree">https://github.com/domodwyer/merkle-search-tree</a></li><li><a href="https://github.com/jrhy/mast">https://github.com/jrhy/mast</a></li></ul>]]></content:encoded></item><item><title><![CDATA[Trade-offs between Different CRDTs]]></title><description><![CDATA[<p>What are the trade-offs between different kinds of CRDTs (Conflict-free Replicated Data Types)? Most introductory talks cover just state-based and operations-based CRDTs because that&apos;s what the original paper formulated. But since then, there have been other variations, and I haven&apos;t seen much written about them in</p>]]></description><link>https://interjectedfuture.com/trade-offs-between-different-crdts/</link><guid isPermaLink="false">654332dfae74a20001e74e69</guid><category><![CDATA[crdt]]></category><dc:creator><![CDATA[Wil]]></dc:creator><pubDate>Thu, 07 Dec 2023 17:00:00 GMT</pubDate><media:content url="https://interjectedfuture.com/content/images/2023/12/DALL-E-2023-11-30-00.56.45---A-hero-image-for-the-article-titled--Trade-offs-between-Different-CRDTs--in-a-mix-of-illustrative-line-art-and-pixel-art-styles-using-pastel-colors.-T.jpg" medium="image"/><content:encoded><![CDATA[<img src="https://interjectedfuture.com/content/images/2023/12/DALL-E-2023-11-30-00.56.45---A-hero-image-for-the-article-titled--Trade-offs-between-Different-CRDTs--in-a-mix-of-illustrative-line-art-and-pixel-art-styles-using-pastel-colors.-T.jpg" alt="Trade-offs between Different CRDTs"><p>What are the trade-offs between different kinds of CRDTs (Conflict-free Replicated Data Types)? Most introductory talks cover just state-based and operations-based CRDTs because that&apos;s what the original paper formulated. But since then, there have been other variations, and I haven&apos;t seen much written about them in blog posts, so I&apos;ll cover their trade-offs here. </p><ul><li><a href="https://www.bartoszsypytkowski.com/the-state-of-a-state-based-crdts/">State-based CRDTs</a> (convergent CRDTs)</li><li><a href="https://www.bartoszsypytkowski.com/operation-based-crdts-protocol/">Operation-based CRDTs</a> (commutative CRDTs)</li><li><a href="https://arxiv.org/pdf/1410.2803.pdf">&#x3B4;-state-based CRDTs</a></li><li><a href="https://web.archive.org/web/20230607175939/https://novasys.di.fct.unl.pt/~alinde/publications/a12-van_der_linde.pdf">&#x394;-state-based CRDTs</a></li></ul><p>The basics of <a href="https://crdt.tech/">CRDTs</a> are covered in a lot of other different places [^1], so I won&apos;t try to speed run it here. Let&apos;s just jump into the differences.</p><h2 id="state-based-crdts-convergent-crdts">State-based CRDTs (Convergent CRDTs)</h2><p>In State-based CRDTs, a replica&apos;s state can be updated by a merge operation with another replica&apos;s state. But if the <a href="https://architecturenotes.co/fallacies-of-distributed-systems/">network is unreliable and can deliver events out of order</a>, how would all the replicas achieve convergence and agreement of all their states? The trick is simply to restrict ourselves to data structures and operations that are immune to net-splits and out-of-order delivery. </p><p>First, the internal data structure of the CRDT must be monotonically increasing. Second, the merge operation has to be commutative, associative, and idempotent with respect to the internal data structures.</p><p>These constraints make the CRDT immune to the unreliable network. As long as all replicas eventually see every state update event, they&apos;re guaranteed to converge to the same state. Due to the use of vector clocks, replicas coming in and out of the network are easily handled. The internal bookkeeping grows linearly with the number of replicas.</p><p>The rest of the design problem is how to build useful data structures out of only monotonically increasing elements. Luckily, there is already a <a href="https://github.com/pfrazee/crdt_notes/tree/68c5fe81ade109446a9f4c24e03290ec5493031f#portfolio-of-basic-crdts">menagerie of CRDTs</a> that can model numbers, arrays, maps, strings, and other common data structures.</p><p>State-based CRDTs are theoretically sound, but in practice, it has a glaring downside: they require sending the entire state of a replica over the network to other replicas. This can be prohibitive for all but the smallest states like counters and registers. That&apos;s because a CRDT needs to maintain internal bookkeeping of &quot;which replica said what&quot; at some logical time in order to resolve conflicts for any new state changes. The end-user queries this internal bookkeeping for a current value, like a functional view calculated from the actual state, the internal bookkeeping. For anything beyond a counter or register, this internal bookkeeping can be too big for sets and maps. And as the number of replicas grows, it can grow too big too quickly.</p><p>Therefore, most current CRDT implementations are not state-based but op-based. </p><div class="kg-card kg-callout-card kg-callout-card-grey"><div class="kg-callout-emoji">&#x1F4A1;</div><div class="kg-callout-text">- Uses merge function to converge<br>- Merge function needs to be commutative, associative, and idempotent<br>- Internal data structure needs to be monotonically increasing</div></div><div class="kg-card kg-callout-card kg-callout-card-green"><div class="kg-callout-emoji">&#x1F44D;</div><div class="kg-callout-text">- Easy syncing protocol that just broadcasts new state to other replicas<br>- Does not need to keep a history to sync (as we&apos;ll see later)</div></div><div class="kg-card kg-callout-card kg-callout-card-red"><div class="kg-callout-emoji">&#x1F44E;</div><div class="kg-callout-text">- Sending the entire state over the wire is impractical for all but the simplest CRDTs<br>- Internal data structure grows linearly with the number of replicas<br>- Harder to accommodate replicas that come in and out of the network</div></div><h2 id="operation-based-crdts-commutative-crdts">Operation-based CRDTs (Commutative CRDTs)</h2><p>Current practical implementations of CRDTs opt for a way to ship smaller pieces of data over the wire while retaining the same properties as a state-based CRDT. We will give up a merge function and instead use a defined set of operations that can be used on the state. It&apos;s akin to defining <a href="https://guide.elm-lang.org/effects/">a command in an effect system</a>. Like before, the operations will need to be constrained to give us the property of uncoordinated syncing between replicas. The operations will need to be both commutative and associative, but unlike the merge function, they do not have to be idempotent.</p><p>Without idempotency, the syncing protocol now cannot just be an out-of-order delivery of updates like in state-based CRDTs. Instead, we require the operations to be delivered in causal order. When a replica syncs and catches up to the latest state, the operations are applied in causal order. Any concurrent operations will result in the same state, due to the commutative and associative properties of the operations. We&apos;re shifting part of the complexity of the merge function into the sync protocol.</p><!--kg-card-begin: html--><div class="kg-card kg-button-card" style="border: 1px solid #ddd; padding: 1em; border-radius: 0.5em;">
   Enjoying what you&apos;re reading?
   <a class="kg-btn kg-btn-accent" style="margin-left: 1em" href="https://interjectedfuture.com/trade-offs-between-different-crdts/#/portal">Subscribe for more</a>
</div><!--kg-card-end: html--><p>Our internal bookkeeping to converge to the same state with concurrent operations doesn&apos;t use vector clocks. Instead, we can keep a log of the causal history of all the operations that occurred. Our internal bookkeeping no longer grows with the number of replicas, but instead with the number of events. </p><p>Op-based CRDTs can scale easily with the number of replicas and with an unknown number of replicas that come in and out of the network easily. However, like blockchains, it has an ever-growing internal bookkeeping if it wants to allow any replica to sync regardless of how long it has been offline.</p><p>There are two ways to address this ever-growing bookkeeping. One or the other may be feasible depending on the requirements of your application. </p><p>The first is to adopt what accountants do: close books at the end of every month and quarter. All replicas keep a rolling window of the latest N days of data and throw away data that&apos;s older than N days. And if a replica has been offline longer than a certain number of days, then they cannot be expected to sync and will need to reload from the latest. </p><p>The second is to keep all the history, but find ways to compress it. This is what the CRDT library <a href="https://youtu.be/x7drE24geUw?si=syBk3NTxeDQekk30&amp;t=3201">Automerge does</a>. This can sound like a bad idea, but we already have software that keeps its entire history around that developers use every day: <a href="https://git-scm.com/book/en/v2/Git-Internals-Git-Objects">Git</a>. Of course, not all application requirements allow us to do this, but in my opinion, we don&apos;t do enough of it. Disks are comparatively cheap now, and the <a href="https://interjectedfuture.com/persistent-data-structure-redux/">benefits of immutability</a> far outweigh the downsides.</p><p>To query for the value of an op-based CRDT, we can apply all operations from the history log to an empty state to get the current state. Alternatively, we can keep a cache of the latest state in memory and update it on every operation.</p><div class="kg-card kg-callout-card kg-callout-card-grey"><div class="kg-callout-emoji">&#x1F4A1;</div><div class="kg-callout-text">- Applies operations to converge state<br>- Operations need to be commutative and associative<br>- Internal data structure needs to be monotonically increasing</div></div><div class="kg-card kg-callout-card kg-callout-card-green"><div class="kg-callout-emoji">&#x1F44D;</div><div class="kg-callout-text">- Small amount of data sent over the network to sync<br>- Easier to accommodate replicas that come in and out of the network<br>- Data is immutable, hence diffs, undos, and comparisons are easy.</div></div><div class="kg-card kg-callout-card kg-callout-card-red"><div class="kg-callout-emoji">&#x1F44E;</div><div class="kg-callout-text">- Syncing protocol needs to implement causal delivery of operations <br>- Internal data structure is a history of operations that grows linearly with the number of operations</div></div><h2 id="delta-state-crdts">Delta-state CRDTs</h2><p>Delta-state CRDTs try to solve the data-over-the-wire problem differently. Instead of breaking the merge function into operations to send over the wire, we try to calculate the minimum amount of state to send to other replicas to sync.</p><p>There are two versions of delta-state CRDTs. In the <a href="https://arxiv.org/pdf/1410.2803.pdf">first &#x3B4;-state CRDT paper</a>, we do something similar to op-based CRDTs: we create a set of operations, called delta-mutators) that are used to update the state. But instead of sending these operations over the wire, delta-mutators generate a diff between the states before the delta-mutator was applied and after it was applied. The diff is then stored in a buffer to be sent out to all other replicas. </p><p>But this buffer isn&apos;t a queue. The buffer holds the diff between the current state and the state when the buffer was last sent to other replicas. This is possible because the delta-mutator diffs are composable. This means if we have two diffs, d<sub>12</sub> (a diff between State X<sub>1</sub> and State X<sub>2</sub>) and d<sub>23</sub> (a diff between State X<sub>2</sub> and State X<sub>3</sub>), the composition of the two diffs would be D<sub>13</sub> (computed with d<sub>12</sub> &#x2A06; d<sub>23</sub>). Hence, the application of d<sub>12</sub> and then d<sub>23</sub> is the same as the application of the composition D<sub>13</sub>. The paper calls these compositions, delta-groups. </p><p>And because the diffs exist in the same join-semilattice as the states, you can use the same merge function to merge the diffs received from other replicas into your own local state.</p><p>The <a href="https://web.archive.org/web/20230607175939/https://novasys.di.fct.unl.pt/~alinde/publications/a12-van_der_linde.pdf">second &#x394;-state paper</a> takes a slightly different approach. While the &#x1D6FF;-state CRDT has a replica send the same diff to all other replicas, in the &#x394;-state CRDT, we notice that the information about which replica has which part of the state is already encoded in the vector clock internal bookkeeping of the CRDTs. Hence, we can tailor the diff that we send to each replica. Therefore, we can toss the buffer and calculate the exact diff of the state that another replica needs. But by trading off the buffer, you now need to keep track of tombstones.</p><p>Delta-CRDTs retain the idempotency property of state-based CRDTs but do not require lots of bandwidth to sync replicas. </p><div class="kg-card kg-callout-card kg-callout-card-grey"><div class="kg-callout-emoji">&#x1F4A1;</div><div class="kg-callout-text">- Merge either full state or deltas to converge<br>- Merge function needs to be commutative, associative, and idempotent<br>- Internal data structure needs to be monotonically increasing</div></div><div class="kg-card kg-callout-card kg-callout-card-green"><div class="kg-callout-emoji">&#x1F44D;</div><div class="kg-callout-text">- Small amount of data sent over the network to sync<br>- Syncing protocol is simple and just needs to broadcast to all other replicas</div></div><div class="kg-card kg-callout-card kg-callout-card-red"><div class="kg-callout-emoji">&#x1F44E;</div><div class="kg-callout-text">- Internal data structure grows with the number of replicas<br>- Internal data structure can get complex, like using <a href="https://www.bartoszsypytkowski.com/optimizing-state-based-crdts-part-2/">Dots</a>.<br>- Merge function implementation can get complex.</div></div><h2 id="the-search-continues">The Search Continues</h2><p>While all of these CRDTs have strengths and weaknesses, I&apos;m looking for something where I don&apos;t have to compromise. Next time, I&apos;ll cover my search into Merkle CRDTs. In the meantime, <a href="https://twitter.com/iamwil">follow me on twitter</a>.</p><hr><p>[^1]: Here&apos;s some introductory material on CRDTs I found helpful<br>	- <a href="https://jakelazaroff.com/words/an-interactive-intro-to-crdts/">An interactive intro to CRDTs</a><br>	- <a href="https://www.bartoszsypytkowski.com/the-state-of-a-state-based-crdts/">An introduction to state-based CRDTs</a><br>	- <a href="https://www.youtube.com/watch?v=vBU70EjwGfw">CRDTs for non-academics</a><br>	- <a href="https://www.youtube.com/watch?v=x7drE24geUw">CRDT: The Hard Parts</a><br>	- <a href="https://christophermeiklejohn.com/crdt/2014/07/22/readings-in-crdts.html">Readings in CRDTs</a><br>	- <a href="https://crdt.tech">crdt.tech</a></p>]]></content:encoded></item><item><title><![CDATA[Information-rich Cache]]></title><description><![CDATA[<p>I was surprised to learn that different stacks of software all do some type of computational analysis to better do its job. The CPU does it to figure out how to pipeline instructions. Compilers do it to feed CPUs better set of instructions. Front-end frameworks do it to figure out</p>]]></description><link>https://interjectedfuture.com/information-rich-cache/</link><guid isPermaLink="false">63fe86a3dbfa25000190afd2</guid><category><![CDATA[programming]]></category><dc:creator><![CDATA[Wil]]></dc:creator><pubDate>Wed, 05 Apr 2023 16:57:00 GMT</pubDate><media:content url="https://interjectedfuture.com/content/images/2023/12/iamwil_In_a_lush_futuristic_forest_clearing_a_colossal_tree_wit_9a2bbb6c-20a9-43af-abeb-9aec15b926a8-1.jpg" medium="image"/><content:encoded><![CDATA[<img src="https://interjectedfuture.com/content/images/2023/12/iamwil_In_a_lush_futuristic_forest_clearing_a_colossal_tree_wit_9a2bbb6c-20a9-43af-abeb-9aec15b926a8-1.jpg" alt="Information-rich Cache"><p>I was surprised to learn that different stacks of software all do some type of computational analysis to better do its job. The CPU does it to figure out how to pipeline instructions. Compilers do it to feed CPUs better set of instructions. Front-end frameworks do it to figure out how to incrementally update the DOM. </p><p>We separate our tech stack into layers to make it more understandable, but it does seem like a pity that the same work is done multiple times. It might be one of those problems that seem the same on the surface, but it really isn&apos;t, so if you tried to unify them somehow, you&apos;d get a big specification mess. Still, it does seem regrettable. </p><p>This brings me to caching. <strong>Caching is hard because the cache subsystem often has the least information about what has changed in the system.</strong> If the computational dependency graph was available to a cache, it&apos;d know exactly what outputs to invalidate based on the inputs that were changed. Taken to the extreme, if you have this computational dependency graph, you can carry it all the way to the GPU. This only solves instances where we need to update caches initiated by a change in the data store. This is the problem of cache invalidation. </p><p>In the instances where we want to cache based on the user&apos;s actions, it&apos;s harder to know what to cache. This is not a problem of cache invalidation, but what to cache in the first place. A cache miss requires us to fetch the data as it&apos;s requested, which means we need to wait for new data when we&apos;ve discovered that we&apos;re about to step off the edge of the world. </p><p>There are predictive techniques we can use to try to figure out what the user will do next, so we can do pre-fetching. I think this can be simplified if we know ahead of time the possible actions, and hence the new sets of data the user could possibly request. Just like there are only two ordinal directions on a map, <strong>if we had such a map&#x2013;a state dependency graph, then we can pre-fetch the &quot;next possible steps&quot; ordered by likelihood</strong> so that the user would never fall off the edge of the world if they didn&apos;t walk too fast. </p><p>We currently don&apos;t encode this state dependency graph in any way, other than in HATEOS, which isn&apos;t in wide accepted use. But I think if we can have the same idea on the client, then it&apos;d be able to ask the server for more data without involvement from the developer manually asking for data when the client is fetching data.</p>]]></content:encoded></item><item><title><![CDATA[Imagining a sample TodoMVC with an immutable database]]></title><description><![CDATA[As an exercise to see what the developer experience might look like, I outlined what it might look like for a React-like single-page app. The app that's outlined below is a variation on the TodoMVC.]]></description><link>https://interjectedfuture.com/immutable-database-for-interactive-apps/</link><guid isPermaLink="false">63d558a5dbfa25000190a881</guid><category><![CDATA[programming]]></category><dc:creator><![CDATA[Wil]]></dc:creator><pubDate>Thu, 02 Feb 2023 02:01:44 GMT</pubDate><media:content url="https://interjectedfuture.com/content/images/2023/12/DALL-E-2023-12-12-19.35.29---Illustration-for-an-article-titled--Imagining-a-Sample-TodoMVC-with-an-Immutable-Database-.-The-image-depicts-a-sleek-and-modern-digital-workspace--wh.jpg" medium="image"/><content:encoded><![CDATA[<img src="https://interjectedfuture.com/content/images/2023/12/DALL-E-2023-12-12-19.35.29---Illustration-for-an-article-titled--Imagining-a-Sample-TodoMVC-with-an-Immutable-Database-.-The-image-depicts-a-sleek-and-modern-digital-workspace--wh.jpg" alt="Imagining a sample TodoMVC with an immutable database"><p>In the last post, I detailed the <a href="https://interjectedfuture.com/state-management-library-front-ends-are-looking-for-is-a-database/">various problems that could be solved by an immutable database on the front-end</a>. In summary, the four state management problems, and how they are solved by an immutable database are as follows:</p><ul><li>1&#xFE0F;&#x20E3; State is a cross-cutting concern: Now we can query both UI state and server state from anywhere.</li><li>2&#xFE0F;&#x20E3; Weak equivalence hinders re-rendering: No more <code>useCallback</code>, <code>useMemo</code>, or <code>key</code> annotations to help optimize rendering. The renderer can figure that all out on its own.</li><li>3&#xFE0F;&#x20E3; Distance between code and data breeds complexity: Developers can treat server state as local using database replication. The data will automatically get synced.</li><li>4&#xFE0F;&#x20E3; State transitions need linear traces: Application state is now observable by querying the transaction log of the immutable database.</li></ul><p>As an exercise to see what the developer experience might look like, I outlined what it might look like for a React-like single-page app. The app that&apos;s outlined below is a variation on the <a href="https://todomvc.com/">TodoMVC</a>.</p><h2 id="the-todo-mvc">The Todo MVC</h2><p>The outline is of a TodoMVC with a slight variation. It will have a side panel that holds details of the focused task. There will be a main panel with a list of tasks and a side panel with the details of the task. </p><figure class="kg-card kg-image-card"><img src="https://interjectedfuture.com/content/images/2023/02/todomvc-1.png" class="kg-image" alt="Imagining a sample TodoMVC with an immutable database" loading="lazy" width="1170" height="938" srcset="https://interjectedfuture.com/content/images/size/w600/2023/02/todomvc-1.png 600w, https://interjectedfuture.com/content/images/size/w1000/2023/02/todomvc-1.png 1000w, https://interjectedfuture.com/content/images/2023/02/todomvc-1.png 1170w" sizes="(min-width: 720px) 720px"></figure><p>Most of the uninteresting components will be at the end as a reference to fill in details, such as the data model and pure functional components. Let&apos;s focus on the interactive components with state.</p><h4 id="main-panel-a-list-of-tasks">Main panel: a list of tasks</h4><p>Here&apos;s the main panel with a list of tasks.</p><pre><code class="language-Javascript">let MainPanel = () =&gt; {    
    let tasks = useQuery(
    	{ find: [&quot;?t&quot;, &quot;?isDone&quot;, &quot;?title&quot;, &quot;?assignedTo&quot;] },
        { where: [
            [&quot;?t&quot;, &quot;task/assignedTo&quot;, &quot;?u&quot;],
            [&quot;?a&quot;, &quot;app/selectedUser&quot;, &quot;?u&quot;],

            [&quot;?t&quot;, &quot;task/isDone&quot;, &quot;?showComplete&quot;],
            [&quot;?a&quot;, &quot;app/taskCompleteFilter&quot;, &quot;?showComplete&quot;],

            [&quot;?t&quot;, &quot;task/isDone&quot;, &quot;?isDone&quot;],
            [&quot;?t&quot;, &quot;task/title&quot;, &quot;?title&quot;],
            [&quot;?u&quot;, &quot;user/name&quot; &quot;?assignedTo&quot;],
        ] })
    
    return &lt;div&gt;
        &lt;TaskFilterByCompletion /&gt;
        &lt;ul&gt;
            {tasks.map((task) =&gt; {
                &lt;TaskLineItem task={task} /&gt;
            })
        &lt;/ul&gt;
    &lt;/div&gt;
}</code></pre><p>The main panel does two things: query for the tasks and show the list of tasks. The list of tasks can be filtered for done or incomplete tasks with the <code>TaskFilterByCompletion</code> component.</p><p>The front-end has an immutable database that acts as the store. It holds both the application state and the remote server state. In this case, it holds the task filter criteria (<code>selectedUser</code> and <code>taskCompleteFilter</code>) and the list of tasks respectively. This means the list of tasks can be retrieved with a join in a single query.</p><div class="kg-card kg-callout-card kg-callout-card-yellow"><div class="kg-callout-emoji">2&#xFE0F;&#x20E3;</div><div class="kg-callout-text">Weak equivalence hinders re-rendering:<br>With a persistent data structure, all objects and functions would know whether they&apos;ve changed between renders.&#xA0;</div></div><h4 id="side-panel-task-details">Side Panel: Task details</h4><pre><code class="language-Javascript">let SidePanel = () =&gt; {
    let user = useQuery(
        { find: [&quot;?u&quot;, &quot;?name&quot;] },
        { where: [
        	[&quot;?u&quot;, &quot;user/name&quot;, &quot;?name&quot;],
            [&quot;?a&quot;, &quot;app/selectedUser&quot;, &quot;?name&quot;],
        ] }
    );
    
    let task = useQuery(
    	{ find: [&quot;?t&quot;, &quot;?title&quot;, &quot;?description&quot;, &quot;?labels&quot;] },
        { where: [
            [&quot;?t&quot;, &quot;task/title&quot;, &quot;?title&quot;],
            [&quot;?a&quot;, &quot;app/selectedTask&quot;, &quot;?t&quot;],
            
            [&quot;?t&quot;, &quot;task/description&quot;, &quot;?description&quot;],
            [&quot;?t&quot;, &quot;task/labels&quot; &quot;?l&quot;],
            [&quot;?l&quot;, &quot;label/name&quot; &quot;?labels&quot;],
        ] }
    );
    
    return &lt;div&gt;
    	&lt;UserSelector user={user} /&gt;
    	&lt;TaskDetails task={task} /&gt;
    &lt;/div&gt;
}</code></pre><p>In the side panel, we query for the user that&apos;s currently being selected, and the details of a task. </p><p>Because the state is accessible anywhere in the component tree, we can query for the application state here, and pass what we need down to the pure functional components. There is no <code>useState</code> to pull up and down the component hierarchy. There is no prop drilling from the top-level store. There is no top-level <code>Context</code> component to nest and split up to optimize rendering.</p><div class="kg-card kg-callout-card kg-callout-card-yellow"><div class="kg-callout-emoji">1&#xFE0F;&#x20E3;</div><div class="kg-callout-text">State is a cross-cutting concern:<br>With a single store for application state accessible by query anywhere in the component tree, it helps solve the problem of cross-cutting concerns of state in an interactive app.</div></div><h4 id="interactive-components">Interactive Components</h4><p>The interactive components are where the action&apos;s at, so let&apos;s take a look at the <code>TaskLineItem</code></p><pre><code class="language-Javascript">let TaskLineItem = (task: Task) =&gt; {
    let db = useDb();
    
    let changeTaskTitle = useTx(
        { in: [&quot;?id, &quot;?title&quot;] },
        { update: [ [&quot;?id&quot;, &quot;task/title&quot;, &quot;?title&quot;] ] }
    );
        
    let handleToggleDone = (_) =&gt; {
        db.tx({ &quot;db/id&quot;: task.id, &quot;task/isDone&quot;: !task.isDone });
    }
    
    let handleChangeTaskName = (evt) =&gt; {
    	changeTaskTitle({ id: task.id, title: evt.value });
    }
    
    let handleFocus = () =&gt; {
    	db.selectTask({ selectedTask: task.id });
    }
    
    let handleBlur = () =&gt; {
    	db.selectTask({ selectedTask: null });
    }
    
    return &lt;li onFocus={handleFocus} onBlur={handleBlur}&gt;
    	&lt;Checkbox isChecked={task.isDone} onToggle={handleToggleDone} /&gt;
        &lt;TextField text={task.title} onChange={handleChangeTaskName} /&gt;
    &lt;/li&gt;
}</code></pre><p>There is no query here, because the component is passed the task, but the query could have existed as just well here. When a database is immutable and local, there is not an N+1 problem, and just querying for a task when you need it within a component is not a problem.</p><p>Outside of a query for data, it has handlers and transaction queries to change the state. In <code>handleToggleDone</code>, the transaction is written directly inside the handler. In <code>changeTaskTitle</code>, the transaction can be specified as a locally reusable hook. Lastly, a transaction can be centralized across an application in a single file, so that state transitions for a particular feature can be seen all in the same place. It&apos;s also possible to add guards to these transactions, because they&apos;re effectively state transitions in a state machine.</p><pre><code>// specify direct queries, or as a higher level state machine

let db.selectTask = useTx(
    { in: [&quot;?selectedTask&quot;] },
    { where: [ [&quot;?a&quot;, &quot;app/selectedTask&quot;] ] },
    { update: [ [&quot;?a&quot;, &quot;app/selectedTask&quot;, &quot;?selectedTask&quot;] ] },    
);</code></pre><div class="kg-card kg-callout-card kg-callout-card-yellow"><div class="kg-callout-emoji">4&#xFE0F;&#x20E3;</div><div class="kg-callout-text">State transitions need linear traces:<br>An advantage of having a single store is that state transitions are all in the same place, and they can be linearized to aid in understanding. This is possible either by inspecting the file defining the state machine, or better yet, by querying the transaction log of the database.</div></div><p>We can ask the database all the state transitions that lead to the current value shown in a UI. When a user hits an error, the transaction events can be sent over to the developers to reproduce the bug exactly. This can drastically help debugging.</p><h3 id="query-language">Query language</h3><p>The query is modeled after Datomic&apos;s variant of Datalog. One realization after doing this exercise, is that while Datalog is powerful and simple to join, it&apos;s quite verbose. </p><p>A major requirement of any query language to make this work are two-fold:</p><ul><li>The query has to be parameterizable. The <code>where</code> clauses are usually filter conditions for the query that needs input from the user. We typically do string interpolation for raw SQL queries which is a security issue. ORMs do this right by make queries parameterizable. GraphQL also does this, but it&apos;s a little verbose.</li><li>The query has to be composable. SQL does have subqueries and common table expressions now. GraphQL has fragments. Both are great help, but I don&apos;t see a lot of tools or organizing principles to help developers organize these into a useful library of composable queries.</li></ul><p>I&apos;ll have to have a closer look at <a href="https://en.wikipedia.org/wiki/Language_Integrated_Query">LINQ</a>, <a href="https://learnsql.com/blog/what-is-common-table-expression/">CTE</a>, and <a href="https://www.scattered-thoughts.net/#imp_v3">Imp</a> to see if there&apos;s something that works for this problem.</p><hr><p>References to other parts of the Todo MVC that isn&apos;t particularly important to the discussion, but help fill out the mental model:</p><h3 id="data-model">Data model</h3><p>The data model is a little bit more involved with task details such as labeled and the assigned user, but nothing too complicated.</p><pre><code class="language-Typescript">type Task = {
    isDone: boolean;
    title: string;
    assignedTo: Option&lt;User&gt;;
    labels: List&lt;Label&gt;;
}

type User = {
    name: string;
}

type Label = {
    name: string;
}</code></pre><h3 id="top-level-page">Top level page</h3><pre><code class="language-Javascript">let App = () =&gt; {
	return &lt;div&gt;
    	&lt;MainPanel /&gt;
        &lt;SidePanel /&gt;
    &lt;/div&gt;
}</code></pre><h3 id="pure-display-components">Pure Display Components</h3><pre><code class="language-Javascript">let TaskDetails = (task: Task) =&gt; {
    return &lt;&gt;
    	&lt;h4&gt;{task.title}&lt;/h4&gt;
        &lt;p&gt;{task.description}&lt;/p&gt;
        &lt;Labels labels={task.labels} /&gt;
    &lt;/&gt;
}

let Label = (label: Label) =&gt; {
	return &lt;span&gt;
    	{label.name}
    &lt;/span&gt;
}</code></pre>]]></content:encoded></item><item><title><![CDATA[The state management library front-ends are looking for is a database]]></title><description><![CDATA[State is a cross-cutting concern. Weak equivalence hinders re-rendering. Distance between code and data breeds complexity.  State transitions need linear traces.]]></description><link>https://interjectedfuture.com/state-management-library-front-ends-are-looking-for-is-a-database/</link><guid isPermaLink="false">63cc0fc3dbfa250001909f85</guid><category><![CDATA[programming]]></category><dc:creator><![CDATA[Wil]]></dc:creator><pubDate>Wed, 25 Jan 2023 05:06:00 GMT</pubDate><media:content url="https://interjectedfuture.com/content/images/2023/01/iamwil_a_more_complex_shaped_3D_puzzle_natural_lighting_cute_3D_27cd2990-3ffc-4c84-886f-51470aa9fefe.png" medium="image"/><content:encoded><![CDATA[<img src="https://interjectedfuture.com/content/images/2023/01/iamwil_a_more_complex_shaped_3D_puzzle_natural_lighting_cute_3D_27cd2990-3ffc-4c84-886f-51470aa9fefe.png" alt="The state management library front-ends are looking for is a database"><p>Most programmers manipulated state directly ever since we first learned to program. We think of state manipulation as being inherently tied to programming. It&apos;s not until we&apos;ve experienced being completely overwhelmed by system complexity that we start to learn one major source of that complexity is having to maintain, manipulate, and keep track of state.</p><p>As outlined in <a href="https://blog.acolyer.org/2015/03/20/out-of-the-tar-pit/">Out of the Tar Pit</a>, a large source of complexity is <a href="https://www.worldofbs.com/minimize-state/">due to state management</a>. The sentiment is also echoed in various pockets of different programming ecosystems <a href="https://www.youtube.com/watch?v=-6BsiVyC1kM">trying to find a better way</a>. </p><p>Recently, I went back and looked for blog post rants on React, and reactions they produced on HN and lobsters (listed at the end of this post), and the following summarizes some of the major sources of complexity due to state management.</p><h3 id="state-is-a-cross-cutting-concern">State is a cross-cutting concern</h3><p>A source of complexity is that application state is a cross-cutting concern in an interactive app of even the most modest complexity. It&apos;s not uncommon for multiple sub-trees of an app to require access to the same data model. While &quot;views as a pure function of state&quot; is a good architectural choice, it also implies the shape of the view is the same as the shape of the data model. This isn&apos;t commonly the case. </p><p>For example, in an app like Figma, the data model for a visual element is used across all sub-components, such as the hierarchy sidebar, the attribute sidebar, and the main canvas view. This means the data model of the visual component can&apos;t live in any one of the three as there wouldn&apos;t be a clean way for the state of the visual component to be communicated across sibling branches of the component tree. </p><p>The advice for such a situation is to pull the data model up to the common ancestor of all three sections, which in this case is the root component. When we do this, there are downsides.</p><p>The application state now needs to be propagated all the way down to the descendant component that displays it. That requires prop drilling, which introduces a lot of boilerplate and manual threading the deeper the descendant lives. While we can use Context to forgo prop drilling, this choice can un-intuitively affect rendering performance. For large apps, Context often needs to be split and ordered so that the outer-most Context is unlikely to change.</p><h3 id="weak-equivalence-hinders-re-rendering">Weak equivalence hinders re-rendering</h3><p>React simplifies developers&apos; mental model by modeling rendering as blitting the entire component tree onto the screen on every frame. In reality, it incrementally updates the DOM by keeping track of which parts of the data model have changed, and hence which component to re-render. However, it&apos;s hard for React to know when data has changed in every case because Javascript was not built with immutability as a core tenet. It doesn&apos;t have a good definition of equivalence to enable the rendering engine to tell when two objects or two functions are the same. Hence, it has a hard time telling if attributes have changed from the last render.</p><p>To compensate, React requires developers to give it hints manually using keys in lists, useMemo, and useCallback. Unlike types in Haskell, these mechanisms aren&apos;t tools to help think through the architectural design. Rather they are just bookkeeping you need to do to satisfy a performance issue with the renderer.</p><p>In the ideal case, the whole system would be able to propagate an incremental change in the data model through to an incremental change in the DOM model without manual annotations from the developer. It can do that if the underlying runtime and language could tell when two objects or two functions were the same or not from frame to frame.</p><h3 id="distance-between-data-and-code-breeds-complexity">Distance between data and code breeds complexity</h3><p>Beyond managing the local front-end state, a modern interactive app has to manage the remote state on the server. When data is in a different place than the code that needs to operate on it, complexity ensues. A lot of things can go wrong when we try to get data &quot;over there&quot; where it is stored to &quot;over here&quot; where the code is, and we fail most of the time and paper over the results.</p><p>Web apps are naturally distributed systems. We could mostly pretend that they weren&apos;t in multi-page apps (MPAs) because the business logic lived close enough to the data that we could pretend they were in the same place. Any transactional writes could be pushed down to the database and any views were stateless, so we could blow it away for a new application state on any action. [1] The UI state was simply the URL the user was currently visiting.</p><p>This simple architecture was easy to build and scale in a distributed manner, but it sorely lacked the responsiveness of desktop apps for end-users. In a desire to mimic this responsiveness without reloading the entire page in this distributed environment, developers moved the UI state and certain business logic [2] to the front-end under the banner of single-page apps (SPAs).</p><p>What we gained in responsiveness, we paid for in complexity. Now that business logic was on the front-end, the server data is now &quot;over there&quot;. So we now need a way to query, fetch, store, and apply the data from &quot;over there&quot; to &quot;over here&quot; over an unreliable network. And we need to do it while retaining state on the front-end.</p><p>As is well-known in the <a href="https://nighthacks.com/jag/res/Fallacies.html">Eight Fallacies in Distributed Systems</a>, this is a recipe for a whole host of additional complexity. To name a few:</p><ul><li>Deduping multiple requests for the same data into a single request.</li><li>Knowing when data is out-of-date or expired.</li><li>Updating out-of-date data in the background.</li><li>Updates that don&apos;t trample on other updates from other users.</li><li>Retry policy on failures.</li></ul><p>React Query and Hasura Subscriptions can help mitigate a lot of the problems here, but they don&apos;t go far enough. Either move UI state and business logic back to the server and give HTML a more expressive language to declare responsive interfaces [3], or move the working set of data to the front-end, so it can be treated as local data.</p><h3 id="state-transitions-need-linear-traces">State transitions need linear traces</h3><p>Redux is an implementation of the Flux-proposed architecture of one-way data binding and serialization of state transitions. This made it much easier to reason about the effect of changing state over its two-way binding predecessors. </p><p>Effectively, it&apos;s a state machine that serializes states between state transitions in a reliable way. It&apos;s certainly a simplifying way to think about state changes, and it&apos;s a pity that Javascript is such an impedance mismatch that it generates a generous amount of boilerplate.</p><p>Modeling state management as a state machine is only a step up if the developer was previously just <a href="https://www.smashingmagazine.com/2018/01/rise-state-machines/">thinking about state changes as a nested series of if-else statements</a>. While state machines bring clarity by making impossible states impossible, a state machine expressed linearly in text (even as a DSL) is still quite verbose. Unless there is a better notation, it can&apos;t be used as a design and thinking tool without graphic visualizations. [4] So if a developer was already thinking about reducers as state machine transitions, most of the benefit is actually from the serialization of state transitions.</p><p>With a serialization (linear list) of state transitions, it&apos;s easier to trace a history of changes to the application state using messages as labels of user intent. More importantly, serialization also brings deterministic reproducibility to state changes and leaves the potential for conflict resolution when there are multiple concurrent sources of change to the front-end store.</p><h3 id="pointing-to-an-immutable-database">Pointing to an Immutable Database</h3><p>This isn&apos;t an exhaustive list of the troubles with state management in front-end interactive apps on the web. However, they all seem to point to the same place: an immutable database for building interactive apps on the web.</p><p>Why does the front-end need a database for state management? Many of the problems listed above are problems that databases have solved before for backend or distributed systems. To start, one advantage is that databases centralize the access points for data and are accessible anywhere from the app. This helps solve the problem that state is a cross-cutting concern for even the most modestly complex app.</p><p>Databases have two other features that help the state management problem: serialized transactional writes and replication.</p><p>Databases have long dealt with the hard problem of coordinating writes between multiple clients with transactions. Databases also keep a serialized log of these transactions to establish durability guarantees in case of catastrophic events like power loss. Querying this log would solve the original problem flux attempted to solve. It&apos;d be a way of tracing the effect of a state change on the entire visual component tree over time.</p><p>Databases also feature replication between primary and secondary instances. This can be used to fetch new data from the server to the relevant working set for the front-end. Currently, developers manually fetch data from the server. But to do it well, developers need to account for timeouts with retries, cache invalidation for performance, deduping multiple requests and batching them, and pagination with lazy loading of data, all for efficiency. Most developers simply don&apos;t handle these common edge cases, and just leave users with infinite spinners, slow loads, or inconsistent views. It&apos;s even more challenging when data was changed on the server by another client, and we need to push the change to the browser to update its view without a manual refresh from the user. By leveraging database replication, between client and server, the front-end developer can treat data as completely local and not worry about network-induced state management from the server. [6]</p><p>But why an immutable database? An immutable database is like other databases except that once data is written, it&apos;s never changed. To &quot;update&quot; a piece of data, the new data is appended and versioned. For example, a table storing the current President of the United States wouldn&apos;t delete the previous president when a new president is elected. The new president would simply be appended to the series of presidents as a new version.</p><p>Keeping every version of data sounds like it would take too much space. In practice, this isn&apos;t a problem. Typically, only a small part of the data in any dataset is updated at a time. Therefore the data between versions have most of their data shared in common. Immutable databases leverage persistent data structures that rely on structural sharing that exploit the commonality between versions. [5]</p><p>It&apos;s this persistent data structure and structural sharing that makes it easier for the system to reason about equivalence. Two objects are the same if they point to the same part of the data structure storing the data. We can also tell exactly how two different versions are different due to the structural sharing in the persistent data structure. With that, we can tell a renderer exactly what has changed, and what it needs to re-render in turn.</p><p>In addition, immutability means there doesn&apos;t need to be coordination for reads, because once written, data will never change. That drastically reduces the complexity of multiple users in a system.</p><p>Lastly, an immutable database isn&apos;t enough. It needs to support incremental queries. Typical database queries assume that the data is stationary and it&apos;s the queries that need to be flexible. Incremental queries turn this around, where it&apos;s the query that&apos;s stationary, but data is streaming in over time. Incremental queries are more commonly found in real-time streaming databases and systems for big data, but they can be useful in interactive apps for the rendering pipeline.</p><p>When the application state held in the immutable database changes, an immutable query can update its result and propagate the difference all the way to the rendering engine, which in turn can efficiently tell what visual components in the tree need to change. This simplifies the state management for developers because they no longer need to give hints to the framework about which parts of the app to re-render.</p><p>To sum up:</p><ul><li>State is a cross-cutting concern: A database would allow a component from anywhere in the app query for application state.</li><li>Weak equivalence hinders re-rendering: An immutable database leverage persistent data structures, which makes it easy to tell whether data has changed or not between renders.</li><li>Distance between code and data breeds complexity: Developers can treat application state as local, rather than remote using database replication. In an immutable database data will never change out from under a read.</li><li>State transitions need linear traces: The database can also serialize transactions and provide an immutable log for tracing, regardless of where in the application the write is coming from.</li></ul><h3 id="a-step-back">A step back</h3><p>All this hullabaloo about a database on the front-end is only necessary if we decide that our web app definitely needs to move the UI state and business logic to the browser. This isn&apos;t the only architectural choice we can make. In fact, Rails Hotwire and Phoenix Liveview take a different tack. They keep most of the high-level UI state and business logic on the backend, close to the server database, and only ship small UI state and a (mostly) immutable view to the browser.</p><p>And given the amount of state management problems in the state-of-the-art front-end frameworks, you&apos;d do well to consider hard whether you&apos;d actually need it. </p><p>But given that you do, I&apos;d argue that the state management library front-ends are looking for is a database.</p><div class="kg-card kg-callout-card kg-callout-card-grey"><div class="kg-callout-emoji">&#x1F4E7;</div><div class="kg-callout-text">If you have insights or have thought about this at all. Feel free to email me or DM me @iamwil on Twitter.&#xA0;</div></div><hr><p>[1] Besides the user session cookie.<br>[2] Business logic outside of access control.<br>[3] Interestingly, this is the direction Rails Hotwire and Pheonix Liveview take.<br>[4] State management libraries that focus on modeling the problem as a state machine are mistaken for this reason.<br>[5] This is how Git stores its data on disk. Blockchains also use this data storage pattern.<br>[6] That said, the developer may still need to fetch data through a 3rd party API whose database he doesn&apos;t control. In this instance, it seems like algebraic effects would be better suited.</p><p><strong>Problems with State Management in React-likes</strong></p><ul><li><a href="https://marmelab.com/blog/2022/09/20/react-i-love-you.html">React I Love You, but You&apos;re Bringing Me Down</a></li><li><a href="https://acko.net/blog/get-in-zoomer-we-re-saving-react/">Get in Zoomer, We&apos;re saving React</a></li><li><a href="https://frontendmastery.com/posts/the-new-wave-of-react-state-management/">The new wave of React state management</a></li><li><a href="https://blog.isquaredsoftware.com/2021/01/context-redux-differences/">Why React Context is Not a &quot;State Management&quot; Tool (and Why It Doesn&apos;t Replace Redux)</a></li><li><a href="https://leerob.io/blog/react-state-management">Past, Present, and Future of React State Management</a></li><li><a href="https://www.matuzo.at/blog/2023/single-page-applications-criticism/">Why I&apos;m not the biggest fan of Single Page Applications</a></li></ul>]]></content:encoded></item><item><title><![CDATA[Persistent Data Structure Redux]]></title><description><![CDATA[<p>This is a review of all the resources I found and learned about persistent data structures in the course of looking for a solution for maintaining state.</p><h2 id="complexity-from-maintaining-state">Complexity from maintaining state</h2><p>One of the hard things about programming is <a href="https://interjectedfuture.com/the-broken-half-of-interactive-programs/">how to maintain and manage state</a>, especially at scale. In interactive</p>]]></description><link>https://interjectedfuture.com/persistent-data-structure-redux/</link><guid isPermaLink="false">63af5ccbdbfa250001909b57</guid><dc:creator><![CDATA[Wil]]></dc:creator><pubDate>Fri, 30 Dec 2022 23:50:31 GMT</pubDate><media:content url="https://interjectedfuture.com/content/images/2023/11/DALL-E-2023-11-27-15.29.41---Illustration-for-the-article-title--Persistent-Data-Structure-Redux-.-The-scene-is-set-in-a-futuristic-library--where-ancient-and-modern-elements-blen.png" medium="image"/><content:encoded><![CDATA[<img src="https://interjectedfuture.com/content/images/2023/11/DALL-E-2023-11-27-15.29.41---Illustration-for-the-article-title--Persistent-Data-Structure-Redux-.-The-scene-is-set-in-a-futuristic-library--where-ancient-and-modern-elements-blen.png" alt="Persistent Data Structure Redux"><p>This is a review of all the resources I found and learned about persistent data structures in the course of looking for a solution for maintaining state.</p><h2 id="complexity-from-maintaining-state">Complexity from maintaining state</h2><p>One of the hard things about programming is <a href="https://interjectedfuture.com/the-broken-half-of-interactive-programs/">how to maintain and manage state</a>, especially at scale. In interactive programs, this manifests itself as all the bookkeeping required to fetch data from remote APIs, send and retrieve data from the database, and maintain the state of the view not dependent on the data (such as the state of a pull-down or current animation state of a UI component).</p><p>In my search for prior art on the topic, I came across <a href="https://www.datomic.com/">Datomic</a>, a functional immutable database. It&apos;s a database you can treat as a value in the functional sense, rather than a &quot;place&quot; where you can fetch and update data.</p><p>Why is this a problem? It turns out that much of the complexity in our modern web systems stems from the problem of maintaining state &quot;over there&quot;. This phenomenon is outlined nicely in Juan Pedro Bolivar Puente&apos;s talk.</p><figure class="kg-card kg-embed-card"><iframe width="200" height="113" src="https://www.youtube.com/embed/sPhpelUfu8Q?start=59&amp;feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen title="CppCon 2017: Juan Pedro Bolivar Puente &#x201C;Postmodern immutable data structures&#x201D;"></iframe></figure><p>When data can change out from under you after you read it, there&apos;s a lot of work to compensate for it. Without that work, it&apos;s hard for humans to reason about, because we tend to think and reason in a sequence where only one thing happens at a time. So we just start adding the complexity, thinking that there isn&apos;t a way out.</p><figure class="kg-card kg-image-card"><img src="https://interjectedfuture.com/content/images/2022/12/Screen-Shot-2022-12-30-at-2.08.25-PM.png" class="kg-image" alt="Persistent Data Structure Redux" loading="lazy" width="2000" height="1147" srcset="https://interjectedfuture.com/content/images/size/w600/2022/12/Screen-Shot-2022-12-30-at-2.08.25-PM.png 600w, https://interjectedfuture.com/content/images/size/w1000/2022/12/Screen-Shot-2022-12-30-at-2.08.25-PM.png 1000w, https://interjectedfuture.com/content/images/size/w1600/2022/12/Screen-Shot-2022-12-30-at-2.08.25-PM.png 1600w, https://interjectedfuture.com/content/images/2022/12/Screen-Shot-2022-12-30-at-2.08.25-PM.png 2190w" sizes="(min-width: 720px) 720px"></figure><h2 id="immutability-is-the-star">Immutability is the star</h2><p>Some people get hung up about types in functional programming when immutability is the star. Immutability guarantees that something won&apos;t change out from under you after you read or write it. This single thing simplifies the architecture of our systems drastically when we can swing it. </p><p>But what about databases? Could databases, explicitly used for maintaining state, be used as a value? Yes, if the database never ever deleted a record, and updates are actually just writing a new version. That&apos;s <a href="https://www.youtube.com/watch?v=V6DKjEbdYos">Rich Hickey&apos;s insight when he built Datomic</a>. </p><p>What would it take to build your own functional immutable database? You need your own persistent data structure.</p><h2 id="persistent-data-structures">Persistent data structures</h2><p>Persistent data structures are immutable. Once written they can never change. Once you read it, you&apos;re guaranteed that some other thread isn&apos;t going to come around and change it. Any updates to the persistent data structures actually amount to creating a new version with the update, so any older versions stay intact.</p><p>Now, there&apos;s no coordination required from reading data, which simplifies the problem of reading data. Writing data also doesn&apos;t require coordination if a single source of truth doesn&apos;t need to be maintained. But if you want everyone to agree on a single view of truth, then writes typically needs to be serialized.</p><p>One objection to this scheme may be that it would take up too much space. In practice, each version of the data uses structural sharing. That&apos;s where new versions of the data structure reuse data that didn&apos;t change from the previous version.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://fuzhe1989.github.io/images/2020-11/persistent-data-structure-02.png" class="kg-image" alt="Persistent Data Structure Redux" loading="lazy"><figcaption>From https://fuzhe1989.github.io/2017/11/07/persistent-data-structure/</figcaption></figure><p>Chances are you use version control software like Git, and it uses a very similar idea. Every commit structurally shares the files that didn&apos;t change from the previous commit.</p><p>I initially started looking at surveys and talks on persistent data structures. </p><ul><li><a href="https://web.archive.org/web/20210415014736/http://www.cs.tau.ac.il/~haimk/papers/persistent-survey.ps">Persistent data structure survey</a></li><li><a href="https://www.youtube.com/watch?v=T0yzrZL1py0">MIT 6.851 Advanced Data Structures Lecture 1</a> [<a href="http://courses.csail.mit.edu/6.851/spring12/lectures/">lecture and note list</a>]</li><li>Okasaki&apos;s <a href="https://www.cs.cmu.edu/~rwh/students/okasaki.pdf">Purely Functional Data Structures</a></li></ul><p>I found them to be rather broad, so I started looking at specific tree algorithms. A good place to start for any database retrieval algorithm is <a href="https://blog.jcole.us/2013/01/10/btree-index-structures-in-innodb/">B+ Trees</a>.</p><p>I thought it was a good starting place to implement a B+ Tree in Rust to get familiar with the language [1]. Turns out typical data structures such as a linked list are bad starting points for Rust. These data structures usually have two pointers pointing at the same piece of data, which violates the basic ownership enforcement required in Rust programs. This required me to unlearn many of the things I learned when I learned C. The following were good reads on this topic: </p><ul><li><a href="https://cglab.ca/~abeinges/blah/rust-btree-case/">Rust Collections Case Study: BTreeMap</a></li><li><a href="https://rust-leipzig.github.io/architecture/2016/12/20/idiomatic-trees-in-rust/">Idiomatic tree and graph-like structures in Rust</a></li></ul><p>After reading about the weaknesses of B+ Trees, and looking through a bunch of others, I happened upon the <a href="https://en.wikipedia.org/wiki/Hash_array_mapped_trie">Hash Array Map Tries (HAMTs)</a> by Phil Bagwell. I saw an implementation of this in the <a href="https://github.com/bodil/im-rs">im.rs library</a>. The confluence of im.rs, Bagwell&apos;s work on <a href="http://lampwww.epfl.ch/papers/idealhashtrees.pdf">Ideal Hash Trees</a>, and Clojure&apos;s own implementation lead me to Relaxed Radix Balanced Tries (RRBTs), a variation on the <a href="https://en.wikipedia.org/wiki/Radix_tree">Radix Trie</a>.</p><p>Radix Tries structures itself after the key, leading to large fanouts, and keeping all data at the leaves of the trie. By keeping the trie balanced as an invariant, we can achieve O(log n) lookups.</p><p>Clojure uses some variant of this, which is well described by L&apos;Orange in his blog post series, as well as Krukow&apos;s posts.</p><ul><li><a href="https://hypirion.com/musings/understanding-persistent-vector-pt-1">Understanding Clojure&apos;s Persistent Vectors, pt. 1</a>, <a href="https://hypirion.com/musings/understanding-persistent-vector-pt-2">pt. 2</a>, <a href="Understanding Clojure&apos;s Persistent Vectors, pt. 3">pt. 3</a></li><li><a href="https://hypirion.com/musings/understanding-clojure-transients">Understanding Clojure&apos;s Transients</a></li><li><a href="https://hypirion.com/musings/persistent-vector-performance-summarised">Persistent Vector Performance Summarised</a></li><li><a href="http://blog.higher-order.net/2009/09/08/understanding-clojures-persistenthashmap-deftwice">Understanding Clojure&apos;s PersistentHashMap (deftwice...)</a></li><li><a href="http://blog.higher-order.net/2010/08/16/assoc-and-clojures-persistenthashmap-part-ii.html">Assoc and Clojure&apos;s PersistentHashMap: part ii</a></li></ul><p>But specifically, RRBTs I found three main sources in Bagwell, L&apos;Orange, Stucki, and Bol&#xEC;var Puete.</p><ul><li><a href="https://infoscience.epfl.ch/record/169879/files/RMTrees.pdf">RRB-Trees: Efficient Immutable Vectors</a></li><li><a href="https://hypirion.com/thesis.pdf">Improving RRB-Tree Performance through Transience</a></li><li><a href="https://www.cs.purdue.edu/homes/rompf/papers/stucki-icfp15.pdf">RRB Vector: A Practical General Purpose Immutable Sequence</a></li><li><a href="https://raw.githubusercontent.com/nicolasstucki/scala-rrb-vector/master/documents/Master%20Thesis%20-%20Nicolas%20Stucki%20-%20Turning%20Relaxed%20Radix%20Balanced%20Vector%20from%20Theory%20into%20Practice%20for%20Scala%20Collections%20(Print).pdf">Turning Relaxed Radix Balanced Vector from Theory into Practice for Scala Collections</a></li><li><a href="https://public.sinusoid.es/misc/immer/immer-icfp17.pdf">Persistence for the Masses: RRB-Vectors in a Systems Language</a></li><li><a href="https://www.youtube.com/watch?v=sPhpelUfu8Q">CppCon 2017: Juan Pedro Bolivar Puente &#x201C;Postmodern immutable data structures&#x201D;</a></li><li>&#x1F419;&#x1F431; <a href="https://github.com/Magnap/persistent-vector-rs/blob/master/src/lib.rs">Magnap/persistent-vector-rs</a></li><li>&#x1F419;&#x1F431; <a href="https://github.com/hypirion/c-rrb/">hypirion/c-rrb</a></li></ul><p>I found that I had to piece together information from all of these sources to figure out an implementation. But once it&apos;s in your head, it&apos;s actually pretty intuitive. The most complicated method to implement is concatenation, as it has to keep a relaxed version of its invariants to guarantee its properties. But with both `concatenation` and `split`/`slice` you can implement insertion. </p><p>In the process of doing the implementation, <a href="https://openai.com/blog/chatgpt/">OpenAI released ChatGPT</a>. I tried using it to help, as an experiment. </p><p>I asked it to explain and convert the pseudocode to Rust. It was a mild success. While I couldn&apos;t use the code it generated, the comments and code in Rust made it more clear what was going on.</p><figure class="kg-card kg-embed-card"><blockquote class="twitter-tweet"><p lang="en" dir="ltr">I&apos;ve also used ChatGPT as a sounding board to help me understand papers. Here, I ask it to convert pseudocode to Rust and ask it to explain every line to me. <a href="https://t.co/SaqoQnS16F">pic.twitter.com/SaqoQnS16F</a></p>&#x2014; Wil Chung (@iamwil) <a href="https://twitter.com/iamwil/status/1600738380970479616?ref_src=twsrc%5Etfw">December 8, 2022</a></blockquote>
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
</figure><h2 id="defaults-matter">Defaults matter</h2><p>Javascript has a bunch of immutable libraries, but to my knowledge, it&apos;s not overwhelmingly popular. I think languages like Clojure that build it in as a default have a leg up on other languages. </p><p>Immutable data structures should be the default in a new modern programming language, and mutation of state should be an intentional act by the programmer. The advantage of immutability everywhere you can swing it, even in places you didn&apos;t think you could before should be able to create wins in reducing complexity in our software.</p><hr><p>[1] I decided to go with Rust instead of Zig, since I was a new systems programmer. I figured having the guard rails would really help. Turns out Rust is a really complicated language.</p>]]></content:encoded></item><item><title><![CDATA[Compiling Composable Reactivity]]></title><description><![CDATA[What if the changes to the input result in state changes that can't be computed fast enough to keep the user interface snappy for an interactive program? Reactivity to the rescue. It's a way to drastically reduce the amount of computation required to compute an output given a change to the input.]]></description><link>https://interjectedfuture.com/compiling-composable-reactivity/</link><guid isPermaLink="false">626e31acc728c800010ec50b</guid><category><![CDATA[programming]]></category><category><![CDATA[web]]></category><dc:creator><![CDATA[Wil]]></dc:creator><pubDate>Tue, 16 Aug 2022 22:56:31 GMT</pubDate><media:content url="https://interjectedfuture.com/content/images/2022/09/1206817536_A_young_scientist_pouring_liquids_into_a_beaker_with_multi_colored_reactions_in_a_warm_and_brightly_lit_room_with_wooden_panels_and_hanging_plants__by_George_Grie__rendered_by_Unreal_engine__In_a_warmer_color_pa.png" medium="image"/><content:encoded><![CDATA[<img src="https://interjectedfuture.com/content/images/2022/09/1206817536_A_young_scientist_pouring_liquids_into_a_beaker_with_multi_colored_reactions_in_a_warm_and_brightly_lit_room_with_wooden_panels_and_hanging_plants__by_George_Grie__rendered_by_Unreal_engine__In_a_warmer_color_pa.png" alt="Compiling Composable Reactivity"><p>The first programs were more like programmable calculators. Fire it up with some inputs and wait for the output. Then in 1963, Ivan Sutherland <a href="https://en.wikipedia.org/wiki/Sketchpad">conceived</a> of a program that facilitated a continuous dialog between computers and humans. You could type or move a pen, and the computer screen updated in real-time. It was the first interactive program. </p><figure class="kg-card kg-embed-card"><iframe width="200" height="150" src="https://www.youtube.com/embed/YB3saviItTI?feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen title="Ivan Sutherland Sketch Pad Demo - Short version"></iframe></figure><p>Though commonplace today, it was non-obvious and took a leap of insight to see it and its potential. But despite approximately half a century of building interactive software, we&apos;re still figuring out how to do it well.</p><figure class="kg-card kg-image-card"><img src="https://interjectedfuture.com/content/images/2022/08/Paper.Journal.1.png" class="kg-image" alt="Compiling Composable Reactivity" loading="lazy" width="1560" height="1170" srcset="https://interjectedfuture.com/content/images/size/w600/2022/08/Paper.Journal.1.png 600w, https://interjectedfuture.com/content/images/size/w1000/2022/08/Paper.Journal.1.png 1000w, https://interjectedfuture.com/content/images/2022/08/Paper.Journal.1.png 1560w" sizes="(min-width: 720px) 720px"></figure><p>The typical interactive program of today generally has this structure. Starting at the bottom of the interactive loop, a component tree is rendered on-screen. A user clicks on an interactive element, like a button, and generates an event. The event is sent to the state manager to update the application state. Then, the component tree is rendered as a pure function of application state, thus completing the cycle.</p><p><em>In </em><a href="https://interjectedfuture.com/the-broken-half-of-interactive-programs/"><em>The Broken Half of Interactive Programs</em></a>, we focused on the stateful interactive side of the main loop (yellow box). In this post, we&apos;ll examine the side of the loop where we render components as a pure function of state (green arrow) [0].</p><p>What could be simpler? If all complexities of dealing with state are pushed to the stateful side of the main loop, then the functional render side is just dependent on its inputs from the application state. However, there are times when this doesn&apos;t hold true. What if the changes to the input result in state changes or view renderings that can&apos;t be computed fast enough to keep the user interface snappy for an interactive program? Reactivity to the rescue. It&apos;s a way to drastically reduce the amount of computation required to compute an output given a change to the input.</p><p>In this post, we&apos;ll cover the following:</p><ul><li>Establish why we&apos;d want reactive systems from a user-centric view.</li><li>Attempt to distill and identify the core problem in reactive systems.</li><li>Discuss the tradeoffs for design choices addressing this core problem.</li><li>Speculate on using composable derivatives for reactivity.</li></ul><h3 id="what-and-why-of-reactivity">What and why of reactivity?</h3><p>Reactivity is where a program recomputes state changes when its inputs change, at an environmentally determined speed (such as the user&apos;s clicks), and not by the program itself. Additionally, reactive systems allow a design where developers specify <em>what</em> the dynamic result should be, without specifying the <em>how</em> thus simplifying the system mental model for programmers. [1]</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://interjectedfuture.com/content/images/2022/08/Paper.Journal.2.png" class="kg-image" alt="Compiling Composable Reactivity" loading="lazy" width="1560" height="1170" srcset="https://interjectedfuture.com/content/images/size/w600/2022/08/Paper.Journal.2.png 600w, https://interjectedfuture.com/content/images/size/w1000/2022/08/Paper.Journal.2.png 1000w, https://interjectedfuture.com/content/images/2022/08/Paper.Journal.2.png 1560w" sizes="(min-width: 720px) 720px"><figcaption>Non-exhaustive examples of programs in each quadrant</figcaption></figure><p>Why is reactivity a desired property for interactive programs? Interactive programs facilitate a dialog between human users and computers to find solutions to problems the human user is trying to solve. Practically speaking, making computers wait for humans is preferred over making humans wait for computers. The creative flow for problem-solving breaks when humans have to wait. When a program is slow, the human user never knows if the computer is still processing or if it has gotten stuck. It&apos;s rarely apparent whether to retry or how much longer to wait. This distraction becomes top of mind, rather than the problem at hand.</p><p>In contrast, a responsive interactive program helps the human-computer dialog iterate towards a solution. This dialog becomes a core loop where the user immediately sees the results of making a change. Each result generates ideas of what to try and change next. The quicker the cycle, the faster a user can iterate through a problem space in search of a creative solution. [2]</p><p>Games call this loop a core game loop, but we can simply call it a <em>core loop</em>. The time budget available to return a computed result is defined by how fast a human can cycle through the core loop (remember, we want computers to wait on humans). The faster the cycle, the shorter the time budget. </p><p>And yet, not all interactive programs require the use of reactive techniques. Games are the best example of interactive programs that rarely use reactive techniques. Why is this? A game&apos;s goal is to be fun, and fun derives from <a href="https://theoryoffun.com/">allowing the player to make meaningful choices</a> in the simulated world. An exact accurate simulation of the world is often secondary (and sometimes in opposition) to the fun. Hence, a game can oftentimes get away with approximate answers, so they opt to use the simplest and fastest computation, not the most accurate one. This gives them much leeway to be fast without reactive techniques. [3] That said, there is <a href="https://github.com/pmndrs/react-three-csg">recent work</a> on <a href="https://acko.net/blog/live-headless-react/">reactive</a> <a href="https://codesandbox.io/s/csg-bunny-usegroups-tewiso?file=/src/App.js">3D scene rendering</a> for the web.</p><p>However, there are still good reasons to use reactive techniques for both interactive and non-interactive programs. One reason is in domains that require exactly the right answer without redoing all the work. For non-interactive programs, good examples would be <a href="https://www.gnu.org/software/make/">build</a> <a href="https://bazel.build/">systems</a> and <a href="https://github.com/frankmcsherry/blog/blob/master/posts/2018-05-19.md">borrow-checkers</a>. For interactive programs, the poster child is the venerable <a href="https://www.microsoft.com/en-us/microsoft-365/excel">spreadsheet</a>.</p><p>Another good reason to leverage reactivity is when the size of the dataset is too large to conceivably return a computational result in milliseconds through brute force. Even with fast computers, there are computational jobs that are beyond the reach of brute force to return a result within milliseconds. For example, <a href="https://flink.apache.org/">streaming</a> <a href="https://spark.apache.org/">systems</a> for big data often need to process petabytes of data. These systems leverage reactivity to make processing practical and tractable.</p><p>Lastly, a good reason to use reactivity is when a required external subsystem is too slow. For example, <a href="https://reactjs.org/">Front-end</a> <a href="https://svelte.dev/">web</a> <a href="https://vuejs.org/">apps</a> render using browser DOM elements, which are well-known to be heavyweight objects that are expensive to create and destroy. </p><p>But then, why not just write imperative code to just minimally update the DOM? This can work, if the scope of the state changes required by the application is small enough. For example, if there&apos;s just a single list of elements to update, such as in a todo list or pins on a map. But just as manual memory management is manageable by a human up to a certain point, so it is with manually updating the DOM. A more complex UI with inter-relate parts, such as a vector drawing app is likely to tax a developer on what state changes should update which part of the view and when. At least in front-end web apps and Rust GUI ecosystems, we&apos;ve found declarative one-way data flow modeling of GUIs to be an easier mental model of the system for developers. </p><p>A declarative interface, by definition, removes control from the developer so they only have to specify the <em>what</em>, not the <em>how&#x2013;</em>the exact series of steps. This responsibility of the <em>how</em> doesn&apos;t go away. It&apos;s merely shifted from the developer to the reactive library. But now, efficiency is also the responsibility of the library. Reactivity allows front-end web frameworks to support a declarative interface while ceding control to the library to maintain efficiency by minimizing changes to the slow DOM tree. </p><p>In short, reactivity is a complexity that system implementers are willing to deal with in order to make the core loop practical when brute-forcing an exact answer isn&apos;t. [4]</p><h3 id="the-core-problem">The core problem</h3><p>The design space for reactive systems is large, and different people have covered different sets of considerations subject to different constraints. At the end of this post, there is an overview of various posts on the design space for various reactive systems.</p><p>To summarize the various attempts to parameterize the design space of reactive systems, we ask: What is the core problem when designing such a system?</p><blockquote>The fastest program is a program that does nothing at all.<br>- Programming Koan</blockquote><blockquote>To make a fast program, just never make it slow.<br>- Paul Buchheit</blockquote><p>Fast programs are largely about exploiting the inherent data structure of a problem domain to avoid doing work. This can involve not doing work that you&apos;ve already done, or not doing work that doesn&apos;t affect the final output. <a href="https://lists.freebsd.org/pipermail/freebsd-current/2010-August/019310.html">Grep is famously fast</a> because it does so little. </p><p>Reactive systems model computation carefully to answer the question: If the input changes by this small amount, what&apos;s the minimum re-computation to get the same answer as the full computation? </p><p>To answer this question, creators of reactive systems need fine-grained control over the execution of code to decide which computations to recompute. We achieve this with two techniques commonly used together.</p><p>First, we ask the developer to rewrite their code more explicitly as a computational graph of data dependencies. In Apache Spark, it&apos;s a program with only the functional operators it makes available. In React, it&apos;s the react hooks with data dependencies relating them to one another. In build systems like Make, it&apos;s explicit file dependencies between build steps.</p><p>Second, the reactive system implements a scheduler to decide which computation should be executed and when. This is often embedded in the reactive library. For languages that don&apos;t allow runtime modification, such as javascript, the scheduler is implemented in userland. In fact, <a href="https://overreacted.io/react-as-a-ui-runtime/">React is actually a runtime</a> on top of a browser. [5]</p><p>Now then, should the system propagate deltas or should it perform diffs between the subsequent computational graphs? </p><p>Propagating deltas means that each node computes how a change in input will change its output. This change in output, rather than the output value itself, is propagated through the computational graph. </p><p>On the other hand, performing diffs means that we build an entire new graph on every cycle through the core loop. And then we diff the current vs previous computational graph to see what has changed. </p><p>In <a href="https://www.scattered-thoughts.net/writing/an-opinionated-map-of-incremental-and-streaming-systems">An Opinionated Map of Incremental and Streaming Systems</a>, the difference between the two approaches is called <em>unstructured vs structured systems.</em></p><p>Structured systems typically propagate deltas through the computational graph, but the downside is that they restrict the operations available to the user. Unstructured systems allow for any type of user operation, so it typically executes diffs. The downside is the implementation complexity, and if the update semantics aren&apos;t internally consistent, it is frustrating to debug. </p><p>This is a base design decision that affects everything else since that is the core problem: it&apos;s the mechanism by which we determine whether or not to recompute. Let&apos;s compare the trade-offs between the two designs.</p><h3 id="the-limits-of-a-structured-system">The limits of a structured system</h3><p>Structured systems restrict the set of operators available to the user. But with this carefully picked set of functional operations, deltas can be propagated through the entire computational graph. Streaming frameworks seem to favor this design. For example, <a href="https://timelydataflow.github.io/differential-dataflow/">Differential Dataflow</a> <a href="https://youtu.be/zsL2YxMXaKw?t=1059">propagates deltas</a> along with multi-dimensional counters to label intermediate computational results at different points in time and in loops, so it knows how to combine them correctly when all of the input data for an output data has been seen. </p><p>Because we know what has changed, rather than just that something has changed, we don&apos;t need to perform tree diffs or define equivalence. All the problems in unstructured systems mentioned above are answered by the delta. </p><p>Why not default to propagating deltas in all reactive systems? </p><p>It seems to be partially a cultural stumbling block. Structured systems would require mainstream developers to rewrite their algorithms in this limited set of functional operators. Given the limited (but growing!) popularity of functional programming, it&apos;s a tall ask. Big-data streaming systems have orders-of-magnitude bigger workloads than front-end web frameworks, so the choice of delta propagation over tree diffs is likely because of performance. If a limited operator set is the trade-off for performance and tractability of large problems, users of streaming systems are more than willing to make that trade.</p><p>That said, even if this trade-off was made willingly, it can be hard to use. MapReduce is <a href="https://www.researchgate.net/publication/221460084_YSmart_Yet_another_SQL-to-MapReduce_translator">famously hard to translate into queries by hand</a>, and even functional systems like Apache Spark provide more operators than just map and reduce now for jobs that query. It now has a SQL variant on top of the functional language for this very use case.</p><h3 id="the-baggage-with-unstructured-systems">The baggage with unstructured systems</h3><p>In unstructured systems, we can allow arbitrary computation, but because unstructured systems don&apos;t propagate deltas, it has to take the diff between the previous application state/tree and the current application state/tree in the computational cycle. It has to have an answer to the equivalence question: &quot;what does it mean when two things are the same?&quot;, so it can make decisions about whether or not to recompute and in what order.</p><p>Why would that be difficult? Equivalence is already a subject of debate amongst mathematicians, not to mention amongst programmers. Some software systems treat things with the same memory location as equivalent. Other systems treat pure functional computational nodes with the same inputs as equivalent. </p><p>These attempted one-size-fits-all answers are problematic. They&apos;re insensitive to the problem domain&apos;s inherent data structure that yields exploits for faster execution. One class of these exploits is to skip recompute when changes to input don&apos;t change the output in a meaningful way. For example, if the problem domain is compiling code, changing a comment shouldn&apos;t trigger a re-compilation. Hence, many of these unstructured systems have a default consistent and predictable (if dumb) recomputation heuristic but leave an escape hatch ajar for the user to override for their specific domain.</p><p>But leaving it up to the user to wire up case-by-case is a recipe for bugs and glitches, and it&apos;s easy for a user to get it wrong and end up with a hard-to-debug system. If a computational node decides to re-run more often than needed, then it is over-subscribed. (<a href="https://www.joshwcomeau.com/react/why-react-re-renders/">correct but not efficient</a>) If a computational node decides to re-run less often than needed, then is it under-subscribed. (minimal computation&#x2013;but at worst incorrect answers, at best just glitchy intermediate answers that are hard to debug) As <a href="https://acko.net/blog/the-incremental-machine/">The Incremental Machine</a> and <a href="https://medium.com/hackernoon/becoming-fully-reactive-an-in-depth-explanation-of-mobservable-55995262a254">An In-depth Explanation of MobX</a> both noted, this is a cache invalidation problem in disguise.</p><p>How come it&apos;s a hard problem to know when the data changes in unstructured systems? We know exactly when the data changes--it&apos;s when the dirty bit gets set. The reason we set a dirty bit is that triggering recomputes at this moment would result in updating all the children to intermediate values. This means that those children might be processed again as we traverse down the computation graph. Until all nodes have been marked, the process doesn&apos;t have enough information to ascertain the minimal set of computational nodes to run for the correct output. The decision of whether or not to invalidate cannot be done locally, one step at a time in a single pass.</p><h3 id="other-complications">Other complications</h3><p>So far, we&apos;ve mostly considered this problem of designing reactive systems as deciding whether or not to recompute specific nodes of a computational graph. The other knobs for tweaking the design of reactive systems have to do with how they would work with other commonly used language features.</p><p>Some of these systems don&apos;t allow for loops and functions. It is true you might not need a loop to iterate across datasets if you have a map operator, but what if you need loops to iterate across time and across a core loop cycle? Sometimes it would be needed to iterate a dataset to a fixed point. How would reactive systems model functions? Is it possible to reuse parts of the computation graph, or even pass it around to insert dynamically? <a href="https://www.youtube.com/watch?v=G6a5G5i4gQU">Self-adjusting computations</a> explicitly address this use case. </p><p>The problem of reactivity is complicated when a computational node in our graph has a data dependency that lives outside of our system across the network. How closely the application is expected to sync with remote data? Does the node wait forever for data? How often should it check? What happens on error, and should the user know? Most of these questions can&apos;t be answered by the reactive system and are left to the user to manually hook up. As a result, it&apos;s often error-prone and feels glitchy to the end-user without a concerted effort and attention paid by the developer. [6] This is a problem most acutely felt by front-end frameworks, since streaming systems typically have a reliable data store to draw from. </p><p>That&apos;s not all. A reactive system&apos;s design is further complicated if the system is expected to handle old data. Since some of these formulations of reactive systems keep internal state&#x2013;such as when a <em>fold</em> is supported as an operator&#x2013;complications arise when parallel or concurrent computation is introduced, such as data races. That&apos;s where front-end frameworks fall over, as they&apos;re not built to address these problems, and people reach for streaming systems built for big data with all their constraints.</p><p>These complications are mentioned for completeness, rather than as an attempt to address them. The decision of whether and how much to recompute is at the heart of the problem, and the two approaches between structured and unstructured are limited in their perspective ways. Is there a way we can have the best of both worlds?</p><h3 id="does-discrete-differentiation-exist">Does Discrete Differentiation exist?</h3><p>In this search through the current state of reactive systems, one question loomed in my mind. As opposed to the common numeric differentiation, does discrete differentiation exist? For example, how would we compute, represent, and compute the derivative of a functional operation such as string append? More importantly, is there a Chain Rule for almost any functional computation?</p><!--kg-card-begin: html--><p>
    In numeric computations, if we want to compute
</p>

<pre>f(x<sub>t</sub>) where x<sub>t</sub> = x<sub>t-1</sub> + &#x394;</pre> 

<p>
    (current input is the previous input with a little change), we can take one of two routes. We can either compute it directly or we can compute 
</p>

<pre>f(x<sub>t-1</sub>) + f&apos;(&#x394;)</pre>

<p>    
    In reactive systems, we&apos;d prefer the latter if <code>f(x<sub>t</sub>)</code> is expensive to compute, but <code>f&apos;(&#x394;)</code> (the derivative) isn&apos;t. 
</p><!--kg-card-end: html--><p>Just finding the derivative of a specific function isn&apos;t enough. To remove the constraint of limited operators, we want a Chain Rule to help us compose the derivatives. The Chain Rule would compute the compositional delta from end-to-end. That means we wouldn&apos;t need to calculate diffs between trees, mark dirty bits, or worry about equivalence problems when comparing two trees. It would also be internally consistent as inputs changed.</p><p>What are some open questions?</p><p>In addition, not all operators have a delta to propagate. Cryptographic functions like SHA256 are, by design, incrementation proof. The whole point of their design is to generate outputs that are statistically white-noise when their input changes a little bit. But even if most functional operators had a delta, it&apos;s not yet clear to me what it would look like for classic operators like append, concat, push, pop, etc.</p><p>And as often noted in introductions to <a href="https://www.youtube.com/watch?v=wG_nF1awSSY">auto-differentiation</a> for deep learning, symbolic differentiation for numeric computations often becomes untenable because of the combinatorial explosion of terms as you take the nth derivative. The deeper the neural network, the higher the derivative. Worse, the terms from one function are often repeated and entangled in different parts of the entire expression&#x2013;a result of a composition that doesn&apos;t compose cleanly. </p><p>What would a possible solution look like? The closest thing I&apos;ve come across is Conal Elliott&apos;s <a href="http://conal.net/papers/compiling-to-categories/compiling-to-categories.pdf">Compiling to Categories</a>. [7]</p><h3 id="compiling-to-categories">Compiling to Categories</h3><p>The idea is to <a href="http://conal.net/papers/compiling-to-categories/compiling-to-categories.pdf">compile a program into a category theory representation,</a> called <a href="https://en.wikipedia.org/wiki/Cartesian_closed_category">Closed Cartesian Category</a> (CCC), as an intermediate representation (IR) of code. We can then use a &quot;template&quot; to compile this IR into anything else. In our case, we want to compile into functions that propagate deltas but also still compose.</p><p>In Conal Elliott&apos;s demos, he compiles a short program into CCC. And then, using a &quot;template&quot;, he can <a href="https://youtu.be/vzLK_xE9Zy8?t=1827">compile it into a graphical DAG representation</a>. </p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://interjectedfuture.com/content/images/2022/08/Screen-Shot-2022-08-04-at-10.47.41-PM.png" class="kg-image" alt="Compiling Composable Reactivity" loading="lazy" width="1800" height="1284" srcset="https://interjectedfuture.com/content/images/size/w600/2022/08/Screen-Shot-2022-08-04-at-10.47.41-PM.png 600w, https://interjectedfuture.com/content/images/size/w1000/2022/08/Screen-Shot-2022-08-04-at-10.47.41-PM.png 1000w, https://interjectedfuture.com/content/images/size/w1600/2022/08/Screen-Shot-2022-08-04-at-10.47.41-PM.png 1600w, https://interjectedfuture.com/content/images/2022/08/Screen-Shot-2022-08-04-at-10.47.41-PM.png 1800w" sizes="(min-width: 720px) 720px"><figcaption>From http://conal.net/papers/compiling-to-categories/compiling-to-categories.pdf</figcaption></figure><p>Then, with the flick of a different template, he&apos;s able to <a href="https://youtu.be/vzLK_xE9Zy8?t=2160">compile the same code into Verilog</a>, a language for programming FPGAs. With another, he&apos;s able to <a href="https://youtu.be/vzLK_xE9Zy8?t=2345">compile it into GLSL for a GPU</a>.</p><p>But the real magic happens when we compile the IR into code with functions that are function-like but aren&apos;t traditional functions&#x2013;they&apos;re embellished in some way. </p><p>One such example is functions that keep track of their deltas&#x2013;differentiable functions. Differentiable functions are useful for constructing neural networks with auto-differentiation since manually writing derivatives for each layer is error-prone and hard to debug. With a template, he&apos;s able to <a href="https://youtu.be/vzLK_xE9Zy8?t=2584">compile a program to be auto-differentiable</a>. </p><p>Auto-differentiation isn&apos;t a far stretch from incremental and reactive systems. And indeed, he <a href="http://conal.net/papers/compiling-to-categories/compiling-to-categories.pdf">has a section where he addresses incremental systems</a>. [8] The purported advantage of this construction of a reactive system is that differentiable functions compose cleanly, and it doesn&apos;t hold state, hence making it easily parallelizable. </p><h3 id="discussion">Discussion</h3><p>While reactivity is deceptively simple when pure functions are involved, it&apos;s easily complicated by temporal locality, the choice of propagating differences, and how equivalence is determined. All of this complexity is absorbed by the implementor in hopes of providing a logical model for the end user that is functional (or declarative), and will run fast enough to produce a result in the short time allotted by the user&apos;s core loop cycle speed.</p><p>It&apos;s intriguing whether this construction of compiling to and from Closed Cartesian Categories can yield a process by which reactive systems are much simpler to build and maintain. Even better if it yields faster systems with more flexibility in operations. </p><p>Could reactive systems be more broadly used to build databases and query languages? The answer seems to be yes. <a href="https://github.com/vmware/differential-datalog">Differential Datalog</a> and <a href="https://github.com/stevedekorte/skipdb">SkipDB</a> are two attempts in this direction. Could we build a runtime that is reactive by default? Could we provide a better developer experience with a reactive compiler? Or will it just collapse on itself with complexity and be hard to use/debug due to inconsistencies?</p><p>Through all this, it may not be worth doing any of it at all, depending on the domain. Today&apos;s computers are faster at doing math, than it is at loading data from memory. Memoization may be slower than just redoing the computation all over again. It&apos;d really have to be an order of magnitude slower than loading from main memory for these techniques to be worth it. </p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://interjectedfuture.com/content/images/2022/08/part101_infographics_v08.png" class="kg-image" alt="Compiling Composable Reactivity" loading="lazy" width="2000" height="1667" srcset="https://interjectedfuture.com/content/images/size/w600/2022/08/part101_infographics_v08.png 600w, https://interjectedfuture.com/content/images/size/w1000/2022/08/part101_infographics_v08.png 1000w, https://interjectedfuture.com/content/images/size/w1600/2022/08/part101_infographics_v08.png 1600w, https://interjectedfuture.com/content/images/2022/08/part101_infographics_v08.png 2400w" sizes="(min-width: 720px) 720px"><figcaption>Other than division, all math ops are in the first order of magnitude. <a href="http://ithare.com/infographics-operation-costs-in-cpu-clock-cycles/">Source</a></figcaption></figure><p>That said, it&apos;s clear to me that the current technique of diffing states in unstructured systems has lots of problems, and that propagating deltas with composable constructions seems like a more promising way to go.</p><p>Many thanks to <a href="https://news.ycombinator.com/user?id=ericd">Eric Dementhon</a> and <a href="https://twitter.com/sridatta">Sri Thatipamala</a> for reading drafts.</p><hr><h3 id="supplemental-overview-of-the-design-space-for-reactive-systems">Supplemental overview of the design space for reactive systems</h3><p><a href="https://raphlinus.github.io/ui/druid/2019/11/22/reactive-ui.html">Towards a unified theory of reactive UI</a> focuses on reactive systems for building GUIs in Rust. The primary mental model is simplifying GUI rendering as a pipeline of steps transforming application state through intermediate representations to become a set of draw calls to the GPU. The design decisions for such a system are:</p><ul><li><strong>data structure vs a computational trace.</strong> The representations of various states after transformations can either be a tree data structure or an execution trace (a linear order of instructions).</li><li><strong>push vs pull interface.</strong> Is code recomputed through a function call (pull), or are changes notified to those subscribing to those changes (pull)?</li><li><strong>explicit deltas vs diffing.</strong> To implement a declarative UI and to provide incremental computation, we&apos;d need to discern what has changed. Should we propagate deltas from cycle to cycle in the core loop? Or should we diff the current and previous trees representing state in the pipeline?</li><li><strong>identity by: object reference vs key path vs unique id.</strong> Lastly, given a changing state tree from cycle to cycle, how do we get a stable identity for nodes requiring change?</li></ul><p><a href="https://dl.acm.org/doi/pdf/10.1145/3236774">Build Systems a la Carte</a> focuses more on build systems. While different on the surface, it&apos;s effectively the same problem that GUIs try to tackle. A build system must be a) correct and b) rebuild out-of-date keys at most once. By correct, it means when the build system completes, the target key and all its dependencies should be up-to-date. And build order of the out-of-date keys should respect their dependencies. Hence, it focuses on two key design parameters that are orthogonal:</p><ul><li><strong>What order to build</strong>. This decision ensures the dependencies of a computational node are computed before the node itself is computed.</li><li><strong>Whether to build</strong>. This decision ensures nodes aren&apos;t computed more than needed. </li></ul><p><a href="https://www.scattered-thoughts.net/writing/an-opinionated-map-of-incremental-and-streaming-systems">An Opinionated Map of Incremental and Streaming Systems</a>, on the other hand, breaks down the variety of reactive systems into a decision tree. </p><figure class="kg-card kg-image-card"><img src="https://interjectedfuture.com/content/images/2022/08/Screen-Shot-2022-08-05-at-8.54.22-PM.png" class="kg-image" alt="Compiling Composable Reactivity" loading="lazy" width="1532" height="1314" srcset="https://interjectedfuture.com/content/images/size/w600/2022/08/Screen-Shot-2022-08-05-at-8.54.22-PM.png 600w, https://interjectedfuture.com/content/images/size/w1000/2022/08/Screen-Shot-2022-08-05-at-8.54.22-PM.png 1000w, https://interjectedfuture.com/content/images/2022/08/Screen-Shot-2022-08-05-at-8.54.22-PM.png 1532w" sizes="(min-width: 720px) 720px"></figure><ul><li><strong>Structured vs Unstructured.</strong> Structured systems restrict the operations available to the user, and they typically propagate deltas through the computational graph. Unstructured systems allow for any type of user operation, and typically execute diffs between the current and previous computational graph to determine which nodes require recomputation. (similar to <a href="https://raphlinus.github.io/ui/druid/2019/11/22/reactive-ui.html">explicit deltas vs diffing</a> mentioned above)</li><li><strong>Low Temporal Locality vs High Temporal Locality.</strong> For streaming systems, data can arrive out of order. How do we incrementally update the result when old data arrives (low temporal locality)? Or do we have a tight window where we ignore old data (high temporal locality)?</li><li><strong>Internally Consistent vs Internally Inconsistent.</strong> When systems are incremental, the internal state will vacillate as values stream into the system. The systems that aren&apos;t internally consistent could be in a state that doesn&apos;t make sense when read. This can be a problem. In GUIs, this manifests itself as visual jank when a page is loading.</li><li><strong>Eager vs Lazy. </strong>An eager system actively computes new outputs whenever the inputs change. A lazy system waits until the output is requested. Lazy systems, however, can&apos;t be used for monitoring use cases.</li></ul><p><a href="https://blog.janestreet.com/breaking-down-frp/">Breaking down FRP</a> focuses more on reactive systems for web apps. It identifies the variety of properties you&apos;d want from a functional reactive system, but notes you can&apos;t have them all in the same system. They are trade-offs that you&apos;d have to pick and choose. The following properties are</p><ul><li><strong>History-sensitivity.</strong> How do we redo our computation incrementally when old data shows up in our front-end? (similar to temporal locality above)</li><li><strong>Efficiency.</strong> This refers to both space efficiency and computational efficiency. Minimize both the amount of past to remember, and minimize the amount of recomputation on an input change. However, it&apos;s hard to have both.</li><li><strong>Dynamism.</strong> This is the ability to reconfigure the computational graph over time, as a response to changing input. But a dynamically changing graph makes it harder to determine what it should do when old data shows up.</li><li><strong>Ease of Reasoning. </strong>Clean semantics that makes it easy to reason about the behavior of the system. (This is related to internally consistent vs internally inconsistent above)</li></ul><hr><p>[0] <a href="https://staltz.com/unidirectional-user-interface-architectures.html">A survey of unidirectional front-end architectures</a> shows us some variations.</p><p>When people say UI is a pure function of state, the state they&apos;re referring to is application data, not the state of various widgets. While you can consider widget states as part of the application state, it&apos;s too fine-grained. The end user rarely cares about the state of a widget.</p><p>[1] Some choice quotes on a definition of a reactive system. </p><blockquote>The essence of functional reactive programming is to specify the dynamic behavior of a value completely at the time of declaration<br><a href="https://svelte.dev/blog/svelte-3-rethinking-reactivity">https://svelte.dev/blog/svelte-3-rethinking-reactivity</a></blockquote><blockquote>Reactive programs maintain a continuous interaction with their environment, at a speed which is determined by the environment, not by the program itself.<br>G&#xE9;rard Berry (RR-1065, INRIA. 189)</blockquote><p>[2] In <a href="https://www.youtube.com/watch?v=PUv66718DII">Inventing on Principle</a>, Bret Victor puts it in a different way. </p><blockquote>...here&apos;s something I&apos;ve come to believe: <strong>Creators need an immediate connection to what they&apos;re creating.</strong><br><br>And what I mean by that, is when you&apos;re making something, if you make a change or you make a decision, you need to see the effects of that immediately. There can&apos;t be a delay, and there can&apos;t be anything hidden.</blockquote><p>In fact, most consumer-facing applications are of this nature: playing games, writing emails, constructing 3D models, and more. </p><p>[3] There are other reasons games are fast, such as using languages that allow for <a href="https://en.wikipedia.org/wiki/Single_instruction,_multiple_data">SIMD</a>. That said, the data pipe to the GPU is slow, so games tend to batch up all the triangles it wants to draw and push them out to the GPU all at once. &#xA0;</p><p>[4] I&apos;m referring to the complexity we&apos;re willing to deal with as implementers, not as client-user developers.</p><p>[5] It seems both Haskell and Julia allow a programmer to change the runtime. Wouldn&apos;t be surprised if both Lisp and Smalltalk allow this as well. <br><br>React is a runtime because it actually has a scheduler that decides when and how much to run the hooks and rendering. It will yield to the browser when its time is up, and <a href="https://www.youtube.com/watch?v=ZCuYPiUIONs">resume where it left off</a> when it regains control from the browser. </p><p>Not all front-end frameworks are runtimes. Svelte, takes a different approach. It compiles reactive code into imperative code. It is the imperative code that is run in the browser.</p><p>[6] Immutability and passing of values help here. If interested, <a href="https://www.hyperfiddle.net/">Hyperfiddle</a> and <a href="https://www.youtube.com/watch?v=9TYfcyvSpEQ">Datomic</a> offer a refreshing look at the problem here. </p><p>[7] Conal Elliott is the originator of <a href="https://blog.danlew.net/2017/07/27/an-introduction-to-functional-reactive-programming/">Functional</a> <a href="https://blog.janestreet.com/breaking-down-frp/">Reactive</a> <a href="http://conal.net/papers/icfp97/">Programming</a>, which influenced <a href="elm-lang.org/">Elm</a>, which in turn influenced <a href="https://redux.js.org/">Redux</a>. But since then, he&apos;s moved on to this idea of compiling to categories.</p><p>[8] Incremental Computation is discussed in <a href="http://conal.net/papers/compiling-to-categories/compiling-to-categories.pdf">section 7.6</a></p>]]></content:encoded></item><item><title><![CDATA[The Broken Half of Interactive Programs]]></title><description><![CDATA[While most front-ends today have largely agreed on the design of the rendering subsystem, we seem to be grappling with design of the interactive subsystem. ]]></description><link>https://interjectedfuture.com/the-broken-half-of-interactive-programs/</link><guid isPermaLink="false">626e31acc728c800010ec521</guid><category><![CDATA[web]]></category><category><![CDATA[programming]]></category><dc:creator><![CDATA[Wil]]></dc:creator><pubDate>Thu, 14 Apr 2022 04:54:14 GMT</pubDate><media:content url="https://interjectedfuture.com/content/images/2022/04/pexels-vanessa-loring-5966631.jpg" medium="image"/><content:encoded><![CDATA[<img src="https://interjectedfuture.com/content/images/2022/04/pexels-vanessa-loring-5966631.jpg" alt="The Broken Half of Interactive Programs"><p>We still don&apos;t know how to structure interactive programs. </p><p>In the last couple of years, web front-ends evolved quickly to converge to a design with two subsystems that depend on each other. The first subsystem renders components in a view as a pure function of application state. The second subsystem leverages a state manager and some way to deal with side effects to provide interactivity.</p><p>This interactivity not only responds to user input, but also to network messages. While most front-ends today have largely agreed on the design of the rendering subsystem, we seem to be grappling with design of the interactive subsystem. We devise different ways to deal with side effects without clear tradeoffs. We can&apos;t decide where state should live, and how to manage state changes in a flexible way.</p><p>What should the the interactive subsystem of the program look like to keep the render side of the loop functional?</p><p>If you have any insight into this, <a href="mailto://iamwil@gmail.com">email me</a> or tweet <a href="https://twitter.com/iamwil">@iamwil</a>.</p><h2 id="the-evolution-of-front-ends">The evolution of front-ends</h2><p>Front-end architecture for the web has evolved quickly in the last couple of years. Early on, front-ends weren&apos;t much of a program. It was a hodge-podge of event handlers, and as the front-end grew in size, the corresponding complexity grew even faster. It became a rat&apos;s nest of dependencies of a component&apos;s state with incoming changes from both the user and the server.</p><p>This complexity only worsened with the desire to give users a low latency experience. To that end, we now build Single Page Apps, which manage their state in addition to handling user events and server updates. It became hard to keep views consistent both internally and with the state on the server.</p><p>We needed a way to organize the front-end programs.</p><!--kg-card-begin: markdown--><p>To that end, circa 2010, we&apos;ve had a slew of front-end frameworks such as <a href="http://backbonejs.org/">Backbone</a>, <a href="https://knockoutjs.com/index.html">Knockout</a>, <a href="https://www.emberjs.com/">Ember</a>, and <a href="https://angularjs.org/">AngularJS</a>, all with differing opinions on the topic. <a name="bref1"></a><sup>[<a href="#ref1">1</a>]</sup> Around 2013, <a href="https://reactjs.org/">React</a> introduced a simplification of front-end programs using functional concepts and the virtual DOM to bring <a href="https://oandre.gal/concepts/immediate-mode-vs-retained-mode/">immediate-mode rendering</a> to web developers. And in 2015, <a href="https://redux.js.org/">Redux</a> introduced a reducer as a sequential message processor to manage state changes.</p>
<!--kg-card-end: markdown--><p>While we&apos;ve had other libraries since then to challenge React and Redux, we seemed to have stabilized on this architecture that pairs functional rendering with one-way data flow with a managed state/effects. These two halves of the architecture are tied end to end to form a core loop of our application that makes it easier to reason about.</p><h2 id="the-core-loop">The Core Loop</h2><p>The core loop of most front-end architectures looks something like this:</p><figure class="kg-card kg-image-card"><img src="https://interjectedfuture.com/content/images/2022/04/Paper.Journal.1.png" class="kg-image" alt="The Broken Half of Interactive Programs" loading="lazy" width="1560" height="1170" srcset="https://interjectedfuture.com/content/images/size/w600/2022/04/Paper.Journal.1.png 600w, https://interjectedfuture.com/content/images/size/w1000/2022/04/Paper.Journal.1.png 1000w, https://interjectedfuture.com/content/images/2022/04/Paper.Journal.1.png 1560w" sizes="(min-width: 720px) 720px"></figure><!--kg-card-begin: markdown--><ol>
<li>One-way data flow from the state at the top of the application down to the rendered components. <a name="bref2"></a><sup>[<a href="#ref2">2</a>]</sup> Components are a pure function of state.</li>
<li>User actions are handled in the component rendering the UI. The handler emits messages to the state manager. The state manager changes the state based on the message.</li>
<li>State changes trigger a re-calculation and re-rendering of the components.</li>
</ol>
<!--kg-card-end: markdown--><p>Hence, the core loop of the application is easy to understand. On one side, application views are rendered as a pure function of the state. We can call this the render side. On the other side, user interactions invoke state changes that trigger re-renders. We can call this the interaction side.</p><p>It&apos;s worth noting at this point this design has analogs in other ecosystems. In enterprise back-ends, there&apos;s a design pattern called <a href="https://blog.ndepend.com/hexagonal-architecture/">Hexagonal Architecture</a> (or more aptly named: &quot;Ports and Adaptors&quot;). It makes the general recommendation: write applications with <a href="https://www.destroyallsoftware.com/screencasts/catalog/functional-core-imperative-shell">a functional core, and an imperative shell</a>.</p><p>&quot;Functional core, imperative shell&quot; encourages programmers to build the core of a program with pure functions until the point where the program needs to interact with the outside world. This happens at the boundaries of the program, such as user interactions, databases, and the network. At that boundary, use an imperative shell to manage these side effects.</p><p>We see web front-end applications converging on the same design. Let&apos;s look at each side in turn, though we&apos;ll spend more time on one than the other.</p><h2 id="the-render-side">The Render Side</h2><p>On the render side, we can see it has an analog in games. Before React, front-end programmers had to maintain both the application state and the DOM state in sync. Syncing these two states can be hard to do consistently and is error-prone. It contributes to the complexity and can introduce view consistency bugs. React introduced <a href="https://oandre.gal/concepts/immediate-mode-vs-retained-mode/">immediate-mode rendering</a> to web developers without calling it that. </p><p>In games, a frame is rendered entirely from the game state. No rendering results from previous frames are used to render the current frame. In other words, rendering is a pure function of state. While this can make a rendering API more low-level, it frees the programmer from syncing the application state with the renderer&apos;s state.</p><p>The advantage of this architecture is that the core is much easier to reason about and test. Pure functions are only affected by their inputs. And one doesn&apos;t need to mock up a graph of objects just to test a pure function.</p><blockquote>&#x2026; Because the problem with object-oriented languages is they&#x2019;ve got all<br>this implicit environment that they carry around with them. <strong>You wanted a banana but what you got was a gorilla holding the banana</strong> and the entire jungle.<br>- Joe Armstrong, creator of Erlang Programming Language</blockquote><p>The front-end library and ecosystem in Javascript seemed to have converged on a functional design on the render side. I won&apos;t expand on the benefits of functional programming here, as that is <a href="https://www.cs.kent.ac.uk/people/staff/dat/miranda/whyfp90.pdf">covered</a> <a href="http://raganwald.com/2014/12/20/why-why-functional-programming-matters-matters.html">elsewhere</a>. The current design issues rest on the other side of the core loop: the interaction side.</p><h2 id="the-interaction-side">The Interaction Side</h2><p>On the interaction side, the story is a little bit more complicated. On this side, a programmer has to contend with both user input and network comms with the server, both commonly regarded as side effects.</p><p>While pure functions are easier to reason about and test, it&apos;s impossible to build useful programs with just pure functions. Many useful programs are interactive programs that need to respond to user input or the outside world through the network. That&apos;s why we have an interaction side of the loop.</p><p>But by their nature, handling user interactions and network responses are not pure functions; the output of the side effects depends on inputs that are outside of the program, and hence outside of its control. To keep the useful design of a pure functional core with interactive elements, we&apos;re forced to push all side effects to the boundaries of the program. Then the question remains, how to best handle the impure interactive elements? This is where front-end programs and the larger functional programming communities differ.</p><h3 id="effects-with-monads">Effects with Monads</h3><p>In contrast to enterprise back-ends that use the hexagonal architecture, pure functional languages like Elm and Haskell, a program cannot explicitly execute side effects. When the pure functional core runs up against the boundary of the program, it uses monads and monad transformers to compose these side effects together. The composed side effects are queued up for the underlying runtime to execute. In Elm, they&apos;re called <em>commands</em>, and in Haskell, they&apos;re called <em>I/O monads</em>. One can think of it as using commands and I/O monads to tell the effects manager in the runtime what side effects to run.</p><p>On one hand, effect managers have the advantage of keeping the rest of the program purely functional and decoupling the event handler from the state manager. However, control flow is now harder to follow and no longer explicit. Typical stack traces are useless here, and we&apos;d have to use other tools to trace a message across the effect manager to its corresponding state update. To complicate tracing, a single event can emit one or more messages for state updates.</p><h3 id="effects-with-algebraic-effects">Effects with Algebraic Effects</h3><p>Monads and effect managers aren&apos;t the only solution to dealing with side effects at the boundaries of a program in a pure functional language. Experimental languages such as <a href="https://www.eff-lang.org/">Eff</a> and <a href="https://koka-lang.github.io/koka/doc/index.html">Koka</a> both feature algebraic effects as a way to manage side effects. </p><p><a href="https://overreacted.io/algebraic-effects-for-the-rest-of-us/">An algebraic effect is like a resumable exception</a>. Instead of using the underlying runtime to execute side effects, the programmer implements the side effect in what looks like a &quot;catch block&quot;. The side effect can then be invoked anywhere in the &quot;try block&quot;, even if it&apos;s invoked deep in the call stack. An invoked algebraic effect will jump back up to the &quot;catch block&quot; to execute the effect, and then resume back where it was invoked after the effect has been completed.</p><p>The only design I know of that remotely leverages algebraic effects on the interaction side are React Hooks. Unlike Redux and <a href="https://guide.elm-lang.org/architecture/">The Elm Architecture</a> (TEA) which keep all application states at the root of the component tree, hooks allow us to choose at which level state can live in the component tree. This can be useful for scoping view state, such as which dropdown was selected, to its local component tree. If the view state needs to be seen across sibling components, we have the option of pulling it up to the lowest common ancestor. Hooks also keep the view and relevant state together, which is easier than tracing messages across the effects manager to discern how a message would change the application state.</p><h3 id="effects-with-linear-types">Effects with Linear Types</h3><!--kg-card-begin: markdown--><p>Lastly, a less well-known method is using linear types to manage side effects.<a name="bref4"></a><sup>[<a href="#ref4">4</a>]</sup> Linear types are a <a href="https://en.wikipedia.org/wiki/Substructural_type_system">type system</a> that restricts a program to only a single reference to a variable. Side effects, such as network fetches, rely on the state of the world for their output. Since we can&apos;t rewind and rerun the world, it&apos;s impossible to do a network fetch twice and guarantee the same output as a pure function. Hence linear types are a way to constrain the programmer to make sure a side effect is never run more than once.</p>
<!--kg-card-end: markdown--><p>I don&apos;t know of any front-end interaction side design that leverages linear types. Perhaps there are Rust GUI libraries that would count. If you know of any with an interesting design, <a href="https://twitter.com/iamwil">let me know</a>.</p><p>Much of the effort from functional programming has been on how to deal with side effects at the boundaries. But for the design of the interaction side, there are other things to consider. Let&apos;s explore a couple below.</p><h2 id="where-should-state-live">Where should state live?</h2><p>As mentioned earlier, Redux and TEA keep a single global state at the root component, with a single reducer to manage state changes. Global state is possible because writes are sequenced linearly by the underlying runtime, and we&apos;re not subject to unpredictable data races.</p><p>On the upside, we can see all our state changes and updates in one place, the reducer. The reducer is usually implemented as one big case statement, but it should be seen as a finite state machine (FSM). With it, we get the advantages of an FSM: we can enumerate all good states and their transitions and <a href="https://www.youtube.com/watch?v=IcgmSRJHu_8&amp;t=1s">make impossible states impossible</a>.</p><p>Another upside of keeping all application states global is that it becomes trivial to implement infinite undo and time-traveling features. However, I&apos;ve rarely seen web apps leverage this. This is probably because outside of database-less web apps, this requirement would cascade supporting undo all the way into the database, which is no easy thing, as <a href="https://www.youtube.com/watch?v=D6nYfttnVco&amp;t=663s">most databases mutate in-place</a>.</p><p>The downside is that it&apos;s hard to get the granularity of the state updates correct. If it&apos;s too fine, the event handler may need to generate many messages for a single user action, which makes it harder to trace and debug. If it&apos;s too coarse, the programmer may find duplication between the different state changes in reaction to messages. In addition, with actions that fetch data over the network, there often needs to be intermediate states that represent the state of data while in transition and error to account for network failures. And while Redux has the concept of sub-reducers that compose, in practice, I found it cumbersome to move state changes in reducers up and down to keep the code organized and in lock step with my current understanding of the domain problem.</p><p>Additionally, an application may need to keep two different types of state, a <em>domain-model state</em> (i.e. list of todos that have been done) and a <em>view state</em> (i.e. the selected dropdown in a list of todos). In an application with global state, reusing the same component in another part of the app requires adding a new global view state. This is extra bookkeeping a programmer would need to keep track of. Hooks are better in this regard by bundling the view state of a component entirely within the component. Reusing a component involves just using the component, and not about coordinating the global view state with the component it references.</p><h2 id="how-should-state-be-referenced">How should state be referenced?</h2><p>A fundamental assumption of React is that the structure of the state tree largely matches the structure of the component tree. But this isn&apos;t always the case. In an illustration app (like Figma), the state of a shape (such as its x, y coordinates) has a single source of truth. However, that state is being referenced in the following sibling components: in the main canvas to draw the shape, in the right sidebar to show its relation to other components, and in the left sidebar to display its properties.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://interjectedfuture.com/content/images/2022/04/Screen-Shot-2022-04-09-at-5.44.01-PM.png" class="kg-image" alt="The Broken Half of Interactive Programs" loading="lazy" width="2000" height="1119" srcset="https://interjectedfuture.com/content/images/size/w600/2022/04/Screen-Shot-2022-04-09-at-5.44.01-PM.png 600w, https://interjectedfuture.com/content/images/size/w1000/2022/04/Screen-Shot-2022-04-09-at-5.44.01-PM.png 1000w, https://interjectedfuture.com/content/images/size/w1600/2022/04/Screen-Shot-2022-04-09-at-5.44.01-PM.png 1600w, https://interjectedfuture.com/content/images/2022/04/Screen-Shot-2022-04-09-at-5.44.01-PM.png 2106w" sizes="(min-width: 720px) 720px"><figcaption>From <a href="https://www.youtube.com/watch?v=_ISAA_Jt9kI">Recoil&apos;s presentation</a> of the problem.</figcaption></figure><p>In a case like this, the recommendation is often to pull the shape&apos;s state up to the first common ancestor between all the siblings. However, in React at least, this has the issue of forcing unnecessary re-renders in sibling components that don&apos;t use the shape state.</p><p><a href="https://recoiljs.org/">Recoil</a> is another state management library from Facebook that tries to solve this problem by introducing querying of the state, which decouples the shape of the data from the shape of the component tree. Solid.js claims to only <a href="https://www.solidjs.com/guides/reactivity#considerations">render all components once</a>, but rerun all its hooks, so it may not have this problem.</p><h2 id="what-if-we-make-time-explicit">What if we make time explicit?</h2><p>What if we make time explicit? If we treat user interactions not as a lone event in time, but as a signal of user interactions over all time? That way, we can treat it as an immutable value that we can plug into pure functions.</p><!--kg-card-begin: markdown--><p>Well-read readers would recognize this is what functional reactive programming (FRP) poses as a solution to structuring the interaction side of the core loop. <a name="bref3"></a><sup>[<a href="#ref3">3</a>]</sup></p>
<!--kg-card-end: markdown--><p>On paper, it doesn&apos;t sound like a bad idea. However, we&apos;ve done that experiment once. Elm started as a language embracing FRP, but <a href="https://elm-lang.org/news/farewell-to-frp">decided to say farewell to it</a>. It turns out TEA killed off FRP in the name of ease of learning and understandability. That may either be an indictment of the method or that most programmers weren&apos;t conceptually ready for it back then.</p><p>But perhaps it just needs to be introduced in a different guise. <a href="https://www.solidjs.com/">Solid.js</a> uses a hooks-like API for its state, but instead of calling it createState, it&apos;s called createSignal.</p><h2 id="how-should-state-be-updated">How should state be updated?</h2><p>What if state was versioned instead of updated? States would be considered immutable and append-only. We could treat the entire future and history of the state as a value, where previous versions are immutable and future versions are also immutable, but just not yet revealed to us.</p><p>At first thought, immutable state should make renders faster, because it&apos;d be easier to compare equality to decide whether a component depending on a state needs to re-render or not. The immutable values could also be shipped around and make it easier to communicate with remote resources like the server&apos;s database. However, it also means that state would need to be garbage collected when old states are no longer being used or referenced, and this might impact performance.</p><p>However, as I describe this, it sounds suspiciously like FRP, so I might have to make a separate post to figure out the implications.</p><h2 id="what-about-in-games">What about in games?</h2><p>As a passing thought, what do games do? They&apos;re the most interactive of all programs, to the point they&apos;re considered a medium.</p><p>But as far as I can tell, they don&apos;t do much hand-wringing. They just mutate state and be done with it. While the rendering system is a pure function of state, that&apos;s largely handled by the game engine. Most game code deals with the state imperatively, and I don&apos;t know of any games that are written in a functional language. However, the <a href="http://sevangelatos.com/john-carmack-on/">best game programmers understand the value of a functional style</a> in their programs, so at least there&apos;s some agreement on the &quot;functional core&quot; part.</p><p>How do games manage updating game state in a manageable way? I do know that finite state machines are commonly in use. And state can get complex with open world systemic games, where the desired property of the game is a combinatorial explosion of behaviors from a combination of game mechanics, such as <a href="https://www.pcgamer.com/how-cats-get-drunk-in-dwarf-fortress-and-why-its-creators-havent-figured-out-time-travel-yet/">cats vomiting in taverns in Dwarf Fortress</a>.</p><p>But overall, I&apos;m less familiar with what developers typically do here, the problems they run up against. <a href="https://twitter.com/iamwil">Let me know</a> if you know.</p><h2 id="discussion">Discussion</h2><p>In summary, it seems like the properties we&apos;d want in the interaction side are the following:</p><ol><li>View state and its transitions should be bundled by default with a component to make it easier to reuse the component. The view state and its transitions can be separated from its default component if necessary to make variations, such as dropdowns with or without search.</li><li>Domain-model state should be accessible to any component but scoped if necessary to avoid re-rendering.</li><li>State managers are finite state machines. It should be easy to group states and transitions for reuse and composability. It should be just as easy to break a grouped state into its component states and transitions to refactor.</li><li>State management should include pre-built states and transitions to handle network fetches, waits, and failures.</li><li>User actions and server pushes should be traceable from the event emittance, through the application state changes, all the way to the re-rendering of the components.</li></ol><p>Taken together, I wonder if an in-memory database with some type of restricted update semantics would be what I&apos;m looking for.</p><p>Do these problems and properties resonate with you? Are these systemic problems that you see in front-end libraries and architectures? If you have thoughts or insights, <a href="https://twitter.com/iamwil">let me know</a>.</p><p>Thanks to <a href="https://twitter.com/sridatta"><em>Sri Thatipamala</em></a> for reading drafts.</p><hr><!--kg-card-begin: html--><div style="margin-top: 1em">
    <a name="ref1">[1]</a> <a href="https://blog.logrocket.com/history-of-frontend-frameworks/">A history of front-end frameworks</a>
    <a href="#bref1">&#x21A9;</a>
</div>
<div style="margin-top: 1em">
	<a name="ref2">[2]</a> This is generally true, though you can have local state within a component that won&apos;t be seen by parents unless it&apos;s in reference to a context, and won&apos;t be seen by children unless it&apos;s passed down.
    <a href="#bref2">&#x21A9;</a>
</div>
<div style="margin-top: 1em">
	<a name="ref3">[3]</a> FRP is one of those vaguely precise terms that many people have taken their own interpretations of it and muddied the waters. Conal Elliot, the originator has tried to redefine his intention with a new term, Denotational Programming.
    <a href="#bref3">&#x21A9;</a>
</div>
<div style="margin-top: 1em">
	<a name="ref4">[4]</a> Linear types are often used interchangably with Uniquness types. They are <a href="https://en.wikipedia.org/wiki/Uniqueness_type#Relationship_to_linear_typing">different in a subtle way</a>, but doesn&apos;t seem to matter in this context. 
<blockquote>In linear logic, variables of a non-linear type can be coerced to a linear type (dereliction). Harrington phrases it well: in linear logic, &quot;linear&quot; means &quot;will not be duplicated&quot; whereas in uniqueness typing, &quot;unique&quot; means &quot;has not been duplicated&quot;.
<br>
- from <a href="https://web.archive.org/web/20201109030040/http://lambda-the-ultimate.org/node/2708">Uniqueness Typing Simplified</a></blockquote>
    <a href="#bref4">&#x21A9;</a>
</div>
<!--kg-card-end: html--><p>Photo by <a href="https://www.pexels.com/photo/avocado-fruits-cut-in-half-on-brown-surface-5966631/">Vanessa Loring</a></p>]]></content:encoded></item><item><title><![CDATA[Destroyed at the Boundaries]]></title><description><![CDATA[Shared mutable state invites complexity into our programs. Programming languages help with this complexity inside a program, but not across network boundaries between programs.]]></description><link>https://interjectedfuture.com/destroyed-at-the-boundary/</link><guid isPermaLink="false">626e31acc728c800010ec51a</guid><category><![CDATA[web]]></category><category><![CDATA[distributed systems]]></category><dc:creator><![CDATA[Wil]]></dc:creator><pubDate>Tue, 05 Apr 2022 13:51:45 GMT</pubDate><media:content url="https://interjectedfuture.com/content/images/2022/04/pexels-sirli-jung-9684571-1.jpg" medium="image"/><content:encoded><![CDATA[<!--kg-card-begin: markdown--><img src="https://interjectedfuture.com/content/images/2022/04/pexels-sirli-jung-9684571-1.jpg" alt="Destroyed at the Boundaries"><p>Today&#x2019;s web developers are distributed systems engineers, they just don&#x2019;t know it yet. And unfortunately for them, most languages and tools fail programmers exactly when they need them most: in a world full of network boundaries.</p>
<p>Programming languages help with complexity stemming from shared mutable state inside a program, but not across network boundaries between programs. This is especially true of web apps, commonly with two network boundaries. <sup class="footnote-ref"><a href="#fn1" id="fnref1">[1]</a></sup> We could initially ignore it, circa 2007 by pushing shared state down into the database and treating browser pages as a pure function of state. But today with single-page apps keeping state, serverless cloud functions, and microservices backends, this is no longer the case.</p>
<p>I think much of the complexity can be tamped down if we had practical abstractions across these network boundaries. What should they look like? Or are network failures an <a href="https://ably.com/blog/8-fallacies-of-distributed-computing">inescapable complexity</a> that we can&apos;t abstract over?</p>
<p>If you have any insight into this, <a href="mailto://iamwil@gmail.com">email me</a> or tweet <a href="https://twitter.com/iamwil">@iamwil</a>.</p>
<h2 id="the-network-boundary">The network boundary</h2>
<p>Let&apos;s pick on pure functional programming for a second. It deals with shared mutable state by banishing mutable state within a program altogether and pushing it to the edge of the program. Then I/O monads and algebraic effects populate the edges to keep out the mutable world.</p>
<p>However, pure functional programming has nothing to say about composing programs across this network boundary. <sup class="footnote-ref"><a href="#fn2" id="fnref2">[2]</a></sup> And it&apos;s not just pure functional programming. Most languages and paradigms address complexity stemming from shared mutable state within a program by constraining the context in which state is used.</p>
<p>For example, <a href="https://doc.rust-lang.org/book/ch16-03-shared-state.html">Rust&apos;s smart pointers and ownership rules</a> make it explicit who can update a variable at any one time. <a href="https://clojure.org/reference/vars">Clojure&apos;s atom/ref/var/agent reference types</a> achieve a similar end along different dimensions. The <a href="https://en.wikipedia.org/wiki/Hexagonal_architecture_(software)">Hexagonal Architecture</a> preaches a pure functional core with an imperative shell. <sup class="footnote-ref"><a href="#fn3" id="fnref3">[3]</a></sup> None of these address the composition across the network boundary.</p>
<p>While Java (along with its JVM ilk, Scala and Clojure) are often used to build distributed systems and web apps, these languages and their ecosystem of libraries also make it apparent when you&apos;re reaching across the network to another program. As for Javascript, the most popular language for web apps, it has little to no support across the network boundary either, needless to say. <sup class="footnote-ref"><a href="#fn4" id="fnref4">[4]</a></sup></p>
<p>The only languages that directly address this are ones that were designed to build distributed systems from the get-go. <a href="https://learnyousomeerlang.com/">Erlang</a>/<a href="https://elixir-lang.org/">Elixir</a> leverages the actor model and message passing over the network. <a href="https://www.ponylang.io/discover/#what-makes-pony-different">Pony</a> uses a type system to check that code is dead-lock and data-race free. <a href="https://www.unison-lang.org/learn/the-big-idea/">Unison</a> sends functions across the wire instead of data, because Unison code is immutable and content-addressable. However, these languages are currently not particularly popular for building web applications. <sup class="footnote-ref"><a href="#fn5" id="fnref5">[5]</a></sup></p>
<p>What is it about web applications that makes them susceptible to this network boundary problem? Simply put, web application workloads are dominated by I/O, disk, and network, rather than the CPU. For any particular CRUD app, most of the work is mapping data from structure to structure, and waiting on the network. We used to be able to ignore the problem of shared mutable states across network boundaries by making prudent architectural design decisions.</p>
<p>Around 2007, web developers could ignore the problem of shared mutable state in the system with server-side rendered web architecture. On the back-end, you could scale by making the application servers as stateless as possible, and pushing that state down into the database. As a result, we delegated all the hard parts of concurrent programming to the database, which they already specialized in. On the front-end, the view for an endpoint was a pure function of the state of the database. There were no inconsistencies in the page because like <a href="https://oandre.gal/concepts/immediate-mode-vs-retained-mode/">immediate-mode</a> rendering, we throw away the entire page and request a new one when the user navigates away from the page.</p>
<p>But today, web developers can no longer ignore the shared mutable state across network boundaries. On the back-end, we have microservices architecture that complicates it into a distributed system. On the front-end, we&apos;ve started constructing thick clients on the browser that manage state of their own, under the banner of more responsive/low latency apps. No longer a pure function of state, front-ends are rife with inconsistent views and introduce complexities like <a href="https://www.gatsbyjs.com/docs/conceptual/react-hydration/">hydration</a>. <sup class="footnote-ref"><a href="#fn6" id="fnref6">[6]</a></sup></p>
<p>The composable properties offered by powerful abstractions in modern languages, such as ownership, immutability, monads, algebraic effects, linear types, etc. help manage the complexity of shared mutable state within a program. And yet, this composability all get destroyed at the boundaries of the program. We still haven&apos;t figured this out, even though our software systems are full of network boundaries.</p>
<h2 id="the-two-boundaries">The Two Boundaries</h2>
<p>Most web apps have two major boundaries: the browser/server boundary and the server/database boundary. Let&apos;s go over the current design problems with these two boundaries.</p>
<h3 id="the-browserserver-boundary">The Browser/Server Boundary</h3>
<p>On this network boundary, the latency is higher and less reliable. The question is: how to send code and state over the network to deliver a fast-loading application and keep a consistent view with new user input and events, even in a loss of connectivity?</p>
<p>For client-side rendered apps (single page apps), most of the current activity in the space is centered around <a href="https://openbase.com/categories/js/best-react-state-management-libraries">state management in the browser</a>. However, these libraries are largely unopinionated about how you fetch and sync the server state with the client state. <sup class="footnote-ref"><a href="#fn7" id="fnref7">[7]</a></sup> Some sort of async effect handler is provided, and it&apos;s up to you to manually fetch data from a server endpoint and the method to update the application state from the data received from the server. <sup class="footnote-ref"><a href="#fn8" id="fnref8">[8]</a></sup></p>
<p>There are a few libraries that specifically focus on this problem, such as <a href="https://react-query.tanstack.com/overview">React Query</a> and <a href="https://hasura.io/learn/graphql/intro-graphql/graphql-subscriptions/">Hasura Subscriptions</a>. Both provide functions to tie the fetching of data, displaying of data, and mutation of data together, so that there&apos;s deep integration between all three stages to keep them in sync. However, the API is more like the <a href="https://www.youtube.com/watch?v=ZQ5_u8Lgvyk&amp;t=232s">engine type of reuse</a>, where it&apos;s the library that&apos;s in control, and you&apos;re just filling in the blanks. This can be problematic if you ever need to do something the engine doesn&apos;t provide. They don&apos;t provide compositional properties over the network boundary.</p>
<table>
<thead>
<tr>
<th><img src="https://interjectedfuture.com/content/images/2022/03/Screen-Shot-2022-03-24-at-11.42.02-PM.png" alt="Destroyed at the Boundaries" loading="lazy"></th>
</tr>
</thead>
<tbody>
<tr>
<td><center>From Casey Muratori&apos;s <a href="https://www.youtube.com/watch?v=ZQ5_u8Lgvyk&amp;t=232s">Designing and Evaluating Reusable Components</a></center></td>
</tr>
</tbody>
</table>
<p>On the other side of this same boundary, server-side rendering advocates have long considered client-side rendered apps as complex and slow-to-first-render. Instead of sending data from the server to the client for it to render, the server renders a partial view of the data and sends it over to the client. The client wouldn&apos;t do a full page reload, but take the partial view and insert it into the current page. <sup class="footnote-ref"><a href="#fn9" id="fnref9">[9]</a></sup> The initial load is quick, because it&apos;s a thinner client. And the user still experiences an app that doesn&apos;t refresh on every action, but now every action is subject to a round-trip latency.</p>
<h3 id="the-serverdatabase-boundary">The Server/Database Boundary</h3>
<p>No matter how well you constrain mutable state in your programming language in a web app, you&apos;ll need to deal with the database: it&apos;s the ultimate global shared mutable state.</p>
<p>On this network boundary, the data is often &quot;over there&quot; with the database and the business logic is on the application server &quot;over here&quot;. The question is: do you bring the code to the data or the data to the code? Current designs often choose the latter, which means how the data is structured, represented, and accessed will affect systems on both sides of the boundary. This wouldn&apos;t be an issue if both sides agree, but oftentimes, they don&apos;t. How do we resolve this tension?</p>
<p>Traditionally, databases weren&apos;t designed to run arbitrary business logic&#x2013;we don&apos;t move the application code to the data. Instead, we move a subset of the data to the application server, by sending the database a query, effectively short declarative programs to get a subset of the entire data set.</p>
<p>Databases are also typically designed with in-place updates of data, which Rich Hickey calls <a href="https://youtu.be/-6BsiVyC1kM?t=245">Place-orientated programming (PLOP)</a>. <sup class="footnote-ref"><a href="#fn10" id="fnref10">[10]</a></sup> PLOP means that you&apos;re if you&apos;re reading data, some other participant might have updated it since you last read it. Therefore, reading values for consistent views and incrementing counts require coordination between participants, which complicates the client.</p>
<p>As a consequence of these design choices, we&apos;re wary of making too many small requests due to the overhead of making round trips. It&apos;s up to the client to now batch up queries, which also complicates the client. For example, when implementing a GraphQL server, a front-end request can traverse any number of tables and microservices. It&apos;s up to the server to batch up those requests to keep the overhead for a request low.</p>
<p>Because we typically bring the data to the application code, the way it&apos;s represented and structured couples software systems on each side of the boundary. Databases often choose a set and relational representation of data, because that aligns with their goals of data storage and access. On the other hand, application servers tend to process data as objects and dictionaries, as mainstream languages tend to be structured around object-orientated programming. This tension results in friction and complication at this network boundary called the <a href="https://en.wikipedia.org/wiki/Object%E2%80%93relational_impedance_mismatch">object-relational impedance mismatch</a>. <sup class="footnote-ref"><a href="#fn11" id="fnref11">[11]</a></sup></p>
<p>A short summary of the mismatch is summarized in the table below.</p>
<table>
<thead>
<tr>
<th><img src="https://interjectedfuture.com/content/images/2022/03/Screen-Shot-2022-03-26-at-6.27.09-AM.png" alt="Destroyed at the Boundaries" loading="lazy"></th>
</tr>
</thead>
<tbody>
<tr>
<td><center>Lifted from <a href="https://www.youtube.com/watch?v=dTg6Vaypo1c">The Impedance Mismatch is Our Fault</a> by Stuart Halloway</center></td>
</tr>
</tbody>
</table>
<p>We&apos;ve tried to solve this with object-relational mapping (ORM) libraries, but often to <a href="https://www.martinfowler.com/bliki/OrmHate.html">no avail</a>. <sup class="footnote-ref"><a href="#fn12" id="fnref12">[12]</a></sup></p>
<p>Does it need to be that way? What if we moved the code closer to the data instead? It&apos;s typically big data systems that move the code to data, but there do seem to be some applications that make the choice of putting business logic in the database as stored procedures. However, this architectural choice currently also has friction as it often doesn&apos;t work with other developer tools and workflow, like version control. In addition, scaling out via replicas introduces consistency issues.</p>
<p>Does the data need to be updated in-place? What if data is immutable and append-only? Then, reads can be done without coordination with other participants. Favoring immutable data has seen success in a variety of software, such as React, immediate-mode rendering, Clojure, Kafka, and others. It stands to reason, that immutable data would find a chance for success here too.</p>
<p>At this point, I&apos;m not advocating for any particular solution. In the design space of cutting a boundary between systems, we seem to be making the decision manually for each application. The place where we make the cut has significant design implications for the rest of the system, and perhaps we can rethink where that boundary is cut, and whether we need to do it manually at all.</p>
<h2 id="composition-across-the-network-boundary">Composition across the network boundary</h2>
<p>How do we compose over the network boundary? Are the <a href="https://ably.com/blog/8-fallacies-of-distributed-computing">eight fallacies of distributed computing</a> impossible to abstract over because of its <a href="https://blog.acolyer.org/2015/03/20/out-of-the-tar-pit/">inherent complexity</a>? Or is it akin to memory management, where we insisted on doing it by hand until the abstraction became so good that we couldn&apos;t justify it any longer for large swaths of programming domains?</p>
<p>Like all beginner optimists, I suspect we should be able to. TCP/IP abstracted the network, so data looks like it came in order. HTTP request/response abstracted over TCP/IP, so it looks like servers are engaged in a conversation. When it comes to distributed computing, however, we&apos;ve taken abstractions that work on a single-threaded computational model and tried to shoehorn them into an environment whose reality invalidates fundamental assumptions of the model, such as calling a function would execute it once and only once. The harder we&apos;ve worked to maintain it, the more complex our systems have become. <sup class="footnote-ref"><a href="#fn13" id="fnref13">[13]</a></sup></p>
<p>What would an abstraction across the network boundary look like? A promising place to start would be to assume an environment that can fail as a network does. It seems to me it should have a couple of properties.</p>
<ol>
<li>The fundamental assumptions of the abstraction we use shouldn&apos;t be invalidated by the <a href="https://ably.com/blog/8-fallacies-of-distributed-computing">eight fallacies</a>, such as functions called means only-once execution.</li>
<li>The abstraction should allow the programmer to focus on two tasks: initiating the next computation and getting the data ready for the next computation to consume.</li>
<li>Whatever abstraction works over the network environment should also work in a single-threaded, single-computer environment without needless over-complication.</li>
<li>Failures need to be taken into account as part of the abstraction. There should be a way for network failures to manifest themselves in the abstraction, so users can handle different cases of failure at a higher level.</li>
</ol>
<h2 id="discussion">Discussion</h2>
<p>Web developers are distributed systems engineers in diguise. The myriad of libraries at both the browser/server boundary and the server/database boundary are all searching for the right place to make a boundary in the design space. None of them feel quite right, resulting in <a href="https://tonsky.me/blog/the-web-after-tomorrow/">both a frustrating user and developer experience</a>.</p>
<p>Is there a unifying pattern or abstraction we can take from distributed systems to simplify the complexity that accrues and accumulates at these boundaries for web applications? What are people trying today to ease the complexities at these boundaries? Do you have insights into what&apos;s already been tried, and what might be pitfalls in such an abstraction?</p>
<p>Thanks to <em><a href="https://twitter.com/sridatta">Sri Thatipamala</a></em>, <em>Eric Dementhon</em>, and *<a href="https://twitter.com/dtran320">David Tran</a> for reading drafts.</p>
<p>Photo by <a href="https://www.pexels.com/photo/stainless-steel-faucet-on-white-ceramic-sink-9684571/">Sirli Jung</a> from Pexels</p>
<hr class="footnotes-sep">
<section class="footnotes">
<ol class="footnotes-list">
<li id="fn1" class="footnote-item"><p>I think this problem also exists for programs traditionally bound by CPU, such as games and compilers, as even those programs make network calls nowadays. However, I&apos;ll mostly speak to web applications as that&apos;s what I know well and am concerned with here. <a href="#fnref1" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn2" class="footnote-item"><p>I think Paul Chisano put it best about Haskell, a pure functional programming language: &quot;<a href="http://pchiusano.github.io/2017-01-20/why-not-haskell.html">If Haskell is so great, why hasn&apos;t it taken over the world? The reason I&#x2019;ll give is that Haskell&#x2019;s otherwise excellent composability is destroyed at I/O boundaries, just like every other language.</a>&quot; <a href="#fnref2" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn3" class="footnote-item"><p>Which if you squint, the <a href="https://deque.blog/2017/07/06/hexagonal-architecture-a-less-declarative-free-monad/">Hexagonal Architecture is precise what pure functional languages enforce</a>: a pure core with free monads around the boundaries. <a href="#fnref3" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn4" class="footnote-item"><p>As for <a href="https://insights.stackoverflow.com/survey/2021#technology-most-popular-technologies">other popular languages to</a> <a href="https://becominghuman.ai/top-10-best-web-application-development-languages-8204aad91bc4">build</a> <a href="https://medium.com/javarevisited/top-5-programming-languages-for-web-development-in-2021-f6fd4f564eb6">web</a> <a href="https://fortyseven47.com/blog/lang-web-app-development/">apps</a> in, Python, PHP, Ruby, C#, and Go. They all have some type of support for concurrent programming, but haven&apos;t focused on distributed computing. <a href="#fnref4" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn5" class="footnote-item"><p>Elixir has been getting some minor traction as a language for web dev, mostly due to the <a href="https://www.phoenixframework.org/">existence of a web framework</a>, rather than its inherent support for distributed computing. I surmise this is because the distributed problems didn&apos;t rear their heads until we started adding state to frontends and scaled beyond what a single database could handle. <a href="#fnref5" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn6" class="footnote-item"><p>Ever notice how on some web apps, the notification count is still wrong after you&apos;ve read your messages on different devices? Consistency problems ensue when front-end manages its own state, and it isn&apos;t just a pure function of state. <a href="#fnref6" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn7" class="footnote-item"><p>There are only a few that actively try to help with client state in relation to the server, such as <a href="https://recoiljs.org/docs/recoil-sync/introduction">Recoil Sync</a> and <a href="https://jotai.org/docs/basics/async">Jotai Async</a>. However, they don&apos;t try to build abstractions over a network data fetch. <a href="#fnref7" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn8" class="footnote-item"><p>There is currently nascent work on <a href="https://reactjs.org/docs/concurrent-mode-suspense.html">React Suspense</a>, but it&apos;s not meant to be a full solution, but rather a way for data fetching libraries to deeply integrate with React to design and orchestrate the different loading states. <a href="#fnref8" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn9" class="footnote-item"><p><a href="https://dockyard.com/blog/2018/12/12/phoenix-liveview-interactive-real-time-apps-no-need-to-write-javascript">Phoenix Liveview</a>, <a href="https://hotwired.dev/">Hotwire</a>, <a href="https://nextjs.org/docs/advanced-features/react-18/server-components">React Server Components</a>, <a href="https://markojs.com/docs/marko-vs-react/">Marko</a>, <a href="https://github.com/BuilderIO/qwik">Qwik</a>, and <a href="https://docs.astro.build/en/core-concepts/component-hydration/">Astro</a> are all different takes on the server-side rendering for rich interactive apps. <a href="#fnref9" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn10" class="footnote-item"><p>As Rich Hickey has argued in his <a href="https://www.youtube.com/watch?v=V6DKjEbdYos&amp;t=314s">numerable</a> <a href="https://www.youtube.com/watch?v=-6BsiVyC1kM">talks</a>. <a href="#fnref10" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn11" class="footnote-item"><p>Or as it&apos;s called <a href="http://blogs.tedneward.com/post/the-vietnam-of-computer-science/">Vietnam of Computer Science</a>. However, I heard in <a href="https://open.spotify.com/episode/25yhLUzpNcPSfIy1pMsyHU?si=afd1818d17d442f7">a recent podcast that DHH would beg to differ</a>, as he believes the ORM did us a solid. <a href="#fnref11" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn12" class="footnote-item"><p>This stopping-the-world stems from a single-threaded model of computing. It&apos;s a vestige from the early days of computing where we had no choice but to update everything in-place since memory was expensive and scarce. Those days are long gone, but we still think in terms of updating data in-place. <a href="#fnref12" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn13" class="footnote-item"><p>But it&apos;s not like the people that came before us weren&apos;t smart enough. In fact, the people that came up with the <a href="https://ably.com/blog/8-fallacies-of-distributed-computing">Eight Fallacies of Distributed Computing all came from people that worked at Sun</a>. Java is widely used in distributed systems. It just doesn&apos;t have as much influence on web applications. <a href="#fnref13" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
</ol>
</section>
<!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[Year in Review]]></title><description><![CDATA[A review of the wanderings in 2021, and a focus for 2022.]]></description><link>https://interjectedfuture.com/2021-year-in-review/</link><guid isPermaLink="false">626e31acc728c800010ec510</guid><category><![CDATA[retro review]]></category><dc:creator><![CDATA[Wil]]></dc:creator><pubDate>Mon, 24 Jan 2022 20:00:00 GMT</pubDate><media:content url="https://interjectedfuture.com/content/images/2022/04/pexels-max-vakhtbovych-6077368.jpg" medium="image"/><content:encoded><![CDATA[<img src="https://interjectedfuture.com/content/images/2022/04/pexels-max-vakhtbovych-6077368.jpg" alt="Year in Review"><p>The New Year is as good of a time to reboot something as any. To encourage more writing, I&apos;ll change this around. I&apos;ll talk more about the incomplete thoughts I have and what I&apos;m up to. Those are easier to write.</p><p>Seeing that it&apos;s the New Year, I was reflecting on the things I worked on this year.</p><p><strong>Things I built this past year:</strong></p><p>At the beginning of 2021, I didn&apos;t know what I was going to do next. I spent this past year trying different things. Looking back, I seemed to have tried a bunch of things, more than I&apos;d remembered.</p><h5 id="payment-workflow-builder">Payment workflow builder</h5><p>The first few weeks of 2021 was spent building a payment workflow builder. At <a href="https://pulley.com/">Pulley</a>, implementing employee option purchasing with ACH payments without holding the funds was a pain. Inspired by page builders, I tried a payment flow builder to direct the flow of money. While it did seem promising, when asking devs on reddit, most of them seem to have relatively simple flows for money. I never launched it. </p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://interjectedfuture.com/content/images/2022/01/Screen-Shot-2021-02-14-at-11.46.11-AM.png" class="kg-image" alt="Year in Review" loading="lazy" width="480" height="270"><figcaption>You&apos;d build a flow on the left and see it on the right.</figcaption></figure><h5 id="automation-cookbook">Automation Cookbook</h5><p>A longstanding problem I want solved are chainable background jobs&#x2013;really an ETL pipeline for web apps, rather than for data science. I seem to keep running into this problem at most every job over the years. Workflow automation tools have an issue where people don&apos;t exactly know what to do with it. Hence, I wanted to collect user stories and use cases, so I started a newsletter called <a href="https://automationcookbook.io/">automationcookbook.io</a>. I even interviewed someone on their process, but I ended boring myself to tears doing this. None of the use cases seemed particularly compelling. It felt like swimming in the long tail.</p><figure class="kg-card kg-image-card"><img src="https://interjectedfuture.com/content/images/2022/01/Screen-Shot-2022-01-08-at-8.42.42-PM.png" class="kg-image" alt="Year in Review" loading="lazy" width="2000" height="1054" srcset="https://interjectedfuture.com/content/images/size/w600/2022/01/Screen-Shot-2022-01-08-at-8.42.42-PM.png 600w, https://interjectedfuture.com/content/images/size/w1000/2022/01/Screen-Shot-2022-01-08-at-8.42.42-PM.png 1000w, https://interjectedfuture.com/content/images/size/w1600/2022/01/Screen-Shot-2022-01-08-at-8.42.42-PM.png 1600w, https://interjectedfuture.com/content/images/size/w2400/2022/01/Screen-Shot-2022-01-08-at-8.42.42-PM.png 2400w" sizes="(min-width: 720px) 720px"></figure><h5 id="gather-town-clone">Gather Town Clone</h5><p>I didn&apos;t really have a name for this project, but it&apos;s like <a href="https://www.gather.town/">Gather Town</a>&#x2013;like a Zoom meets Pokemon. </p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://interjectedfuture.com/content/images/2022/01/Screen-Shot-2021-06-15-at-7.07.57-AM.png" class="kg-image" alt="Year in Review" loading="lazy" width="800" height="428" srcset="https://interjectedfuture.com/content/images/size/w600/2022/01/Screen-Shot-2021-06-15-at-7.07.57-AM.png 600w, https://interjectedfuture.com/content/images/2022/01/Screen-Shot-2021-06-15-at-7.07.57-AM.png 800w" sizes="(min-width: 720px) 720px"><figcaption>I was envisioning it&apos;d be kinda like Zapier, where you can fetch alerts from twitter and post them into a chat room on Slack.</figcaption></figure><p>I originally came at this from a remote work meets visual programming perspective. You and your coworkers would walk around an environment (doesn&apos;t have to be an office) to talk to each other over video chat when close in proximity. But also you&apos;d collaboratively build business workflow processes (that can actually get executed) together by connecting buildings (that fetch or transform data) with packages (containing data) that travel to and from the buildings.</p><p>I deployed this at a domain I already had handy: <a href="https://tetracasa.com/#/">tetracasa.com</a>. You can walk around, and if someone else is also there, you can video chat with them. However, there&apos;s no visual programming implemented, because I ran into conceptual issues. I recognized I had no strong opinion on who the user was, what they&apos;d use it for, and the issue at hand. I just had an intution that there was something useful and intereseting here, but I couldn&apos;t articulate it.</p><p>The more I looked into other attempts at visual programming, the more I wasn&apos;t convinced connecting buildings together with traveling data packages would be manageable. In the game <a href="https://www.factorio.com/">Factorio</a>, it also has a similar model&#x2013;factories connected by conveyor belts. The near un-manageability in Factorio is half the fun, but it seems like a weakness in a collaborative programming environment.</p><p>The other problem I wrestled with was the issue of <a href="https://www.inkandswitch.com/end-user-programming/#embodiment">embodiment</a>. In a game-like 2D environment for programming, how do you represent abstract data structures and functions? I had little mainstream prior art I could really build upon, so it felt more like a research problem. At the time, I wanted to avoid doing research if I didn&apos;t have to.</p><p>I got lost in the weeds. and didn&apos;t have an angle of attack for some of these problems, so I decided to take a break and swing back to this later. But I took a long detour.</p><h5 id="react-hooks-for-data-pipelines">React Hooks for Data Pipelines</h5><p>I fell into a rabbit hole learning about React Hooks and was surprised to learn it was inspired by <a href="https://overreacted.io/algebraic-effects-for-the-rest-of-us/">Algebraic Effects</a>. This post on <a href="https://acko.net/blog/live-headless-react/">Headless React</a> showed it was possible to use the React Hooks API for 3D scene rendering. </p><p>Going back to the chainable background jobs, I never thought the API for most workflow automation/ETL software like <a href="https://airflow.apache.org/">Airflow</a> of connecting nodes with edges was that robust. But a React Hooks offered a way to declaratively state a pipeline.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://interjectedfuture.com/content/images/2022/01/Screen-Shot-2022-01-08-at-10.00.47-PM.png" class="kg-image" alt="Year in Review" loading="lazy" width="1432" height="864" srcset="https://interjectedfuture.com/content/images/size/w600/2022/01/Screen-Shot-2022-01-08-at-10.00.47-PM.png 600w, https://interjectedfuture.com/content/images/size/w1000/2022/01/Screen-Shot-2022-01-08-at-10.00.47-PM.png 1000w, https://interjectedfuture.com/content/images/2022/01/Screen-Shot-2022-01-08-at-10.00.47-PM.png 1432w" sizes="(min-width: 720px) 720px"><figcaption>Just a taste of it. The <a href="https://gitlab.com/iamwil/refract/-/blob/master/src/ingestor.ts">full file is here</a>.</figcaption></figure><p>As a sample workflow, I used it to pull down the interest rates of all markets on <a href="https://compound.finance/">Compound</a>, and recording it over time. It is for the most part pretty easy to use, and I think promising. However, it&apos;s easy to let an uninitialized state variable slip by and having that cause an error. </p><p>The full project is called <a href="https://gitlab.com/iamwil/refract">Refract</a>, for the lack of a better name. Of everything I worked on, this was the most satisfying because it came the closest to scratching my own long-standing itch. But more importantly, I started to see React in a different way&#x2013;it&apos;s dragging mainstream developers halfway towards accepting effect managers as a norm. I&apos;m starting to be able to articulate an opinion.</p><p>I have other features I&apos;d want to add to this, one of which was data provenance. You should be able to rewind and look at any piece of data that you processed. I was stuck thinking about it and stepped away for a bit, and got sucked into the next three things.</p><h5 id="nft-with-crafting-mechanics">NFT with crafting mechanics</h5><p>Inspired by <a href="lootproject.com/">Loot Project</a>, This was a side project meant to be 2 weeks, but it took the better part of a quarter. I had meant to jump back into crypto somehow, so this seemed like a good small project. Turns out it was lots of deadends and redesign to fit around the limits of Solidity and the Ethereum VM. </p><p>It&apos;s basically Cards Against Humanities mechanics, of combining blank cards with answer cards, but with shit that cryptonerds say. </p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://interjectedfuture.com/content/images/2022/01/minting_blank4-1.gif" class="kg-image" alt="Year in Review" loading="lazy" width="1501" height="753"><figcaption>Here&apos;s what it looks like to mint a pack of cards</figcaption></figure><p>And then crafting the black and white cards together will get you a rainbow card with the blank filled in. People can heart it with eth.</p><p>Was it worth it? I&apos;d almost say no. The interesting parts of what might make this a compelling project to buyers had been stripped away to keep scope. Text manipulation is Solidity is a pain, and the limitations of a Solidity contract is annoying to work around. Given the amount of time this took, I should have ejected quickly like I did with the Payment Flow and the Automation Cookbook. I think I just wanted to finish something.</p><p>If I were to retro-justify it with anything, is that I learned how painful it was (again) to write smart contracts. Maybe this is something I should pay attention to, and do something about.</p><p>I haven&apos;t deployed this yet, but will in the coming weeks. It&apos;s hard to justify, because it&apos;ll cost me about $2500 to deploy this thing onto mainnet.</p><h5 id="the-technium-podcast">The Technium Podcast</h5><p>I started a podcast with Sri in the last couple of months. As a result of some of the things I&apos;ve talked about in the newsletter and our pandemic conversations, he suggested we just record our conversations. </p><p>It&apos;s a podcast about the (usually software) technologies Sri and I see on the horizon. It usually follows the arc of: </p><ul><li>What is it?</li><li>What are people using it for today?</li><li>What would the world look like if it was pervasive?</li></ul><p>We <a href="https://www.youtube.com/channel/UCl_rEKDGBw4myn0uOnPxYsg/videos">covered topics</a> from <a href="https://www.youtube.com/watch?v=OWY2epLu-eA">The Metaverse</a> and <a href="https://www.youtube.com/watch?v=OirCPLOCm9s&amp;t=1s">Datalog</a> to <a href="https://www.youtube.com/watch?v=wKDpO_KePO8">CLIP/DALL-E</a> and <a href="https://www.youtube.com/watch?v=0omUH_VqbJ8">Zero Knowledge Proofs</a>. We&apos;ve already finished <a href="https://www.youtube.com/watch?v=OirCPLOCm9s&amp;list=PLZ8NrLhP6GzsOaZaVBugC86_vBchIeCxx">the first season</a>, and we&apos;re starting up our second season with better production values&#x2013;new mics!</p><figure class="kg-card kg-embed-card kg-card-hascaption"><iframe width="200" height="113" src="https://www.youtube.com/embed/hp9TB65yT7c?feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe><figcaption>Season 2 Eps 1 on End-User Programming</figcaption></figure><p>Overall, this has been good. First, it&apos;s been reflective. We don&apos;t often see ourselves on screen or hear our selves talk. But being forced to do it while editing in post, it&apos;s like looking in the mirror. I noticed my verbal ticks are putting &quot;right?&quot; at the end of sentences, with the occasional &quot;like&quot; thrown in. And sometimes, I start a sentence in the middle, only to abort to restart somewhere else. I can see it&apos;s because I had never had to articulate that particular thought before, and I&apos;m putting together the sentence as I&apos;m going.</p><p>Second, it&apos;s been an outlet. I&apos;ve been excited about this kind of stuff for a long time and didn&apos;t really find someone to nerd out with out it until now. Even in Silicon Valley, people find it easy to be cynics or skeptics, so it&apos;s hard to see a potential future. </p><p>We&apos;ll continue to keep recording episodes as long as we amuse each other. Not sure where this will lead, but so far so good.</p><h5 id="impromptu-spices">Impromptu Spices</h5><p>Over the summer, I started <a href="https://impromptu.shop/">selling a Beef Noodle Soup spice blend</a>. I thought it was a dumb idea. But early ambition and ideas are fragile, and I&apos;ve talked myself out of too many things. This time I figured I&apos;d just do it and see what happens. A couple of you bought a bottle when it was announced on this newsletter (thanks!), but I needed validation from strangers if y&apos;all were just being nice. Jess suggested I tell a <a href="https://www.facebook.com/groups/taiwanesehomecooking/posts/1000005010850773/">Taiwanese Cooking facebook group</a> about it.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://interjectedfuture.com/content/images/2022/01/Screen-Shot-2022-01-12-at-1.41.44-PM.png" class="kg-image" alt="Year in Review" loading="lazy" width="1092" height="820" srcset="https://interjectedfuture.com/content/images/size/w600/2022/01/Screen-Shot-2022-01-12-at-1.41.44-PM.png 600w, https://interjectedfuture.com/content/images/size/w1000/2022/01/Screen-Shot-2022-01-12-at-1.41.44-PM.png 1000w, https://interjectedfuture.com/content/images/2022/01/Screen-Shot-2022-01-12-at-1.41.44-PM.png 1092w" sizes="(min-width: 720px) 720px"><figcaption>&quot;Professional&quot; iPhone photo shoot.</figcaption></figure><p>The demand surprised me, and I ended up selling out twice, and scrambled to make more. I was <a href="https://www.taiwaneseamerican.org/2021/12/beef-noodle-soup-seasoning-wil-chung/">featured on TaiwaneseAmerican.org</a> in an interview about it in December. Last quarter, I sold over 300 bottles as well as a couple refill bags of the stuff. </p><p>This is the first time that I made something people want, however small. It definitely felt different, like people were pulling it out of me. I had just tripped over a latent demand. That said, I&apos;m waiting to see how many return customers there are to see if this really is a thing.</p><p>I think there&apos;s a couple reasons for the latent demand. One, the product is pretty tasty. So that&apos;s a base line. But two, it has a compelling narrative for a specific group of people. I think Taiwanese Americans are so hungry for representative products that they&apos;re willing to pay shipping for it. </p><p>To me, this is new, since I&apos;ve always got a sense for what can be possible, but I&apos;ve always found it hard to craft a compelling narrative for others before I&apos;m able to show them. It&apos;ll be something that I&apos;ll focus on more for other things I work on going forward.</p><p><strong>In Review</strong></p><p>At the beginning of 2021, I didn&apos;t know what I should be doing, but I did let go of the idea that I <em>should</em> be doing this or that. This past year was just exploring and trying things without talking myself out of it. That&apos;s lead to, of all things, a spice blend.</p><p>I had gotten better at ejecting out of unpromising projects or dead-ends, but it&apos;s still a skill that can be honed better. That has to be balanced with just doing things to see what happens, because there are always surprises when you do something. </p><p>While this year&apos;s exploration was helpful in getting unstuck, I feel like I haven&apos;t really accrued deep expertise in any one particular vertical. For 2022, I&apos;m going to be more focused on something. Looking back, for the past couple of years, I find myself reading and learning about programming languages, <a href="http://worrydream.com/MediaForThinkingTheUnthinkable/">Tools for Thinking</a> previously hard-to-think-about thoughts, and human-computer augmentation with any chance I get&#x2013;regardless of what I&apos;m actually working on. And in my work, I&apos;ve always thought a lot about the affordance of my code for other developers, to the point of being derided as too clean.</p><p>And yet, I&apos;ve been denying myself from going deep into this topic because it didn&apos;t fit the narrative I wanted for myself. It wasn&apos;t something my peers seemed to value. I&apos;ve killed projects working on these things because, I didn&apos;t think it&apos;d have a market. I&apos;ve not joined things because I was so focused on something else. I really just didn&apos;t know how to make this something sustainable. So this year, I&apos;m also letting that go, but this means it points me in a direction that I don&apos;t really have a name for. </p><p>Will I be any good at this? Do I actually have an original line of thought, an angle of attack, to contribute to the state of programming and thinking tools? I can see that behind the Gather Town clone and Refract is a viewpoint on programming, but it was vague and not well articulated. In retrospect, that has to come from writing to clarify my thoughts to more clearly communicate the currently scattered and faceted intuition I have on the topic.</p><p>So the near-term plan is to do two things. One, write to articulate the problems in the space. Two, learn how to write interpreters. </p><p>Hope you have a plan for your 2022, and that you have lots to look forward to! </p><hr><!--kg-card-begin: html--><span>Photo by <a href="https://www.pexels.com/@max-artbovich?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels">Max Vakhtbovych</a> from Pexels</span><!--kg-card-end: html-->]]></content:encoded></item><item><title><![CDATA[Easy Backfilling with Hooks and Components]]></title><description><![CDATA[I implemented a backfill mechanism with the basic hooks, so when the pipeline is brought back up after some downtime, it'll not only do its job of ingesting new blocks for interest rates, but it will also see how many of the last 128 blocks it missed and ingest those. ]]></description><link>https://interjectedfuture.com/easy-backfilling-with-hooks-and-components/</link><guid isPermaLink="false">626e31acc728c800010ec50d</guid><category><![CDATA[pipeline]]></category><dc:creator><![CDATA[Wil]]></dc:creator><pubDate>Sat, 21 Aug 2021 07:40:23 GMT</pubDate><media:content url="https://images.unsplash.com/photo-1617806118233-18e1de247200?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=MnwxMTc3M3wwfDF8c2VhcmNofDQ3fHxpbnRlcmlvciUyMGRlc2lnbnxlbnwwfHx8fDE2Mjk1MzE1NjA&amp;ixlib=rb-1.2.1&amp;q=80&amp;w=2000" medium="image"/><content:encoded><![CDATA[<img src="https://images.unsplash.com/photo-1617806118233-18e1de247200?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=MnwxMTc3M3wwfDF8c2VhcmNofDQ3fHxpbnRlcmlvciUyMGRlc2lnbnxlbnwwfHx8fDE2Mjk1MzE1NjA&amp;ixlib=rb-1.2.1&amp;q=80&amp;w=2000" alt="Easy Backfilling with Hooks and Components"><p>Ever since my realization that <a href="https://acko.net/blog/live-headless-react/">React&apos;s hook API was useful in other contexts</a>, I&apos;ve been excited about its application to a workflow software like Apache Airflow. Riding on that momentum, I built <a href="https://interjectedfuture.com/hooks-for-pipelines/">a bare-bones React for data pipelines</a>, which for now, I&apos;ll call by its codename Refract.</p><p>So beyond just getting it to work, I wanted to address the backfilling problem I had building ETL pipeline in the past. When fetching incoming data and running it through a transformation, something inevitably can go wrong, resulting in downtime.</p><p>Hence, when bringing the service back up, you might have missed event data and will need to backfill it. This is often extra work that I may not have had time for to get it up and running. But when something breaks, I spend more time manually backfilling data.</p><p>Using Refract, I built an example app that fetches interest rates from the <a href="https://compound.finance/markets">Compound smart contract for all their markets</a> directly from an Ethereum node.</p><p>I implemented a backfill mechanism with the basic hooks, so when the pipeline is brought back up after some downtime, it&apos;ll not only do its job of ingesting new blocks for interest rates, but it will also see how many of the last 128 blocks it missed and ingest those. [1] </p><p>Most visual programming where nodes are connected via wires makes reuse hard to do because they don&apos;t distinguish between a component and its instance. But because Refract uses components and hooks much like React does, the main pipeline as well as the missing block hook are both reusable. </p><pre><code class="language-typescript">export const ApyIngestor = props =&gt; {
  let { providerUri, tokenListPath, mainnetPath, mainnetAbiPath } = props;

  // ...various hooks...

  return [
    elem(MainLine, {
      key: &quot;0&quot;,
      knex,
      tokens: filteredTokens,
      contractGenerator,
      blockHeader
    }),

    elem(Backfill, {
      key: &quot;1&quot;,
      web3,
      knex,
      tokens: filteredTokens,
      contractGenerator,
      latestBlockHeader: blockHeader
    })
  ];
};


export const Backfill = props =&gt; {
  let { web3, knex, tokens, contractGenerator, latestBlockHeader, children } =
    props;

  const { missingKeys, numRemainingKeys, setNumRemainingKeys } = useMissingKeys(
    knex,
    &quot;annual_percentage_yields&quot;,
    &quot;blocknumber&quot;,
    latestBlockHeader?.number
  );

  let [pastBlockHeader, setPastBlockHeader] = Refract.useState(
    &quot;past blockheader&quot;,
    null
  );

  Refract.useEffect(
    &quot;get past block&quot;,
    () =&gt; {
      if (!web3 || !missingKeys || !numRemainingKeys) return;

      (async () =&gt; {
        missingKeys.forEach(async blocknumber =&gt; {
          const bh = await web3.eth.getBlock(blocknumber);
          setPastBlockHeader(bh);
          setNumRemainingKeys(numRemainingKeys - 1);
        });
      })();
    },
    [missingKeys, numRemainingKeys]
  );

  return [
    elem(MainLine, {
      key: &quot;0&quot;,
      knex,
      tokens,
      contractGenerator,
      blockHeader: pastBlockHeader
    })
  ];
};
</code></pre><p>Backfill is something specific to the domain, so I decided to leave it as something the programmer constructs, but with reusable hooks to help make it easier.</p><p>For example, we backfill in ascending order, processing the oldest first. Otherwise, the Ethereum node might flush the old block before we get a chance to get around to it. Also, we try to generate a new missing keys list on every new block, but only if we haven&apos;t finished processing the last generated backfill list yet. That&apos;s what the <code>numRemainingKeys</code> count is for. We return it and its complement <code>setNumRemainingKeys</code> from the hook, so the client can update it.</p><p>So far, I&apos;ve liked the experience of writing Backfill with this API. When I refactor and move either components or hooks around, I don&apos;t really have to change much; I can just cut and paste. And I can reuse components because each place I use them is a different instanciation. </p><p>Another advantage is that I don&apos;t really have to worry about cache invalidation any longer. In the initial non-Refract implementation, I did notice I reloaded the ABI files from disk on every new block. It&apos;s wasteful, but I didn&apos;t change it because it didn&apos;t seem to affect run time. But in Refract, it just worked. The caveat is that I have to get the dependency array correct in useEffect hooks, which sometimes I can miss. Also, forgetting that some variables might be undefined in the beginning, so I have to use option chaining. </p><p>So far so good. </p><p>[1] Unless you&apos;re talking to a full archive node, Infura only keeps the last 128 blocks.</p>]]></content:encoded></item><item><title><![CDATA[What are DeFi's Prospects?]]></title><description><![CDATA[In the last couple of years, people have been building a global, open alternative to every financial products you use today, accessible to anyone in the world with a smartphone and internet connection. ]]></description><link>https://interjectedfuture.com/what-are-defis-prospects/</link><guid isPermaLink="false">626e31acc728c800010ec4fa</guid><category><![CDATA[cryptocurrency]]></category><category><![CDATA[defi]]></category><dc:creator><![CDATA[Wil]]></dc:creator><pubDate>Fri, 13 Aug 2021 17:42:26 GMT</pubDate><media:content url="https://interjectedfuture.com/content/images/2021/08/pexels-jason-boyd-3209045-1.jpg" medium="image"/><content:encoded><![CDATA[<img src="https://interjectedfuture.com/content/images/2021/08/pexels-jason-boyd-3209045-1.jpg" alt="What are DeFi&apos;s Prospects?"><p>By now, most of you have heard of cryptocurrencies, such as Bitcoin or Ethereum. Even in Silicon Valley, it&apos;s been a polarizing technology. Some think of it as seismic as the internet, whereas others think of it as a giant scam. As <a href="https://www.youtube.com/watch?v=g6iDZspbRMg">John Oliver puts it</a>, &quot;Cryptocurrencies: It&apos;s all the things you don&apos;t understand about computers mashed with all the things you don&apos;t understand about money.&quot;</p><p>The most prominent aspect of cryptocurrencies is the price and volatility, but that&apos;s hardly the interesting bit. In the last couple of years, people have been building a global, open alternative to every financial products you use today, accessible to anyone in the world with a smartphone and internet connection. </p><p>These financial services are grouped under the umbrella term, <em>Decentralized Finance</em> (DeFi). It&apos;s one of the fastest growing sectors in cryptocurrencies, with <a href="https://defipulse.com/">$26 billion of cryptocurrencies deposited</a>, and growing exponentially as of this writing.</p><p>DeFi products are different from their traditional counterparts, in <a href="https://www.forbes.com/sites/lukefitzpatrick/2020/05/20/defi-is-reinventing-global-finance-faster-than-the-fed-can-print-money/">a number of different ways</a>, but to me, the most interesting aspect is that these financial services are interoperable and composable without coordination or permission.</p><p><a href="https://compound.finance/">Compound</a> is an automated money market&#x2013;a lending and borrowing platform for cryptocurrencies. As a borrower, you can borrow crypto with an over-collateralized loan, usually with other crypto as collateral. As a lender, you can lend your crypto to earn continuously compounding interest. When you lend on Compound, you receive cTokens representing your contribution in the lending pool. These cTokens can be bought, traded, and sold on exchanges because cTokens themselves are also cryptocurrencies.</p><p><a href="https://pooltogether.com/">PoolTogether</a> is no-loss prize game. Multiple users trade their cryptocurrency for tickets to pool their crypto together. PoolTogether then uses the pooled crypto to lend in Compound. The interest earned is then used as prize money that gets distributed randomly to ticket holders like a lottery. These tickets can be bought, traded, and sold on exchanges because the tickets themselves are also cryptocurrencies.</p><p>PoolTogether didn&apos;t have to coordinate with Compound to implement no-loss prize games. And if a new DeFi product wanted to use cTokens and tickets, they can do so without coordinating with either party. This permission-less innovation mirrors the growth of web integrations to provide value to users that each individual service couldn&apos;t do alone.</p><p>There are <a href="https://defipulse.com/defi-list">many other DeFi products</a> out there, and each successful DeFi product is yet another building block for future DeFi products in a positive feedback loop. By the looks of it, the pace of innovation in this space will only accelerate. [1]</p><p>That said, I think it will still be a long while before the DeFi ecosystem will substantially affect the day to day real world workings of the economy. For me, the tipping point will be when businesses can seriously consider using crypto to help integrate financial services into their products for a seamless customer experience. Before, I was more bullish, but now I have reservations.</p><p>For one, the regulatory environment is uncertain in this space, as the SEC hasn&apos;t made definitive rulings on certain aspects of crypto. If there&apos;s draconian regulation, it could easily cool business adoption in this space. However, this won&apos;t kill crypto, because the decentralization ethos of crypto has made it more resistant to state-level actors.</p><p>Second, cryptocurrencies have some serious user experience problems. There&apos;s a definite learning curve, and unless you have sufficiently motivated users, most people aren&apos;t going to want to learn all new ways of transacting online. While there is some work on a great user experience, most of the attention and work in the crypto space is about infrastructure, inventing new financial services, and memes.</p><p>Third, fintech companies are the default go-to for tech companies that want to deeply integrate a financial service into their product for a more seamless customer experience. They&apos;ve coined this movement as <em>Embedded Finance</em>. Examples are <a href="https://www.uber.com/us/en/drive/insurance/">Uber Insurance</a> for drivers or <a href="https://www.shopify.com/capital">Shopify Capital</a> to give loans to small businesses. The fintech companies powering financial services aren&apos;t your mother&apos;s Bank of America, but rather, companies like <a href="https://stripe.com/">Stripe</a> and <a href="https://mercury.com/">Mercury</a>. They are fierce competitors and are entirely focused on customer experience.</p><p>Cryptocurrencies will likely remain as a secondary alternative financial system for speculative assets if it can&apos;t find an in-road to being the go-to default for businesses that wants to integrate financial services into their product for a better customer experience. There&apos;s only two other paths I can see at the moment. They could provide financial services for companies in spaces that are turning from illegal to legal, like weed delivery companies, since traditional banks won&apos;t service them. Or they&apos;d have to provide services that can&apos;t exist in traditional finance, such as <a href="https://decrypt.co/resources/non-fungible-tokens-nfts-explained-guide-learn-blockchain">non-fungible tokens</a> (NFTs). </p><p>The other valuable contribution these DeFi products create is the governance model with their community of users. Some DeFi products issue governance tokens as a part of its ecosystem or as natural by-product of using it. These governance tokens can then be used to vote on provisions that dictate the policy the DeFi product should adopt. </p><p>For example, <a href="https://makerdao.com/">MakerDAO</a> is a DeFi product that locks up Ethereum as collateral, and issues a loan for less than the USD value of the locked Ethereum. [2] This loan has a variable interest rate, which is controlled and voted on by holders of the Maker governance token. It&apos;s deliberately designed with a system of incentives to give users of MakerDAO a voice to dictate policy, even if they don&apos;t work at the company.</p><p>While social media companies can sometimes have an internal committee that decides policy, it has no actual system of government to allow users to voice their desires from outside the company. That result in the quagmire we see today with the debates over the censoring power of Big Tech. I think DeFi products have communities that are much better suited to address this class of problems.</p><p>It&apos;s hard to see the value in cryptocurrencies. There&apos;s definitely a lot of hype, meme-ing, and religious fervor that get in the way. In addition, many of the problems crypto projects solve are only applicable if you&apos;re already using cryptocurrencies. The early internet was similar. Google, <a href="https://en.wikipedia.org/wiki/Delicious_(website)">Del.icio.us</a>, Flickr, Yahoo Mail, and Paypal were only useful if you were already using the internet. [3] It wasn&apos;t until later we had web services that affected the real world and other industries like Uber, Netflix, and Airbnb. Will crypto be able to jump the gap? For now, I&apos;m cautiously optimistic about DeFi&apos;s prospects.</p><p>For those of you that are curious:</p><ul><li><a href="https://nakamoto.com/beginners-guide-to-defi/">A beginner&apos;s guide to DeFi</a> by Linda Xie</li><li>Coinbase&apos;s <a href="https://blog.coinbase.com/a-beginners-guide-to-decentralized-finance-defi-574c68ff43c4">beginner&apos;s guide to DeFi</a></li><li>Nakamoto&apos;s <a href="https://nakamoto.com/beginners-guide-to-defi/">beginner&apos;s guide to DeFi</a></li><li>DeFi indexes: <a href="https://defipulse.com/">DefiPulse</a> and <a href="https://dappradar.com/">DappRadar</a></li></ul><p>I haven&apos;t worked in crypto in the last year and a half, but I wanted to catch up with developments since. For those of you working in crypto, what should I look at besides <a href="https://medium.com/aave/sneak-peek-at-flash-loans-f2b28a394d62">flash loans</a>, <a href="https://linda.mirror.xyz/df649d61efb92c910464a4e74ae213c4cab150b9cbcc4b7fb6090fc77881a95d">social tokens</a>, and <a href="https://decrypt.co/resources/what-is-yield-farming-beginners-guide">yield farming</a>?</p><p><em>Photo by <a href="https://www.pexels.com/@jason-boyd-1388339?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels">Jason Boyd</a> from <a href="https://www.pexels.com/photo/photo-of-living-room-3209045/?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels">Pexels</a></em></p>]]></content:encoded></item></channel></rss>