state entropy musings

bevacqua · bevacqua · commit 3aa131143bd7 · 2017-09-21T07:48:34.000-03:00
diff --git a/chapters/ch04.asciidoc b/chapters/ch04.asciidoc
@@ -501,28 +501,39 @@ In this section, we'll discuss ways of eliminating and containing state, as well
 
 ==== 4.3.1 Current State: It's Complicated
 
+The problem with state is that, as an application grows, its state tree inevitably grows with it, and for this reason large applications are hopelessly complex. We shall highlight that this complexity exists in the whole, but not necessarily in individual pieces. This is why breaking an application into ever smaller components might reduce local complexity even when it increases overall complexity. That is to say, breaking a single large function into a dozen small functions might make the overall application more complex, -- as there would be ten times as many pieces -- but it also makes the individual aspects of the previously-large function that are now covered by each small function simpler when we're focused on them, as thus easier to maintain individual pieces of a large, complicated system, without requiring a complete or even vast understanding of the system as a whole.
 
+At its heart, state is mutable. Even if the variable bindings themselves are immutable, as we'll consider in section 4.3.1, the complete picture is mutable. A function might return a different object every time, and we may even make that object immutable so that the object itself doesn't change either, but anything that consumes the function receives a different object each time. Different objects mean different references, meaning the state as a whole mutates.
 
+Consider a game of chess, where each of two players starts with 16 pieces, each deterministically assigned a position on a checkerboard. The initial state is always the same. As each player inputs their actions, moving and trading pieces, the system state mutates. A few moves into the game, there is a good chance we'll be facing a game state we haven't ever experienced before. Computer program state is a lot like a game of chess, except there's more nuance in the way of user input, and an infinitude of possible board positions and state permutations.
 
+In the world of web development, a human decides to open a new tab in their favorite web browser and they then google for "cat in a pickle gifs". The browser allocates a new process through a system call to the operating system, which chemically shifts some bits around on the physical hardware that lies inside the human's computer. Before the HTTP request hits the network, we need to hit DNS servers, engaging in the ellaborate process of casting `google.com` into an IP address. The browser then checks whether there's a ServiceWorker installed, and assuming there isn't one the request finally takes the default route of querying Google's servers for "cat in a pickle gifs". Naturally, Google receives this request at one of the front-end edges of its public network, in charge of balancing the load and routing requests to healthy back-end services. The query goes through a variety of analyzers that attempt to break it down to its semantic roots, stripping the query down to its essential keywords in an attempt to better match relevant results. As the search engine figures out the 10 most relevant results for "cat pickle gif" out of billions of pages in its index -- which was of course primed by a different system that's also part of the whole -- Google pulls down a highly targeted piece of relevant advertisement about cat gifs that matches what they believe is the demographic the human making the query belongs to, thanks to a sophisticated ad network, figures out whether the user is authenticated with Google through an HTTP header session cookie and the search results page starts being constructed and streamed to the human, who now appears impatient and fidgety. As the first bits of HTML being streaming down the wire, the search engine produces its results and hands them back to the front-end servers, which includes it in the HTML stream that's sent back to the human. The web browser has been working hard at this too, parsing the incomplete pieces of HTML that have been streaming down the wire as best it could, even daring to launch other admirably and equally-mind-boggling requests for HTTP resources presumed to be JavaScript, CSS, font, and image files as the HTML continues to stream down the wire. As the first few chunks of HTML are converted into a DOM tree, the browser would finally be able to begin rendering bits and pieces of the page on the screen, if it weren't because it's still waiting on those equally-mind-boggling CSS and font requests. As the CSS stylesheets and fonts are transmitted, the browser begins modelling the CSSOM and getting a more complete picture of how to turn the HTML and CSS plain text chunks provided by Google servers into a graphical representation that the human finds pleasant. Browser extensions get a chance to meddle with the content, removing the highly targeted piece of relevant advertisement about cat gifs before I even realize Google hoped I wouldn't block ads this time around. A few seconds have passed by since I first decided to search for cats in a pickle. Needless to say, thousands of others brought similarly inane requests to the same systems during this time.
 
-.. section on how state is terrible and the more state there is the less predictable our code becomes.
+Not only does this example demonstrate the marvelous machinery and infrastructure that fuels even our most flippant daily computing experiences, but it also illustrates how abundantly hopeless it is to make sense of a system as a whole, let alone its comprehensive state at any given point in time. After all, where do we draw the boundaries? Within the code we wrote? The code that powers our customer's computers? Their hardware? The code that powers our servers? Its hardware? The internet as a whole? The power grid?
 
+==== 4.3.2 Eliminating Incidental State
 
+We've established that the overall state of a system has little to do with our ability to comprehend parts of that same system. Our focus in reducing entropy must then lie in the individual aspects of the system. It's for this reason that breaking apart large pieces of code is so effective. We're reducing the amount of state local to each given aspect of the system, and that's the kind of state that's worth taking care of, since it's what we can keep in our heads and make sense of.
 
+Whenever there's persistance involved, there's going to be a discrepancy between ephemeral state and realized state. In the case of a web application, we could define ephemeral state as any user input that hasn't resulted in state being persisted yet, as might be the case of an unsaved user preference that might be lost unless persisted. We can say realized state is the state that has been persisted, and that different programs might have different strategies on how to convert ephemeral state into realized state. A web application might adopt an Offline-First pattern where ephemeral state is automatically synchronized to an IndexedDB database in the browser, and eventually realized by updating the state persisted on a back-end system. When the Offline-First page is reloaded, unrealized state may be pushed to the back-end or discarded.
+
+Incidental state can occur when we have a piece of data that's used in several parts of an application, and which is derived from other pieces of data. When the original piece of data is updated, it wouldn't be hard to inadvertently leave the derived pieces of data in their current state, making them stale in comparison with the updated original pieces of data. As an example, consider a piece of user input in Markdown and the HTML representation derived from that piece of Markdown. If the piece of Markdown is updated but the previously compiled pieces of HTML are not, then different parts of the system might display different bits of HTML out of what was apparently the same single Markdown source.
+
+When we persist derived state, we're putting the original and the derived data at risk of falling out of sync. This isn't the case just when dealing with persistance layers, but can also occur in a few other scenarios as well. When dealing with caching layers, which may become stale because the underlying original piece of content is updated but we forget to invalidate the stale piece of derived content. Database denormalization is another common occurrence of this problem, whereby creating a lot of derived state can result in synchronization problems. An example of this might be forum software where user profiles are denormalized into comments in an effort to save a database roundtrip, but then when users update their profile, their old comments preserve an stale avatar, signature, or display name. To avoid this kind of issue, we should always consider recomputing derived state from its roots. Even though doing so won't always be possible or even practical, encouraging this kind of thinking will, if anything, increase awareness about the subtle intricacies of denormalized state.
+
+==== 4.3.3 Containing State
+
+.. when we must have state, keep it as constrained as possible
 
 
 
 
 
 
 
-==== 4.3.2 Eliminating Incidental State
 
-.. where possible always recompute state from data rather than rely on state
 
-==== 4.3.3 Containing State
 
-.. when we must have state, keep it as constrained as possible
 
 ==== 4.3.4 Leveraging Immutability