jtm.dev/articles/beyond-operating-systems/index.xhtml

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="en">
  <head>
    <meta charset="utf-8" />

    <meta name="author" content="Lijero" />
    <!-- <meta name="description" content="" /> -->
    <meta name="keywords" content="Lijero,code,operating system,distributed,network,resource graph,abstraction" />
    <meta name="robots" content="index,follow" />

    <title>Beyond Operating Systems - Lijero</title>

    <meta name="viewport" content="width=device-width, initial-scale=1" />

    <link href="/favicon.png" rel="icon" type="image/png" />
    <link href="/res/common.css" rel="stylesheet" type="text/css" />

<script type="text/javascript">
  var _paq = _paq || [];
  /* tracker methods like "setCustomDimension" should be called before "trackPageView" */
  _paq.push(["setDocumentTitle", document.domain + "/" + document.title]);
  _paq.push(["setCookieDomain", "*.lijero.co"]);
  _paq.push(["setDomains", ["*.lijero.co"]]);
  _paq.push(['trackPageView']);
  _paq.push(['enableLinkTracking']);
  (function() {
    var u="//p.lijero.co/";
    _paq.push(['setTrackerUrl', u+'foo']);
    _paq.push(['setSiteId', '1']);
    var d=document, g=d.createElement('script'), s=d.getElementsByTagName('script')[0];
    g.type='text/javascript'; g.async=true; g.defer=true; g.src=u+'foo.js'; s.parentNode.insertBefore(g,s);
  })();
</script>
  </head>
  <body>
    <nav>
      <div id="navbrand">Lijero</div>
      <ul>
        <li><a href="/">Home</a></li>
      </ul>
    </nav>
    <article>
    <header>
      <h1>Beyond Operating Systems - Abstracting Over Networks</h1>
      <div class="metadata">
      <span><img src="/res/icon/person.svg" alt="Author" /> <span class="author"><a href="https://lijero.co/about/lijero">Lijero</a></span></span>
      <span><img src="/res/icon/clock.svg" alt="Date" /> Posted 2017-10-06</span>
      </div>
    </header>
    <h2>Introduction</h2>
    <p>
      I've come to get my crazy idea smashed apart. Or open an innovative discussion! Whatever it is, I don't wanna waste your time so I'll be brief. Let me know if anything needs clarification, since I'm shrinking a couple notebooks into one article.
    </p>
    <h3>Programs running on the internet itself</h3>
    <p>
      I want to write software for the whole internet! That can mean a lot of things, though:
      <ul>
	<li>Websites without a host</li>
	<li>Multiplayer games and apps without even <em>thinking</em> about servers</li>
	<li>Easy distributed stuff like cryptocurrencies</li>
      </ul>
    </p>
    <p>
      Operating systems mean we can write software for <em>any</em> computer but I want to write software for <em>every</em> computer at once! A <strong>globally distributed operating system</strong>.
    </p>
    <h3>Building up to a global OS</h3>
    <p>
      It's gonna take a lot of layers of abstraction. Bear with me, I'll get there eventually!
    </p>
    <h2>Step 1: The Resource Graph</h2>
    <p>
      If you're going to write a distributed operating system, it better <strong>not matter where data comes from</strong>. So <strong>unify every resource</strong>, from memory to files to HTTP calls, into a <em>resource graph</em>. That said, you can't make assumptions about your surroundings either: <strong>implicit dependencies have got to go</strong>.
    </p>
    <h3>Unifying resources</h3>
    <p>
      We still need to know what data is, which can be pretty tough when you don't know where it's coming from. The solution is an <strong>extremely strong type system</strong>. The type system tells software what data actually is. That said, different resources have different properties, so this has to be expressed in the type system as well.
    </p>
    <h3>Killing implicit dependencies</h3>
    <p>
      This is where the resource graph comes in. A graph is just a general way to structure data. It's basically a generalized filesystem. The important difference is that <strong>there is no root</strong>. We get data around by building, rebuilding, and passing resource graphs, rather than ACL on a general filesystem or whatever. We pass around references.
    </p>
    <h3>Programming with a resource graph</h3>
    <p>
      Today's languages are not well-suited to this model of data. Neither are computers: you can't have a pointer to an HTTP resource. Also, it needs to be enforced, so RIP native binaries. Not that you could have them anyway, since the internet runs on a loot of architectures. Instead, you'd need a bytecode interpreter that can actually handle it all natively.
    </p>
    <p>
      Also, I swear I am not a Lisp shill and I had no intention of it, but Lisp happens to be really good at dealing with linked data structures. It'd probably be really convenient for dealing with this sort of thing. Though this system is strongly typed, unlike Lisp.
    </p>
    <h3>There's more to resources than CRUD</h3>
    <p>
      There needs to be an actual way to implement all these fancy types of nodes. And how can we not care about the source of something being HTTP if we need to make a query? The solution is simple: <strong>make every node a computation</strong>. And by computation I mean closure.
    </p>
    <p>
      I mean, how did you <em>think</em> the nodes were implemented? Most nodes are actually a procedure to fetch the requested content. But you can also make nodes with inputs and outputs, such as an HTTP query. Of course, we need to know where we're getting the data from, so that must have been put in earlier. The resources with that information have been enclosed in the node.
    </p>
    <p>
      Nodes are functions. They're partially applied with a portion of the graph they require to make closures. In fact, <strong>functions are nodes too</strong>. So are all processes. You pass in a parameter graph and get back data.
    </p>
    <p>
      Now I hope it makes sense how we can have all the features of a kernel without syscalls. Your malloc or whatever is inclosed with the relevant information, including any syncronization references.
    </p>
    <h2>Step 2: Computational Economy</h2>
    <p>
      So what if we <strong>assign an expense to a function</strong>? We can count instructions, give them different expenses, and sum them up to get a final price. We can calculate the range of potential prices by calculating branches as lumps and adding them each time a branch is taken. From here, we can extend it to the cost of data (e.g. memory) by calculating <strong>quantity of data over computational time</strong>. Bandwidth costs are obvious too. This is how you can implement schedulers, when you expose the interpreter as a resource.
    </p>
    <h4>What about the halting problem?</h4>
    <p>
      Totality proofs (manual or automatic), and relating expense to the input. Where it cannot be proven, the maximum expense can be considered infinite. In many of those cases, it would make sense to put an <strong>upper bound on the amount you're willing to pay to run a function</strong>.
    </p>
    <h3>Paying for our remote access</h3>
    <p>
      Servers can also assign an expense to queries if you modified e.g. HTTP to do it. <strong>HTTP is just a kind of RPC anyway</strong>, and it now behaves exactly like one in our resource graph. Now you can pay them for your access to their resource graph!
    </p>
    <strong>We've now turned computational expense into a currency!</strong>
    <p>
      How cryptocurrencies are implemented is out of the scope of this article, but we can now turn this into one, and trade our currency for computation.
    </p>

But to continue, if accessing someone else's resources has a cost, we should probably be <em>paying</em> for it, right? Paying for goods and services. Seems familiar. Maybe our "money" really is just that-- a currency. Are y'all familiar with cryptocurrencies? Because we're well on our way to making one. Explaining how blockchains like bitcoin work is waay out of the scope of this post (or perhaps not-- but at any rate, it's too long to explain in this one), but to SUPER summarize it, we can use cryptographic mAgIc to sell portions of transactions and stuff with cryptographic proof, tl;dr a distributed currency. I'll have to explain how it works, and how you can make contracts to prevent scams and stuff, and how to deal with redundant computation in a different post, because you have no idea just how much further I can push this (i.e. I don't have time for that sort of digression).

The end result is that your pair of computers can trade resources and do all sorts of crazy stuff all beneath the nose of the program itself. This can be generalized to a more sensible system by decoupling the e.g. local per-instruction expense and currency to allow competition between suppliers, but that's pretty complicated too, and I want to stick to the realm of abstractions, rather than defining the economic system. Anyway, from there you can just extend this abstraction over an entire network of computers, and buy/sell resources as needed. You're probably seeing a lot of issues at this point (e.g. what if the other computers lie, what if they drop my data, what about latency, what about network partitions, what if they steal my data, how do I deal with all of these transactions in a performant matter), but don't worry, I'll get to them later. Actually, there are going to be a /lot/ of places coming up where I'll have to get to it later, so I'm going to stop giving this notice, but I will cover them. Also, if you're thinking, "this keeps sounding less and less like osdev", you're right. I'm no longer strictly discussing operating systems-- I've generalized the concept of an operating system, which abstracts over hardware and other such things, to an entire network.

Now that's a pretty big claim, but none of what I've said so far is all that impressive. Most of us have enough power on our computers not to need to buy it, and just doing these tasks the old-fashioned way is a heck of a lot easier for these benefits so far (though before I got so far out, the ideas still in the realm of typical operating systems is more justification than a lot of hobby OSes). The thing is, I'm claiming to abstract over the internet, but really all I've talked about is loaning resources and paying for data. Applications haven't changed that much, only their execution platform, and a couple of conveniences. For example, your HTTP server still holds all your data, you're just getting paid for holding it (though you can pay other people to act like a CDN). The issue is that while the computation is distributed and that space can be loaned, data is still fundamentally centralized. Sure, it's over a bunch of computers, but there's no concept of shared data. Your applications can run all <em>over</em> the internet, but not <em>on</em> it.

    <h2>Running <em>ON</em> the Internet -- Fully distributed applications</h2>
Now we're finally at my point. What I want is to be able to write fully distributed multi-user applications in a similar way to how we write standard applications now. There shouldn't be a server or host or central authority, you shouldn't have to deal with figuring out where data is stored, you shouldn't have to figure out the fine details of protocols, and such things. Publishing pages or files Web 1.0-style shouldn't require any hosting, and <em>all</em> data should receive the benefits of torrenting automatically. Interactive content (web 2.0, IRC, etc) should behave similarly. Games shouldn't need to worry about hosting (all multiplayer games should be as easy as LAN hosting is today, and large servers should be a hell of a lot cheaper i.e. free). A distributed data store.

    <h2>Getting back to details</h2>
If you can't think of a billion potential issues with this, I'm disappointed in you. But don't worry, I've reached peak abstraction, and now I can start working my way back down into details and issues. I'm probably going to go from easier issues to harder issues and low-level issues to high-level issues, just fyi. I know this is written like an FAQ but I keep building on important details here.

* Aren't interpreters slow? How is it a good idea to run your entire operating system on them?
They are typically slow, but as they're the core of the entire everything, a lot of optimization would go into it. It'd probably be rather low-level bytecode, and have an extremely efficient interpreter. Some degree of JITting would probably occur, when there is very high confidence it won't open up a bug. Since resources are managed in a graph and implicit dependencies are impossible, I'd assume people could take advantage of these properties to achieve an extremely high level of parallelism, like they do in functional programming languages.

* How would it be safe to run unknown code on your computer?
It's not, but you're probably doing it right now with your browser. There's the obvious distinction between going to trusted websites and getting pushed data from strangers, though. There's some inevitable risk in any internet connection, and the best you can do is minimize that risk. I would try to keep the inner interpreter extremely simple and small, perhaps even going as far as to formally verify it, to prevent exploits; JITting should be used with extreme caution; I would sandbox processes pushed to me when possible; and I would also try to take the precautions used by OSes such as OpenBSD. Overall though, I'm reasonably confident in it, and the complete lack of implicit dependencies makes it a hell of a lot more secure than running random executables, even sandboxed ones, and brings it closer to e.g. JS. The language wouldn't even deal with numeric pointers (to real memory, not as in the concept), to prevent accidental pointer arithmetic bugs.

<h4>How do I access raw resources from the interpreted language?</h4>
The interpreter would pass in operations for raw resource access into the boot program, which can then create closures wrapping these resources (e.g. a memory allocator, a partition reader, other device wrappers) and keep building abstractions on these, with only the abstractions being included in the resource graph for normal programs. How these raw resources are provided is an implementation detail. It could potentially involve bootstrap stages and assembly output.

<h4>Okay, well how do these abstractions work? <strong>What is a graph node anyhow?</strong></h4>
Each node is essentially a function call, though most of the actual function calls would be optimized away. You pass in a graph with the correct data types, and assign the result to another local graph node. The closures are actually more like partially applied functions in this sense, since the graph has to be explicitly passed in and must have exactly the required functions and nothing more, but I call them closures because it "captures" the resource access primitives from the creator, and then gets given actual input later. Resources like an HTTP call are essentially just lazily-evaluated procedures.

<h4>How is it performant to have a program distributed over multiple computers? <strong>How does the network manage latency and throughput?</strong></h4>
Usually, it isn't. Loaning resources isn't the point, it's just a necessary abstraction to build the rest of the network, for things like contracts, and managing the distributed data store. Usually, the actual computation and its related data will be transferred as a block (lazily, of course) to the actual place of execution, and only independent batch jobs will be transferred. A computer would rarely actually pay for a computation unless necessary (think swap space), and will do computations locally when possible, and local-initiated computations always take priority over sold computations. There are only two purposes other than for internal infrastructure and massively distributed computation projects: conserving battery life, and accessing remote resources. Often it doesn't make sense to download large portions of a database when the content you actually want is a lot smaller, so you just pay someone who already has it to do it for you. If this is a common case and it gets overwhelmed, its own price goes up until someone else finds it cheaper to grab the data than pay for it, and then you have two cheap servers. They don't have to maintain a clone of the database obviously, but in most cases, for a commonly-accessed resource, it makes sense because you can resell (seed) the data yourself, or you'll want to retrieve it again and have it there. This is how static (as opposed to stateful, not dynamic) data propagates through the network. You can tell whether something is common by announcing you have it, and holding onto it. This means that the more popular a resource is, the faster and cheaper it becomes to retrieve because more people have it. If it becomes less popular, or otherwise overhosted, people will automatically drop it because it's no longer profitable to host. Additionally, the more people host something, the more likely there is a version close to you, which reduces latency. This is especially true of geographically popular data. Beyond that, it is trivial to chunk data, so that the end users receive <strong>all data like a bittorrent</strong>. Through these mechanisms, <strong>static data would be much faster than the normal internet, since all data behaves like a torrent or CDN.</strong> Stateful data still receives many of these benefits, but that'll have to wait for later.

I had to take a break from writing this post, so hopefully I don't miss anything. I probably will, having lost my train of thought now.

* Isn't it expensive to track computations like that / CPU cache &amp; branch prediction?
You can calculate the length of each block in advance, and only need to add the sum of the block to the total when a branch is taken. Locally initiated code could under some circumstances be free and not need to be tracked at all.

* Can this have any <em>benefit</em> to security, as far as local programs go?
Isolation of programs &amp; capability-based security: no implicit dependencies or global resource graph to exploit. You can't traverse e.g. the whole filesystem because the concept of "the whole filesystem" isn't meaningful.

    <h2>The implementation of distributed state</h2>
Oh lord. Now I have to move onto the implementation of distributed state. I'm not really looking forward to this, since everything I've said so far is just the tip of the iceberg as far as details go. Most of my countless pages of notebook are trying to work around the insane difficulties of building a general case for distributed applications. Can't I just stick to hype? Shit. Man, and that's not even counting explaining the usual blockchain problems.

    <h3>The guarentees of a distributed system</h3>
    What we're essentially looking at is an extremely complex internet overlay network implemented on top of the operating system I've described so far, to provide guarantees about data. We have opened the gates of hell. I suppose I should start off by listing the properties that such a network needs to be able to have:
    <ul>
<li> The typical cryptographic guarantees: confidentiality, integrity, availability, and non-repudiation</li>
<li> Agreement upon the order in which things occurred *with low latency* (FUCK the PACELC theorem). Sometimes a partial ordering is acceptable</li>
<li> Prove that things actually happened to peers who did not observe it (this is incredibly complex if you don't want to use blockchain)
</li><li> Separation of identity / pseudonymity / persistent identification: actually this is by far the easiest thing here
</li><li> The ability to locate and retrieve public data
</li><li> Anonymity (note: I'm not willing to even think about anonymity networks. I figure this is separation of identity + layering connections over I2P or some other existing anonymity network-- I2P is just built for this sort of thing though).
</li><li> Privacy of the data
</li><li> Provable secrets such as e.g. player position in an FPS
</li><li> Reforming the network in the case of a partition
</li><li> The ability to update the program while maintaining the same data store
</li><li> Guaranteeing consistent data across the entire network
</li><li> Preventing lying about data
</li><li> Separation of semantic units of data-- only perform necessary replication of calculations, and only providing the guarantees necessary for a particular block of data
</li><li> Resilience against censorship, malicious firewalls, DDOS, and such
</li><li> Continued functionality in high-latency and/or low-bandwidth environments, and unreliable connections
</li><li> Protection against <em>spam</em> and cheaters
</li><li> Multi-modality for when server architectures just work best for performance
</li><li> Prevention against data loss, especially for the ability to do stuff like long-term remote file hosting
</li><li> Plausible deniability</li>
</ul>
    </article>
    <footer>
<div class="license"><a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="/res/cc-by-sa-small.png" /></a> This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">Creative Commons Attribution-ShareAlike 4.0 International License</a>.</div>
<div>Open Iconic — <a href="https://www.useiconic.come/open">www.useiconic.com/open</a> / MIT Licensed, attribution optional</div>
<div>This page is friendly to text browsers.</div>
    </footer>
  </body>
</html>