Backing up dscs-spec from its repo

2018-03-17 03:02:39 +00:00 · 2018-03-17 03:02:39 +00:00 · 3062b05aa9
parent e2e8d2ed42
commit 3062b05aa9
1 changed files with 163 additions and 0 deletions
--- a/dscs-spec.md
+++ b/dscs-spec.md
@ -0,0 +1,163 @@
 # Distributed State Coordination Suite
 ## Introduction
 The source for this draft is maintained on Github. Suggested changes should be
 submitted as pull requests at https://github.com/lijerom/dscs-specs. The original
 author is not an expert, so if something seems strange, you're probably right.
 The Distributed State Coordination Suite (DSCS) is intended to form a generic
 network for distributed applications that can provide the same guarentees as and
 similar or better performance than a centralized client/server model does. It
 itself is a protocol suite for coordinating state, data distribution, and
 network topology in a peer-to-peer network.
 ## Design
 ### The flaws of distributed systems
 There are four incompatible guarentees that a distributed data store can make:
 * Everyone is syncronized (consistency)
 * Everyone can access the data (availability)
 * Minimization of latency (latency)
 * Tolerance to network splits/partitions (partition tolerance)
 #### The choice between consistency and latency
 The PACELC theorem states that even when there is no network partition, one must
 choose between consistency and latency, an inherent flaw of distributed systems.
 It can be made acceptable by immediately performing computations on incoming
 data, and then retroactively integrating latent packets. This is similar to
 client-side prediction in modern games. It allows the presentation of an
 acceptably imperfect state to the consumer, or when perfection is needed,
 a head start on the relevant computations.
 This, in general, would be extremely difficult with stateful computations.
 However, functional reactive programming allows deriving a body of state
 from a body of computations, preventing such conflicts by ensuring the
 independence of the vast majority of computations, and easily allowing
 cascading updates when necessary, while preserving still-valid results.
 #### Maintaining trust in the face of network partitions
 While the authenticity of a packet to an identity can be verified with digital
 signatures, that can not be used to prove the order of state changes
 (transactions), or that no transactions are being intentionally left out. In
 addition, it's usually impractical to achieve the current state by replaying an
 enormous list of transactions, so there must be a way to trust an opaque blob
 of state.
 This can be handled in a variety of ways. (transactions = state changes)
 * Don't have state. Static data can be trusted solely by digital signatures.
 * Don't care about trust. There is no real persistent state, only the here and
 now. An example might be a real-time chat program. Chat logs from one side of
 the network can optionally be relayed when the networks rejoin, though it is
 not strictly trustworthy. Timestamps can prevent replay attacks.
 * Use proof-of work (i.e. blockchains). The biggest chain was too expensive to
 spoof, so it must be true. However, proof-of-work computations result in a high
 latency, and any transactions on a smaller chain in the event of a network
 partition are nullified, destroying availability.[1] This relies on there being
 a certain level of activity though, if my understanding is correct.
 * Use trusted observers. While this sacrifices being perfectly distributed, a
 dynamic web-of-trust can allow the verification of all data-- as long as enough
 people are connected. Any actions that occured during a period when the chain of
 trust was insufficiently configured would result in the nullification of that
 data or a netsplit.
 * Use third-party trusted observers. Rather than a dynamic friend-to-friend
 web-of-trust, what in essence is a server is used, probably as a supplement
 to the dynamic system. This is the least trustworthy system, but is no worse,
 when not better than, a client/server model.[2]
 The DSCS will support all of these options, and they may be chosen as appropriate
 for an application.
 [1] In a lot of situations, this is actually acceptable, since only small,
 local portions of the network are likely to disconnect at any time, which is why
 cryptocurrencies like Bitcoin work. If two networks are inherently going to have
 an unreliable link, one can simply run two independent networks. In the case of
 a cryptocurrency, atomic exchanges between them can be securely done while the
 network is up, allowing currency availability on either side, while still
 allowing trade between them. There was an excellent article on this, but I seem
 to have lost it.
 [2] At the present time I foresee only two instances where this would be
 necessary. One might be if network is too irregular for a persistent sufficient
 quantity of your trusted observers to be available, yet nobody can trust
 eachother anyway. The other is when a secret needs to be kept to prevent
 cheating in a game (e.g. a player's location, or the location of a secret base),
 in conjunction with a much more complex system to handle this described below.
 #### The choice between consistency and availability
 According to the CAP theorem, in the event of a network partition, either
 consistency or availability must be sacrificed. A sacrifice in availability
 is equivalent to the client/server model, where the data on the other part of
 the network is simply unavailable. Since the entire point of DSCS is distributed
 state, implying massive redundancy, this would often not be an issue, except for
 in the case of soft partitions (described below), or blockchains (described
 above), unless a complete lock was intentionally forced. Otherwise, the networks
 will simply run independently and merge, at the cost of either side being
 inconsistent with the other until that point.
 #### The choice between consistency and availability
 The PACELC theorem can be made acceptable by immediately performing
 computations on the data coming in, and then retroactively including the
 latent packets. This is similar to client-side prediction in modern games.
 It allows presenting an acceptably inaccurate state to the user, or when
 perfect accuracy is absolutely necessesary, reduces the latency caused by
 computation. That task can be made easier by functional reactive programming
 (wherein the body of state is derived from a body of calculations rather than
 being directly set, which would make retroactively changing state extremely
 difficult).
 ### Performance and scalability
 A serious internet network needs to scale with inevitable exponential growth,
 provide high bandwidth, and very importantly, minimize latency, without
 sacrificing any of its guarentees.
 #### Distributed data
 In a DSCS network, depending upon the trust system, a very large number of peers
 at any given time redundantly store data. If the data is broken up into chunks,
 each connected peer can provide a chunk, resulting in extremely high throughput,
 similar to how bittorrent functions.
 #### Routing optimization
 DSCS will have an optional protocol for self-organizing networks, with the
 fastest nodes carrying more traffic or something.
 #### Soft partitions
 ### Anonymity, privacy, and security
 #### Friend-to-friend routing
 #### Compatibility with existing anonymity overlay networks
 #### End-to-end encryption
 #### Digital signatures and identities
 ### Spam and DoS resiliance
 #### The web-of-trust revisited
 #### Proof-of-work
 #### Throughput rationing
 #### Secret identities (?)
 ### Adaptability
 #### Providing options
 #### Fallbacks and stripping
 #### Modularity and reusability