A Concept Model of Noosphere

Warning

Heavy-duty concept design ahead! This note exploring the concept design of Noosphere is not for the faint-hearted. For those who aren’t very familiar with the concepts of decentralized storage (which includes me!) or who are learning concept design for the first time, this is likely not the best introduction to concepts.

Why conceptualize Noosphere?

This note applies my theory of concept design to Gordon Brander’s Noosphere. The reasons for expressing Noosphere in terms of concepts are:

  • Separation of concerns: to separate design aspects as cleanly as possible so they can be more easily understood, and reuse of existing concepts identified;
  • Design innovation: to capture the novelty of the design, which I expect to be not in the invention of new concepts but in the combination of old concepts (demonstrating what Margaret Boden calls “combinational creativity”);
  • Mental model: to provide a compelling and tractable mental model for users (in this case probably programmers building atop the platform);
  • Product family: to show, from the independence of concepts, that the design represents a family of possible designs generated by the possible concept subsets;
  • Design decisions: to highlight key design decisions and suggest other points in the design space that might be considered;
  • Design rationale: to convey the rationale for the design by articuating the purpose of individual concepts and exploring the properties that concepts assure.

In these respects, Noosphere is no different from any other systems I have considered in the past. But unlike most of them, Noosphere is a platform and not an application or a service, and this presents useful challenges to concept design:

  • Interface vs. implementation: Concepts capture observable behavior and eschew any implementation details. In an app, this distinction is straightforward, but for a platform it’s murkier. For example, Noosphere’s use of public key cryptography for signing objects requires that they be constructed in a particular order (for example, with version pointers inside the signed envelope so that their validity can be assured too); arguably this is an implementation detail.

  • General vs. specific usage: An application generally dictates a particular usage, but a platform is intended to support a variety of different usages. Some line needs to be drawn between the most particular and the most general. Notably, I’ve taken the view that Noosphere should be independent of Subconscious (Brander’s notebook application that it supports) but that peer-to-peer sharing of objects is essential to the platform.

Because of these challenges, Noosphere offers a useful case study of applying concept design in a different domain, and a chance to explore some of the complications that arise.

Basic Concepts

Noosphere assumes a setting of decentralized file sharing, so we start there; without this, it’s hard to motivate Noosphere’s concepts. Because Noosphere itself doesn’t prescribe any particular file sharing protocol, we’ll make the concept pretty minimal:

concept Stash <Content, Principal>: Memo, Stash
purpose
  decentralized, replicated storage of immutable content
principle
  // chain of copies retains owner and content
  new (s1); new (s1, c, o, m1); copy (m1, s2, m2); copy (m2, s3, m3)
  {m3.content = c and m3.owner = o}
state
  memos: Stash one -> set Memo
  owner: Memo -> one Principal
  content: Memo -> one Content
actions
  new (out s: Stash)
  // create a new stash
  new (s: Stash, c: Content, o: Principal, out m: Memo)
  // create a new memo in a stash from given contents and owner
  copy (m: Memo, to: Stash, out m': Memo)
  // create copy m' of memo m in stash to with same owner and content as m
  delete (m: Memo)
  delete (s: Stash)
Notation notes
  • The header gives a name to the concept (Stash) and lists its parameter types (Content and Principal) and the types of objects that the concept generates (Memo and Stash).
  • The operational principle is an archetypal scenario illustrating how the concept fulfills its purpose. In this case, there’s a sequence of actions followed by a (Hoare-style) assertion. It says: if you create a stash s1, and then a new memo m1 in that stash, and you copy the memo to another stash s2 and then to a further stash s3, then afterwards the content of the final copied memo m3 will be the original content inserted into m1, and its owner will be the original owner. In other words, just what you’d expect: memos can be copied around and retain content and ownership.
  • The state is a bunch of relations marked with multiplicities in Alloy style. So memos, for example, maps elements of the set Stash to the set Memo, and the multiplicities say that each stash maps to a set (any number) of memos, and each memo is mapped to by exactly one stash (that is, memos are not shared across stashes).
  • The actions define possible updates on the state. Observer actions aren’t generally needed because the state is assumed to be visible.
  • I like to overload names to make things succinct, so Stash is both the name of the concept and the name of a type (set of items) generated by it. Note that a concept is not like an OO class, but is (computationally) just a state machine, although it’s common for a concept to bear the name of a significant type of item that it manages. Action names are overloaded too: the actions for creating new stashes and memos have the same name, as do the deletion actions.

Design notes

  • Genericity. Concepts should be as generic as possible. So this concept does not constrain the type of memo content (it could be text files, videos, etc) nor who the principals are that act as owners. In Noosphere, the content will turn out to be text and the principals will be represented by public keys. The unconstrained types are given as parameters to the concept, using the standard kind of parameteric polymorphism familiar from languages like ML.
  • Stashes and memos. I’ve coined the term stash for the memo container. In Noosphere, it’s called a sphere, but I’m avoiding that term for now since a sphere includes more than a collection of memos, and I’ll want to describe it as a composition of concepts. The term memo is Noosphere’s.
  • Stash vs. principal. Note that stashes aren’t owned, and there is no constraint that all the memos in a stash have the same owner. This flexibility allows a stash to be used both as an owner’s repository for the memos they create, and as storage for memos received from others. A stash is not necessarily a peer in the P2P file-sharing sense either, since a user might want to store both their own stash of owned memos and a stash of received memos on the same machine.
  • No discovery protocol. The concept is intentionally minimal and does not describe how memos are found; that will be addressed in part by other concepts later.
  • Ownership preservation. The copy action preserves ownership. A user can copy the text of a memo and make a new memo that they own, but can’t claim ownership of a memo copied from another stash.

In the simplest intended application of Noosphere, memos are text notes:

concept Note: Buffer, Note
purpose
  provide editable text notes
principle
  // editing and saving creates a series of immutable notes
  new (b); edit (b, t1); save (b, n1); edit (b, t2); save (b, n2)
  {n1 != n2 and n1.content = t1 and n2.content = t2}
state
  content: Note -> one Text
  current: Buffer -> one Text
actions
  new (out b: Buffer)
  // create new empty buffer
  open (n: Note, out b: Buffer)
  // open a new buffer with the text content of note n
  edit (b: Buffer, t: Text)
  // replace current text of buffer b with t
  save (b: Buffer, out n: Note)
  // create new note n whose content is current text in b

Design notes:

  • Immutability. This describes a minimal concept in which the user edits a mutable buffer and every save generates a fresh, immutable note.
  • Editing abstracted away. Editors are complicated things whose details aren’t relevant here, so we model editing as an action that just replaces the text of the buffer with a new value.

Even though notes are immutable, the history of a note will be represented in Noosphere by a chain of versions:

concept Version <Item>
purpose
  let users access older versions of items
principle
  // if you derive a new version then rollback,
  // you're where you started
  derive (i1, i2); rollback (i2, i) {i = i1}
state
  private pred: Item -> lone Item
actions
  derive (p, i: Item)
  // when i has no pred, make p the pred of i
  rollback (i: Item, out p: Item)
  // when i has pred p, return it

Design notes:

  • Immutability. The derive action is the only way to update the state, so existing predecessor bindings cannot be altered. The intent (suggested by the action being called derive rather than register, eg) is that the derive action is executed when the new item is created, so its predecessor is established from the start.
  • Multiple successors. Although an item has at most one predecessor, it can have multiple successors.
  • Navigating back only. Perhaps most significantly, the concept does not offer any way to traverse from an item to its successor(s); although this is perhaps even more useful than navigating backwards, the way that versioning is implemented (by embedding a predecessor pointer) does not permit it. There will be another means of finding more recent versions in Noosphere through another concepts. (To enforce this, I’ve marked the state component as private and provided an observer action rollback that reads it only in one direction.)

In a decentralized storage system, in which the object you’re looking for can migrate from peer to peer, location-based addressing won’t work. Instead a scheme is used in which addresses of objects are based on their contents:

concept ContentAddress <Object, Content>: Address
purpose
  support addressing of objects in decentralized storage
principle
  // if you register two objects with same content
  // they will have the same address and a lookup on that
  // address over a set including both will return one of them
  register (o1, c, a1); register( o2, c, a2); lookup ({o1, o2}, a2, o)
  {a1 = a2 and (o = o1 or o = o2)}
state
  addr: Object -> one Address
  content: Object -> one Content
  hash: Content lone -> one Address
actions
  register (o: Object, c: Content, out a: Address)
  // register new name a for object o with content c
  lookup (s: set Object, a: Address, o: Object)
  // return any object o in set s with address a

Design notes:

  • Addresses not names. A content-based address is sometimes called a “name”. But the term “address” is more accurate and will be less confusing in Noosphere, since we’ll be introducing names through a different concept.
  • Addresses unique over content. The hash state component represents the fixed relationship between contents and their hashes. The multiplicities of this relation say that each content maps to one address, and each address is mapped to by at most one content. Hash collisions are in theory possible, but are assumed not to happen.
  • Addresses not unique over objects. In contrast, the addresses of objects are not unique, since two distinct objects may have the same contents and therefore the same address. In fact, this is essential in a decentralized storage system, because there will be many copies of a single object, and each must have the same address.
  • Non-deterministic lookup. The lookup action takes a set of objects within which to conduct a search, and arbitrarily returns an object within that set with the requested address. If no such object exists, the action fails.

Content addresses can act as primitive names for memos, and have the advantage of simplicity (being derivable directly from content) and persistence (always identifying the same thing). But as names they suffer from two disadvantages. First, persistence is not always what’s wanted: the author of a memo might want to update the content and have existing names refer to the new content. Second, addresses are not human readable.

Noosphere uses petnames to overcome these problems:

concept PetName <Object>: Dir
purpose decentralized human-readable naming
principle
  // immutable directories built in a series that preserves old bindings
  new (d1); bind (d1, n1, o1, d2); bind (d2, n2, o2, d3); lookup (d3, n1, o)
  {o = o1}
state
  bindings: Dir -> String -> lone Object
actions
  new (out d: Dir)
  // create an empty directory
  bind (d: Dir, n: String, o: Object, out d': Dir)
  // create new directory which adds binding of name n to object o
  lookup (d: Dir, n: Name, out o: Object)
  // return object bound to name in directory   

Design notes:

  • Polymorphic target. The concept is polymorphic in the target type of a name, so anything can be named (even a directory).
  • Immutable directories. Directories are immutable, so each binding action creates a new directory. This might seem burdensome since binding a collection of n new objects will produce n new directories, but this will be mitigated by the ability of users to publish only occasionally (see User concept below).
  • String names. For simplicity, I’ve made names strings, but a richer design would make the concept polymorphic in the name type, allowing a separate path name concept (for example) which could then support hierarchical naming.

Our final concept represents users:

concept User <Asset>: User
purpose
  centralize some assets
principle
  // after a series of asset replacements,
  // the latest asset is the one given in the
  // last replacement
  new (a1, u); replace (u, a2) {u.asset = a2}
state
  asset: User -> one Asset
actions
  new (a: Asset, out u: User)
  // create a new user with a given asset
  replace (u: User, a: Asset)
  // set u's asset to be a

Design notes:

  • Mutable state. In a conventional app, the User concept is just to provide unique identifiers for authentication (typically an additional concept), and might include basic profile fields such as display name and email address. In a decentralized system like Noosphere, there’s a more significant role because owners of content can act as arbiters identifying the latest (and thus most authentic) versions. Introducing users as mutable stores is thus a small concession to centralization.
  • Just one asset. For simplicity, each user has only a single associated asset. This will turn out (when we instantiate the concept) to be the user’s latest petname directory. More realistically, each user might have a collection of assets tied to keys (in the style of a domain’s DNS bindings).
  • Concept identification tactic. One simple way to identify missing concepts is to anticipate the concept composition, and note type parameters for which concrete, instatiating types have yet to be provided. The Stash concept, for example, introduce a type parameter for the content of memos; this will be instantiated with the Note type from the Note concept. Since Stash has a Principal type parameter, there will need to be some concept binding it to a concrete type of uses or cryptographic keys. This User concept will play that role.

Concept Composition

With the concepts in hand, we can now assemble them into a coherent system:

app Noosphere
includes
  concept Note: Buffer, Note
  concept Version <Address>
  concept Stash <Note + Dir, User>: Memo, Stash
  concept ContentAddress <Memo, Note>: Address
  concept PetName <Address>: Dir
  concept User <Dir>: User

Notation notes:

  • Type instantiation. The included concepts have their type parameters instantiated with concrete types that correspond to the types generated by other concepts. For example, the Note concept exports a Note type; this then appears as the second argument in the instantiation of ContentAddress, saying that the notes of the Note concept will be the contents of the ContentAddress concept.

Design notes:

  • Versioning addresses. The items that are versioned by the Version concept are content addresses, these addresses in turn resolve to memos which contain notes. An alternative would have been to version notes instead, but this seemed too abstract. Indeed, regarding notes as directly obtainable would eliminate the role of content addresses. In the implementation, the predecessor fields of memos hold content addresses that are then used to find memos whose contained notes have the corresponding predecessor content. By mapping addresses to addresses, the conceptual model of versioning does deviate slightly from the implementation, whose versioning representation is not homogeneous, and instead maps a memo to a content address. This non-homogeneity is tolerable only because the versioning concept is not separated out.
  • Directories as assets. The asset held by each user is a petname directory holding the user’s most recent bindings of petnames to (content addresses of) memos.
  • Directories in stashes. The Stash concept is instantiated with a union type for the content of memos, allowing memos to hold either petname directories or notes. All stash objects are owned, so directories will be owned too. This is important because our concept design allows any user to create new petnames for memos, including memos owned by other users, and in practice one might want to rely only on petnames created by owners.
  • Addressing notes, not text. The contents of the ContentAddress concept are notes and not the text inside them, since two notes that happen to have the same text must be regarded as distinct items (with distinct predecessors) and must therefore have distinct addresses.
  • Names for addresses. The PetName concept is instantiated with the target of the names being the addresses of the ContentAddress concept. Both content addresses and petnames can be used to locate memos. When used in links, the content addresses will be suitable for permanent and unchanging references, and the petnames for allowing references to the most recent versions of memos.

Action synchronizations

The actions of the composed system are described as synchronizations of the actions of the constituent concepts. We’ll consider just a few of the more interesting and central synchronizations.

The fundamental action involves creating memos:

sync save_edit_as_memo (owner: User, buf: Buffer, stash: Stash,
  pred: Note, name: String,
  out note: Note, memo: Memo, addr: Address, dir: Dir)

  Note.save (buf, note)
  Stash.new (stash, note, owner, memo)
  ContentAddress.register (memo, note, addr)
  Version.derive (ContentAddress.addr[pred], addr)
  PetName.bind (User.asset[owner], name, addr, dir)
  User.replace (owner, dir)

The context is assumed to be a user (who will become the owner of the new memo) who saves text in a buffer they are editing into a (presumably local) stash. A predecessor and a petname for the new note are also required; these might be provided by default, by using the last note that was saved as the predecessor, and keeping its petname as the name of the new note.

The synchronization causes actions to be performed in each of the other concepts:

  • Note: The saving of the buffer results in creation of a new note;
  • Stash: The new note is inserted into the stash, creating a memo;
  • ContentAddress: A content address is computed based on the note and associated with the memo;
  • Version: The (content address of the) predecessor note is recorded to be the predecessor of the (content address of the) new note;
  • PetName: The petname is bound to the (content address of the) new note in the current petname directory of the user, creating a new directory;
  • User: the new petname directory is stored as the user’s current directory.

How does the newly created memo reach other users? Rather than imagining the user interface design for a completed peer-to-peer system, we can just specify a few key actions.

A user can get a memo’s content address by looking up a petname in a directory that they have in a local stash:

sync resolve_name_to_address_local (owner: User, dir: Dir, name: String,
  stash: Stash, memo: Memo, out addr: Address)

  memo in Stash.memos[stash]
  dir = Stash.content[memo]
  owner = Stash.owner[memo]

  PetName.lookup (dir, name, addr)

Notation notes:

  • Precondition. The sync includes a precondition saying that the directory in which the name lookup occurs is the content of a memo in the given stash, and its owner is the expected owner.

You can also go to a user’s current petname directory for the most up-to-date binding of a petname:

sync resolve_name_to_address_auth (owner: User, name: String,
  out addr: Address)

  PetName.lookup (User.asset[owner], name, addr)

Notation notes:

  • State queries. The directory that is presented as the first argument to the lookup action is obtained by a query of the User concept, looking up the asset associated with the user owner. A similar state query was used in the first sync above to find the content address of the predecessor note.

With a content address in hand, a memo can now be located by resolving the address within the set of memos of a given stash:

sync get_memo_by_address (s: Stash, a: Address, out m: Memo)
  ContentAddress.resolve (Stash.memos[s], a, m)

Notation notes:

  • Failure. In concept design, an action may fail (or more exactly fail to occur) if its preconditions are not satisfied. So implicitly this action will only occur when a memo actually exists in the given stash with the given name. Of course in an implementation appropriate errors should be reported.

For memos to be available in a local stash, they must be copied from peer stashes:

sync copy_memo_between_peers (from, to: Stash, memo_from: Memo, out memo_to: Memo)
  memo_from in Stash.memos[from]
  Stash.copy (memo_from, to, memo_to)

You can find the address of the predecessor of a note:

sync get_predecessor_addr (m: Memo, a: Address)
  Version.rollback (ContentAddress.addr[Stash.content[m]], a)

Reflections and next steps

  • Errors. The concept model no doubt contains errors arising from my lack of knowledge of Noosphere and decentralized storage.
  • Modeling liberties. It also takes some modeling liberties, most notably in treating content addresses as being derived directly from the identities of notes (rather than from the bits that comprise them).
  • Immutability modeling. The immutability of various objects is modeled here by simply having immutable relations in concept states; in the implementation, the immutability is assured as a side-effect of the use of content addresses (since modifying the content of a memo would change its address). Since content, for the purpose of content addressing, includes predecessor pointers, the content address also ensures the immutability of predecessors. I considered representing this more explicitly in the concept model but it created modularity problems (and anyway seems to be an implementation detail).
  • Role of versioning. The Version concept seems weak, as evidenced by its not very compelling operational principle. It’s not clear to me how versioning is used. One scenario might be that, when a memo is copied between stashes, the receiving stash checks to see if the new memo is a newer version of an existing memo in the stash, and if so replaces it. How this would work with naming, however, is not clear.
  • Links. In Subconscious, the note storage app built on Noosphere, the text of a note can include links to other notes. A link might refer to a content addresses or a petname (or perhaps both, allowing an archival referent and an updatable one).
  • Public keys. In Noosphere, public keys are used as principals and also to ensure the integrity and authenticity of memos. I don’t think this would be hard to model. I left it out for now, because it’s mostly an implementation detail, with the concept aspects captured by the User concept and the owner relation in Stash.
  • Hierarchical storage. I noted above that by introducing a Pathname concept, and using path names as petnames instead of strings, the name space could simulate a hierarchical folder structure (as in Git).
  • Spheres. A sphere in Noosphere combines the roles of User and Stash. In the concept model these are separated to allow the possibility that a user has multiple stashes.