DHT

Introduction & Technical Overview of SAFE Consensus

The features included in decentralized networks can be quite varied based on the proposed goals of the technology. From the sharing capabilities offered by Bittorrent to user privacy enabled via Tor’s routing protocol, network designs directly reflect the mission set forth by their architects. Within autonomous networks that rely on data and system integrity where network critical actions may fail or produce faulty outputs, consensus mechanisms are an important feature which optimise reliability. Just like in greater society where important business or policy decisions are typically deferred to a board or committee rather than depending on a single individual, computer systems which manage data and user accounts in a diverse environment face quite a lot of potential for parts of that system to be inaccurate or unresponsive. In commonly owned and openly participatory networks, the risk of malicious behavior adds even greater importance of consensus around its current state and actions taken.

As a decentralized, peer-to-peer network with an initial focus on reliable storage and communication, the SAFE Network requires a high standard for data integrity. User files and account information are secured and stored in such a way that no major outage should affect access of data for the main network. While most P2P networks gain in security from a global network of distributed nodes (the SAFE Network further obfuscates traffic using global multiple hop routing), critical decisions for maintaining security of stored data are kept “localised” in SAFE for increased efficiency and privacy.

The Nature of Consensus: Global & Local

Before diving into the specifics of SAFE consensus, let’s do a bit of comparison with other recent developments in decentralized consensus design. One of the more interesting implementations was introduced several years ago with the launch of Bitcoin. The combination of proof of work and blockchain technologies has enabled an extremely reliable way to track a permanent and ordered list of cryptographically secure transactions. All nodes within the Bitcoin Network store and verify a copy of the blockchain which protects against tampering with historical transactions and faulty reporting of newly created ones. Unfortunately, the global and public nature of Bitcoin’s consensus process creates drawbacks with regard to efficiency and privacy. . The fact that all nodes in the network need to track and agree on a single, infinitely growing ledger has proven scaling problems and simplifies the ability for deep analysis of the ledger and user profiling. While various efforts are looking to solve theses issues, the years of research carried out by the MaidSafe team has resulted in a consensus mechanism designed specifically for privacy and efficiency – a different goal than the proof of concept architected by Satoshi Nakamoto for Bitcoin. This protocol is the basis of the SAFE Network and when compared to Bitcoin takes a very different approach, enabling actions and verifying network states based on local node consensus.

Those following MaidSafe may know of our preference for the emulation of natural systems which have hundreds of thousands of years of testing in diverse environments and harsh conditions. This philosophy can be extended to help understand a high level reasoning for our approach to consensus. Animal societies of all kinds localise decisions to reach agreements about immediate threat levels and other states of being while brains have evolved to localise neuron function for efficiency. Additionally, local consensus allows for the more sophisticated societies formed around humans to make intelligent decisions about sensitive actions such as an elected committee deciding on substantial policy changes for a community. Of course, these social situations come with their own vulnerabilities if the individuals involved in consensus decisions have similar self interested goals which do not reflect the interest of that which they govern. However thankfully, in computer networks, measures can be implemented which prevent local consensus abuse (or misuse) by nodes and it all starts with the foundation on which the network is built upon.

XOR Close Group Consensus

A recent post on this blog titled Structuring Networks with XOR outlines the basics of defining closeness within the SAFE Network’s distributed hash table (DHT) implementation. If you are not familiar with the foundation of Kademlia-based DHTs, that post will be a prerequisite to effectively understanding consensus process in SAFE that we will now dive deeper into. As we explore how such local consensus is approached using XOR closeness, it is important to keep in mind that “closeness” in this sense does not refer to geographical closeness, rather from a perspective of network address. So when nodes have IDs which are close in distance according to the XOR calculation, their physical locations could be on opposite sides of the planet.

By relating network elements in terms of XOR closeness, a unique group of the closest nodes to a given object can be determined and subsequently act in a managerial role for it. As long as these objects have unique IDs which can be translated to binary, everything from data to nodes can be related to each other in terms of distance and assigned a close group of network nodes (or as we call them, Vaults). This close group of Vaults can take on a variety of purposes depending on the object they surround, but center on management of data and node integrity consensus processes. The graph below shows how we can relate any object with ID n to its four closest online Vaults.

closest-group

Whether the data is public keys, encrypted personal files and client account information, or cryptocurrencies, close group authority is the basis for the SAFE Network’s ability to self-manage and self-heal. As long as nodes are not able to predetermine their location in the DHT address space, the inclusion within a close group is essentially random and drastically reduces any chance of group members colluding to disrupt the integrity of an action on particular piece of data. A future post detailing the security against various types of attacks will dive deeper into concepts like how IDs are assigned but for the purpose of understanding the consensus mechanism, we can view it as random. Further, each group consensus requires a quorum of vaults for authority approval which protects against a minority of unreliable nodes. The exact quorum to group size ratio will be investigated as part of ongoing test networks to balance efficiency with security. Additionally, as vaults in the network go on and offline (referred to as network churn), the members in close groups will be in a constant state of flux to accommodate for new or lost nodes.

Node and Group Authorities

The variety of close group authorities formed in SAFE are fundamentally determined based on the ID of the object which the vaults within that group surround. These distinct consensus responsibilities are referred to as personas. Client nodes act as a complementary authority for user authorised actions within the network and differ from Vault nodes as they do not route data but instead act as an interface for users to connect into the network and access or upload data. Each Vault node can also be considered an authority with the extremely limited capabilities of responding to requests for data they store. Using cryptographic key signing, the network verifies authority based on messages sent by Clients, personas and individual Vaults.

Some actions require just Client and Vault cryptographic authorisation (such as reading data already uploaded to the network) while others involve at least one persona consensus as well (such as storing new data to the network). Autonomous actions require no Client authority and solely rely on persona consensus and Vault cryptographic authorisation (such as network reconfiguration to reassign data storage when a Vault goes offline). These autonomous processes are what enables the SAFE Network’s ability to heal itself and manage nodes and stored data without the need for any centralised authority or Client action. This is a major difference from Bittorrent’s implementation of Kademlia which does not provide availability of data – if a few Bittorrent nodes hosting an niche piece of content all eventually go offline, there is no network procedure for reallocating that data and it therefore becomes inaccessible.

The Four Authorities

The network’s routing protocol defines four authorities. There are two node authorities: Client and ManagedNode (Vault storing data); and two fundamental persona types: ClientManager and NaeManager (Network Addressable Element) consensus groups. Fundamental persona types are defined based on placement in the DHT and XOR closeness to object IDs while ManagedNodes are determined based on their inclusion within a NaeManager group. The persona types are subcategorised into specialised responsibilities based on the type of data or account which they surround. It is expected that personas will overlap, meaning a single Vault might be included within several close groups simultaneously while also storing data assigned to it.

Authority Network
component
Persona Group
Sub-types
Responsibility
Client Client node N/A Private & public
data access
ClientManager
Persona
Vault node
close group
MaidManager Client authentication
& PUT allowance
MpidManager Client inbox/outbox
NaeManager
Persona
Vault node
close group
Data-
Manager
Immutable
data
GET rewards & data
relocation
Structured
data
GET rewards, data
updates & data
relocation
ComputeManager (TBA) TBA
ManagedNode Vault node N/A Store data & respond
to data requests

Client

While Clients have authority outside of group consensus, as previously mentioned, they have limited control and are never in a position which affects data reliability. Clients are the windows into the network for users and therefore will control data and messages from a client-side perspective such as encryption and decryption for uploaded data. For each client connection into the network, there is an anonymising proxy node which relays all data to and from destinations within the network, but the proxy does not have the ability to read any of it (for those familiar with Tor, this function is akin to a “guard node”).1

ClientManager

The ClientManager persona consists of Vaults closest to a Client ID and is subcategorised into MaidManager and MpidManager personas. MaidManager (Maid Anonymous ID) is adjacent to the personal ID associated with a Client and has the responsibility of managing that user’s personal account functions, such as upload allowance and authentication. MpidManger (Maid Public ID) surrounds the public ID associated with a Client and is responsible for maintaining the Client’s inbox and outbox for sending messages to other Clients.

NaeManager

The NaeManager persona consists of Vaults closest to network addressable elements such as data chunks. The initial release of SAFE will focus on implementing the persona type DataManager to take on the task of enforcing data storage reliability with future plans for ComputeManager persona type for reliably computing data. DataManager is further subcategorised into functions managing immutable data and structured data. ImmutableDataManager are a group of Vaults closest to the ID of an immutable chunk of data and manages GET rewards (safecoin farming) for successful ManagedNode responses and the relocation of data when one of these goes offline. Immutable data chunks are encrypted pieces of user uploaded files with IDs derived from the data chunk itself. A file is only able to be reassembled by users with access to the specific data map, more commonly known as a symmetric key file. StructuredDataManager is closest to the ID of structured data which are small, mutable data files owned by users on the network such as DNS entries, safecoin and data maps for immutable file reassembly. In addition to managing GET rewards for ManagedNodes storing the file and relocation responsibilities, StructuredDataManager will also acknowledge updates initiated by owners of the data (such reassigning ownership to another user).

ManagedNode

Like Clients, ManagedNodes have limited control in the authority functions they take on as individual Vaults. They only have control over data which they store and responses to requests for that data. Since all uploaded data is stored redundantly over at least a few ManagedNodes they are never in total control of any data piece. All Vaults in the network may store data and will take on this limited ManagedNode authority over a piece of data when assigned to the DataManager group surrounding that data ID. This means all DataManagers will also be ManagedNodes storing that data. The role a Vault takes on as a ManagedNode storing (or eventually computing) a piece of data is directly dependent on its role as a DataManager group member for that data, but the two authorities are nonetheless distinct.

The illustration below shows relationships of a Client (n) and the closest online Vaults which make up their ClientManager (n+2, n+5, n+7, n-8), a data chunk (m) and the closest online Vaults which make up its DataManager (m-1, m-2, m-3, m+7) and those DataManager Vaults acting also as ManagedNodes storing m.

circle-groups

Mapping Network Actions

With these various roles and authorities in mind, let’s explore some actions which can give a more complete view of how the network functions. For simplicity, we’ll use the same groups as the previous illustration for all examples. Each action within the network originates from one of the four authorities. Client initiates actions for storing new data or modifying and reading existing data they have access to while DataManager authority initiates restoring a piece of data when a ManagedNode becomes unresponsive. A ManagedNode will never initiate an action and only acts in authority to requests for data. Every action additionally involves cryptographic verification of authorities involved at each step.

PUT

When a logged in user uploads a new piece of data to the SAFE Network, a series of authorities come into play for securely and privately storing it. If a Client is putting a standard file type (such as a document or image) onto the network, it will first locally self-encrypt the data into chunks while creating a data map (symmetric key) and upload each data piece to be stored and managed within its own location on the DHT. As mentioned, these immutable data chunks have unique IDs derived from the contents of the encrypted chunk itself while the decryption key, or data map, is uploaded with its own unique ID within a structured data file. The self-encrypt video linked above illustrates how the data map is both a decryption key and list of data chunk IDs which double as pointers for locating them in the network. The authorities involved in uploading a single piece of data (whether immutable or structured) are as follows:

circle-groups-PUT

The Client sends a signed message via bootstrap relay node with the data chunk to its own ID which is picked up by the MaidManager close group in that part of the network. After checking the authority comes from the Client, a quorum of Vaults within this group must confirm the storage allowance for the Client before deducting an amount then sending the data and a message signed by the group to the location in the network where the ID for that data exists. In the case that the data already exists, no allowance is deducted and the Client is instead given access to existing data. If it does not exist, the message from the MaidManager is picked up by the closest group of Vaults to the data ID as a new DataManager, which checks the authority coming from the MaidManager persona. DataManager Vaults then initiate storage on each Vault in the group as individual ManagedNodes. Each ManagedNode sends a success response (or fail in the case of insufficient resources) back to data ID which is again picked up by the DataManager and forwarded back to the Client ID which is in turn picked up by the ClientManager and forwarded back to the Client.

GET

The action of reading a piece of data which a user has access to is a simpler process as there is no need for MaidManager consensus. Clients can send messages directly to ManagedNodes so long as they know the ID for the data which they store locally. In fact, there is no direct consensus needed to retrieve a data piece however in order to reward the data holder with proof of resource, DataManagers confirm successful response of data.

circle-groups-GET

The Client sends a message to the ID for the data they are looking for which is picked up by the closest ManagedNode among the DataManager group and responds with the data itself. If there is a problem obtaining the data from this Vault, a short timeout will trigger the second closest ManagedNode to instead respond with the data and so on. The DataManager group confirms response by the Vault and sends a reward message with a newly created structured data representing a unique safecoin to the public address of that particular node. A future post will go more in depth into safecoin creation, handling and cost metrics.

Relocate Stored Data

When the network churns and ManagedNodes holding data go offline, it is only natural that the network assigns data storage responsibilities to another node for preserving data retrievability. This is a network critical action and is one of the few instances of actions being initiated by a persona rather than a Client. The absence of a missing ManagedNode will be detected quickly as periodic heartbeat messages are sent between all connected nodes.

circle-groups-churn

A ManagedNode storing a particular piece of data that is unresponsive to a connected node will have its closest Vaults alerted. Once confirmed by the DataManager maintaining that data chunk, they choose the next closest Vault to the data ID to become a new ManagedNode and member of the DataManager group. The newly chosen ManagedNode sends a success response to the rest of the close group which is then confirmed.

Close Group Strategy

Decentralized data reliability is an important feature of systems which remove dependence on central parties. Furthermore, such systems which aim to preserve privacy for users must also consider their methods used for consensus and understand the trade-offs. Bitcoin’s global consensus around a single public ledger helps guarantee network and transaction status but lacks in scalability and privacy. By segregating vault responsibilities in SAFE based on XOR closeness, the network is able to achieve reliable data and network status maintenance without the need to reveal any information globally. The potential for attacks on consensus mechanisms also varies with their implementations. While no system can claim absolute security, sufficient measures can be put in place to reduce potential by increasing the difficulty and necessary resources for staging such an attack. In SAFE, for example, a series of close group consensus in a PUT forces attackers to control a quorum in multiple close groups. Additionally, network churn in such a large address space facilitates the constant distribution of new IDs makes Sybil attacks more difficult to attain as nodes are constantly showing up in random parts of the network. Node ranking can also be used in close groups to detect disagreements in consensus and downgrade or push out unagreeable nodes.

With the previous introduction of XOR properties in DHTs and overview of their use for close group consensus within SAFE, we hope we have provided a better general understanding of data reliability in the network. This authority process is used for every action on the network including managing safecoin ownership and transfer. Expect future posts which dive into details of attacks on close group consensus (and mitigations), data types in SAFE, and safecoin functionality including the built-in reward system for data holders, applications and public content. In the meantime, questions or discussion about the consensus approach are welcome in our community forum.

 

1This proxy also serves as the initial bootstrap node for introducing nodes back into the network whether a Client or Vault. All nodes start out as a client and negotiate connections to their future peers via their proxy node. The bootstrap node has no relationship to the Client or Vault (in terms of closeness) and is randomly chosen from a list of hardcoded nodes in a bootstrap config file, taken from a cache of previously used bootstrap nodes or through a service discovery mechanism such as a vault already connected to SAFE on the same local area network (LAN).

Further resources:

The Language of the Network (2014)

Autonomous Network whitepaper (2010)

‘Peer to Peer’ Public Key Infrastructure whitepaper (2010)

MaidSafe Distributed Hash Table whitepaper (2010)

MaidSafe Distributed File System whitepaper (2010)

Structuring Networks with XOR

A prerequisite to understanding the SAFE Network on a technical level, including the consensus process, requires knowledge of the underlying structure which powers it as a decentralized, autonomous system. Peer-to-peer networks can be categorised into two general types: structured and unstructured. In unstructured networks, there is no explicit organization of nodes and the process of joining and forming connections to others is random. In structured networks, like those which use a distributed hash table (DHT) such as Bittorrent or the SAFE Network, the overlay protocol includes a structure for organizing the network and makes purposeful connections to specific nodes more efficient.

One of the most widely adopted DHT’s, named Kademlia, was originally popularised through its implementation in Bittorrent’s Mainline DHT which removed dependence on central trackers for finding the locations of nodes and data stored on the network. Kademlia employs a rather simple operation called “exclusive or” (XOR) to establish a mathematical relationship between node identifiers. While SAFE uses a modified version of Kademlia, the XOR operation is consistent across implementations and understanding this equation will give insight to all networks based from Kademlia.

Comparing Bits

To best understand how XOR facilitates a structured, p2p network, let’s start from the very basics of the operation which at its foundation, compares two inputs and then outputs their difference. The input numbers used are binary, meaning they are made of only 0’s and 1’s. The mathematical symbol for a XOR operation is ⊕.

To show the simplicity of calculating the XOR output of two binary numbers, let’s first look at an example with fairly large numbers as inputs:

Input A: 11001010011010100100010101011110
Input B: 01001000101110011010011111000101

Now, to find the XOR output, simply compare the bits (a bit is a single digit in a binary number) individually and in order. Where the bits are the same, place a zero (0) and where the bits differ, place a one (1).

The table below shows the calculation of the 32-bit inputs we chose where input A is the first row in yellow, input B the second row in blue and the XOR output last in green.

1 1 0 0 1 0 1 0 0 1 1 0 1 0 1 0 0 1 0 0 0 1 0 1 0 1 0 1 1 1 1 0
0 1 0 0 1 0 0 0 1 0 1 1 1 0 0 1 1 0 1 0 0 1 1 1 1 1 0 0 0 1 0 1
1 0 0 0 0 0 1 0 1 1 0 1 0 0 1 1 1 1 1 0 0 0 1 0 1 0 0 1 1 0 1 1

 

Since the first bit in input A is 1 and the first bit in input B is 0, the XOR output for that digit is 1. Meanwhile, the second bit in both numbers is 1 so the XOR output for that digit is 0 and the third bit in each number is 0 so the XOR output for that digit is also 0. By comparing each digit down the line as the same or different, we arrive at an XOR output of 10000010110100111110001010011011. The decimal conversion of that value is 2194924187 which is not such a straightforward calculation, however, it can be helpful to know how the pattern of binary counting works:

0=0
1=1
10=2
11=3
100=4
101=5
110=6
111=7
And so on...

Properties of XOR

Now, to get a grasp on the usefulness of XOR calculations, let’s take a step back and focus on 1-bit numbers as our inputs.

XOR operations on 1-bit numbers (0, 1)
Input A Input B Output C Operation A⊕B==C
0 0 0 0⊕0==0
0 1 1 0⊕1==1
1 0 1 1⊕0==1
1 1 0 1⊕1==0

 

Using the table above (which shows every possible combination of those values), we can see that regardless of the input values, if they are equal to each other, the output is zero. Alternatively, if input values are not equal, the output is a non-zero value (1 in the case in 1-bit values). The last characteristic we can gather from this table is that if we swap A for B then C stays the same which in mathematics is called commutative and can be expressed as:

if A ⊕ B == C therefore B ⊕ A == C

Furthermore (but a bit more difficult to tell in this simple example), if we swap A or B for C, the new output will be the value which C replaced and can be expressed as:

if A ⊕ B == C therefore C ⊕ B == A and A ⊕ C == B

We can now observe how these properties hold true with slightly larger binary values.

XOR operations on 2-bit numbers (00, 01, 10, 11)
Input A Input B Output C Operation A⊕B==C
00 01 01 00⊕01==01
00 11 11 00⊕11==11
01 01 00 01⊕01==00
01 10 11 01⊕10==11
10 11 01 10⊕11==01
11 01 10 11⊕01==10
11 11 00 11⊕11==00

 

Using the table above (which only shows a sample of possible combinations) we can still see that equal inputs give an output of zero (00), unequal inputs give a non-zero output and the property where swapping any of A, B or C for each other in the operation holds valid (highlighted in coloured rows). As the binary numbers grow larger, these characteristics of XOR operations will continue to hold. Additionally, we can deduce that the XOR output of two values (also called XOR distance) A and B will always be unique for those inputs. In other words, there is only one value B at a given distance C from given value A which can be expressed as

if A ⊕ B == C then never A ⊕ B == D and never A ⊕ D == C

XOR Relationships in Networks

With basic understanding of XOR characteristics, let’s now explore how it maps onto a peer-to-peer network using binary tree graphs.

sm-binary-trees

The two graphs above illustrate the simple tables we used to explain XOR properties with the left side being a 1-bit network (two nodes) and the right side, a 2-bit network (four nodes). Within each graph, a step to the left at a vertex point adds a zero (0) bit to the end of the number and a step to the right adds a one (1).

big-binary-tree

For better understanding of these properties in larger networks, the graph above shows a 5-bit XOR network consisting of 32 nodes (00000 to 11111 in binary, 0-31 in decimal) and follows the same vertex stepping rule. The two blue coloured nodes, 12 (01100) and 15 (01111), have an XOR distance of 00011 (3 in decimal) while the two orange nodes 16 (10000) and 31 (11111) have an XOR distance of 01111 (15). However, even though the blue node 15 and the orange node 16 are next to each other in the line, their XOR distance is even larger at 11111 (31) and shows that XOR distance does not follow the same distance assumptions as we are used to. Since every node has a unique distance to every other node in the network, it becomes quite simple to reliably relate them to each other using this property and therefore finding nodes that are closest to each other is straightforward by doing an XOR calculation on a node ID and the smallest distances (think back on the property of swapping values in XOR calculations).

Say we want to find the closest 4 values to the input value of 01010 (10) then we can XOR the input value with the 4 closest non-zero distances 00001 (1), 00010 (2), 00011 (3) and 00100 (4).

Inputs 01010
⊕00001
01010
⊕00010
01010
⊕00011
01010
⊕00100
Output (11) 01011 (8) 01000 (9) 01001 (14) 01110

 

Now, if we take one of those closest values, say, 01110 (14) and again find the closest 4 non-zero values to it, we get a unique set of outputs.

Inputs 01110
⊕00001
01110
⊕00010
01110
⊕00011
01110
⊕00100
Output (15) 01111 (12) 01100 (13) 01101 (10) 01010

 

Since the XOR distance between a particular value and every other value is unique, the closest value set will also be unique. With that in mind, imagine a network using XOR calculations to relate node IDs where each node establishes connections with and stores data about their closest nodes. By communicating with the group of nodes closest and asking what each of their closest nodes are, any single node can eventually locate any other node in the network creating a distributed database.

Decentralization Requirements

It is extremely important in XOR networks that each node has a unique ID. In decentralized networks where there is no central party to issue and enforce unique IDs, this task requires a large enough ID range (ie. with 128-bit or 256-bit numbers) to reduce chance of overlap and a secure hashing algorithm to prevent predetermination of a node’s ID and therefore a node’s placement in the network. Due to the necessarily high ID space and random placement, decentralized networks will not have nodes occupying every value in the ID range and therefore, the closest nodes are most likely not going to be one of the closest 3 nodes like the example above. Regardless of what the actual closest nodes are, however, the relationships between them allows for each node to maintain a narrow perspective of the network and use their established connections for scoping beyond that when needed. This type of relay network makes it easier to discover data and route messages in a targeted but decentralized manner.

This concept is the foundation for Kademlia-based distributed hash tables (understand the name better, now?), including those used in Bittorrent and SAFE. In Bittorrent, this discovery pattern allows nodes to discover which other nodes are storing a particular file. In SAFE, the data on the network is identified with IDs in the same range as nodes so that the data itself can be mathematically related to nodes for storage purposes in the same way nodes are related to each other. XOR closeness is the basis for the SAFE Network’s consensus processes to ensure data reliability and security. This will be covered in more detail in a future post now that we have established an understanding of XOR properties in structured, p2p networks.

The Architecture of the SAFE Network

Despite our best efforts, it is not uncommon to come across comments in forums from people advising that they aren’t really sure what MaidSafe is. They normally go something like “It sounds really cool, but I don’t really know what decentralising the Internet means”.  We have spent about 12 months now explaining the concept to people, giving talks, producing videos, papers and other documentation, so you might think these comments would be frustrating. But, you would be wrong.  In fact, this lack of understanding is …well…understandable.

Firstly, the concept of decentralisation is difficult to get your head around if you are not familiar with Bitcoin and other decentralised/distributed networks. People still ask us: “Where are the servers really?” I suspect this comes from the fact that the client/server model is so entrenched in our thoughts and understanding of how computers should work. Another reason may be that we use potentially inadequate phrases like ‘decentralise the Internet’ to try and communicate our offering concisely. An unfortunate requirement of the buzz word and catch phrase driven attitude of modern media. A by product of so many companies and technologies all competing for a slice of our precious time.

To overcome these challenges it may be helpful to try and apply some context and look at how the Internet currently works, with a view to explaining how the SAFE Network compares. Please excuse the brevity of this explanation, given that this is a blog post discussing a huge topic this will be a Felix Baumgartner height view of the main concepts. For a more comprehensive explanation check out the following post by Rus Shuler, it’s an excellent resource.

The most fundamental layers of the Internet, as described by the OSI model, are comprised of the cables, routers, switches, servers, and connected devices that exist. At this level, data is at it’s most basic level, 1’s and 0’s. Each of these hardware devices require a unique location address (in a TCP/IP network) in order that they are able to find one another, these are called IP (Internet Protocol) addresses.

Maid OSI

You may have heard that the current system of IPv4 is currently running out of IP addresses (there are 2^32) and the Internet Engineering Task Force has been trying to transition to IPv6 for around 25 years, however, uptake has been slow.  You can see this shown on the diagram as the network layer and it is here that data is transferred using the message content and the destination (IP) address, the network layer then decides how to deliver it. These messages are typically split into fragments, or data packets. 

The existing Internet

It is at this point that the existing Internet and the SAFE Network diverge. The transport layer enables data of variable length to be transferred between a source and it’s destination, checking to ensure that each data packet is received at the destination and re submitting those that don’t. A helpful analogy to think about the transport layer is as a post office, classifying and sending the letters and packages to their destination. Examples of two common layer four protocols are Transmission Control Protocol (TCP) and the User Datagram Protocol (UDP).

The session layer governs connections between devices; setting up, managing and terminating exchanges and dialogues between applications. The session layer responds to requests from the presentation layer, which is itself responsible for transforming data from the session layer into a format that the application layer will accept and display. This can include features such as encryption, decryption and data compression.

Finally we have the application layer, the highest level of the OSI conceptual model and the one that all Internet users will be familiar with. The application layer displays information received from lower layers to the user and it is the one that we use to interact with each application. At this level we will see protocols such as Hypertext Transfer Protocol (HTTP) being the basis for data communication on the web, as well as Simple Mail Transfer Protocol (SMTP) for email transmission.

The SAFE Network

It is not the intention to go into any great detail on the SAFE stack, that is for another post. The purpose here is to give an overview and to show how it compares and contrasts with the OSI model.

As depicted in the following diagram, the SAFE Network is an overlay network, utilising the Internet’s existing hardware and network layers. It is worth noting at this point that while we are focussing on the differences of both networks in the software layer, it is anticipated that the SAFE Network will have a significant impact within the hardware layer. With network resources being provided by the SAFE Network’s users, we will potentially experience a reduction in the number of servers deployed and a change in how they are being used. 

Maid STACK

Sitting on top of the hardware and network layers of the existing Internet infrastructure, the SAFE Network utilises the UDP transport protocol, with a Connected Reliable Udp eXchange (CRUX) sitting a layer above. The combination of these two layers, and the Routing layer above, provides the reliability and congestion control of TCP while also helping to deal with the tricky issue of NAT traversal.

Network Address Translation (NAT) is commonly used to address the problem of IPv4 address exhaustion mentioned earlier, enabling private IP addresses to exist behind one public IP address. As the NAT has no way of determining which device the incoming message is intended for, this poses a problem specifically for P2P programs, such as file sharing services like Torrents, or VOIP applications like Skype. The use of CRUX, UDP and Routing enables the SAFE Network to traverse all NAT routers.

Sitting above UDP and CRUX, the Routing layer is a Distributed Hash Table (DHT), based on Kademlia, and governs the routing between nodes on the network. Each node locally stores information about other nodes that it is connected to. Information exchange in routing typically involves data traversing a number of intermediate nodes prior to reaching the destination. This communication is provided by CRUX and an acknowledgement/retransmission mechanism is also provided to ensure reliable message delivery.

Passport and Vaults complete the network libraries with everything above taking place within the users computer/device. Passport facilities the provision of key types that validate the user, enabling them to perform various tasks, such as transferring ownership of a safecoin or accessing a file. These keys require no central authority to validate them. The vaults reside on the machines of the network’s farmers. These vaults perform multiple functions including; managing data, storing data, looking after clients, managing other network nodes and the well being of the entire network.

The Application programming interface (API) sits on top of the Encryption layer and provides a way for SAFE clients (applications) to connect to the underlying network. Theses APIs (there are two of them) will be used by application developers on the network to enable their applications to utilise the networks features. The REST API enables simple commands to be made, such as Put, Get and Delete, while the POSIX like NFS API provides much more nuanced access, enabling more advanced functionality. Drive is a cross platform virtual drive, utilised by both API’s, that presents the network as a native drive on the users computer.

So, hopefully this post has helped explain what the SAFE Network is and how it compares to the Internet as it stands today. SAFE has the potential to impact the physical layers of the existing Internet, through a reduced number of servers, as well as many of the software layers that transfer our data around the network. We have said in the past that we will replace the parts of the Internet and we think this will happen in time, but possibly complimenting the existing Internet is a better way to think about SAFE in the nearer term.

This undertaking has been a huge task for all involved and has not been taken on lightly, or just to be able to do things differently. The overarching drive behind the SAFE Project has always been to provide privacy, security and freedom to all users of the Internet and to offer these benefits within the existing Internet infrastructure was believed to be an impossible task. In our view a significant restructure was required.

The Internet would also benefit from decentralisation elsewhere. Less reliance on Internet Service Providers through Mesh Networking technologies would be of huge advantage, removing a further central point of control and ensuring that all data is treated with the same equal importance. The ability to openly share and collaborate on information will be a key driver for the advancement of the human race, something that is too important to be controlled by companies or governments.