Data Discovery & Connections

Mutable Content

One of the most powerful things about IPFS is that any piece of data or content you store on the network cannot be modified without changing the Content Identifier (CID) for that data, since the CID is created (in part) by hashing the content. There are various ways to create mutable data on top of IPFS and in this lesson we will learn what just a few of them are.

Mutable Content | ResNetLabs on Tour – David Dias

See the full set of resources on the ResNetLab Tutorials page

Nodes, Peers, and the Swarm

A Peer is any connected node on IPFS that relays and/or stores information on the network. You can either search peers using the DHT and Kademlia, or be directly connected to a peer. The set of peers that you (as a peer) are connected to directly is called a Swarm.

IPFS Nodes are programs that run on a computer that can exchange data with other IPFS nodes. Bootstrap nodes are used when a new node initially enters the IPFS network.

The DHT

The public Distributed Hash Table is the record of content that is used, along with Kademlia, to discover content-addressed data in a peer-to-peer network. The DHT is the mechanism that allows a peer-to-peer network to work without the old client-server model that the web2 internet runs on.

IPFS and the DHT

The DHT is a distributed system for mapping keys to values. In IPFS, the DHT is used as the fundamental component of the content routing system. It maps what the user is looking for (a CID) to the peer that is actually storing the matching content. There are 3 types of key-value pairings that are mapped using the DHT:

  • Provider Records – These map a data identifier (i.e., a multihash) to a peer that has advertised that they have, and are willing, to provide you with that content. This is used by IPFS to find content, and IPNS to find pubsub peers

  • IPNS Records – These map an IPNS key (i.e., hash of a public key) to an IPNS record (i.e., a signed and versioned pointer to some path like /ipfs/bafyXYZ)

  • Peer Records – These map a peerID to a set of multiaddresses at which the peer may be reached. This is used by IPFS when we know of a peer with content, but do not know its address, and used for manual connections

Read More in the docs

Kademlia

Kademlia is a distributed hash table for decentralized peer-to-peer computer networks designed by Petar Maymounkov and David Mazières in 2002. It specifies the structure of the network and the exchange of information through node lookups.

Kademlia makes it easier and quicker to find peers with content by, essentially, comparing how similar two nodes’ content is and rank it by how similar or ‘close’ it is. Read the paper to learn about Kademlia more in-depth.

The InterPlanetary Name System (IPNS)

Since IPFS uses CIDs, if you were to share an IPFS address such as /ipfs/QmbezGequPwcsWo8UL4wDF6a8hYwM1hmbzYv2mnKkEWaUp with someone, you would need to give the person a new link every time you update the content, because every change would result in a new CID.

The InterPlanetary Name System (IPNS) solves this issue by creating a link that can be updated. Thus, IPNS bridges the gap to Web3 by providing functionality that Web2 users are already familiar with. For example, when you go to a users’ website, you expect to find that same website with the same link or URL in the future. If that website is updated, you will see those updates as well.

This “link” in IPNS is called a name, this name is the hash of a public key. The name is associated with a record containing information about the hash that it points to and is signed by the public key’s corresponding private key. This allows new records to be signed and published at any time. Using IPNS means that when someone searches for your website using your name, they will receive the most up-to-date content as expected in today’s internet. You can learn more about IPNS and how to use it here.

Pubsub + IPNS

Publish/Subscribe (PubSub) is a messaging protocol to quickly communicate with other peers. Whenever a peer Publishes a message, Subscribing peers will receive it almost instantly. This protocol is not specific to IPFS or IPNS, but to Libp2p; paired with IPNS it allows for quick delivery of records over the network. With PubSub enabled on IPNS, updates to a record can be shared virtually instantly with subscribers.

IPNS is a self-certifying mutable pointer. Meaning any name that gets published is signed by a private key and anyone else can verify that it was signed by that peer with just the name.

How it Works

  • Publishing - When you first publish an IPNS name, you create a brand new record containing the CID to point to,a timestamp, and a cryptographic signature of the record created with the private key to establish you are the owner and certify the name.

  • Searching - Kubo uses the DHT to find peers that will have the queried record. This method has gotten faster over the years, but is still limited by the speed of DHT resolution itself. Alternative transports can be used to improve resolution speed (see PubSub).

  • Validity - A record is valid for 24 hours by default, but you can change its validity to be longer. When someone has your record, but it is expired, they will have to go to DHT to find a new, valid version of your record.

  • Keys - A name is a hash of a public key in a key pair. By default, the first public key used by Kubo is the same as the one for identifying your peer (PeerID). You can generate new key pairs with Kubo and use them to create additional IPNS records.

Subscribe to Content with IPNS

To accomplish IPNS with PubSub, a persistence layer is added. Now when you ask for a name, you are subscribing to a PubSub topic based on that name, you create a connection with a peer that is following the same name, then they send you the latest version of the record.

Watch the Mutable Content video above to learn more about IPNS over PubSub

The Public DHT | LabWeek 2021