Connecting Peers
Understanding Nodes
Although libp2p was originally developed to work with IPFS, you can use it to create p2p applications that have no relationship to IPFS at all. The modularity of libp2p allows you to take whichever pieces you need in your project.
A central term in a p2p network is the node. The concept of a node is pretty broad in software engineering, but in libp2p, it usually refers to a single peer of the network (essentially, a computer that might send or receive messages within the p2p network).
For example, in the following snippet, go-libp2p
(the Go implementation of libp2p), is used to create a node with default settings.
import (
libp2p "github.com/libp2p/go-libp2p"
)
func main() {
host, err := libp2p.New()
if err != nil {
panic(err)
}
}
Nodes establish connections with other nodes, so we need a way to locate those nodes. Multiaddresses specify the location of a node (e.g. which IP address and which port) and peer identifiers specify the identity of the node (e.g. once you get to the node, you can check if this is indeed the node you were looking for).
Peer Identity
Every node in the network is uniquely identified through its peer id. A peer id is generated from the public key of the node. A peer id is a multihash, which essentially is a hash prefixed with the hash algorithm used (the same way that multihashes are used in CIDs).
- If the public key’s length is more than 42 bytes, then the multihash digest is generated by applying a hash function to the public key.
- If the public key’s length is less than or equal to 42 bytes, then the multihash digest is the public key itself, and the hash function code is the identity hash. The identity hash means that the output digest is the same as the input.
Multiaddress
We have every node uniquely identified by its public key, however, we still must know where and how to establish the connection. A multiaddress allows you to specify the transport for the connection (TCP, UDP, QUIC…) and how to reach the node to establish the connection (IP, DNS…).
The following example specifies a connection to the 127.0.0.1
IPv4 by using the TCP protocol at port 8080.
/ip4/127.0.0.1/tcp/8080
The main advantage of multiaddresses is that they are self-describing. Only by looking at the address, you can figure out what protocols are involved in the connection.
Establishing a connection
Now that we know how to reach our peers, we can create a connection. When we create a connection in libp2p, a process called connection bootstrapping occurs. This process is responsible for:
- Handshake: establish the raw connection with the peer.
- Security protocol: negotiate the security protocol for the raw connection (e.g. TLS).
- Stream multiplexer: negotiate the stream multiplexer protocol.
Note that libp2p does NOT allow connections that are not secured or multiplexed. When establishing the connection, we will have two actors: the initiator and the responder.
Handshake
The handshake starts the connection with the peer and verifies that the peer can understand the multistream-select protocol.
After the handshake, the two peers must negotiate other protocols (security, multiplexer, application…), so we need a protocol that indicates how two peers can negotiate other protocols. The Multistream protocol allows peers to exchange what other protocols they support and negotiate them. Therefore, both peers must agree on the same version of Multistream.
# Request: Do you understand "/multistream/1.0.0"?
> /multistream/1.0.0
# Response: I do.
+ /multistream/1.0.0
Security
Now, the two peers will use Multistream to negotiate the security protocol, which allows the connection to be encrypted.
Usually, several protocols are tried during the negotiation, with a preference order. For example, you might try to use TLS first. If your peer does not support TLS, you might try Noise. When a security protocol is accepted, the connection is upgraded to use that protocol.
# Request: Do you understand "/tls/1.0.0"?
> /tls/1.0.0
# Response: I do not
+ na
# Request: Do you understand "/noise/1.0.0"?
> /noise/1.0.0
# Response: I do
+ /noise/1.0.0
Multiplexer
Once the connection is secured, a multiplexer protocol is negotiated. Multiplexing means opening up multiple distinct logical streams to a peer on a single connection. For example you may want to interact with a peer on the DHT, but you may also want to ping that peer. Multiplexing lets you do both of these on the same underlying connection.
Because the data will be sent over the same connection, we need an abstraction called stream. A stream represents the data for a specific protocol in a given connection. Data flowing through the connection is assigned its corresponding stream id. In the following example, the streams for two protocols are shown: rendezvous (with protocol id /rendezvous/1.0.0
) and identify (with protocol id /ipfs/id/1.0.0
).
Negotiating protocols
After the connection is established (i.e. handshake, security, and multiplexing), peers exchange what application protocols they support. Because multistream-select
is used, the procedure to agree on an application protocol is the same as for the security and multiplexing negotiations: the protocol identifier is sent to the peer and na
is answered if the protocol is not supported.
Every supported protocol is assigned a handler, which manages the data for that protocol.
Peer Discovery
Every peer has an identifier, which is generated from its public key. However, this is not enough to find out the location of the peer. Libp2p exposes two interfaces: Advertiser and Discoverer.
The Advertiser offers services to the network, which means that it shares the protocols that it supports with the rest of the network. The Discoverer is able to find peers.
The are two main implementations: mDNS and the Kademlia Distributed Hash Table. The Kademlia DHT is used to discover peers in the IPFS network. Other implementations are also valid as long as they comply with the previously mentioned interfaces.
Supporting Video
To get more information on these concepts, watch the following video.