dash-docs/_includes/guide_p2p_network.md

425 lines
20 KiB
Markdown

{% comment %}
This file is licensed under the MIT License (MIT) available on
http://opensource.org/licenses/MIT.
{% endcomment %}
{% assign filename="_includes/guide_p2p_network.md" %}
## P2P Network
{% include helpers/subhead-links.md %}
{% autocrossref %}
The Bitcoin network protocol allows full nodes
([peers][peer]{:#term-peer}{:.term}) to collaboratively maintain a
[peer-to-peer network][network]{:#term-network}{:.term} for block and
transaction exchange. Many SPV clients also use this protocol to connect
to full nodes.
Consensus rules do not cover networking, so Bitcoin programs may use
alternative networks and protocols, such as the [high-speed block relay
network][] used by some miners and the [dedicated transaction
information servers][electrum server] used by some wallets that provide
SPV-level security.
To provide practical examples of the Bitcoin peer-to-peer network, this
section uses Bitcoin Core as a representative full node and [BitcoinJ][]
as a representative SPV client. Both programs are flexible, so only
default behavior is described. Also, for privacy, actual IP addresses
in the example output below have been replaced with [RFC5737][] reserved
IP addresses.
{% endautocrossref %}
### Peer Discovery
{% include helpers/subhead-links.md %}
{% autocrossref %}
When started for the first time, programs don't know the IP
addresses of any active full nodes. In order to discover some IP
addresses, they query one or more DNS names (called [DNS seeds][dns
seed]{:#term-dns-seed}{:.term}) hardcoded into Bitcoin Core and
BitcoinJ. The response to the lookup should include one or more [DNS
A records][] with the IP addresses of full nodes that may accept new
incoming connections. For example, using the [Unix `dig`
command][dig command]:
;; QUESTION SECTION:
;seed.bitcoin.sipa.be. IN A
;; ANSWER SECTION:
seed.bitcoin.sipa.be. 60 IN A 192.0.2.113
seed.bitcoin.sipa.be. 60 IN A 198.51.100.231
seed.bitcoin.sipa.be. 60 IN A 203.0.113.183
[...]
The DNS seeds are maintained by Bitcoin community members: some of them
provide dynamic DNS seed servers which automatically get IP addresses
of active nodes by scanning the network; others provide static DNS
seeds that are updated manually and are more likely to provide IP
addresses for inactive nodes. In either case, nodes are added to the
DNS seed if they run on the default Bitcoin ports of 8333 for mainnet
or 18333 for testnet.
<!-- paragraph below based on Greg Maxwell's email in
http://comments.gmane.org/gmane.comp.bitcoin.devel/5378 -->
DNS seed results are not authenticated and a malicious seed operator or
network man-in-the-middle attacker can return only IP addresses of
nodes controlled by the attacker, isolating a program on the attacker's
own network and allowing the attacker to feed it bogus transactions and
blocks. For this reason, programs should not rely on DNS seeds
exclusively.
Once a program has connected to the network, its peers can begin to send
it `addr`
(address<!--noref-->) messages with the IP addresses and port numbers of
other peers on the network, providing a fully decentralized method of
peer discovery. Bitcoin Core keeps a record of known peers in a
persistent on-disk database which usually allows it to connect directly
to those peers on subsequent startups without having to use DNS seeds.
However, peers often leave the network or change IP addresses, so
programs may need to make several different connection attempts at
startup before a successful connection is made. This can add a
significant delay to the amount of time it takes to connect to the
network, forcing a user to wait before sending a transaction or checking
the status of payment.
<!-- reference for "Bitcoin Core...11 seconds" below:
https://github.com/bitcoin/bitcoin/pull/4559 -->
To avoid this possible delay, BitcoinJ always uses dynamic DNS seeds to
get IP addresses for nodes believed to be currently active.
Bitcoin Core also tries to strike a balance between minimizing delays
and avoiding unnecessary DNS seed use: if Bitcoin Core has entries in
its peer database, it spends up to 11 seconds attempting to connect to
at least one of them before falling back to seeds; if a connection is
made within that time, it does not query any seeds.
<!-- reference for Bitcoin Core behavior described below: search for
"FixedSeeds" in src/net.cpp; BitcoinJ has IPv4 seeds in its chainparams
and a function to use them, but I don't see that function being used in
any of the examples/wallet templates (but I'm not Java fluent, so
maybe PEBKAC). -@harding -->
Both Bitcoin Core and BitcoinJ also include a hardcoded list of IP
addresses and port numbers to several dozen nodes which were active
around the time that particular version of the software was first
released. Bitcoin Core will start attempting to connect to these nodes
if none of the DNS seed servers have responded to a query within 60
seconds, providing an automatic fallback option.
As a manual fallback option, Bitcoin Core also provides several
command-line connection options, including the ability to get a list of
peers from a specific node by IP address, or to make a persistent
connection to a specific node by IP address. See the `-help` text for
details. BitcoinJ can be programmed to do the same thing.
**Resources:** [Bitcoin Seeder][], the program run by several of the
seeds used by Bitcoin Core and BitcoinJ. The Bitcoin Core [DNS Seed
Policy][]. The hardcoded list of IP addresses used by Bitcoin Core and
BitcoinJ is generated using the [makeseeds script][].
{% endautocrossref %}
### Connecting To Peers
{% include helpers/subhead-links.md %}
{% autocrossref %}
Connecting to a peer is done by sending a `version` message, which
contains your version number, block, and current time to the remote
node. The remote node responds with its own `version` message. Then both
nodes send a `verack` message to the other node to indicate the
connection has been established.
Once connected, the client can send to the remote node `getaddr` and `addr` messages to gather additional peers.
In order to maintain a connection with a peer, nodes by default will send a message to peers before 30 minutes of inactivity. If 90 minutes pass without a message being received by a peer, the client will assume that connection has closed.
{% endautocrossref %}
### Initial Block Download
{% include helpers/subhead-links.md %}
{% autocrossref %}
Before a full node can validate unconfirmed transactions and
recently-mined blocks, it must download and validate all blocks from
block 1 (the block after the hardcoded genesis block) to the current tip
of the best block chain. This is the Initial Block Download (IBD) or
initial sync.
Although the word "initial" implies this method is only used once, it
can also be used any time a large number of blocks need to be
downloaded, such as when a previously-caught-up node has been offline
for a long time. In this case, a node can use the IBD method to download
all the blocks which were produced since the last time it was online.
Bitcoin Core uses the IBD method any time the last block on its local
best block chain has a block header time more than 24 hours in the past.
Bitcoin Core 0.10.0 will also perform IBD if its local best block chain is
more than 144 blocks lower than its local best headers chain (that is,
the local block chain is more than about 24 hours in the past).
{% endautocrossref %}
#### Blocks-First
{% include helpers/subhead-links.md %}
{% autocrossref %}
Bitcoin Core (up until version [0.9.3][bitcoin core 0.9.3]) uses a
simple initial block download (IBD) method we'll call *blocks-first*.
The goal is to download the blocks from the best block chain in sequence.
The first time a node is started, it only has a single block in its
local best block chain---the hardcoded genesis block (block 0). This
node chooses a remote peer, called the sync node, and sends it the
`getblocks` message illustrated below.
![First GetBlocks Message Sent During IBD](/img/dev/en-ibd-getblocks.svg)
In the header hashes field of the `getblocks` message, this new node
sends the header hash of the only block it has, the genesis block
(6fe2...0000 in internal byte order). It also sets the stop hash field
to all zeroes to request a maximum-size response.
Upon receipt of the `getblocks` message, the sync node takes the first
(and only) header hash and searches its local best block chain for a
block with that header hash. It finds that block 0 matches, so it
replies with 500 block inventories (the maximum response to a
`getblocks` message) starting from block 1. It sends these inventories
in the `inv` message illustrated below.
![First Inv Message Sent During IBD](/img/dev/en-ibd-inv.svg)
Inventories are unique identifiers for information on the network. Each
inventory contains a type field and the unique identifier for an
instance of the object. For blocks, the unique identifier is a hash of
the block's header.
The block inventories appear in the `inv` message in the same order they
appear in the block chain, so this first `inv` message contains
inventories for blocks 1 through 501. (For example, the hash of block 1
is 4860...0000 as seen in the illustration above.)
The IBD node uses the received inventories to request 128 blocks from
the sync node in the `getdata` message illustrated below.
![First GetData Message Sent During IBD](/img/dev/en-ibd-getdata.svg)
It's important to blocks-first nodes that the blocks be requested and
sent in order because each block header references the header hash of
the preceding block. That means the IBD node can't fully validate a
block until its parent block has been received. Blocks that can't be
validated because their parents haven't been received are called orphan
blocks; a subsection below describes them in more detail.
Upon receipt of the `getdata` message, the sync node replies with each
of the blocks requested. Each block is put into serialized block format
and sent in a separate `block` message. The first `block` message sent
(for block 1) is illustrated below.
![First Block Message Sent During IBD](/img/dev/en-ibd-block.svg)
The IBD node downloads each block, validates it, and then requests the
next block it hasn't requested yet, maintaining a queue of up to 128
blocks to download. When it has requested every block for which it has
an inventory, it sends another `getblocks` message to the sync node
requesting the inventories of up to 500 more blocks. This second
`getblocks` message contains multiple header hashes as illustrated
below:
![Second GetBlocks Message Sent During IBD](/img/dev/en-ibd-getblocks2.svg)
Upon receipt of the second `getblocks` message, the sync node searches
its local best block chain for a block that matches one of the header
hashes in the message, trying each hash in the order they were received.
If it finds a matching hash, it replies with 500 block inventories
starting with the next block from that point. But if there is no
matching hash (besides the stopping hash), it assumes the only block the
two nodes have in common is block 0 and so it sends an `inv` starting with
block 1 (the same `inv` message seen several illustrations above).
This repeated search allows the sync node to send useful inventories even if
the IBD node's local block chain forked from the sync node's local block
chain. This fork detection becomes increasingly useful the closer the
IBD node gets to the tip of the block chain.
When the IBD node receives the second `inv` message, it will request
those blocks using `getdata` messages. The sync node will respond with
`block` messages. Then the IBD node will request more inventories with
another `getblocks` message---and the cycle will repeat until the IBD
node is synced to the tip of the block chain. At that point, the node
will accept blocks sent through the regular block broadcasting described
in a later subsection.
{% endautocrossref %}
##### Blocks-First Advantages & Disadvantages
{:.no_toc}
{% include helpers/subhead-links.md %}
{% autocrossref %}
The primary advantage of blocks-first IBD is its simplicity. The primary
disadvantage is that the IBD node relies on a single sync node for all
of its downloading. This has several implications:
* **Speed Limits:** All requests are made to the sync node, so if the
sync node has limited upload bandwidth, the IBD node will have slow
download speeds. Note: if the sync node goes offline, Bitcoin Core
will continue downloading from another node---but it will still only
download from a single sync node at a time.
* **Download Restarts:** The sync node can send a non-best (but
otherwise valid) block chain to the IBD node. The IBD node won't be
able to identify it as non-best until the initial block download nears
completion, forcing the IBD node to restart its block chain download
over again from a different node. Bitcoin Core ships with several
block chain checkpoints at various block heights selected by
developers to help an IBD node detect that it is being fed an
alternative block chain history---allowing the IBD node to restart
its download earlier in the process.
* **Disk Fill Attacks:** Closely related to the download restarts, if
the sync node sends a non-best (but otherwise valid) block chain, the
chain will be stored on disk, wasting space and possibly filling up
the disk drive with useless data.
* **High Memory Use:** Whether maliciously or by accident, the sync node
can send blocks out of order, creating orphan blocks which can't be
validated until their parents have been received and validated.
Orphan blocks are stored in memory while they await validation,
which may lead to high memory use.
All of these problems are addressed in part or in full by the
headers-first IBD method used in Bitcoin Core 0.10.0.
**Resources:** The table below summarizes the messages mentioned
throughout this subsection. The links in the message field will take you
to the reference page for that message.
| **Message** | [`getblocks`][getblocks message] | [`inv`][inv message] | [`getdata`][getdata message] | [`block`][block message]
| **From→To** | IBD→Sync | Sync→IBD | IBD→Sync | Sync→IBD
| **Payload** | One or more header hashes | Up to 500 block inventories (unique identifiers) | One or more block inventories | One serialized block
{% endautocrossref %}
### Block Broadcasting
{% include helpers/subhead-links.md %}
{% autocrossref %}
At the start of a connection with a peer, both nodes send `getblocks` messages containing the hash of the latest known block. If a peer believes they have newer blocks or a longer chain, that peer will send an `inv` message which includes a list of up to 500 hashes of newer blocks, stating that it has the longer chain. The receiving node would then request these blocks using the command `getdata`, and the remote peer would reply via `block`<!--noref--> messages. After all 500 blocks have been processed, the node can request another set with `getblocks`, until the node is caught up with the network. Blocks are only accepted when validated by the receiving node.
New blocks are also discovered as miners publish their found blocks, and these messages are propagated in a similar manner. Through previously established connections, an `inv` message is sent with the new block hashed, and the receiving node requests the block via the `getdata` message.
{% endautocrossref %}
#### Orphan Blocks
{% include helpers/subhead-links.md %}
{% autocrossref %}
Blocks-first nodes may download orphan blocks---blocks whose previous
block header hash field refers to a block header this node
hasn't seen yet. In other words, orphan blocks have no known parent
(unlike stale blocks, which have known parents but which aren't part of
the best block chain).
![Difference Between Orphan And Stale Blocks](/img/dev/en-orphan-stale-definition.svg)
When a blocks-first node downloads an orphan block, it will not validate
it. Instead, it will send a `getblocks` message to the node which sent
the orphan block; the broadcasting node will respond with an `inv` message
containing inventories of any blocks the downloading node is missing (up
to 500); the downloading node will request those blocks with a `getdata`
message; and the broadcasting node will send those blocks with a `block`
message. The downloading node will validate those blocks, and once the
parent of the former orphan block has been validated, it will validate
the former orphan block.
{% endautocrossref %}
### Transaction Broadcasting
{% include helpers/subhead-links.md %}
{% autocrossref %}
In order to send a transaction to a peer, an `inv` message is sent. If a `getdata` response message is received, the transaction is sent using `tx`. The peer receiving this transaction also forwards the transaction in the same manner, given that it is a valid transaction.
{% endautocrossref %}
#### Memory Pool
{% include helpers/subhead-links.md %}
{% autocrossref %}
Full peers may keep track of unconfirmed transactions which are eligible to
be included in the next block. This is essential for miners who will
actually mine some or all of those transactions, but it's also useful
for any peer who wants to keep track of unconfirmed transactions, such
as peers serving unconfirmed transaction information to SPV clients.
Because unconfirmed transactions have no permanent status in Bitcoin,
Bitcoin Core stores them in non-persistent memory, calling them a memory
pool or mempool. When a peer shuts down, its memory pool is lost except
for any transactions stored by its wallet. This means that never-mined
unconfirmed transactions tend to slowly disappear from the network as
peers restart or as they purge some transactions to make room in memory
for others.
Transactions which are mined into blocks that later become stale blocks may be
added back into the memory pool. These re-added transactions may be
re-removed from the pool almost immediately if the replacement blocks
include them. This is the case in Bitcoin Core, which removes stale
blocks from the chain one by one, starting with the tip (highest block).
As each block is removed, its transactions are added back to the memory
pool. After all of the stale blocks are removed, the replacement
blocks are added to the chain one by one, ending with the new tip. As
each block is added, any transactions it confirms are removed from the
memory pool.
SPV clients don't have a memory pool for the same reason they don't
relay transactions. They can't independently verify that a transaction
hasn't yet been included in a block and that it only spends UTXOs, so
they can't know which transactions are eligible to be included in the
next block.
{% endautocrossref %}
### Misbehaving Nodes
{% include helpers/subhead-links.md %}
{% autocrossref %}
Take note that for both types of broadcasting, mechanisms are in place to punish misbehaving peers who take up bandwidth and computing resources by sending false information. If a peer gets a banscore above the `-banscore=<n>` threshold, he will be banned for the number of seconds defined by `-bantime=<n>`, which is 86,400 by default (24 hours).
{% endautocrossref %}
### Alerts
{% include helpers/subhead-links.md %}
{% autocrossref %}
In case of a bug or attack,
the Bitcoin Core developers provide a
[Bitcoin alert service](https://bitcoin.org/en/alerts) with an RSS feed
and users of Bitcoin Core can check the error field of the `getinfo` RPC
results to get currently active alerts for their specific version of
Bitcoin Core.
These messages are aggressively broadcast using the `alert` message, being sent to each peer upon connect for the duration of the alert.
These messages are signed by a specific ECDSA private key that only a small number of developers control.
**Resource:** More details about the structure of messages and a complete list of message types can be found in
the [P2P reference section][section P2P reference].
{% endautocrossref %}