dash-docs/_includes/ref_p2p_networking.md
David A. Harding 1634212dd5
Dev Docs: Add P2P Messages That Request Or Reply With Data
Adds to the devel reference page detailed documentation on the following
messages: block, getblocks, getdata, getheaders, headers, inv, mempool,
merkleblock, notfound, and tx.

Adds to the devel examples page an example of requesting and parsing a
merkleblock message.

Adds to the devel docs overview pages links to the above two new
P2P sections.

Tweaks the autocrossref plugin ignore pattern to not crossref in the
middle of a GIF image name; this allows the inclusion of animated GIFs.
2014-11-12 12:39:36 -05:00

27 KiB
Raw Blame History

P2P Network

{% autocrossref %}

This section describes the Bitcoin P2P network protocol (but it is not a specification). It does not describe the discontinued direct [IP-to-IP payment protocol][], the [BIP70 payment protocol][payment protocol], the [GetBlockTemplate mining protocol][section getblocktemplate], or any network protocol never implemented in an official version of Bitcoin Core.

All peer-to-peer communication occurs entirely over TCP.

Note: unless their description says otherwise, all multi-byte integers mentioned in this section are transmitted in little-endian order.

{% endautocrossref %}

Constants And Defaults

{% autocrossref %}

The following constants and defaults are taken from Bitcoin Core's [chainparams.cpp][core chainparams.cpp] source code file.

Network Default Port [Start String][]{:#term-start-string}{:.term} Max nBits
Mainnet 8333 0xf9beb4d9 0x1d00ffff
Testnet 18333 0x0b110907 0x1d00ffff
Regtest 18444 0xfabfb5da 0x207fffff

Note: the testnet start string and nBits above are for testnet3; the original testnet used a different string and higher (less difficult) nBits.

Command line parameters can change what port a node listens on (see -help). Start strings are hardcoded constants that appear at the start of all messages sent on the Bitcoin network; they may also appear in data files such as Bitcoin Core's block database. The nBits displayed above are in big-endian order; they're sent over the network in little-endian order.

Bitcoin Core's [chainparams.cpp][core chainparams.cpp] also includes other constants useful to programs, such as the hash of the genesis blocks for the different networks as well as the alert keys for mainnet and testnet.

{% endautocrossref %}

Protocol Versions

{% autocrossref %}

The table below lists some notable versions of the P2P network protocol, with the most recent versions listed first. (If you would like to help document older protocol versions, please [open an issue][docs issue].)

Version Implementation Major Changes
70002 Bitcoin Core 0.9.0 (March 2014) Added reject message (see [BIP61][]). Send multiple inv messages in response to mempool message if necessary.
70001 Bitcoin Core 0.8.0 (February 2013) Added filterload message, filteradd message, filterclear message, and merkleblock message; also added relay field to version message and MSG_FILTERED_BLOCK inventory type to getdata message (see [BIP37][]).
60002 Bitcoin Core 0.7.0 (September 2012) Added mempool message and extended getdata message to allow download of memory pool transactions (see [BIP35][]).
60001 Bitcoin Core 0.6.1 (May 2012) Added pong message (see [BIP31][]).
60000 Bitcoin Core 0.6.0 (March 2012) Separated protocol version from Bitcoin Core version (see [BIP14][]).

{% endautocrossref %}

Message Headers

{% autocrossref %}

All messages in the network protocol use the same container format, which provides a required multi-field header and an optional payload. The header format is:

Bytes Name Data Type Description
4 start string char[4] Magic bytes indicating the originating network; used to seek to next message when stream state is unknown.
12 command name char[12] ASCII string which identifies what message type is contained in the payload. Followed by nulls (0x00) to pad out byte count; for example: version\0\0\0\0\0.
4 payload size uint32_t Number of bytes in payload. The current maximum number of bytes ([MAX_SIZE][max_size]) allowed in the payload by Bitcoin Core is 32 MiB---messages with a payload size larger than this will be dropped or rejected.
4 checksum char[4] First 4 bytes of SHA256(SHA256(payload)) in internal byte order.

If payload is empty, as in verack and getaddr messages, the checksum is always 0x5df6e0e2 (SHA256(SHA256(<empty string>))).

The checksum field was introduced in protocol version 209; Bitcoin Core 0.2.9, May 2010.

The following example is an annotated hex dump of a mainnet message header from a verack message which has no payload.

{% highlight text %} f9beb4d9 ................... Start string: Mainnet 76657261636b000000000000 ... Command name: verack + null padding 00000000 ................... Byte count: 0 5df6e0e2 ................... Checksum: SHA256(SHA256()) {% endhighlight %}

{% endautocrossref %}

Data Messages

{% autocrossref %}

The following network messages all request or provide data related to transactions and blocks.

Overview Of P2P Protocol Data Request And Reply Messages

Many of the data messages use [inventories][inventory]{:#term-inventory}{:.term} as unique identifiers for for transactions and blocks. Inventories have a simple 36-byte structure:

Bytes Name Data Type Description
4 type identifier uint32_t The type of object which was hashed. See list of type identifiers below.
32 hash char[32] SHA256(SHA256()) hash of the object in internal byte order.

The currently-available type identifiers are:

Type Identifier Name Description
1 [MSG_TX][msg_tx]{:#term-msg_tx}{:.term} The hash is a TXID.
2 [MSG_BLOCK][msg_block]{:#term-msg_block}{:.term} The hash is of a block header.
3 [MSG_FILTERED_BLOCK][msg_filtered_block]{:#term-msg_filtered_block}{:.term} The hash is of a block header; identical to MSG_BLOCK. When used in a getdata message, this indicates the response should be a merkleblock message rather than a block message (but this only works if a bloom filter was previously configured). Only for use in getdata messages.

Type identifier zero and type identifiers greater than three are reserved for future implementations. Bitcoin Core ignores all inventories with one of these unknown types.

{% endautocrossref %}

Block

{% autocrossref %}

The block message transmits a single serialized block in the format described in the [serialized blocks section][section serialized blocks]. See that section for an example hexdump.

It is sent in reply to a getdata message which had an inventory type of MSG_BLOCK and the header hash of the particular block being requested.

{% endautocrossref %}

GetBlocks

{% autocrossref %}

The getblocks message requests an inv message that provides block header hashes starting from a particular point in the block chain. It allows a peer which has been disconnected or started for the first time to get the data it needs to request the blocks it hasn't seen.

Peers which have been disconnected may have stale blocks in their locally-stored block chain, so the getblocks message allows the requesting peer to provide the receiving peer with multiple header hashes at various heights on their local chain. This allows the receiving peer to find, within that list, the last header hash they had in common and reply with all subsequent header hashes.

Note: the receiving peer itself may respond with an inv message containing header hashes of stale blocks. It is up to the requesting peer to poll all of its peers to find the best block chain.

If the receiving peer does not find a common header hash within the list, it will assume the last common block was the genesis block (block zero), so it will reply with in inv message containing header hashes starting with block one (the first block after the genesis block).

Bytes Name Data Type Description
4 version uint32_t The protocol version number; the same as sent in the version message.
Varies hash count compactSize uint The number of header hashes provided not including the stop hash. There is no limit except that the byte size of the entire message must be below the [MAX_SIZE][max_size] limit; typically from 1 to 200 hashes are sent.
Varies block header hashes char[32] One or more block header hashes (32 bytes each) in internal byte order. Hashes should be provided in reverse order of block height, so highest-height hashes are listed first and lowest-height hashes are listed last.
32 stop hash char[32] The header hash of the last header hash being requested; set to all zeroes to request an inv message with 500 header hashes (the maximum which will ever be sent as a reply to this message).

The following annotated hexdump shows a getblocks message. (The message header has been omitted.)

{% highlight text %} 71110100 ........................... Protocol version: 70001 02 ................................. Hash count: 2

d39f608a7775b537729884d4e6633bb2 105e55a16a14d31b0000000000000000 ... Hash #1

5c3e6403d40837110a2e8afb602b1c01 714bda7ce23bea0a0000000000000000 ... Hash #2

00000000000000000000000000000000 00000000000000000000000000000000 ... Stop hash {% endhighlight %}

{% endautocrossref %}

GetData

{% autocrossref %}

The getdata message requests one or more data objects from another node. The objects are requested by an inventory, which the requesting node typically previously received by way of an inv message.

The response to a getdata message can be a tx message, block message, merkleblock message, or notfound message.

This message cannot be used to request arbitrary data, such as historic transactions no longer in the memory pool or relay set. Full nodes may not even be able to provide older blocks if they've pruned old transactions from their block database. For this reason, the getdata message should usually only be used to request data from a node which previously advertised it had that data by sending an inv message.

The format and maximum size limitations of the getdata message are identical to the inv message; only the message header differs.

{% endautocrossref %}

GetHeaders

{% autocrossref %}

The getheaders message requests a headers message that provides block headers starting from a particular point in the block chain. It allows a peer which has been disconnected or started for the first time to get the headers it hasnt seen yet.

The getheaders message is nearly identical to the getblocks message, with one minor difference: the inv reply to the getblocks message will include no more than 500 block header hashes; the headers reply to the getheaders message will include as many as 2,000 block headers.

Headers

The headers message sends one or more block headers to a node which previously requested certain headers with a getheaders message.

Bytes Name Data Type Description
Varies count compactSize uint Number of block headers up to a maximum of 2,000.
Varies headers block_header Block headers: each 80-byte block header is in the format described in the [block headers section][block header] with an additional 0x00 suffixed. This 0x00 is called the transaction count, but because the headers message doesn't include any transactions, the transaction count is always zero.

The following annotated hexdump shows a headers message. (The message header has been omitted.)

{% highlight text %} 01 ................................. Header count: 1

02000000 ........................... Block version: 2 b6ff0b1b1680a2862a30ca44d346d9e8 910d334beb48ca0c0000000000000000 ... Hash of previous block's header 9d10aa52ee949386ca9385695f04ede2 70dda20810decd12bc9b048aaab31471 ... Merkle root 24d95a54 ........................... Unix time: 1415239972 30c31b18 ........................... Target (bits) fe9f0864 ........................... Nonce

00 ......... Transaction Count (0x00) {% endhighlight %}

{% endautocrossref %}

Inv

{% autocrossref %}

The inv message (inventory message) transmits one or more inventories of objects known to the transmitting peer. It can be sent unsolicited to announce new transactions or blocks, or it can be sent in reply to a getblocks message or mempool message.

The receiving peer can compare the inventories from an inv message against the inventories it has already seen, and then use a follow-up message to request unseen objects.

Bytes Name Data Type Description
Varies count compactSize uint The number of inventory entries.
Varies inventory inventory One or more inventory entries up to a maximum of 50,000 entries.

The following annotated hexdump shows an inv message with two inventory entries. (The message header has been omitted.)

{% highlight text %} 02 ................................. Count: 2

01000000 ........................... Type: MSG_TX de55ffd709ac1f5dc509a0925d0b1fc4 42ca034f224732e429081da1b621f55a ... Hash (TXID)

01000000 ........................... Type: MSG_TX 91d36d997037e08018262978766f24b8 a055aaf1d872e94ae85e9817b2c68dc7 ... Hash (TXID) {% endhighlight %}

{% endautocrossref %}

MemPool

{% autocrossref %}

The mempool message requests the TXIDs of transactions that the receiving node has verified are valid but which have not yet appeared in a block. That is, transactions which are in the receiving node's memory pool. The response to the mempool message is one or more inv messages containing the TXIDs in the usual inventory format. The mempool message was introduced in protocol version 60002 as implemented in Bitcoin Core 0.7.0 (September 2012).

Sending the mempool message is mostly useful when a program first connects to the network. Full nodes can use it to quickly gather most or all of the unconfirmed transactions available on the network; this is especially useful for miners trying to gather transactions for their transaction fees. SPV clients can set a filter before sending a mempool to only receive transactions that match that filter; this allows a recently-started client to get most or all unconfirmed transactions related to its wallet.

The inv response to the mempool message is, at best, one node's view of the network---not a complete list of unconfirmed transactions on the network. Here are some additional reasons the list might not be complete:

  • Before Bitcoin Core 0.9.0, the response to the mempool message was only one inv message. An inv message is limited to 50,000 inventories, so a node with a memory pool larger than 50,000 entries would not send everything. Later versions of Bitcoin Core send as many inv messages as needed to reference its complete memory pool.

  • The mempool message is not currently fully compatible with the filterload message's BLOOM_UPDATE_ALL and BLOOM_UPDATE_P2PUBKEY_ONLY flags. Mempool transactions are not sorted like in-block transactions, so a transaction (tx2) spending an output can appear before the transaction (tx1) containing that output, which means the automatic filter update mechanism won't operate until the second-appearing transaction (tx1) is seen---missing the first-appearing transaction (tx2). It has been proposed in [Bitcoin Core issue #2381][] that the transactions should be sorted before being processed by the filter.

There is no payload in a mempool message. See the [message header section][message header] for an example of a message without a payload.

{% endautocrossref %}

MerkleBlock

{% autocrossref %}

The merkleblock message is a reply to a getdata message which requested a block using the inventory type MSG_MERKLEBLOCK. It is only part of the reply: if any matching transactions are found, they will be sent separately as tx messages.

This message is part of the bloom filters described in BIP37, added in protocol version 70001 and implemented in Bitcoin Core 0.8.0 (February 2013).

If a filter has been previous set with the filterload message, the merkleblock message will contain the TXIDs of any transactions in the requested block that matched the filter, as well as any parts of the block's merkle tree necessary to connect those transactions to the block header's merkle root. The message also contains a complete copy of the block header to allow the client to hash it and confirm its proof of work.

Bytes Name Data Type Description
80 block header block_header The block header in the format described in the [block header section][block header].
4 transaction count uint32_t The number of transactions in the block (including ones that don't match the filter).
Varies hash count compactSize uint The number of hashes in the following field.
Varies hashes char[32] One or more hashes of both transactions and merkle nodes in internal byte order. Each hash is 32 bits.
Varies flag byte count compactSize uint The number of flag bytes in the following field.
Varies flags byte[] A sequence of bits packed eight in a byte with the least significant bit first. May be padded to the nearest byte boundary but must not contain any more bits than that. Used to assign the hashes to particular nodes in the merkle tree as described below.

The annotated hexdump below shows a merkleblock message which corresponds to the examples below. (The message header has been omitted.)

{% highlight text %} 01000000 ........................... Block version: 1 82bb869cf3a793432a66e826e05a6fc3 7469f8efb7421dc88067010000000000 ... Hash of previous block's header 7f16c5962e8bd963659c793ce370d95f 093bc7e367117b3c30c1f8fdd0d97287 ... Merkle root 76381b4d ........................... Time: 1293629558 4c86041b ........................... nBits: 0x04864c * 256**(0x1b-3) 554b8529 ........................... Nonce

07000000 ........................... Transaction count: 7 04 ................................. Hash count: 4

3612262624047ee87660be1a707519a4 43b1c1ce3d248cbfc6c15870f6c5daa2 ... Hash #1 019f5b01d4195ecbc9398fbf3c3b1fa9 bb3183301d7a1fb3bd174fcfa40a2b65 ... Hash #2 41ed70551dd7e841883ab8f0b16bf041 76b7d1480e4f0af9f3d4c3595768d068 ... Hash #3 20d2a7bc994987302e5b1ac80fc425fe 25f8b63169ea78e68fbaaefa59379bbf ... Hash #4

01 ................................. Flag bytes: 1 1d ................................. Flags: 1 0 1 1 1 0 0 0 {% endhighlight %}

Note: when fully decoded, the above merkleblock message provided the TXID for a single transaction that matched the filter. In the network traffic dump this output was taken from, the full transaction belonging to that TXID was sent immediately after the the merkleblock message as a tx message.

Parsing A MerkleBlock

As seen in the annotated hexdump above, the merkleblock message provides three special data types: a transaction count, a list of hashes, and a list of one-bit flags.

You can use the transaction count to construct an empty merkle tree. We'll call each entry in the tree a node; on the bottom are TXID nodes---the hashes for these nodes are TXIDs; the remaining nodes (including the merkle root) are non-TXID nodes---they may actually have the same hash as a TXID, but we treat them differently.

Example Of Parsing A MerkleBlock Message

Keep the hashes and flags in the order they appear in the merkleblock message. When we say "next flag" or "next hash", we mean the next flag or hash on the list, even if it's the first one we've used so far.

Start with the merkle root node and the first flag. The table below describes how to evaluate a flag based on whether the node being processed is a TXID node or a non-TXID node. Once you apply a flag to a node, never apply another flag to that same node or reuse that same flag again.

Flag TXID Node Non-TXID Node
0 Use the next hash as this node's TXID, but this transaction didn't match the filter. Use the next hash as this node's hash. Don't process any descendant nodes.
1 Use the next hash as this node's TXID, and mark this transaction as matching the filter. The hash needs to be computed. Process the left child node to get its hash; process the right child node to get its hash; then concatenate the two hashes as 64 raw bytes and hash them to get this node's hash.

Any time you descend into a node for the first time, evaluate the next flag. Never use a flag at any other time.

When processing a child node, you may need to process its children (the grandchildren of the original node) or further-descended nodes before returning to the parent node. This is expected---keep processing depth first until you reach a TXID node or a non-TXID node with a flag of 0.

After you process a TXID node or a non-TXID node with a flag of 0, stop processing flags and begin to ascend the tree. As you ascend, compute the hash of any nodes for which you now have both child hashes or for which you now have the sole child hash. See the [merkle tree section][section merkle trees] for hashing instructions. If you reach a node where only the left hash is known, descend into its right child (if present) and further descendants as necessary.

However, if you find a node whose left and right children both have the same hash, fail. This is related to CVE-2012-2459.

Continue descending and ascending until you have enough information to obtain the hash of the merkle root node. If you run out of flags or hashes before that condition is reached, fail. Then perform the following checks (order doesn't matter):

  • Fail if there are unused hashes in the hashes list.

  • Fail if there are unused flag bits---except for the minimum number of bits necessary to pad up to the next full byte.

  • Fail unless the hash of the merkle root node is identical to the merkle root in the block header.

  • Fail if the block header is invalid. Remember to ensure that the hash of the header is less than or equal to the target threshold encoded by the nBits header field. Your program should also, of course, attempt to ensure the header belongs to the best block chain and that the user knows how many confirmations this block has.

For a detailed example of parsing a merkleblock message, please see the corresponding [merkleblock examples section][section merkleblock example].

Creating A MerkleBlock

It's easier to understand how to create a merkleblock message after you understand how to parse an already-created message, so we recommend you read the parsing section above first.

Create a complete merkle tree with TXIDs on the bottom row and all the other hashes calculated up to the merkle root on the top row. For each transaction that matches the filter, track its TXID node and all of its ancestor nodes.

Example Of Creating A MerkleBlock Message

Start processing the tree with the merkle root node. The table below describes how to process both TXID nodes and non-TXID nodes based on whether the node is a match, a match ancestor, or neither a match nor a match ancestor.

TXID Node Non-TXID Node
Neither Match Nor Match Ancestor Append a 0 to the flag list; append this node's TXID to the hash list. Append a 0 to the flag list; append this node's hash to the hash list. Do not descend into its child nodes.
Match Or Match Ancestor Append a 1 to the flag list; append this node's TXID to the hash list. Append a 1 to the flag list; process the left child node. Then, if the node has a right child, process the right child. Do not append a hash to the hash list for this node.

Any time you descend into a node for the first time, a flag should be appended to the flag list. Never put a flag on the list at any other time, except when processing is complete to pad out the flag list to a byte boundary.

When processing a child node, you may need to process its children (the grandchildren of the original node) or further-descended nodes before returning to the parent node. This is expected---keep processing depth first until you reach a TXID node or a node which is neither a TXID nor a match ancestor.

After you process a TXID node or a node which is neither a TXID nor a match ancestor, stop processing and begin to ascend the tree until you find a node with a right child you haven't processed yet. Descend into that right child and begin processing again.

After you fully process the merkle root node according to the instructions in the table above, processing is complete. Pad your flag list to a byte boundary and construct the merkleblock message using the template near the beginning of this subsection.

{% endautocrossref %}

NotFound

{% autocrossref %}

The notfound message is a reply to a getdata message which requested an object the receiving node does not have available for relay. (Nodes are not expected to relay historic transactions which are no longer in the memory pool or relay set. Nodes may also have pruned spent transactions from older blocks, making them unable to send those blocks.)

The format and maximum size limitations of the notfound message are identical to the inv message; only the message header differs.

{% endautocrossref %}

Tx

{% autocrossref %}

The tx message transmits a single transaction in the raw transaction format. It can be sent in a variety of situations;

  • Transaction Response: Bitcoin Core and BitcoinJ will send it in response to a getdata message that requests the transaction with an inventory type of MSG_TX.

  • MerkleBlock Response: Bitcoin Core will send it in response to a getdata message that requests a merkleblock with an inventory type of MSG_MERKLEBLOCK. (This is in addition to sending a merkleblock message.) Each tx message in this case provides a matched transaction from that block.

  • Unsolicited: BitcoinJ will send a tx message unsolicited for transactions it originates.

For an example hexdump of the raw transaction format, see the [raw transaction section][raw format].

{% endautocrossref %}