Dev Docs: Describe Serialized Block Header And Block Format

* Replace current description of the block header with a better
description.

    * Describe the various version numbers.

    * Describe how the merkle root is constructed.

    * Describe how nBits is parsed and how to correctly create it to
      avoid negative values.

* Describe the serialized block format used to calculate max block size.
This commit is contained in:
David A. Harding 2014-11-06 21:45:12 -05:00
parent 1863fad011
commit a8f8f750c8
No known key found for this signature in database
GPG key ID: 4B29C30FF29EC4B7
13 changed files with 872 additions and 82 deletions

View file

@ -2,94 +2,214 @@
The following subsections briefly document core block details.
### Block Contents
### Block Headers
{% autocrossref %}
This section describes [version 2 blocks][v2 block]{:#term-v2-block}{:.term}, which are any blocks with a
block height greater than 227,835. (Version 1 and version 2 blocks were
intermingled for some time before that point.) Future block versions may
break compatibility with the information in this section. You can determine
the version of any block by checking its `version` field using
bitcoind RPC calls.
Block headers are serialized in the 80-byte format described below and then
hashed as part of Bitcoin's proof-of-work algorithm, making the
serialized header format part of the consensus rules.
As of version 2 blocks, each block consists of four root elements:
| Bytes | Name | Data Type | Description
|-------|---------------------|-----------|----------------
| 4 | version | uint32_t | The [block version][]{:#term-block-version}{:.term} number indicates which set of block validation rules to follow. See the list of block versions below.
| 32 | previous block hash | char[32] | A SHA256(SHA256()) hash in internal byte order of the previous block's header. This ensures no previous block can be changed without also changing this block's header.
| 32 | merkle root hash | char[32] | A SHA256(SHA256()) hash in internal byte order. The merkle root is derived from hashes of all transaction included in this block, ensuring none of those transactions can be modified without modifying the header. See the [merkle trees section][section merkle trees] below.
| 4 | time | uint32_t | The [block time][]{:#term-block-time}{:.term} is a Unix epoch time when the miner started hashing the header (according to the miner). Must be greater than or equal to the median time of the previous 11 blocks. Full nodes will not accept blocks with headers more than two hours in the future according to their clock.
| 4 | nBits | uint32_t | An encoded version of the target threshold this block's header hash must be less than or equal to. See the nBits format described below.
| 4 | nonce | uint32_t | An arbitrary number miners change to modify the header hash in order to produce a hash below the target threshold. If all 32-bit values are tested, the time can be updated or the coinbase transaction can be changed and the merkle root updated.
1. A [magic number][block header magic]{:#term-block-header-magic}{:.term} (0xd9b4bef9).
The hashes are in internal byte order; the other values are all
in little-endian order.
2. A 4-byte unsigned integer indicating how many bytes follow until the
end of the block. Although this field would suggest maximum block
sizes of 4 GiB, max block size is currently capped at 1 MB and the
default max block size (used by most miners) is 750 KB (although
this will likely increase over time).
An example header in hex:
3. An 80-byte block header described in the section below.
{% highlight text %}
02000000 .......................... Block version: 2
4. One or more transactions.
b6ff0b1b1680a2862a30ca44d346d9e8
910d334beb48ca0c0000000000000000 ... Hash of previous block's header
9d10aa52ee949386ca9385695f04ede2
70dda20810decd12bc9b048aaab31471 ... Merkle root
The first transaction in a block must be a [coinbase transaction][]{:#term-coinbase-tx}{:.term} which should collect and
24d95a54 ........................... Unix time: 1415239972
30c31b18 ........................... Target: 0x1bc330 * 256**(0x18-3)
fe9f0864 ........................... Nonce
{% endhighlight %}
{% endautocrossref %}
#### Block Versions
{% autocrossref %}
* **Version 1** was introduced in the genesis block (January 2009).
* **[Version 2][v2 block]{:#term-v2-block}{:.term}** was introduced in
Bitcoin Core 0.7.0 (September 2012) as a soft fork. As described in
BIP34, valid version 2 blocks require a [block height parameter in the
coinbase][coinbase block height]. Also described in BIP34 are rules
for rejecting certain blocks; based on those rules, Bitcoin Core 0.7.0
and later versions began to reject version 2 blocks without the block
height in coinbase at block height 224,412 (March 2013) and began to
reject new version 1 blocks three weeks later at block height 227,930.
<!-- source for heights: my (@harding) own headers dump and counting
script -->
* **Version 3** blocks will likely be introduced in the near-future as
specified in draft BIP62. Possible changes include:
* Reject version 3 blocks that include any version 2 transactions
that don't adhere to any of the version 2 transaction rules.
These rules are not yet described in this documentation; see
BIP62 for details.
* A soft fork rollout of version 3 blocks identical to the rollout
used for version 2 blocks (described briefly in BIP62 and in more
detail in BIP34).
{% endautocrossref %}
#### Merkle Trees
{% autocrossref %}
*For an overview of merkle trees, see the [block chain guide][merkle
tree].*
The merkle root is constructed using all the TXIDs of transactions in
this block, but first the TXIDs are placed in order as required by the
consensus rules:
* The coinbase transaction's TXID is always placed first.
* Any input within this block can spend an output which also appears in
this block (assuming the spend is otherwise valid). However, the TXID
corresponding to the output must be placed at some point before the
TXID corresponding to the input. This ensures that any program parsing
block chain transactions linearly will encounter each output before it
is used as an input.
If a block only has a coinbase transaction, the coinbase TXID is used as
the merkle root hash.
If a block only has a coinbase transaction and one other transaction,
the TXIDs of those two transactions are placed in order, concatenated as
64 raw bytes, and then SHA256(SHA256()) hashed together to form the
merkle root.
If a block has three or more transactions, intermediate merkle tree rows
are formed. The TXIDs are placed in order and paired, starting with the
coinbase transaction's TXID. Each pair is concatenated together as 64
raw bytes and SHA256(SHA256()) hashed to form a second row of
hashes. If there are an odd (non-even) number of TXIDs, the last TXID is
concatenated with a copy of itself and hashed. If there are more than
two hashes in the second row, the process is repeated to create a third
row (and, if necessary, repeated further to create additional rows).
Once a row is obtained with only two hashes, those hashes are concatenated and
hashed to produce the merkle root.
<!-- built block 170's merkle root with Python to confirm left-to-right order
for A|B concatenation demonstrated below:
sha256(sha256("82501c1178fa0b222c1f3d474ec726b832013f0a532b44bb620cce8624a5feb1169e1e83e930853391bc6f35f605c6754cfead57cf8387639d3b4096c54f18f4".decode("hex")).digest()).digest().encode("hex_codec")
-->
![Example Merkle Tree Construction](/img/dev/en-merkle-tree-construction.svg)
TXIDs and intermediate hashes are always in internal byte order when they're
concatenated, and the resulting merkle root is also in internal byte
order when it's placed in the block header.
{% endautocrossref %}
#### Target nBits
{% autocrossref %}
The target threshold is a 256-bit unsigned integer compared the 256-bit
SHA256(SHA256()) header hash (treated also as an unsigned integer).
However, the header field *nBits* provides only 32 bits of space, so the
target number uses a less precise format called "compact" which works
like a base-256 version of scientific notation:
![Converting nBits Into A Target Threshold](/img/dev/en-nbits-overview.svg)
As a base-256 number, nBits can be quickly parsed as bytes the same way
you might parse a decimal number in base-10 scientific notation:
![Quickly Converting nBits](/img/dev/en-nbits-quick-parse.svg)
<!-- Source for paragraph below: Bitcoin Core src/tests/bignum_tests.cpp:
num.SetCompact(0x04923456);
BOOST_CHECK_EQUAL(num.GetHex(), "-12345600");
BOOST_CHECK_EQUAL(num.GetCompact(), 0x04923456U);
-->
Although the target threshold should be an unsigned integer, the
original nBits implementation inherits properties from a signed data
class, allowing the target threshold to be negative if the high bit of
the significand is set. This is useless---the header hash is
treated as an unsigned number, so it can never be equal to or lower than a
negative target threshold. Bitcoin Core deals with this in two ways:
<!-- source for "Bitcoin Core converts..." src/main.h GetBlockWork() -->
* When parsing nBits, Bitcoin Core converts a negative target
threshold into a target of zero, which the header hash can equal (in
theory, at least).
* When creating a value for nBits, Bitcoin Core checks to see if it will
produce an nBits which will be interpreted as negative; if so, it
divides the significand by 256 and increases the exponent by 1 to
produce the same number with a different encoding.
Some examples taken from the Bitcoin Core test cases:
| nBits | Target | Notes
|------------|------------------|----------------
| 0x01003456 | &nbsp;0x00 |
| 0x01123456 | &nbsp;0x12 |
| 0x02008000 | &nbsp;0x80 |
| 0x05009234 | &nbsp;0x92340000 |
| 0x04923456 | -0x12345600 | High bit set (0x80 in 0x92).
| 0x04123456 | &nbsp;0x12345600 | Inverse of above; no high bit.
Difficulty 1, the minimum allowed difficulty, is represented on mainnet
and the current testnet by the nBits value 0x1d00ffff. Regtest mode uses
a different difficulty 1 value---0x207fffff, the highest possible value
below uint32_max which can be encoded; this allows near-instant building
of blocks in regtest mode.
{% endautocrossref %}
### Serialized Blocks
{% autocrossref %}
Under current consensus rules, a block is not valid unless its
serialized size is less than or equal to 1 MB. All fields described
below are counted towards the serialized size.
| Bytes | Name | Data Type | Description
| 80 | block header | block_header | The block header in the format described in the [block header section][block header].
| *Varies* | txn_count | compactSize uint | The total number of transactions in this block, including the coinbase transaction.
| *Varies* | txns | raw transaction | Every transaction in this block, one after another, in raw transaction format. Transactions must appear in the data stream in the same order their TXIDs appeared in the first row of the merkle tree. See the [merkle tree section][section merkle trees] for details.
The first transaction in a block must be a [coinbase
transaction][]{:#term-coinbase-tx}{:.term} which should collect and
spend any transaction fees paid by transactions included in this block.
All blocks with a block height less than 6,930,000 are entitled to
receive a [block reward][]{:#term-block-reward}{:.term} of newly created bitcoin value, which also
should be spent in the coinbase transaction. (The block reward started
at 50 bitcoins and is being halved every 210,000 blocks---approximately once every four years. As of
June 2014, it's 25 bitcoins.) A coinbase transaction is invalid if it
tries to spend more value than is available from the transaction
fees and block reward.
receive a block subsidy of newly created bitcoin value, which also
should be spent in the coinbase transaction. (The block subsidy started
at 50 bitcoins and is being halved every 210,000 blocks---approximately
once every four years. As of November 2014, it's 25 bitcoins.)
The coinbase transaction has the same basic format as any other
transaction, but it references a single non-existent UTXO and a special
[coinbase field][]{:#term-coinbase-field}{:.term} replaces the field that would normally hold a signature script and
secp256k1 signature. In version 2 blocks, the coinbase parameter must begin with
the current block's block height and may contain additional arbitrary
data or a script up to a maximum total of 100 bytes.
Together, the transaction fees and block subsidy are called the [block
reward][]{:#term-block-reward}{:.term}. A coinbase transaction is
invalid if it tries to spend more value than is available from the
block reward.
{% endautocrossref %}
### Block Header
{% autocrossref %}
The 80-byte block header contains the following six fields:
| Field | Bytes | Format |
|-------------------|--------|--------------------------------|
| 1. Version | 4 | Unsigned Int |
| 2. hashPrevBlock | 32 | Unsigned Int (SHA256 Hash) |
| 3. hashMerkleRoot | 32 | Unsigned Int (SHA256 Hash) |
| 4. Time | 4 | Unsigned Int (Epoch Time) |
| 5. Bits | 4 | Internal Bitcoin Target Format |
| 6. Nonce | 4 | (Arbitrary Data) |
1. The *[block version][]{:#term-block-version}{:.term}* number indicates which set of block validation rules
to follow so Bitcoin Core developers can add features or
fix bugs. As of block height 227,836, all blocks use version number
2.
2. The *hash of the previous block header* puts this block on the
block chain and ensures no previous block can be changed without also
changing this block's header.
3. The *merkle root* is a hash derived from hashes of all the
transactions included in this block. It ensures no transactions can
be modified in this block without changing the block header hash.
4. The *[block time][]{:#term-block-time}{:.term}* is the approximate time when this block was created in
Unix Epoch time format (number of seconds elapsed since
1970-01-01T00:00 UTC). The time value must be greater than the
median time of the previous 11 blocks. No peer will accept a block with a
time currently more than two hours in the future according to the
peer's clock.
5. *Bits* translates into the target threshold value---the maximum allowed
value for this block's hash. The bits value must match the network
difficulty at the time the block was mined.
6. The *[header nonce][]{:#term-header-nonce}{:.term}* is an arbitrary input that miners can change to test different
hash values for the header until they find a hash value less than or
equal to the target threshold. If all values within the nonce's four
bytes are tested, the time can be updated or the
coinbase transaction can be changed and the merkle
root updated.
{% endautocrossref %}