From 0af6ef09ff5a2c76d5e027a2ab8fa9798c56f354 Mon Sep 17 00:00:00 2001 From: "David A. Harding" Date: Sun, 2 Nov 2014 22:21:36 -0500 Subject: [PATCH] Dev Docs: Detail Transaction Format Provides a detailed description of the transaction format, replacing an example hexdump taken from the wiki. I'm putting this in the transaction section as the format is necessary for the creation of txids, which are used as merkle leaves (so are covered by consensus rules). However, this is also the format used by several P2P network messages to transmit transactions, so I'll be linking back to it from there as I document those messages. --- _autocrossref.yaml | 8 + _includes/guide_transactions.md | 2 +- _includes/ref_transactions.md | 270 ++++++++++++++++++++++---------- _includes/references.md | 9 +- 4 files changed, 203 insertions(+), 86 deletions(-) diff --git a/_autocrossref.yaml b/_autocrossref.yaml index 3ec9ec2a..bd8eeb31 100644 --- a/_autocrossref.yaml +++ b/_autocrossref.yaml @@ -46,6 +46,9 @@ coinbase: coinbase transaction coinbase transaction: coinbase transactions: coinbase transaction coinbase field: +compactsize uint: compactsize unsigned integer +compactsize unsigned integer: +compactsize unsigned integers: compactsize unsigned integer confirm: confirmed: confirmation: @@ -168,6 +171,9 @@ public keys: public key public key infrastructure: pki '`r`': r raw format: +raw transaction: raw format +raw transactions: raw format +raw transaction format: raw format rawtransaction format: raw format receipt: recurrent rebilling: @@ -325,3 +331,5 @@ CVE-2012-2459: '`walletlock`': rpc walletlock '`walletpassphrase`': rpc walletpassphrase '`walletpassphrasechange`': rpc walletpassphrasechange + +Bitcoin Core 0.9.3: diff --git a/_includes/guide_transactions.md b/_includes/guide_transactions.md index d667704b..267fd3bc 100644 --- a/_includes/guide_transactions.md +++ b/_includes/guide_transactions.md @@ -575,7 +575,7 @@ maximum. Since sequence numbers are not used by the network for any other purpose, setting any sequence number to zero is sufficient to enable locktime. -Locktime itself is an unsigned 4-byte number which can be parsed two ways: +Locktime itself is an unsigned 4-byte integer which can be parsed two ways: * If less than 500 million, locktime is parsed as a block height. The transaction can be added to any block which has this height or higher. diff --git a/_includes/ref_transactions.md b/_includes/ref_transactions.md index d90742e4..fe73a5e4 100644 --- a/_includes/ref_transactions.md +++ b/_includes/ref_transactions.md @@ -64,7 +64,8 @@ Page][wiki script], with an authoritative list in the `opcodetype` enum of the Bitcoin Core [script header file][core script.h] ![Warning icon](/img/icon_warning.svg) -**Signature script modification warning:** Signature scripts are not signed, so anyone can modify them. This +**Signature script modification warning:** +Signature scripts are not signed, so anyone can modify them. This means signature scripts should only contain data and data-pushing op codes which can't be modified without causing the pubkey script to fail. Placing non-data-pushing op codes in the signature script currently @@ -189,103 +190,206 @@ against the extracted checksum, and then remove the version byte. {% autocrossref %} -Bitcoin transactions are broadcast between peers and stored in the -block chain in a serialized byte format, called [raw format][]{:#term-raw-format}{:.term}. Bitcoin Core -and many other tools print and accept raw transactions encoded as hex. +Bitcoin transactions are broadcast between peers +in a serialized byte format, called [raw format][]{:#term-raw-format}{:.term}. +It is this form of a transaction which is SHA256(SHA256()) hashed to create +the TXID and, ultimately, the merkle root of a block containing the +transaction---making the transaction format part of the consensus rules. -The binary form of a raw transaction is SHA256(SHA256()) hashed to create -its TXID. Bitcoin Core RPCs use a reversed byte order for hashes; see the [subsection about hash byte -order][section hash byte order] for details. +Bitcoin Core and many other tools print and accept raw transactions +encoded as hex. -A sample raw transaction is the first non-coinbase transaction, made in -[block 170][block170]. To get the transaction, use the `getrawtransaction` RPC with -that transaction's txid (provided below): +As of Bitcoin Core 0.9.3 (October 2014), all transactions use the +version 1 format described below. (Note: transactions in the block chain +are allowed to list a higher version number to permit soft forks, but +they are treated as version 1 transactions by current software.) + +A raw transaction has the following top-level format: + +| Bytes | Name | Data Type | Description +|----------|--------------|---------------------|------------- +| 4 | version | uint32_t | Transaction version number; currently version 1. Programs creating transactions using newer consensus rules may use higher version numbers. +| *Varies* | tx_in count | compactSize uint | Number of inputs in this transaction. +| *Varies* | tx_in | *See TxIn Below* | Transaction inputs. +| *Varies* | tx_out count | compactSize uint | Number outputs in this transaction +| *Varies* | tx_out | *See TxOut Below* | Transaction outputs. +| 4 | lock_time | uint32_t | A time (Unix epoch time) or block number. See the [locktime parsing rules][]. + +A transaction may have multiple inputs and outputs, so the TxIn and +TxOut structures may recur within a transaction. CompactSize unsigned +integers are a form of variable-length integers; they are described in +the [CompactSize section][CompactSize unsigned integer]. {% endautocrossref %} -~~~ -> bitcoin-cli getrawtransaction \ - f4184fc596403b9d638783cf57adfe4c75c605f6356fbc91338530e9831e9e16 +**TxIn: A Transaction Input (Non-Coinbase)** -0100000001c997a5e56e104102fa209c6a852dd90660a20b2d9c352423e\ -dce25857fcd3704000000004847304402204e45e16932b8af514961a1d3\ -a1a25fdf3f4f7732e9d624c6c61548ab5fb8cd410220181522ec8eca07d\ -e4860a4acdd12909d831cc56cbbac4622082221a8768d1d0901ffffffff\ -0200ca9a3b00000000434104ae1a62fe09c5f51b13905f07f06b99a2f71\ -59b2225f374cd378d71302fa28414e7aab37397f554a7df5f142c21c1b7\ -303b8a0626f1baded5c72a704f7e6cd84cac00286bee000000004341041\ -1db93e1dcdb8a016b49840f8c53bc1eb68a382e97b1482ecad7b148a690\ -9a5cb2e0eaddfb84ccf9744464f82e160bfa9b8b64f9d4c03f999b8643f\ -656b412a3ac00000000 -~~~ +{% autocrossref %} -A byte-by-byte analysis by Amir Taaki (Genjix) of this transaction is -provided below. (Originally from the Bitcoin Wiki -[OP_CHECKSIG page](https://en.bitcoin.it/wiki/OP_CHECKSIG); Genjix's -text has been updated to use the terms used in this document.) +Each non-coinbase input spends an outpoint from a previous transaction. +(Coinbase inputs are described separately after the example section below.) -~~~ -01 00 00 00 version number -01 number of inputs (var_uint) +| Bytes | Name | Data Type | Description +|----------|------------------|----------------------|-------------- +| 36 | previous_output | *See Outpoint Below* | The previous outpoint being spent. +| *Varies* | script bytes | compactSize uint | The number of bytes in the signature script. Maximum is 10,000 bytes. +| *Varies* | signature script | char[] | A script-language script which satisfies the conditions placed in the outpoint's pubkey script. Should only contain data pushes; see the [signature script modification warning][]. +| 4 | sequence | uint32_t | Sequence number; see [sequence number][]. Default for Bitcoin Core and almost all other programs is 0xffffffff. -input 0: -c9 97 a5 e5 6e 10 41 02 previous tx hash (txid) -fa 20 9c 6a 85 2d d9 06 -60 a2 0b 2d 9c 35 24 23 -ed ce 25 85 7f cd 37 04 -00 00 00 00 previous output index +{% endautocrossref %} -48 size of signature script (var_uint) +**Outpoint: The Specific Part Of A Specific Output** -Signature script for input 0: -47 push 71 bytes to stack -30 44 02 20 4e 45 e1 69 -32 b8 af 51 49 61 a1 d3 -a1 a2 5f df 3f 4f 77 32 -e9 d6 24 c6 c6 15 48 ab -5f b8 cd 41 02 20 18 15 -22 ec 8e ca 07 de 48 60 -a4 ac dd 12 90 9d 83 1c -c5 6c bb ac 46 22 08 22 -21 a8 76 8d 1d 09 01 -ff ff ff ff sequence number +{% autocrossref %} -02 number of outputs (var_uint) +Because a single transaction can include multiple outputs, the outpoint +structure includes both a TXID and an output index number to refer to +specific output. -output 0: -00 ca 9a 3b 00 00 00 00 amount = 10.00000000 BTC -43 size of pubkey script (var_uint) +| Bytes | Name | Data Type | Description +|-------|-------|-----------|-------------- +| 32 | hash | char[32] | The TXID of the transaction holding the output to spend. The TXID is a hash provided here in internal byte order. +| 4 | index | uint32_t | The output index number of the specific output to spend from the transaction. The first output is 0x00000000. -Pubkey script for output 0: -41 push 65 bytes to stack -04 ae 1a 62 fe 09 c5 f5 -1b 13 90 5f 07 f0 6b 99 -a2 f7 15 9b 22 25 f3 74 -cd 37 8d 71 30 2f a2 84 -14 e7 aa b3 73 97 f5 54 -a7 df 5f 14 2c 21 c1 b7 -30 3b 8a 06 26 f1 ba de -d5 c7 2a 70 4f 7e 6c d8 -4c -ac OP_CHECKSIG +{% endautocrossref %} -output 1: -00 28 6b ee 00 00 00 00 amount = 40.00000000 BTC -43 size of pubkey script (var_uint) +**TxOut: A Transaction Output** -Pubkey script for output 1: -41 push 65 bytes to stack -04 11 db 93 e1 dc db 8a -01 6b 49 84 0f 8c 53 bc -1e b6 8a 38 2e 97 b1 48 -2e ca d7 b1 48 a6 90 9a -5c b2 e0 ea dd fb 84 cc -f9 74 44 64 f8 2e 16 0b -fa 9b 8b 64 f9 d4 c0 3f -99 9b 86 43 f6 56 b4 12 -a3 -ac OP_CHECKSIG +{% autocrossref %} -00 00 00 00 locktime -~~~ +Each output spends a certain number of satoshis, placing them under +control of anyone who can satisfy the provided pubkey script. +| Bytes | Name | Data Type | Description +|----------|-----------------|------------------|-------------- +| 8 | value | int64_t | Number of satoshis to spend. May be zero; the sum of all outputs may not exceed the sum of satoshis previously spent to the outpoints provided in the input section. (Exception: coinbase transactions spend the block subsidy and collected transaction fees.) +| 1+ | pk_script bytes | compactSize uint | Number of bytes in the pubkey script. Maximum is 10,000 bytes. +| *Varies* | pk_script | char[] | Defines the conditions which must be satisfied to spend this output. + +**Example** + +The sample raw transaction itemized below is the one created in the +[Simple Raw Transaction section][section simple raw transaction] of the +Developer Examples. It spends a previous pay-to-pubkey output by paying +to a new pay-to-pubkey-hash (P2PKH) output. + +{% highlight text %} +01000000 ................................... Version + +01 ......................................... Number of TxIns +| +| 7b1eabe0209b1fe794124575ef807057 +| c77ada2138ae4fa8d6c4de0398a14f3f ......... Outpoint TXID +| 00000000 ................................. Outpoint index number +| +| 49 ....................................... Bytes in sig. script: 73 +| | 48 ..................................... Push 72 bytes as data +| | | 30450221008949f0cb400094ad2b5eb3 +| | | 99d59d01c14d73d8fe6e96df1a7150de +| | | b388ab8935022079656090d7f6bac4c9 +| | | a94e0aad311a4268e082a725f8aeae05 +| | | 73fb12ff866a5f01 ..................... Secp256k1 signature +| +| ffffffff ................................. Sequence number: UINT32_MAX + +01 ......................................... Number of outputs +| f0ca052a01000000 ......................... Satoshis (49.99990000 BTC) +| +| 19 ....................................... Bytes in pubkey script: 25 +| | 76 ..................................... OP_DUP +| | a9 ..................................... OP_HASH160 +| | 14 ..................................... Push 20 bytes as data +| | | cbc20a7664f2f69e5355aa427045bc15 +| | | e7c6c772 ............................. PubKey hash +| | 88 ..................................... OP_EQUALVERIFY +| | ac ..................................... OP_CHECKSIG + +00000000 ................................... locktime: 0 (a block height) +{% endhighlight %} + + +**Coinbase Input: The Input Of The First Transaction In A Block** + +The first transaction in a block, called the coinbase transaction, must +have exactly one input, called a coinbase. The coinbase input currently +has the following format. + +| Bytes | Name | Data Type | Description +|----------|--------------------|----------------------|-------------- +| 32 | hash (null) | char[32] | A 32-byte null, as a coinbase has no previous outpoint. +| 4 | index (UINT32_MAX) | uint32_t | 0xffffffff, as a coinbase has no previous outpoint. +| *Varies* | script bytes | compactSize uint | The number of bytes in the coinbase script, up to a maximum of 100 bytes. +| *Varies* (4) | height | script | The block height of this block as required by BIP34. Uses script language: starts with a data-pushing op code that indicates how many bytes to push to the stack followed by the block height as a little-endian unsigned integer. This script must be as short as possible, otherwise it may be rejected.

The data-pushing op code will be 0x03 and the total size four bytes until block 16,777,216 about 300 years from now. +| *Varies* | coinbase script | *None* | Arbitrary data not exceeding 100 bytes minus the (4) height bytes. Miners commonly place an extra nonce in this field to update the block header merkle root during hashing. +| 4 | sequence | uint32_t | Sequence number; see [sequence number][]. + +Most (but not all) blocks prior to block height 227,836 used block +version 1 which did not require the height parameter to be prefixed to +the coinbase script. The block height parameter is now required. + +Although the coinbase script is arbitrary data, if it includes the +bytes used by any signature-checking operations such as `OP_CHECKSIG`, +those signature checks will be counted as signature operations (sigops) +towards the block's sigop limit. To avoid this, you can prefix all data +with the appropriate push operation. + +An itemized coinbase transaction: + +{% highlight text %} +01000000 .............................. Version + +01 .................................... Number of inputs +| 00000000000000000000000000000000 +| 00000000000000000000000000000000 ... Previous outpoint TXID +| ffffffff ............................ Previous outpoint index +| +| 29 .................................. Bytes in coinbase +| | +| | 03 ................................ Bytes in height +| | | 4e0105 .......................... Height: 328014 +| | +| | 062f503253482f0472d35454085fffed +| | f2400000f90f54696d65202620486561 +| | 6c74682021 ........................ Arbitrary data +| 00000000 ............................ Sequence + +01 .................................... Output count +| 2c37449500000000 .................... Satoshis (25.04275756 BTC) +| 1976a914a09be8040cbf399926aeb1f4 +| 70c37d1341f3b46588ac ................ P2PKH script +| 00000000 ............................ Locktime +{% endhighlight %} + +{% endautocrossref %} + +### CompactSize Unsigned Integers + +{% autocrossref %} + +The raw transaction format and several peer-to-peer network messages use +a type of variable-length integer to indicate the number of bytes in a +following piece of data. + +Bitcoin Core code and this document refers to these variable length +integers as compactSize. Many other documents refer to them as var_int +or varInt, but this risks conflation with other variable-length integer +encodings---such as the CVarInt class used in Bitcoin Core for +serializing data to disk. Because it's used in the transaction format, +the format of compactSize unsigned integers is part of the consensus +rules. + +For numbers from 0 to 252, compactSize unsigned integers look like +regular unsigned integers. For other numbers up to 0xffffffffffffffff, a +byte is prefixed to the number to indicate its length---but otherwise +the numbers look like regular unsigned integers in little-endian order. + +| Value | Bytes Used | Format +|-----------------------|------------|----------------------------------------- +| <= 252 | 1 | uint8_t +| <= 0xffff | 3 | 0xfd followed by the number as uint16_t +| <= 0xffffffff | 5 | 0xfe followed by the number as uint32_t +| <= 0xffffffffffffffff | 9 | 0xff followed by the number as uint64_t + +For example, the number 515 is encoded as 0xfd0302. + +{% endautocrossref %} diff --git a/_includes/references.md b/_includes/references.md index 917d0ecd..b8c0a6fb 100644 --- a/_includes/references.md +++ b/_includes/references.md @@ -26,6 +26,7 @@ [child public key]: /en/developer-guide#term-child-public-key "In HD wallets, a public key derived from a parent public key or a corresponding child private key" [coinbase field]: /en/developer-reference#term-coinbase-field "A special input-like field for coinbase transactions" [coinbase transaction]: /en/developer-reference#term-coinbase-tx "A special transaction which miners must create when they generate a block" +[compactsize unsigned integer]: /en/developer-reference#compactsize-unsigned-integers "A type of variable-length integer" [confirm]: /en/developer-guide#term-confirmation "A transaction included in a block currently on the block chain" [confirmed]: /en/developer-guide#term-confirmation "A transaction included in a block currently on the block chain" [confirmed transactions]: /en/developer-guide#term-confirmation "Transactions included in a block currently on the block chain" @@ -242,6 +243,7 @@ [rpc walletpassphrasechange]: /en/developer-reference#walletpassphrasechange +[Bitcoin Core 0.9.3]: /en/release/v0.9.3 [bitcoin URI subsection]: /en/developer-guide#bitcoin-uri [bitcoinpdf]: https://bitcoin.org/bitcoin.pdf [core executable]: /en/download @@ -255,14 +257,17 @@ [devguide payment processing]: /en/developer-guide#payment-processing [devguide wallets]: /en/developer-guide#wallets [devref wallets]: /en/developer-reference#wallets +[locktime parsing rules]: /en/developer-guide#locktime_parsing_rules [Merge Avoidance subsection]: /en/developer-guide#merge-avoidance [micropayment channel]: /en/developer-guide#term-micropayment-channel [raw transaction format]: /en/developer-reference#raw-transaction-format [RPC]: /en/developer-reference#remote-procedure-calls-rpcs [RPCs]: /en/developer-reference#remote-procedure-calls-rpcs -[section hash byte order]: /en/developer-reference#hash-byte-order -[section verifying payment]: /en/developer-guide#verifying-payment [section detecting forks]: /en/developer-guide#detecting-forks +[section hash byte order]: /en/developer-reference#hash-byte-order +[section simple raw transaction]: /en/developer-examples#simple-raw-transaction +[section verifying payment]: /en/developer-guide#verifying-payment +[signature script modification warning]: /en/developer-reference#signature_script_modification_warning [transaction object format]: /en/developer-reference#term-transaction-object-format [Verification subsection]: /en/developer-guide#verifying-payment [X509Certificates]: /en/developer-examples#term-x509certificates