rustly_blockparser

0

Описание

Bitcoin Blockchain Parser written in Rust language

Языки

  • Rust100%
10 месяцев назад
10 месяцев назад
10 месяцев назад
10 месяцев назад
10 лет назад
10 месяцев назад
10 месяцев назад
README.md

rusty-blockparser

rusty-blockparser is a Bitcoin Blockchain Parser that enables data extraction of various types (e.g.: blocks, transactions, scripts, public keys / hashes, balances) and full UTXO dumps.

Supported Blockchains

Bitcoin
,
Namecoin
,
Litecoin
,
Dogecoin
,
Myriadcoin
,
Unobtanium
and
NoteBlockchain
.

IMPORTANT: A local unpruned copy of the blockchain with intact block index and blk files, downloaded with Bitcoin Core 0.15.1+ or similar clients is required. If you are not sure whether your local copy is valid you can apply

--verify
to validate the blockdata and block merkle trees.

Supported Transaction Types

Bitcoin and Bitcoin Testnet transactions are parsed using rust-bitcoin, this includes transactions of type P2SH, P2PKH, P2PK, P2WSH, P2WPKH, P2TR, P2MS, OP_RETURN and SegWit.

Bitcoin forks (e.g.: Dogecoin, Litecoin, etc.) are evaluated via a custom script implementation which includes P2PK, P2PKH, P2SH, P2MS and OP_RETURN.

Analysis and Data Extraction

Data is being extracted via callbacks which are built on top of the core parser. They can be easily extended to extract specific types of information and can be found here.

Extract Balances of all known addresses

The command

balances
extracts the balance of all known addresses and dumps it to a csv file called
balances.csv
with the following format:

balances.csv: address ; balance

Extract all UTXOs along with their corresponding address balances

The command

unspentcsvdump
can be used to dump all UTXOs with their corresponding address balance to a csv file called
unspent.csv
. The csv file is in the following format:

NOTE: The total size of the csv dump is at least 8 GiB (height 635000).

Show OP_RETURN data

The command

opreturn
can be used to show all embedded OP_RETURN data in the terminal that contains valid UTF8.

Extract full CSV dump

The command

csvdump
dumps all data to csv files. This data can be imported to a database for further analysis. NOTE: The total size of the csv dump is at least 731 GiB (height 635,000).

The files are in the following format:

blocks.csv: block_hash ; height ; version ; blocksize ; hashPrev ; hashMerkleRoot ; nTime ; nBits ; nNonce
transactions.csv: txid ; hashBlock ; version ; lockTime
tx_in.csv: txid ; hashPrevOut ; indexPrevOut ; scriptSig ; sequence
tx_out.csv: txid ; indexOut ; height ; value ; scriptPubKey ; address

Load data directly into ClickHouse

The command

clickhouse
loads blockchain data directly into ClickHouse database. This includes blocks, transactions, inputs, and outputs. The data is loaded asynchronously for better performance.

Usage:

This will create four tables in ClickHouse:

  1. blocks
    - information about blocks
  2. transactions
    - information about transactions
  3. tx_inputs
    - information about transaction inputs
  4. tx_outputs
    - information about transaction outputs

You can specify custom table names:

Table structures:

  1. blocks
    table:

    • height
      - block height
    • hash
      - block hash
    • prev_hash
      - previous block hash
    • merkle_root
      - merkle root
    • timestamp
      - block timestamp
    • size
      - block size in bytes
    • tx_count
      - number of transactions
  2. transactions
    table:

    • height
      - block height
    • block_hash
      - block hash
    • tx_hash
      - transaction hash
    • version
      - transaction version
    • locktime
      - locktime
    • input_count
      - number of inputs
    • output_count
      - number of outputs
  3. tx_inputs
    table:

    • height
      - block height
    • block_hash
      - block hash
    • tx_hash
      - transaction hash
    • input_index
      - input index
    • prev_tx_hash
      - previous transaction hash
    • prev_output_index
      - previous output index
    • script_sig
      - script signature (in hex)
    • sequence
      - sequence number
  4. tx_outputs
    table:

    • height
      - block height
    • block_hash
      - block hash
    • tx_hash
      - transaction hash
    • output_index
      - output index
    • value
      - value in satoshis
    • script_pubkey
      - public key script (in hex)

See the block and transaction specifications if some of the fields are unclear. If you want to insert the files into MySql see sql/schema.sql (but be aware this hasn't been tested and used for quite some time now). It contains all table structures and SQL statements for bulk inserting. Also see sql/views.sql for some query examples.

Show blockchain statistics

The command

simplestats
can be used to show blockchain statistics, e.g.:

  • Show the txid of transactions that contain specific script types
  • Total numbers like number of blocks, number of transactions, biggest tx in value or size
  • Averages like block size, time between blocks, txs, inputs and outputs

Usage

Usage: rusty-blockparser [OPTIONS] [COMMAND] Commands: unspentcsvdump Dumps the unspent outputs to CSV file csvdump Dumps the whole blockchain into CSV files simplestats Shows various Blockchain stats balances Dumps all addresses with non-zero balance to CSV file opreturn Shows embedded OP_RETURN data that is representable as UTF8 clickhouse Loads blockchain data directly into ClickHouse database help Print this message or the help of the given subcommand(s) Options: --verify Verifies merkle roots and block hashes -v... Increases verbosity level. Info=0, Debug=1, Trace=2 (default: 0) -c, --coin <NAME> Specify blockchain coin (default: bitcoin) [possible values: bitcoin, testnet3, namecoin, litecoin, dogecoin, myriadcoin, unobtanium, noteblockchain] -d, --blockchain-dir <blockchain-dir> Sets blockchain directory which contains blk.dat files (default: ~/.bitcoin/blocks) -s, --start <HEIGHT> Specify starting block for parsing (inclusive) -e, --end <HEIGHT> Specify last block for parsing (inclusive) (default: all known blocks) -h, --help Print help -V, --version Print version

Example Usage

To make a

unspentcsvdump
of the Bitcoin blockchain your command would look like this:

# ./blockparser unspentcsvdump /path/to/dump/ [6:02:53] INFO - main: Starting rusty-blockparser v0.7.0 ... [6:02:53] INFO - index: Reading index from ~/.bitcoin/blocks/index ... [6:02:54] INFO - index: Got longest chain with 639626 blocks ... [6:02:54] INFO - blkfile: Reading files from ~/.bitcoin/blocks ... [6:02:54] INFO - parser: Parsing Bitcoin blockchain (range=0..) ... [6:02:54] INFO - callback: Using `unspentcsvdump` with dump folder: /path/to/dump ... [6:03:04] INFO - parser: Status: 130885 Blocks processed. (left: 508741, avg: 13088 blocks/sec) ... [10:28:47] INFO - parser: Status: 639163 Blocks processed. (left: 463, avg: 40 blocks/sec) [10:28:57] INFO - parser: Status: 639311 Blocks processed. (left: 315, avg: 40 blocks/sec) [10:29:07] INFO - parser: Status: 639452 Blocks processed. (left: 174, avg: 40 blocks/sec) [10:29:17] INFO - parser: Status: 639596 Blocks processed. (left: 30, avg: 40 blocks/sec) [10:29:19] INFO - parser: Done. Processed 639626 blocks in 266.43 minutes. (avg: 40 blocks/sec) [10:32:01] INFO - callback: Done. Dumped all 639626 blocks: -> transactions: 549390991 -> inputs: 1347165535 -> outputs: 1359449320 [10:32:01] INFO - main: Fin.

Installing

This tool should run on Windows, OS X and Linux. All you need is

rust
and
cargo
.

IMPORTANT: Building with

--release
is essential for performance.

Tested on Gentoo Linux with rust-stable 1.85.0

Memory Usage

The required memory usage depends on the used callback:

  • simplestats: ~100MB
  • csvdump: ~100M
  • unspentcsvdump: ~18GB
  • balances: ~18GB

NOTE: Those values are taken from parsing to block height 639,631 (17.07.2020).

Contributing

Use the issue tracker to report problems, suggestions and questions. You may also contribute by submitting pull requests.

If you find this project helpful, please consider making a donation:

1LFidBTeg5joAqjw35ksebiNkVM8azFM1K

Customizing the tool for your coin

The tool can easily be customized to your coin. This section outlines the changes that need to be made and is for users not familiar with Rust and Blockchain. During this example the coin name used is NoCoinium.

  • The main change is
    src/blockchain/parser/types.rs
    .
  • Add a new entry
    pub struct NoCoinium
    similar to other definitions.
  • You will then need to add a
    impl Coin for NoCoinium
    . You could easily copy a previous block e.g. Bitcoin. The changes you need to do are highlighted below as comments
  • Finally, tie these changes within
    impl FromStr for CoinType
    under
    match coin_name
    . The first part will be the string passed as argument to the program (see bullet point below) and the name within
    from()
    will be the name used above.
  • The next change is in
    src/main.rs
    . Under the fn
    parse_args()
    add your coin to the array of coins. The case you use here will be the same value as you pass in the arguments when executing the blockchain (using the
    -c
    argument)
  • Finally, add your coin name in the README.md file so others know your coin is supported