Envelope Specification
by Curtis Ellis on 14 January 2021
Category
Expiry date
Licence
No intellectual property has been sought for this standard
Applies to
Wallets will need to use this standard to read/write transactions containing it. Data services will need to use this standard to filter transactions based on data within it.
Standard Stage
0 In Consideration 1 Draft 1 Internal Review 2 Public Review 0 Published 0 Recommended 0 Withdrawn
Thanks for submiting your comments!
Overview
AttributeDescription
Version0.3
AuthorCurtis Ellis
ReviewersAttila Aros, Connor Murray, David Case, Dylan Murray, Jaime Salom Viñado, Jonathan Aird, Liam Missin, Lucas Rohenaz, Mathias Wulff, MrZ, Nithin Mani, Roger Taylor, Xiaohui Liu, Ye Chen
Tags and Categories
Publication Date07/06/2021
Valid Until
Copyright2021 Bitcoin Association
IP GenerationNo intellectual property has been sought for this standard
Known Implementationshttps://github.com/tokenized/envelope
Applies toWallets will need to use this standard to read/write transactions containing it. Data services will need to use this standard to filter transactions based on data within it.
BRFC ID
Acknowledgements
StatusPublic Review
VisibilityPUBLIC

Background

Currently, data stored within a Bitcoin SV transaction typically utilizes a standard unspendable output that begins with OP_FALSE OP_RETURN. The first push data after that is generally the protocol identifier which specifies the format/encoding of the subsequent data in the output script.

This standard aims to improve the way data is stored within Bitcoin transactions by creating a general purpose data envelope that allows for efficient identification, structuring, and interoperability (or layering) of 1 or more data protocols. We will refer to these other protocols as sub-protocols for the purposes of this document.

Problem Statement

When a software application sees an output script that begins with OP_FALSE OP_RETURN, it knows that it is provably unspendable and likely contains data that has meaning outside of the Bitcoin protocol. Currently, it is difficult for blockchain indexing software to determine the encoding and formatting of the data because there are many different protocols already in use, and there are no constraints/control over what, or how many protocols will be stored within a Bitcoin transaction. In other words, interacting software is unable to quickly determine what it knows or does not know about the data. This uncertainty can add systemic costs to IT infrastructure that is interested in some or all of the data.

Objectives

This standard aims to provide a framework for specifying and combining different protocols embedded in Bitcoin transactions. It also aims to be agnostic to the way the data is stored within Script, and equally supportive of all types of protocols and protocol identifiers.

The protocol needs to be as simple and lightweight as possible, to enable easy integration with various services, and to be efficient regarding processing speed, to support high throughput systems, and its storage footprint, to keep mining fees as low as possible.

The Envelope protocol also aims to provide a framework to allow for the interoperability of sub-protocols. For example, if you define a data format protocol, but want to support encryption or Metanet, you can simply use this specification to combine those protocols on top of your protocol without your protocol having to know anything about Metanet or encryption. This helps on both sides. It helps the protocol developers to concentrate on the specific functionality they want to provide, and it also helps software developers to support more functionality by not having to implement specific support for combinations of functionality like Metanet, encryption, compression, and many other features for each protocol.

Scope

The Envelope protocol is focused on everything required to provide a common system for identifying all protocol(s) used to encode on-chain data embedded in scripts, as well as providing a framework to allow for interoperability between these protocols.

Out of scope

Protocols used within an Envelope and their meaning are out of scope. The Envelope protocol simply defines the framework that allows for the use, interoperability, and layering of sub-protocols.

Methods and Concepts

Data Location Within Bitcoin Scripts

Envelope data can be provided in any part of the script that is not meant to be executed. This protocol provides some recommended scripts, although it is agnostic to the exact script used.

  • An unspendable locking script that starts with OP_FALSE OP_RETURN followed by the envelope data is the most easily identifiable as unspendable by a parser.
  • A spendable locking script with OP_RETURN after it, followed by the envelope data is also fairly easily identified by a parser though it requires parsing through spendable scripts, to look for OP_RETURNs.

Push data

Push data in this document refers to Bitcoin script push operations that, when in executable code, would cause a value to be pushed to the stack. They consist of an op code that specifies the size of the data or the method of specifying the size of the data, followed by the data.

For data sizes up to 75 bytes (hex 0x4b) a single byte op code is provided that is a byte containing the size. So, for a pushdata containing 32 bytes of data, you write the byte containing the value 32 0x20 followed by 32 bytes of data.

For longer data sizes, there are:

  • OP_PUSHDATA1 (0x4c)
  • OP_PUSHDATA2 (0x4d)
  • OP_PUSHDATA4 (0x4e)

Each of those op codes is followed by the specified number of bytes, 1, 2, or 4 containing a little endian integer representing the length of the data, followed by the data.

Examples

Single byte push op (up to 75 bytes)

Script: 0x080102030405060708

The first byte is an op code that tells the bitcoin script interpreter to push the next 8 bytes of the script onto the stack.

OP_PUSHDATA1

Script: 0x4c640102030405…

The first byte 0x4c is the value for OP_PUSHDATA1 that tells the interpreter that the next byte specifies the number of bytes to read from the script and push onto the stack. 0x64 is the value 100 telling the interpreter to push the next 100 bytes of the script onto the stack.

OP_PUSHDATA2

Script: 0x4d15320102030405…

The first byte 0x4d is the value for OP_PUSHDATA2 that tells the interpreter that the next 2 bytes, using little endian, specify the number of bytes to read from the script and push onto the stack. 0x1532 are the little endian representation for the number 12,821 telling the interpreter to push the next 12,821 bytes of the script onto the stack.

OP_PUSHDATA4

Script: 0x4e153247000102030405…

The first byte 0x4e is the value for OP_PUSHDATA4 that tells the interpreter that the next 4 bytes, using little endian, specify the number of bytes to read from the script and push onto the stack. 0x15324700 is the little endian representation for the number 4,665,877 telling the interpreter to push the next 4,665,877 bytes of the script onto the stack.

Numbers

Numbers used to specify the number of protocol identifiers and the number of push datas will use standard Bitcoin “script” format.

For numbers 1 through 16, use op codes OP_1 through OP_16 (hex byte values 0x51 through 0x60). A push data is used for larger numbers. The push data’s data is a little endian number, where the first byte is least significant and the last byte is the most significant. If the first bit of the most significant byte (the last byte) is set then the number is negative. If that byte is 0x80 then it is not used as part of the value, otherwise the bits of that byte are inverted. To represent a positive number with the first bit set, like 128 0x80, a zero byte is added as the most significant byte (the last byte). For the purposes of this protocol, only positive numbers are valid, since they are counts of protocol identifiers or push datas, and so the only exception is to add a zero byte if the last byte has its most significant bit set. Otherwise, trailing zero bytes are removed when encoding.

Examples

Less than or equal to 16

To specify 3 protocol IDs or push datas simply use the specific op code OP_3.

Script: OP_3

Greater than 16

To specify 25 protocol IDs or push datas, a push data must be used to push the value 25 onto the stack.

Script: 0x0119

0x01 tells the interpreter to push 1 byte to the stack. The hex value 0x19 equals 25 (in decimal).

Highest bit set

To specify 128 protocol IDs or push datas, a push data and a special zero byte is needed to distinguish it from a negative number. This is because, in traditional computing, negative integers are represented using two’s complement and this means that the highest bit is set when the value is negative.

Script: 0x028000

0x02 tells the interpreter to push 2 bytes to the stack. The hex value 0x80 represents the value 128, if it were an unsigned integer, but it can also be interpreted as -1 if it is an 8 bit signed integer. To prevent it representing -1, a zero byte 0x00 must be added to the end. To clarify, the interpreter would interpret 0x0180 as -1.

Values higher than 8 bits

To specify values over 255, more than 1 value byte using little endian is required.

Script: 0x021532

0x02 tells the interpreter to push 2 bytes to the stack. The bytes 0x1532 in little endian represent the value 12,821.

Non-executable Data

Data within a Bitcoin script that is provably not executed by the interpreter.

  • After an OP_RETURN
  • Within an OP_FALSE OP_IF
  • OP_DROP after being pushed

Sub-protocol

Sub-protocol is used to refer to protocols that are used within the Envelope protocol and referenced by the Envelope protocol.

Specification

The data identified as non-executable by the methods above, is in the following format:

  • It starts with a push data that contains the value 0xbd01 to identify the envelope protocol version 1.
  • The first byte is the envelope protocol and the second is the version. A push data op code to push 2 bytes is 0x02, so the first piece of data should be 0x02bd01.

This is followed by 1 or more self-contained sections of data as shown below:

Section

Envelope data is divided into sections to allow for multiple sets of sub-protocols to be included. Each section of data starts with an Envelope section header that provides information about the data in that section. The Envelope section header is as follows:

  1. A number that specifies the number of protocol identifiers. (i.e. OP_1)
  2. A push data with each protocol identifier being used. A protocol-identifier can be any unique data wrapped in a push data. The order specifies the order used to decode, and the reverse should be used to encode. For example, the first might be an encryption protocol, and the next, a data format protocol. When the data is written, it is first formatted according to the data format and then encrypted. When read, it is first decrypted and then read in the data format.
  3. A number that specifies the number of push data, (or op codes) that are encoded using the specified protocols.
  4. After the specified number of push data, if the non-executable part of the script still has data remaining, then it is assumed that a new envelope section will start with a new section header including a new set of protocol identifiers.
NameTypeNote
Protocol CountScript Number (OP_1, …)
Protocol Identifierspush dataRepeats “Protocol Count “ times
Push Data CountScript Number (OP_1, …)

Sub-protocols are processed in a specific order so that it is clear which push data applies to it. For example, if the first protocol specified is for encryption, then the first push data can be a header that defines how the data following it is to be encrypted. Or if the first protocol is for Metanet, then the first push data can contain the Metanet data and the rest of the push data can be according to the following protocols specified. So, for example, you could specify Metanet first, so it is unencrypted, then specify an encryption protocol, followed by a data format protocol. When processing, the Metanet protocol can “eat” the first push data, then leave the rest to be decrypted.

Examples

The following are possible sub-protocols (for example purposes only).

Simple Single Protocol

OP_FALSE OP_RETURN 0x02bd01 OP_1 0x06 “proto1” OP_1 0x10 “some proto1 data”

DataDescription
OP_FALSE OP_RETURNSpecifies that the script is unspendable and contains data.
0x02bd01This is the push data containing the Envelope protocol ID that specifies that the following data is in accordance with the Envelope protocol.
OP_1Only 1 protocol is used for the following data.
0x06 “proto1”The 0x06 specifies a push data containing 6 bytes. The bytes are ASCII “proto1” and specify a hypothetical data protocol.
OP_1Only 1 push data is used for the proto1 protocol data.
0x10 ”some proto1 data”This push data is the data according to the proto1 protocol.

Encrypted Data Protocol

OP_FALSE OP_RETURN 0x02bd01 OP_2 0x05 “CRYPT” 0x06 “proto1” OP_2 0x1b0000000100... 0x…

DataDescription
OP_FALSE OP_RETURNSpecifies that the script is unspendable and contains data.
0x02bd01The push data containing the Envelope protocol ID that specifies the following data is in accordance with the Envelope protocol.
OP_22 protocols are used for the following data.
0x05 “CRYPT”The 0x05 specifies a push data containing 5 bytes. The bytes are ASCII “CRYPT” and specify the hypothetical encryption protocol.
0x06 “proto1”The 0x06 specifies a push data containing 6 bytes. The bytes are ASCII “proto1” and specify the hypothetical data protocol.
OP_2There are 2 push datas used by the preceding protocols.
0x1b0000000100...A push data containing the header data for the hypothetical “CRYPT” protocol. The specifics are irrelevant for this discussion.
0x…A push data containing the proto1 data that is encrypted with CRYPT.

Encrypted And Unencrypted Data Protocol

OP_FALSE OP_RETURN 0x02bd01 OP_2 0x05 “CRYPT” 0x06 “proto1” OP_2 0x1b0000000100... 0x... OP_1 0x06 “proto1” OP_1 0x10 “some proto1 data”

DataDescription
OP_FALSE OP_RETURNSpecifies that the script is unspendable and contains data.
0x02bd01This is the push data containing the Envelope protocol ID which specifies that the following data is in accordance with the Envelope protocol.
OP_22 protocols are used for the related data.
0x05 “CRYPT”The 0x05 specifies a push data containing 5 bytes. The bytes are ASCII “CRYPT” and specify the encryption protocol.
0x06 “proto1”The 0x06 specifies a push data containing 6 bytes. The bytes are ASCII “proto1” and specify a hypothetical data protocol.
OP_2There are 2 push datas used by the preceding protocols.
0x1b0000000100...A push data containing the header data for the hypothetical “CRYPT” protocol. The specifics are irrelevant for this discussion.
0x…A push data containing the proto1 data that is encrypted with CRYPT.
OP_1This is the start of a new section since the 2 push datas specified in the previous section header were consumed. OP_1 specifies that only 1 protocol is used for the following data.
0x06 “proto1”The 0x06 specifies a push data containing 6 bytes. The bytes are ASCII “proto1” and specify a hypothetical data protocol.
OP_1Only one push data is used for the proto1 protocol data.
0x10 ”some proto1 data”This push data is the data according to the proto1 protocol.

Layering Diagram

This diagram shows how multiple sub-protocols can be combined via appending or layering depending on the specifics of the sub-protocol. This is an example and the actual data can be in a different place in the script and include different sub-protocols. This Envelope consists of 2 Envelope sections. The first uses the MNET sub-protocol and the second uses the CRYPT and proto1 sub-protocols.

History

ArtifactDescription
Errata
Previous versions
Change LogProtobuf was determined to be too complex for this purpose and so was replaced with simpler bitcoin like binary data encoding. Extensions were determined to be too complex and not agnostic enough for a top level protocol and so Envelope was changed to be completely agnostic but to enable combining of sub-protocols to allow MetaNet, encryption, and other standards to be combined with data formatting protocols. Standard Bitcoin script numbers were determined to be better than a simpler custom number format.
Decision LogExtensions (Between versions 0.1 and 0.2)

The original protocol contained optional “extensions” for metanet and encryption. It was decided that the base protocol should be completely agnostic and provide the ability to combine multiple protocols in different ways to enable that type of functionality more dynamically.
This was based on feedback from Jaime Salom Vinado, Jonathan Aird, and Jack Davies,

Encoding Change (Between versions 0.1 and 0.2)

The original protocol was a pushdata for the Envelope protocol ID followed by a pushdata containing Protobuf data. It was decided that a simpler encoding is more appropriate since there should be no extensions, so very few fields and very little data needs to be at the Envelope protocol level. Simpler is better.
This was based on feedback from Roger Taylor and Jonathan Aird.

Numbers (Between versions 0.2 and 0.3)

Version 0.2 had a simplified version of bitcoin script numbers used for protocol id and push data counts because negative numbers are not needed. It was decided to use standard bitcoin script numbers instead so it is more standardized even though they require special rules to handle negative numbers.
This was based on feedback from Jonathan Aird and Roger Taylor.

Relationships

RelationshipDescription
IP licences and dependenciesThis standard was created as an independent work under auspices of the Bitcoin SV Technical Standards Committee. Whilst best efforts have been made to ensure that this standard and its implementations do not infringe intellectual property rights of any third party, Bitcoin Association can offer no guarantee relating to third party intellectual property rights.
Copyright© 2021 Bitcoin Association. All rights reserved.
Unless otherwise specified, or required in the context of its implementation on BSV Blockchain, no part of this standard may be reproduced or utilised otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on the internet or an intranet, without prior written permission of Bitcoin Association.
Extends
Modifies
Deprecates
Depends On
Prior Arthttps://github.com/bitcoin-sv-specs/op_return https://bitcom.bitdb.network/#/
Existing Solution
References

Overview
Submit comments

This Standard is at the public review stage. To leave comments, feedbacks or suggestions please register below.

A beta reference implementation of this Standard is available here

Already have an account? Login
Standard details
Become a Contributor
If you wish to join us on this mission to make BSV the public blockchain of choice please fill in our preliminary registration form below. We look forward to having you on board.