Questions tagged [hash-tree]

A hash tree (or Merkle tree) is a method of hierarchically hashing data. They allow efficient parallel hashing and updates and the possibility of verifying partial data.

A hash tree (or Merkle tree, named after Ralph Merkle, who patented them in 1979 for use in combination with the Lamport one-time signature scheme) is a method of hierarchically hashing data.

In a typical hash tree implementation, the data to be hashed is first split into a number of segments (if it is not already so structured). Each of these segments are hashed using an ordinary cryptographic hash function, with the resulting hash values forming the lowest level of the tree.

The hash values at the lowest level are then divided into groups, often of fixed size. (For example, in a binary hash tree, each group would consist of two hash values.) These groups are then each hashed, with the resulting hash values forming the second-lowest level of the tree, and so on. This process is repeated to build the tree from the bottom up, until, at the top level of the tree, only a single hash value remains.

The single hash value at the top level can be used to verify the entire dataset. However, if the hash values at intermediate levels of the tree are also transmitted, they can be used to verify parts of the data without requiring knowledge of the rest. Also, if parts of the data are changed, the hash tree can be efficiently updated by only recomputing the parent nodes of the changed segments. These features make hash trees useful for verifying the integrity of filesystems or databases, as well as for detecting errors in data transmitted out of order e.g. over peer-to-peer file transfer networks.

Another benefit of hash trees over conventional hashing is that, since different branches of the hash tree may be computed independently, the hash calculation and verification can be easily parallelized.

Some care is needed when designing practical tree hashing schemes to avoid collision or preimage attacks, particularly relating to the choice of hash functions used at different levels of the tree, as well as the handling of input data whose length in segments is not a power of two (or, more generally, of the number of children per node). For example, in a naïve hash tree implementation, where the same hash function was used for all levels and the hashes of child nodes were simply concatenated to obtain the input for the parent node's hash, an attacker would be able to trivially generate second preimages of the root hash simply by submitting the concatenation of the second-level hashes as a message.

101 questions
24
votes
2 answers

What is the purpose of using different hash functions for the leaves and internals of a hash tree?

We previously learned that the THEX hash tree specification, which used by some P2P systems, requires that two different hash functions be used: one for the leaf nodes (hashes of input data), and one for the internal/branch nodes (hashes of…
user1114
  • 855
  • 2
  • 10
  • 26
18
votes
2 answers

Merkle hash tree updates

It seems that merkle hash tree (MHT) traversals have been discussed somewhat in the literature, but there does not appear to be much written on inserting, deleting, and updating leaves. Is this lack of material regarding updating MHT's possibly due…
user3150164
  • 303
  • 1
  • 2
  • 6
17
votes
3 answers

Is there a standardized tree hash?

SHA-1, SHA-2, and the standardized version of SHA-3 are all sequential. This is impractical for hashing very large files distributed across machines. Any sequential hash can be straightforwardly converted into an efficiently parallelized hash…
Geoffrey Irving
  • 404
  • 2
  • 12
16
votes
1 answer

How to to calculate the hash of an unordered set

Suppose I have a set of elements, with known hash (e.g. SHA-2). How can I calculate the hash of the set? With it I mean an unordered set, so the order of elements is undefined and shall not play any role in determining the hash of the set. In theory…
ragazzojp
  • 423
  • 4
  • 8
15
votes
4 answers

Proof of non-membership on a Merkle tree?

Assume a user $U$ and a server $S$. $U$ uploads its data and wants later to perform an authenticity check. It also sends a Merkle tree to the server. Let’s say we would like $U$ to ask for a specific element in the tree. $S$ then returns the leaf…
curious
  • 6,280
  • 6
  • 34
  • 48
13
votes
2 answers

Efficient Incremental Updates to Large Merkle Tree

I have a data set with 300 Million entries and every 5 minutes 4000 random entries in this table change. I need to calculate the merkle root on this data set to validate integrity multiple times every 5 minutes. Assuming sha224 hashes this would be…
bytemaster
  • 233
  • 2
  • 5
11
votes
1 answer

Tiger Tree Hash vs generic Merkle Tree

Is there any advantage of using Tiger Tree Hash over any other hash function organized as the Merkle tree? Are there maybe any properties of TIGER that, say, SHA-2 or BLAKE in Merkle tree do not have? And in general, is there any point of chosing…
toriningen
  • 473
  • 2
  • 12
10
votes
1 answer

Is hashing large files CPU or I/O bound?

For SHA3, if I were to hash large files (> 1GB), would that operation be CPU or I/O bound? Suppose I were to exploit parallelism and either use multicore/SIMD execution or generate a hash tree, would that also be constrained by the same resource?
sha3_tree
  • 103
  • 4
10
votes
3 answers

What is the reason to separate domains in the internal hash algorithm of a merkle tree hash?

From rfc 6962 It is stated that: Note that the hash calculations for leaves and nodes differ. This domain separation is required to give second preimage resistance. That means that whenever the hash computes on leaves a distinct known element is…
curious
  • 6,280
  • 6
  • 34
  • 48
9
votes
1 answer

Are there cryptographic hash functions with homomorphic properties?

Are there cryptographic hash functions that have homomorphism-like properties? E.g. satisfying following relation $h(a || b) = h(a) · h(b)$, where $h(x)$ is hash function itself, $x || y$ is concatenation and $x · y$ is some hash-specific…
toriningen
  • 473
  • 2
  • 12
8
votes
1 answer

Is there a hash tree scheme designed for complex data structures?

I have a JSON object with private data. It has the following (complex!) structure: { name: "JB", age: 35, children: [ { name: "Alice", age: "5", favColor: "pink" }, { …
8
votes
0 answers

Are there conventions for signing JSON as a tree, to allow proofs of signed subtrees?

Given some JSON with a chosen encoding, you can obviously cryptographically-sign the whole thing as a binary blob. However, it might be useful if the logical structure of the JSON-compatible object were leveraged to control what is signed, in such…
gojomo
  • 221
  • 1
  • 8
8
votes
1 answer

Is it possible to build short proofs of arbitrary computations over a big list?

With the use of Merkle Trees, you can prove the presence of an element of a very big list, with an amount of information logarithmic in the size of the whole tree. Merkle proofs, thus, probabilistically confirm the result of this specific…
MaiaVictor
  • 1,365
  • 8
  • 16
8
votes
1 answer

How does a "Tiger Tree Hash" handle data whose size isn't a power of two?

Constructing a hash tree is simple enough if the data fits into a number of blocks that is a power of two. root = H(H(A+B)+H(C+D)) / \ H(A+B) H(C+D) / \ / \ A B C D In…
user1114
  • 855
  • 2
  • 10
  • 26
8
votes
3 answers

Stateless hash based public key cryptography?

Merkle-Winternitz signatures based on fractal hash trees are an attractive alternative to other post-quantum cryptographic schemes, in particular since they are conceptually simple, the security properties are easily understood and they are easy to…
Henrick Hellström
  • 10,556
  • 1
  • 32
  • 59
1
2 3 4 5 6 7