Hash Functions in XRPL | XRPL Security & Cryptography | XRP Academy - XRP Academy
3 free lessons remaining this month

Free preview access resets monthly

Upgrade for Unlimited
Skip to main content
advanced55 min

Hash Functions in XRPL

Learning Objectives

Explain the three security properties of cryptographic hash functions—pre-image resistance, second pre-image resistance, and collision resistance—with practical examples of what each prevents

Map specific hash function applications throughout XRPL's architecture, understanding why SHA-256, SHA-512, and RIPEMD-160 are each used where they are

Evaluate SHA-256's security margins against current and foreseeable attacks, distinguishing between theoretical attacks and practical threats

Analyze how Merkle trees use hash functions to enable efficient verification of account states without requiring complete ledger history

Assess length extension attacks and proper hash construction, understanding why naive hashing can create vulnerabilities

  • Is completely different if you change even one bit of input
  • Cannot be reversed to recover the original input
  • Makes it practically impossible to find two different inputs producing the same output

This machine is a cryptographic hash function, and XRPL uses it for everything.

Every transaction has a unique hash that identifies it. Every ledger version contains a hash linking to the previous version. Your XRPL address is derived from hashes of your public key. The Merkle trees that prove account balances are built from hashes.

Break the hash function, and you break the blockchain. That's why understanding hash security isn't academic—it's the foundation of ledger integrity.


A cryptographic hash function is a mathematical function with specific security properties that make it useful for security applications. Understanding these properties reveals what hash functions can and cannot guarantee.

Hash Function Basics:

Input: Any data of any length
Output: Fixed-size digest (256 bits for SHA-256)

- Same input always produces same output
- No randomness involved
- Anyone can verify: H(x) → y

- Computing hash takes microseconds
- Linear in input size (longer input = proportionally longer time)
- No special hardware required

- Easy: x → H(x)
- Hard: H(x) → x (finding x given output)
- This asymmetry is the foundation of security

Definition: Given a hash output h, it should be computationally infeasible to find any input x such that H(x) = h.

Pre-Image Resistance:

- Attacker has: hash value h
- Attacker wants: any x where H(x) = h
- Difficulty: ~2^n for n-bit hash

Why It Matters:

  • You publish: Transaction + Hash

  • Attacker sees: Hash

  • Attacker cannot: Find different transaction with same hash

  • Database stores: H(password)

  • Attacker steals: H(password)

  • Attacker cannot: Compute password from hash

  • Pre-image resistance: 256 bits

  • Requires: ~2^256 operations to find pre-image

  • Status: No attack significantly better than brute force

Definition: Given an input x₁, it should be computationally infeasible to find a different input x₂ such that H(x₁) = H(x₂).

Second Pre-Image Resistance:

- Attacker has: Original document x₁ and its hash H(x₁)
- Attacker wants: Different document x₂ with H(x₂) = H(x₁)
- Difficulty: ~2^n for n-bit hash

Why It Matters:

  • Original: "Send 100 XRP to Alice" → hash h

  • Attacker wants: "Send 100,000 XRP to Attacker" → same hash h

  • If possible: Could substitute malicious transaction

  • Reality: Computationally infeasible for good hash functions

  • Contract signed with hash commitment

  • Attacker cannot create different contract with same hash

  • Digital signatures remain valid only for original document

  • Second pre-image resistance: ~256 bits

  • No known attack better than generic 2^256 search

Definition: It should be computationally infeasible to find any two different inputs x₁ and x₂ such that H(x₁) = H(x₂).

Collision Resistance:

- Attacker wants: ANY two different inputs with same hash
- No constraint: Attacker chooses both inputs
- Difficulty: ~2^(n/2) due to birthday paradox

- 23 people → 50% chance two share birthday
- Not comparing to specific date—any match counts
- For 365 possibilities: only √365 ≈ 19 needed for 50%

- 256-bit hash → 2^256 possible outputs
- Birthday attack → ~2^128 hashes to find collision
- Collision resistance = n/2 bits for n-bit hash

Why It Matters:

  • Collisions could create ambiguous proofs

  • Two different states appearing identical

  • Would break efficient verification

  • "I'll reveal my bid later, here's the hash"

  • If collision exists: could reveal different bid

  • Breaks fairness of auctions, voting, etc.

  • Collision resistance: 128 bits

  • Requires: ~2^128 operations to find collision

  • Status: No collision ever found

SHA-256 Security Properties:

Property              | Security Level | Attack Complexity | Status
----------------------|----------------|-------------------|--------
Pre-image             | 256 bits       | 2^256 operations  | Unbroken
Second pre-image      | 256 bits       | 2^256 operations  | Unbroken
Collision             | 128 bits       | 2^128 operations  | Unbroken

- Pre-image: Find ANY x for given H(x) = h
- Second pre-image: Find x₂ ≠ x₁ for given x₁ where H(x₂) = H(x₁)
- Collision: Find ANY x₁, x₂ where H(x₁) = H(x₂)

Collision is "easiest" to find (only 128-bit security)
but 128-bit is still astronomically difficult.

XRPL uses multiple hash functions for different purposes. Understanding why reveals careful engineering decisions.

SHA-256 in XRPL:

- Every transaction hashed with SHA-256
- Hash becomes transaction ID (64 hex characters)
- Used for lookups, references, deduplication

- Each ledger contains hash of previous ledger
- Creates tamper-evident chain
- Modifying history changes all subsequent hashes

- Proposals identified by hash
- Validators agree on ledger hash
- Enables efficient comparison without full data

- Account states hashed in Merkle tree
- Root hash in ledger header
- Enables state verification proofs

Transaction Details:
Input: Raw transaction bytes
Output: 256-bit (32-byte) hash
Encoding: Usually hex (64 characters)
Example: 7F83B1657FF1FC53B92DC18148A1D65DFC2D4B1FA3D677284ADDD200126D9069
SHA-512 in XRPL:

- Ed25519 signature scheme internally
- Key derivation processes
- Where extended output needed

- 512-bit output (64 bytes)
- Security: 256-bit pre-image, 256-bit collision
- Faster than SHA-256 on 64-bit processors
- Same design as SHA-256, larger state

- SHA-256: Standard for commitments, IDs
- SHA-512: Required by EdDSA specification
- Both from SHA-2 family, well-analyzed
RIPEMD-160 in XRPL:

Purpose: Compress public key hashes for addresses

- SHA-256 output: 32 bytes
- RIPEMD-160 output: 20 bytes
- Shorter addresses = better UX
- Combined: SHA-256(pubkey) → RIPEMD-160(result)

- RIPEMD-160 alone: 80-bit collision resistance
- Used after SHA-256: attacker must break both
- Combined security better than either alone

- Bitcoin chose RIPEMD-160 in 2009
- XRPL maintained compatibility
- Legacy decision, still secure for purpose
XRPL Hash Function Usage:

┌────────────────┐
                        │   Public Key   │
                        │   (33 bytes)   │
                        └───────┬────────┘
                                │
                           SHA-256
                                │
                        ┌───────▼────────┐
                        │  Hash of Key   │
                        │   (32 bytes)   │
                        └───────┬────────┘
                                │
                          RIPEMD-160
                                │
                        ┌───────▼────────┐
                        │   Account ID   │
                        │   (20 bytes)   │
                        └───────┬────────┘
                                │
                          Base58Check
                                │
                        ┌───────▼────────┐
                        │    Address     │
                        │  (25-35 chars) │
                        └────────────────┘

Transaction Processing:

┌─────────────────┐ SHA-256 ┌─────────────────┐
│ Transaction │ ───────────────► │ Transaction ID │
│ (bytes) │ │ (32 bytes) │
└─────────────────┘ └─────────────────┘

Ledger Chain:

┌─────────┐ ┌─────────┐ ┌─────────┐
│Ledger N │ │Ledger │ │Ledger │
│ │◄───│ N+1 │◄───│ N+2 │
│Hash: H₁ │ │Hash: H₂ │ │Hash: H₃ │
│Prev: H₀ │ │Prev: H₁ │ │Prev: H₂ │
└─────────┘ └─────────┘ └─────────┘


---

Understanding SHA-256's construction reveals why it's trusted for security-critical applications.

SHA-256 Structure (Merkle-Damgård):

Input:   Message of any length
Output:  256-bit (32-byte) digest

1. Padding: Add bits to make length multiple of 512
2. Parsing: Divide into 512-bit blocks
3. Initialization: Set 8 initial hash values
4. Processing: Apply compression function to each block
5. Output: Final hash values concatenated

┌─────────┐   ┌─────────┐   ┌─────────┐   ┌─────────┐
│ Block 1 │   │ Block 2 │   │ Block 3 │   │ Block n │
└────┬────┘   └────┬────┘   └────┬────┘   └────┬────┘
     │             │             │             │
     ▼             ▼             ▼             ▼
   ┌───┐         ┌───┐         ┌───┐         ┌───┐
IV─►│ f │────────►│ f │────────►│ f │───...──►│ f │──► Hash
   └───┘         └───┘         └───┘         └───┘

f = compression function
IV = initialization vector
Each application updates 256-bit state
SHA-256 Compression (High-Level):

- 64 rounds per 512-bit block
- Mix message bits into state
- Apply non-linear operations
- Shift and rotate bits

- AND, OR, XOR (bit-wise logic)
- Addition modulo 2^32
- Right-rotation and right-shift

- Mix bits thoroughly (diffusion)
- Create non-linear relationships (confusion)
- Reversible individually, irreversible combined
- Efficient on standard CPUs

- Each bit of output depends on all input bits
- After all rounds: thoroughly scrambled
- Cannot work backward without knowing all inputs
Avalanche Effect Demonstration:

Input 1: "Send 100 XRP"
Hash: 8a7b1c3d9e4f2a5b6c7d8e9f0a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b

Input 2: "Send 100 XRQ" (one letter changed)
Hash: 3f2e1d0c9b8a7f6e5d4c3b2a1f0e9d8c7b6a5f4e3d2c1b0a9f8e7d6c5b4a3f2e

- 1 bit changed in input
- ~128 bits changed in output (50%)
- Completely unpredictable which bits flip
- No correlation between input change and output change

- Can't make "small" changes to get desired hash
- Any modification is detectable
- No way to predict output from input structure
- Underpins all integrity guarantees
SHA-256 Security Status:

Best Known Attacks:

  1. Brute Force Pre-Image:

  2. Birthday Attack (Collision):

  3. Length Extension Attack:

  4. Reduced-Round Attacks:

  • SHA-256 published: 2001

  • 20+ years of cryptanalysis

  • No practical attack discovered

  • Trillions of dollars protected

  • Grover's algorithm: Quadratic speedup for search

  • Pre-image: 2^256 → 2^128 (still infeasible)

  • Collision: 2^128 → 2^64 (potentially concerning)

  • SHA-256 likely needs 384-bit upgrade eventually


Merkle trees combine hash functions with tree structures to enable efficient verification of large data sets—critical for blockchain scalability.

Merkle Tree Basics:

Purpose: Commit to large data set with single hash
Property: Can prove any element belongs with O(log n) data

1. Hash each data element (leaves)
2. Pair hashes and hash pairs (parents)
3. Continue until single root hash
4. Root commits to entire data set

Example (4 transactions):

┌────────────┐
│ Root │
│ Hash │
└─────┬──────┘

┌─────────────┴─────────────┐
│ │
┌────▼────┐ ┌────▼────┐
│ Hash AB │ │ Hash CD │
└────┬────┘ └────┬────┘
│ │
┌─────┴─────┐ ┌─────┴─────┐
│ │ │ │
┌──▼──┐ ┌──▼──┐ ┌──▼──┐ ┌──▼──┐
│Tx A │ │Tx B │ │Tx C │ │Tx D │
│Hash │ │Hash │ │Hash │ │Hash │
└─────┘ └─────┘ └─────┘ └─────┘

Root = H(Hash_AB || Hash_CD)
Hash_AB = H(Hash_A || Hash_B)
Hash_CD = H(Hash_C || Hash_D)
Hash_A = H(Transaction_A)
... etc
```

Proving Transaction Inclusion:

Scenario: Prove Tx B is in the tree without revealing all transactions

- Tx B data (to hash)
- Hash of Tx A (sibling)
- Hash CD (uncle)

1. Hash Tx B → Hash_B
2. Combine with provided Hash_A → Hash_AB = H(Hash_A || Hash_B)
3. Combine with provided Hash_CD → Root = H(Hash_AB || Hash_CD)
4. Compare computed root with known root
5. Match → Tx B was included

- For n leaves: log₂(n) hashes needed
- 1 million transactions: only 20 hashes (~640 bytes)
- 1 billion transactions: only 30 hashes (~960 bytes)

- Light clients verify inclusion
- No need to download entire ledger
- Trustless verification with minimal data
XRPL Account State Tree:

- Accounts are leaves
- Internal nodes are hashes
- Root hash in ledger header

State Proof:
"Prove account rXYZ has balance of 1000 XRP"

- Account object (balance, sequence, etc.)
- Merkle path to root
- Root hash (in ledger header)

1. Hash account object
2. Follow Merkle path to root
3. Compare with ledger root hash
4. Match → Account state verified

- Don't need full ledger history
- Can verify any account state
- Proof size: O(log n) regardless of ledger size
- Enables SPV (Simplified Payment Verification)
Merkle Tree Security:

- Changing any leaf changes root
- Path through tree must be recalculated
- Tampering is detectable

- Root hash commits to exact set of leaves
- Cannot add/remove leaves without changing root
- Total binding: any modification visible

- Cannot create valid proof for non-existent leaf
- Would require collision in hash function
- With SHA-256: computationally impossible

Attack Scenarios:

  1. Modify Transaction:

  2. Add Fake Transaction:

  3. Create Fake Proof:


Hash functions can be misused. Understanding length extension attacks reveals why construction matters as much as algorithm choice.

Length Extension Attack:

Vulnerable Pattern:
MAC = H(secret || message)

The Problem:
SHA-256 (and MD5, SHA-1, SHA-512) are vulnerable to length extension.

- H(secret || message)
- Length of secret
- (NOT the secret itself!)

- H(secret || message || padding || extension)
- Without knowing the secret!

- SHA-256 output IS the internal state
- Attacker continues hashing from that state
- Can append arbitrary data
- Valid hash for extended message

Example Attack:
Original: MAC = SHA-256("secret" || "amount=100")
Attacker: Computes SHA-256("secret" || "amount=100" || padding || "&amount=1000000")
Result: Valid MAC for malicious message
Impact: Could authorize fraudulent transactions if protocol is naive
Why This Matters for Blockchains:

Scenario 1: Naive API Authentication
Bad design: signature = SHA-256(api_key || request)
Attack: Extend request with malicious parameters
Result: Valid signature for attacker's modified request

Scenario 2: Transaction Malleability
If transaction hash computed naively
Attacker might extend transaction data
Could create transaction with same authorization but different effect

- Transactions have fixed structure
- Signature covers specific fields
- No arbitrary extension possible
- Not vulnerable to length extension in practice
HMAC: Hash-based Message Authentication Code

Construction:
HMAC(K, M) = H((K ⊕ opad) || H((K ⊕ ipad) || M))

Where:
K = Secret key
M = Message
opad = Outer padding (0x5c repeated)
ipad = Inner padding (0x36 repeated)

  • Double hashing breaks length extension
  • Inner hash produces intermediate result
  • Outer hash consumes intermediate result
  • Cannot extend without knowing key
  • Used in key derivation
  • Secure message authentication
  • Standard cryptographic construction
  • SHA-3 (Keccak) uses different construction
  • Sponge function, not Merkle-Damgård
  • Inherently resistant to length extension
  • XRPL uses SHA-2 family (established before SHA-3)
Choosing Hash Functions:

For Transaction/Data IDs:
✓ SHA-256 (standard, well-analyzed)
✓ Direct application safe
✓ No secret key involved

For MACs/Authentication:
✓ HMAC-SHA-256 (or HMAC-SHA-512)
✗ NOT plain SHA-256(key || message)
✗ Vulnerable to length extension

For Password Hashing:
✗ NOT plain SHA-256 (too fast)
✓ bcrypt, scrypt, Argon2
✓ Designed to be slow

For Key Derivation:
✓ HKDF (HMAC-based Key Derivation Function)
✓ PBKDF2 (with high iteration count)
✗ NOT plain hash of seed

- SHA-256 for IDs and commitments
- HMAC in key derivation
- Proper constructions throughout

---

Understanding historical attacks on hash functions contextualizes SHA-256's security.

MD5: Completely Broken
  • 1992: Published
  • 2004: First collision found
  • 2008: Rogue CA certificate attack
  • Today: Collision in seconds on laptop

Status: NEVER use for security
Still seen: File checksums (integrity, not security)

Attack Reality:
$ hashclash --attack collision md5
[Collision found in 2.1 seconds]

SHA-1: Practically Broken

  • 1995: Published
  • 2005: Theoretical attack published
  • 2017: SHAttered - practical collision
  • Cost: ~$110,000 in GPU compute

Status: Deprecated, actively being phased out
Chrome, Firefox: Reject SHA-1 certificates

  • Two different PDFs with same SHA-1 hash
  • Demonstrated real-world exploitability
  • Took 6,500 years of CPU time + 110 years of GPU time
  • First practical SHA-1 collision
SHA-256 vs Broken Predecessors:

MD5         SHA-1       SHA-256
Output bits       128         160         256
Collision bits    64 (theory) 80 (theory) 128 (theory)
Best attack       <64         <80         N/A (no attack)
Status            BROKEN      BROKEN      SECURE

Why SHA-256 Remains Secure:

  1. Larger State:

  2. More Rounds:

  3. Improved Operations:

  4. 20+ Years of Analysis:

Hypothetical SHA-256 Break:

- Could create two transactions with same hash
- Merkle tree proofs become ambiguous
- Impact: Significant but limited

- Could find transaction matching target hash
- Could potentially forge transaction IDs
- Impact: Severe, would require protocol changes

- Could replace any transaction with malicious one
- Would break all integrity guarantees
- Impact: Catastrophic

1. Amendment process to upgrade hash function
2. SHA-3 or SHA-384 as replacement
3. Protocol migration similar to Y2K preparations
4. Cryptographic agility enables response

- Probability of SHA-256 break: Very low
- Consequence if broken: High
- Mitigation available: Yes (protocol upgrades)
- Action needed now: Monitor research, prepare contingency

---

SHA-256 has withstood 20+ years of intensive cryptanalysis. Unlike MD5 and SHA-1 which were broken, SHA-256 shows no signs of weakness. The absence of any attack significantly better than brute force after two decades of well-funded research suggests genuine security.

128-bit collision resistance is adequate for all foreseeable classical computing scenarios. The computational resources required to find a SHA-256 collision exceed what's practically achievable. Even with optimistic projections of computing power growth, this margin remains comfortable for decades.

Merkle trees provide mathematically sound proofs of inclusion and state. The security reduction is clear: if the hash function is collision-resistant, Merkle proofs are unforgeable. This isn't a heuristic—it's a provable security property.

⚠️ Quantum computers reduce collision resistance from 128 to 64 bits via Grover's algorithm. This is a known theoretical concern, though 64-bit collision resistance still requires approximately 2^64 operations even with quantum speedup. The timeline for quantum computers capable of this is uncertain but likely 15+ years.

⚠️ New mathematical breakthroughs could change the landscape. Cryptographic history includes surprises. While SHA-256 appears solid, the impossibility of proving security means unknown attacks could exist.

⚠️ Implementation quality varies. The hash function itself is secure, but implementations can introduce timing side channels or other vulnerabilities. Each implementation should be independently verified.

🔴 Using broken hash functions for security is common despite warnings. MD5 and SHA-1 are still found in production systems. XRPL correctly uses SHA-256, but ecosystem tools and integrations may not be as careful.

🔴 Length extension attacks threaten naive constructions. Anyone building authentication or commitment schemes using SHA-256 must understand this vulnerability. Using HMAC instead of plain hashing prevents the attack.

🔴 Hash function output ≠ random. While hash outputs appear random, they're deterministic. Using hash outputs where true randomness is required (key generation) introduces predictability if inputs are predictable.

SHA-256 is an excellent choice for XRPL's integrity needs. After two decades of cryptanalysis by motivated attackers (breaking SHA-256 would bring fame and fortune), no practical attack exists. The 128-bit collision resistance provides margins far beyond any conceivable classical computing attack.

However, "secure hash function" doesn't mean "use however you want." Construction matters: HMAC for authentication, proper key derivation functions for keys, and appropriate algorithms for each use case. XRPL's design reflects this understanding, using hash functions correctly throughout its architecture.


Assignment: Create a comprehensive diagram and documentation mapping every use of hash functions in XRPL's architecture, explaining why each specific function was chosen for each purpose.

Requirements:

Part 1: Visual Map

  • Every location where hash functions are used in XRPL

  • Which specific hash function is used at each point

  • Data flow through hashing operations

  • Relationships between hashed values

  • Transaction hashing → Transaction ID

  • Address derivation (full path from private key)

  • Ledger chain hashing

  • Merkle tree construction for transactions

  • State tree for account data

  • Signature scheme internals (where applicable)

Part 2: Function Justification Table

  • Location in architecture
  • Hash function used
  • Input format
  • Output format/size
  • Security property relied upon
  • Why this specific function was chosen
  • Alternatives considered (if known)

Part 3: Security Analysis

  • What attack would be enabled if this hash were broken?
  • What's the current security margin?
  • What would the upgrade path be?

Part 4: Implementation Notes

  • Code libraries that implement these hashes in XRPL ecosystem

  • Known implementation considerations

  • Testing/verification approaches

  • Completeness of map (30%)

  • Technical accuracy (30%)

  • Justification quality (20%)

  • Presentation clarity (20%)

Time Investment: 4-5 hours

Value: This map serves as a reference for understanding XRPL's integrity architecture and for evaluating the security implications of any proposed changes to hash function usage.


Knowledge Check

Question 1 of 5

Security Property Identification

  • Rogaway & Shrimpton: "Cryptographic Hash-Function Basics" (foundational paper)
  • NIST FIPS 180-4: Secure Hash Standard (SHA-256 specification)
  • Merkle: "A Digital Signature Based on a Conventional Encryption Function"
  • Wang et al.: "Finding Collisions in the Full SHA-1" (breakthrough attack)
  • Stevens et al.: "The First Collision for Full SHA-1" (SHAttered attack)
  • Kelsey & Schneier: "Second Preimages on n-bit Hash Functions"
  • Bertoni et al.: "The Keccak Reference" (SHA-3 specification)
  • NIST SP 800-185: SHA-3 Derived Functions

For Next Lesson:
We'll examine digital signatures—how ECDSA and EdDSA transform hash functions and elliptic curves into unforgeable proofs of authorization. Understanding signature generation and verification reveals how your private key authorizes transactions without ever being revealed.


End of Lesson 3

Total words: ~6,100
Estimated completion time: 55 minutes reading + 4-5 hours for deliverable

Key Takeaways

1

Cryptographic hash functions provide three distinct security properties.

Pre-image resistance prevents finding inputs for given outputs. Second pre-image resistance prevents finding alternative inputs with matching hashes. Collision resistance prevents finding any two inputs that hash identically. SHA-256 provides 256-bit, 256-bit, and 128-bit security respectively for these properties.

2

XRPL uses multiple hash functions for specific purposes.

SHA-256 handles transaction identification, ledger chaining, and Merkle tree construction. SHA-512 supports Ed25519 signatures. RIPEMD-160 compresses public key hashes for shorter addresses. Each choice reflects specific requirements.

3

Merkle trees transform hash functions into efficient verification structures.

With a single root hash, you can prove inclusion of any element using only O(log n) hashes. This enables light clients, state proofs, and scalable verification—critical for practical blockchain operation.

4

The avalanche effect makes hash outputs completely unpredictable.

Changing one bit of input changes approximately half the output bits in an unpredictable pattern. This property underpins all integrity guarantees: any modification is detectable because it produces a completely different hash.

5

Proper construction prevents hash function misuse.

Length extension attacks break naive MAC constructions like H(secret || message). Using HMAC or SHA-3 prevents this vulnerability. XRPL uses appropriate constructions throughout, but developers building on XRPL must understand these requirements for their own code. ---