advanced•55 min

Hash Functions in XRPL

Name: XRPL Security & Cryptography
Price: 29 USD
Availability: InStock

Learning Objectives

Explain the three security properties of cryptographic hash functions—pre-image resistance, second pre-image resistance, and collision resistance—with practical examples of what each prevents

Map specific hash function applications throughout XRPL's architecture, understanding why SHA-256, SHA-512, and RIPEMD-160 are each used where they are

Evaluate SHA-256's security margins against current and foreseeable attacks, distinguishing between theoretical attacks and practical threats

Analyze how Merkle trees use hash functions to enable efficient verification of account states without requiring complete ledger history

Assess length extension attacks and proper hash construction, understanding why naive hashing can create vulnerabilities

Is completely different if you change even one bit of input
Cannot be reversed to recover the original input
Makes it practically impossible to find two different inputs producing the same output

This machine is a cryptographic hash function, and XRPL uses it for everything.

Every transaction has a unique hash that identifies it. Every ledger version contains a hash linking to the previous version. Your XRPL address is derived from hashes of your public key. The Merkle trees that prove account balances are built from hashes.

Break the hash function, and you break the blockchain. That's why understanding hash security isn't academic—it's the foundation of ledger integrity.

A cryptographic hash function is a mathematical function with specific security properties that make it useful for security applications. Understanding these properties reveals what hash functions can and cannot guarantee.

Hash Function Basics:

Input: Any data of any length
Output: Fixed-size digest (256 bits for SHA-256)

- Same input always produces same output
- No randomness involved
- Anyone can verify: H(x) → y

- Computing hash takes microseconds
- Linear in input size (longer input = proportionally longer time)
- No special hardware required

- Easy: x → H(x)
- Hard: H(x) → x (finding x given output)
- This asymmetry is the foundation of security

Definition: Given a hash output h, it should be computationally infeasible to find any input x such that H(x) = h.

Pre-Image Resistance:

- Attacker has: hash value h
- Attacker wants: any x where H(x) = h
- Difficulty: ~2^n for n-bit hash

Why It Matters:

You publish: Transaction + Hash
Attacker sees: Hash
Attacker cannot: Find different transaction with same hash
Database stores: H(password)
Attacker steals: H(password)
Attacker cannot: Compute password from hash
Pre-image resistance: 256 bits
Requires: ~2^256 operations to find pre-image
Status: No attack significantly better than brute force

Definition: Given an input x₁, it should be computationally infeasible to find a different input x₂ such that H(x₁) = H(x₂).

Second Pre-Image Resistance:

- Attacker has: Original document x₁ and its hash H(x₁)
- Attacker wants: Different document x₂ with H(x₂) = H(x₁)
- Difficulty: ~2^n for n-bit hash

Why It Matters:

Original: "Send 100 XRP to Alice" → hash h
Attacker wants: "Send 100,000 XRP to Attacker" → same hash h
If possible: Could substitute malicious transaction
Reality: Computationally infeasible for good hash functions
Contract signed with hash commitment
Attacker cannot create different contract with same hash
Digital signatures remain valid only for original document
Second pre-image resistance: ~256 bits
No known attack better than generic 2^256 search

Definition: It should be computationally infeasible to find any two different inputs x₁ and x₂ such that H(x₁) = H(x₂).

Collision Resistance:

- Attacker wants: ANY two different inputs with same hash
- No constraint: Attacker chooses both inputs
- Difficulty: ~2^(n/2) due to birthday paradox

- 23 people → 50% chance two share birthday
- Not comparing to specific date—any match counts
- For 365 possibilities: only √365 ≈ 19 needed for 50%

- 256-bit hash → 2^256 possible outputs
- Birthday attack → ~2^128 hashes to find collision
- Collision resistance = n/2 bits for n-bit hash

Why It Matters:

Collisions could create ambiguous proofs
Two different states appearing identical
Would break efficient verification
"I'll reveal my bid later, here's the hash"
If collision exists: could reveal different bid
Breaks fairness of auctions, voting, etc.
Collision resistance: 128 bits
Requires: ~2^128 operations to find collision
Status: No collision ever found

SHA-256 Security Properties:

Property              | Security Level | Attack Complexity | Status
----------------------|----------------|-------------------|--------
Pre-image             | 256 bits       | 2^256 operations  | Unbroken
Second pre-image      | 256 bits       | 2^256 operations  | Unbroken
Collision             | 128 bits       | 2^128 operations  | Unbroken

- Pre-image: Find ANY x for given H(x) = h
- Second pre-image: Find x₂ ≠ x₁ for given x₁ where H(x₂) = H(x₁)
- Collision: Find ANY x₁, x₂ where H(x₁) = H(x₂)

Collision is "easiest" to find (only 128-bit security)
but 128-bit is still astronomically difficult.

XRPL uses multiple hash functions for different purposes. Understanding why reveals careful engineering decisions.

SHA-256 in XRPL:

- Every transaction hashed with SHA-256
- Hash becomes transaction ID (64 hex characters)
- Used for lookups, references, deduplication

- Each ledger contains hash of previous ledger
- Creates tamper-evident chain
- Modifying history changes all subsequent hashes

- Proposals identified by hash
- Validators agree on ledger hash
- Enables efficient comparison without full data

- Account states hashed in Merkle tree
- Root hash in ledger header
- Enables state verification proofs

Transaction Details:
Input: Raw transaction bytes
Output: 256-bit (32-byte) hash
Encoding: Usually hex (64 characters)
Example: 7F83B1657FF1FC53B92DC18148A1D65DFC2D4B1FA3D677284ADDD200126D9069

SHA-512 in XRPL:

- Ed25519 signature scheme internally
- Key derivation processes
- Where extended output needed

- 512-bit output (64 bytes)
- Security: 256-bit pre-image, 256-bit collision
- Faster than SHA-256 on 64-bit processors
- Same design as SHA-256, larger state

- SHA-256: Standard for commitments, IDs
- SHA-512: Required by EdDSA specification
- Both from SHA-2 family, well-analyzed

RIPEMD-160 in XRPL:

Purpose: Compress public key hashes for addresses

- SHA-256 output: 32 bytes
- RIPEMD-160 output: 20 bytes
- Shorter addresses = better UX
- Combined: SHA-256(pubkey) → RIPEMD-160(result)

- RIPEMD-160 alone: 80-bit collision resistance
- Used after SHA-256: attacker must break both
- Combined security better than either alone

- Bitcoin chose RIPEMD-160 in 2009
- XRPL maintained compatibility
- Legacy decision, still secure for purpose

XRPL Hash Function Usage:

┌────────────────┐
                        │   Public Key   │
                        │   (33 bytes)   │
                        └───────┬────────┘
                                │
                           SHA-256
                                │
                        ┌───────▼────────┐
                        │  Hash of Key   │
                        │   (32 bytes)   │
                        └───────┬────────┘
                                │
                          RIPEMD-160
                                │
                        ┌───────▼────────┐
                        │   Account ID   │
                        │   (20 bytes)   │
                        └───────┬────────┘
                                │
                          Base58Check
                                │
                        ┌───────▼────────┐
                        │    Address     │
                        │  (25-35 chars) │
                        └────────────────┘

Transaction Processing:

┌─────────────────┐ SHA-256 ┌─────────────────┐
│ Transaction │ ───────────────► │ Transaction ID │
│ (bytes) │ │ (32 bytes) │
└─────────────────┘ └─────────────────┘

Ledger Chain:

┌─────────┐ ┌─────────┐ ┌─────────┐
│Ledger N │ │Ledger │ │Ledger │
│ │◄───│ N+1 │◄───│ N+2 │
│Hash: H₁ │ │Hash: H₂ │ │Hash: H₃ │
│Prev: H₀ │ │Prev: H₁ │ │Prev: H₂ │
└─────────┘ └─────────┘ └─────────┘

---

Understanding SHA-256's construction reveals why it's trusted for security-critical applications.

SHA-256 Structure (Merkle-Damgård):

Input:   Message of any length
Output:  256-bit (32-byte) digest

1. Padding: Add bits to make length multiple of 512
2. Parsing: Divide into 512-bit blocks
3. Initialization: Set 8 initial hash values
4. Processing: Apply compression function to each block
5. Output: Final hash values concatenated

┌─────────┐   ┌─────────┐   ┌─────────┐   ┌─────────┐
│ Block 1 │   │ Block 2 │   │ Block 3 │   │ Block n │
└────┬────┘   └────┬────┘   └────┬────┘   └────┬────┘
     │             │             │             │
     ▼             ▼             ▼             ▼
   ┌───┐         ┌───┐         ┌───┐         ┌───┐
IV─►│ f │────────►│ f │────────►│ f │───...──►│ f │──► Hash
   └───┘         └───┘         └───┘         └───┘

f = compression function
IV = initialization vector
Each application updates 256-bit state

SHA-256 Compression (High-Level):

- 64 rounds per 512-bit block
- Mix message bits into state
- Apply non-linear operations
- Shift and rotate bits

- AND, OR, XOR (bit-wise logic)
- Addition modulo 2^32
- Right-rotation and right-shift

- Mix bits thoroughly (diffusion)
- Create non-linear relationships (confusion)
- Reversible individually, irreversible combined
- Efficient on standard CPUs

- Each bit of output depends on all input bits
- After all rounds: thoroughly scrambled
- Cannot work backward without knowing all inputs

Avalanche Effect Demonstration:

Input 1: "Send 100 XRP"
Hash: 8a7b1c3d9e4f2a5b6c7d8e9f0a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b

Input 2: "Send 100 XRQ" (one letter changed)
Hash: 3f2e1d0c9b8a7f6e5d4c3b2a1f0e9d8c7b6a5f4e3d2c1b0a9f8e7d6c5b4a3f2e

- 1 bit changed in input
- ~128 bits changed in output (50%)
- Completely unpredictable which bits flip
- No correlation between input change and output change

- Can't make "small" changes to get desired hash
- Any modification is detectable
- No way to predict output from input structure
- Underpins all integrity guarantees

SHA-256 Security Status:

Best Known Attacks:

Brute Force Pre-Image:
Birthday Attack (Collision):
Length Extension Attack:
Reduced-Round Attacks:

SHA-256 published: 2001
20+ years of cryptanalysis
No practical attack discovered
Trillions of dollars protected
Grover's algorithm: Quadratic speedup for search
Pre-image: 2^256 → 2^128 (still infeasible)
Collision: 2^128 → 2^64 (potentially concerning)
SHA-256 likely needs 384-bit upgrade eventually

Merkle trees combine hash functions with tree structures to enable efficient verification of large data sets—critical for blockchain scalability.

Merkle Tree Basics:

Purpose: Commit to large data set with single hash
Property: Can prove any element belongs with O(log n) data

1. Hash each data element (leaves)
2. Pair hashes and hash pairs (parents)
3. Continue until single root hash
4. Root commits to entire data set

Example (4 transactions):

┌────────────┐
│ Root │
│ Hash │
└─────┬──────┘
│
┌─────────────┴─────────────┐
│ │
┌────▼────┐ ┌────▼────┐
│ Hash AB │ │ Hash CD │
└────┬────┘ └────┬────┘
│ │
┌─────┴─────┐ ┌─────┴─────┐
│ │ │ │
┌──▼──┐ ┌──▼──┐ ┌──▼──┐ ┌──▼──┐
│Tx A │ │Tx B │ │Tx C │ │Tx D │
│Hash │ │Hash │ │Hash │ │Hash │
└─────┘ └─────┘ └─────┘ └─────┘

Root = H(Hash_AB || Hash_CD)
Hash_AB = H(Hash_A || Hash_B)
Hash_CD = H(Hash_C || Hash_D)
Hash_A = H(Transaction_A)
... etc
```

Proving Transaction Inclusion:

Scenario: Prove Tx B is in the tree without revealing all transactions

- Tx B data (to hash)
- Hash of Tx A (sibling)
- Hash CD (uncle)

1. Hash Tx B → Hash_B
2. Combine with provided Hash_A → Hash_AB = H(Hash_A || Hash_B)
3. Combine with provided Hash_CD → Root = H(Hash_AB || Hash_CD)
4. Compare computed root with known root
5. Match → Tx B was included

- For n leaves: log₂(n) hashes needed
- 1 million transactions: only 20 hashes (~640 bytes)
- 1 billion transactions: only 30 hashes (~960 bytes)

- Light clients verify inclusion
- No need to download entire ledger
- Trustless verification with minimal data

XRPL Account State Tree:

- Accounts are leaves
- Internal nodes are hashes
- Root hash in ledger header

State Proof:
"Prove account rXYZ has balance of 1000 XRP"

- Account object (balance, sequence, etc.)
- Merkle path to root
- Root hash (in ledger header)

1. Hash account object
2. Follow Merkle path to root
3. Compare with ledger root hash
4. Match → Account state verified

- Don't need full ledger history
- Can verify any account state
- Proof size: O(log n) regardless of ledger size
- Enables SPV (Simplified Payment Verification)

Merkle Tree Security:

- Changing any leaf changes root
- Path through tree must be recalculated
- Tampering is detectable

- Root hash commits to exact set of leaves
- Cannot add/remove leaves without changing root
- Total binding: any modification visible

- Cannot create valid proof for non-existent leaf
- Would require collision in hash function
- With SHA-256: computationally impossible

Attack Scenarios:

Modify Transaction:
Add Fake Transaction:
Create Fake Proof:

Hash functions can be misused. Understanding length extension attacks reveals why construction matters as much as algorithm choice.

Length Extension Attack:

Vulnerable Pattern:
MAC = H(secret || message)

The Problem:
SHA-256 (and MD5, SHA-1, SHA-512) are vulnerable to length extension.

- H(secret || message)
- Length of secret
- (NOT the secret itself!)

- H(secret || message || padding || extension)
- Without knowing the secret!

- SHA-256 output IS the internal state
- Attacker continues hashing from that state
- Can append arbitrary data
- Valid hash for extended message

Example Attack:
Original: MAC = SHA-256("secret" || "amount=100")
Attacker: Computes SHA-256("secret" || "amount=100" || padding || "&amount=1000000")
Result: Valid MAC for malicious message
Impact: Could authorize fraudulent transactions if protocol is naive

Why This Matters for Blockchains:

Scenario 1: Naive API Authentication
Bad design: signature = SHA-256(api_key || request)
Attack: Extend request with malicious parameters
Result: Valid signature for attacker's modified request

Scenario 2: Transaction Malleability
If transaction hash computed naively
Attacker might extend transaction data
Could create transaction with same authorization but different effect

- Transactions have fixed structure
- Signature covers specific fields
- No arbitrary extension possible
- Not vulnerable to length extension in practice

HMAC: Hash-based Message Authentication Code

Construction:
HMAC(K, M) = H((K ⊕ opad) || H((K ⊕ ipad) || M))

Where:
K = Secret key
M = Message
opad = Outer padding (0x5c repeated)
ipad = Inner padding (0x36 repeated)

Double hashing breaks length extension
Inner hash produces intermediate result
Outer hash consumes intermediate result
Cannot extend without knowing key

Used in key derivation
Secure message authentication
Standard cryptographic construction

SHA-3 (Keccak) uses different construction
Sponge function, not Merkle-Damgård
Inherently resistant to length extension
XRPL uses SHA-2 family (established before SHA-3)

Choosing Hash Functions:

For Transaction/Data IDs:
✓ SHA-256 (standard, well-analyzed)
✓ Direct application safe
✓ No secret key involved

For MACs/Authentication:
✓ HMAC-SHA-256 (or HMAC-SHA-512)
✗ NOT plain SHA-256(key || message)
✗ Vulnerable to length extension

For Password Hashing:
✗ NOT plain SHA-256 (too fast)
✓ bcrypt, scrypt, Argon2
✓ Designed to be slow

For Key Derivation:
✓ HKDF (HMAC-based Key Derivation Function)
✓ PBKDF2 (with high iteration count)
✗ NOT plain hash of seed

- SHA-256 for IDs and commitments
- HMAC in key derivation
- Proper constructions throughout

---

Understanding historical attacks on hash functions contextualizes SHA-256's security.

MD5: Completely Broken

1992: Published
2004: First collision found
2008: Rogue CA certificate attack
Today: Collision in seconds on laptop

Status: NEVER use for security
Still seen: File checksums (integrity, not security)

Attack Reality:
$ hashclash --attack collision md5
[Collision found in 2.1 seconds]

SHA-1: Practically Broken

1995: Published
2005: Theoretical attack published
2017: SHAttered - practical collision
Cost: ~$110,000 in GPU compute

Status: Deprecated, actively being phased out
Chrome, Firefox: Reject SHA-1 certificates

Two different PDFs with same SHA-1 hash
Demonstrated real-world exploitability
Took 6,500 years of CPU time + 110 years of GPU time
First practical SHA-1 collision

SHA-256 vs Broken Predecessors:

MD5         SHA-1       SHA-256
Output bits       128         160         256
Collision bits    64 (theory) 80 (theory) 128 (theory)
Best attack       <64         <80         N/A (no attack)
Status            BROKEN      BROKEN      SECURE

Why SHA-256 Remains Secure:

Larger State:
More Rounds:
Improved Operations:
20+ Years of Analysis:

Hypothetical SHA-256 Break:

- Could create two transactions with same hash
- Merkle tree proofs become ambiguous
- Impact: Significant but limited

- Could find transaction matching target hash
- Could potentially forge transaction IDs
- Impact: Severe, would require protocol changes

- Could replace any transaction with malicious one
- Would break all integrity guarantees
- Impact: Catastrophic

1. Amendment process to upgrade hash function
2. SHA-3 or SHA-384 as replacement
3. Protocol migration similar to Y2K preparations
4. Cryptographic agility enables response

- Probability of SHA-256 break: Very low
- Consequence if broken: High
- Mitigation available: Yes (protocol upgrades)
- Action needed now: Monitor research, prepare contingency

---

✅ SHA-256 has withstood 20+ years of intensive cryptanalysis. Unlike MD5 and SHA-1 which were broken, SHA-256 shows no signs of weakness. The absence of any attack significantly better than brute force after two decades of well-funded research suggests genuine security.

✅ 128-bit collision resistance is adequate for all foreseeable classical computing scenarios. The computational resources required to find a SHA-256 collision exceed what's practically achievable. Even with optimistic projections of computing power growth, this margin remains comfortable for decades.

✅ Merkle trees provide mathematically sound proofs of inclusion and state. The security reduction is clear: if the hash function is collision-resistant, Merkle proofs are unforgeable. This isn't a heuristic—it's a provable security property.

⚠️ Quantum computers reduce collision resistance from 128 to 64 bits via Grover's algorithm. This is a known theoretical concern, though 64-bit collision resistance still requires approximately 2^64 operations even with quantum speedup. The timeline for quantum computers capable of this is uncertain but likely 15+ years.

⚠️ New mathematical breakthroughs could change the landscape. Cryptographic history includes surprises. While SHA-256 appears solid, the impossibility of proving security means unknown attacks could exist.

⚠️ Implementation quality varies. The hash function itself is secure, but implementations can introduce timing side channels or other vulnerabilities. Each implementation should be independently verified.

🔴 Using broken hash functions for security is common despite warnings. MD5 and SHA-1 are still found in production systems. XRPL correctly uses SHA-256, but ecosystem tools and integrations may not be as careful.

🔴 Length extension attacks threaten naive constructions. Anyone building authentication or commitment schemes using SHA-256 must understand this vulnerability. Using HMAC instead of plain hashing prevents the attack.

🔴 Hash function output ≠ random. While hash outputs appear random, they're deterministic. Using hash outputs where true randomness is required (key generation) introduces predictability if inputs are predictable.

SHA-256 is an excellent choice for XRPL's integrity needs. After two decades of cryptanalysis by motivated attackers (breaking SHA-256 would bring fame and fortune), no practical attack exists. The 128-bit collision resistance provides margins far beyond any conceivable classical computing attack.

However, "secure hash function" doesn't mean "use however you want." Construction matters: HMAC for authentication, proper key derivation functions for keys, and appropriate algorithms for each use case. XRPL's design reflects this understanding, using hash functions correctly throughout its architecture.

Assignment: Create a comprehensive diagram and documentation mapping every use of hash functions in XRPL's architecture, explaining why each specific function was chosen for each purpose.

Requirements:

Part 1: Visual Map

Every location where hash functions are used in XRPL
Which specific hash function is used at each point
Data flow through hashing operations
Relationships between hashed values
Transaction hashing → Transaction ID
Address derivation (full path from private key)
Ledger chain hashing
Merkle tree construction for transactions
State tree for account data
Signature scheme internals (where applicable)

Part 2: Function Justification Table

Location in architecture
Hash function used
Input format
Output format/size
Security property relied upon
Why this specific function was chosen
Alternatives considered (if known)

Part 3: Security Analysis

What attack would be enabled if this hash were broken?
What's the current security margin?
What would the upgrade path be?

Part 4: Implementation Notes

Code libraries that implement these hashes in XRPL ecosystem
Known implementation considerations
Testing/verification approaches
Completeness of map (30%)
Technical accuracy (30%)
Justification quality (20%)
Presentation clarity (20%)

Time Investment: 4-5 hours

Value: This map serves as a reference for understanding XRPL's integrity architecture and for evaluating the security implications of any proposed changes to hash function usage.

Knowledge Check

Question 1 of 5

Security Property Identification

Rogaway & Shrimpton: "Cryptographic Hash-Function Basics" (foundational paper)
NIST FIPS 180-4: Secure Hash Standard (SHA-256 specification)
Merkle: "A Digital Signature Based on a Conventional Encryption Function"

Wang et al.: "Finding Collisions in the Full SHA-1" (breakthrough attack)
Stevens et al.: "The First Collision for Full SHA-1" (SHAttered attack)
Kelsey & Schneier: "Second Preimages on n-bit Hash Functions"

Bertoni et al.: "The Keccak Reference" (SHA-3 specification)
NIST SP 800-185: SHA-3 Derived Functions

For Next Lesson:
We'll examine digital signatures—how ECDSA and EdDSA transform hash functions and elliptic curves into unforgeable proofs of authorization. Understanding signature generation and verification reveals how your private key authorizes transactions without ever being revealed.

End of Lesson 3

Total words: ~6,100
Estimated completion time: 55 minutes reading + 4-5 hours for deliverable

Key Takeaways

Cryptographic hash functions provide three distinct security properties.

Pre-image resistance prevents finding inputs for given outputs. Second pre-image resistance prevents finding alternative inputs with matching hashes. Collision resistance prevents finding any two inputs that hash identically. SHA-256 provides 256-bit, 256-bit, and 128-bit security respectively for these properties.

XRPL uses multiple hash functions for specific purposes.

SHA-256 handles transaction identification, ledger chaining, and Merkle tree construction. SHA-512 supports Ed25519 signatures. RIPEMD-160 compresses public key hashes for shorter addresses. Each choice reflects specific requirements.

Merkle trees transform hash functions into efficient verification structures.

With a single root hash, you can prove inclusion of any element using only O(log n) hashes. This enables light clients, state proofs, and scalable verification—critical for practical blockchain operation.

The avalanche effect makes hash outputs completely unpredictable.

Changing one bit of input changes approximately half the output bits in an unpredictable pattern. This property underpins all integrity guarantees: any modification is detectable because it produces a completely different hash.

Proper construction prevents hash function misuse.

Length extension attacks break naive MAC constructions like H(secret || message). Using HMAC or SHA-3 prevents this vulnerability. XRPL uses appropriate constructions throughout, but developers building on XRPL must understand these requirements for their own code. ---

Hash Functions in XRPL

Learning Objectives

Introduction: The Digital Fingerprint Machine

Section 1: Cryptographic Hash Function Properties

Section 2: Hash Functions in XRPL Architecture

Section 3: SHA-256 Deep Dive

Section 4: Merkle Trees and Efficient Verification

Section 5: Length Extension and Proper Construction

Section 6: Hash Function Attacks and Defenses

Critical Analysis

Deliverable: XRPL Hash Function Map

Assessment Questions

Knowledge Check

Further Reading & Sources

Key Takeaways

Further Reading & Sources