Protocol Risk Scoring - Building Your Evaluation Framework
Learning Objectives
Apply a systematic scoring framework across five risk dimensions to any DeFi protocol
Evaluate security indicators including audit quality, track record, and bug bounty programs
Assess team risk by analyzing identity, track record, and incentive alignment
Identify economic design vulnerabilities through sustainability and dependency analysis
Create weighted composite scores that inform position sizing and portfolio decisions
When evaluating DeFi protocols, most investors rely on intuition: "This feels solid" or "I've heard good things." This approach is inadequate for several reasons:
INTUITION PROBLEMS
Inconsistency:
├── Same person rates differently on different days
├── Mood, recent news, and presentation affect judgment
├── No systematic basis for comparison
└── Can't explain or defend decisions
Incompleteness:
├── You focus on what you know
├── Important factors get overlooked
├── Confirmation bias reinforces initial impression
└── Unknown unknowns stay unknown
Incomparability:
├── How does Protocol A compare to Protocol B?
├── "Both seem good" doesn't help allocation
├── No basis for position sizing
└── Portfolio construction becomes arbitrary
```
A systematic scoring framework addresses these problems by forcing consistent evaluation across standardized criteria, ensuring completeness, and enabling meaningful comparison.
The goal isn't perfect scores. The goal is systematic thinking that reduces the probability of catastrophic oversights.
Our protocol risk scoring model evaluates five dimensions, each weighted by its importance to overall risk:
PROTOCOL RISK SCORING MODEL
Dimension Weight Range Description
─────────────────────────────────────────────────────
Security 30% 0-10 Audit, track record, bug bounty
Team 15% 0-10 Identity, history, incentives
Economic Design 25% 0-10 Sustainability, dependencies
Technical 15% 0-10 Architecture, complexity
Governance 15% 0-10 Centralization, upgrade process
─────────────────────────────────────────────────────
Total 100% 0-10 Weighted average
INTERPRETATION:
8-10: Lower risk (relatively) - Core portfolio candidate
6-8: Moderate risk - Selective allocation
4-6: Higher risk - Small positions only
<4: Extreme risk - Generally avoid
NOTE: Even 8-10 scores don't mean "safe"
DeFi carries inherent risk regardless of score
```
WEIGHT RATIONALE
Security (30%):
├── Direct cause of most DeFi losses
├── Smart contract exploits = instant total loss
├── Most actionable dimension
└── Highest weight justified
Economic Design (25%):
├── Caused largest single loss (Terra/Luna)
├── Often overlooked by investors
├── Can fail even with perfect code
└── Second-highest weight
Team (15%):
├── Affects all other dimensions
├── But team risk is harder to assess
├── Good teams can have bad outcomes
└── Moderate weight
Technical (15%):
├── Underlying architecture matters
├── But captured partially in Security
├── Complexity is risk factor
└── Moderate weight
Governance (15%):
├── Can override other factors (upgrades)
├── Centralization creates single points of failure
├── But many protocols function fine with centralization
└── Moderate weight
You can adjust weights based on your priorities.
These represent a balanced starting point.
```
Not all audits are equal. Evaluate audit quality systematically:
AUDIT QUALITY ASSESSMENT
Factor 1: Number of Audits
├── 0 audits: Score 0
├── 1 audit: Score 4
├── 2 audits: Score 6
├── 3+ audits: Score 8
└── Diminishing returns after 3
Factor 2: Auditor Reputation
Tier 1 auditors (+ 2 points if at least one):
├── Trail of Bits
├── OpenZeppelin
├── Consensys Diligence
├── ChainSecurity
├── Certik (mixed reputation but widely used)
├── Zellic
├── Spearbit
└── Tier 1 = teams with strong track records
Tier 2 auditors (+ 1 point):
├── Other established firms
├── Reputable independent auditors
└── Known in the ecosystem
Unknown auditors (+ 0 points):
├── Unknown firms
├── Self-audited
└── Audit quality uncertain
Factor 3: Audit Recency
├── < 6 months since last audit: +1
├── 6-12 months: +0
├── > 12 months: -1
├── Code changes since audit: -1 additional
└── Recent = relevant
Factor 4: Finding Resolution
├── All critical/high resolved: +1
├── Outstanding critical/high: -3
├── Transparent response: +1
├── No response published: -1
└── How team handles findings matters
AUDIT SCORE CALCULATION:
Base: Number of audits score (0-8)
+ Auditor tier adjustment
+ Recency adjustment
+ Finding resolution adjustment
= Audit sub-score (cap at 10)
Time without incident is evidence (but not proof) of security:
TRACK RECORD ASSESSMENT
Factor 1: Time Operating
├── < 3 months: Score 1
├── 3-6 months: Score 3
├── 6-12 months: Score 5
├── 12-24 months: Score 7
├── > 24 months: Score 9
└── Time-tested is valuable
Factor 2: TVL History
├── Peak TVL > $100M without incident: +1
├── Peak TVL > $1B without incident: +2
├── Significant incident: -3 to -5
└── Higher TVL = bigger target survived
Factor 3: Previous Incidents
├── None: +0 (baseline)
├── Minor (< $1M, resolved): -1
├── Moderate ($1M-$10M): -3
├── Major (> $10M): -5
├── Multiple incidents: -7
└── History matters
Factor 4: Related Protocol History
├── Team's previous protocols safe: +1
├── Team's previous protocol exploited: -2
├── Fork of safe protocol: +0.5
├── Fork of exploited protocol: -1
└── Track record extends to team history
TRACK RECORD SCORE:
Time operating base score
+ TVL adjustment
+ Incident adjustment
+ Related history adjustment
= Track record sub-score (cap at 10)
Bug bounty programs incentivize responsible disclosure:
BUG BOUNTY ASSESSMENT
Factor 1: Program Existence
├── No program: Score 2
├── Private/limited program: Score 4
├── Public program: Score 6
├── Platform-hosted (Immunefi, etc.): Score 7
└── Public is better than private
Factor 2: Bounty Size
├── < $50K max: +0
├── $50K-$100K max: +1
├── $100K-$500K max: +2
├── > $500K max: +3
└── Larger bounties attract more scrutiny
Factor 3: Response History
├── Paid bounties (evidence of working): +1
├── Fast response to submissions: +0.5
├── Ignored or disputed valid findings: -2
└── How they handle findings matters
BUG BOUNTY SCORE:
Existence base + Size adjustment + Response adjustment
= Bug bounty sub-score (cap at 10)
SECURITY SCORE CALCULATION
Security Score = (
Audit Score × 0.40 +
Track Record Score × 0.40 +
Bug Bounty Score × 0.20
)
Weighted contribution to total: Security Score × 0.30
EXAMPLE:
├── Audit Score: 7
├── Track Record Score: 8
├── Bug Bounty Score: 6
├── Composite: (7×0.4 + 8×0.4 + 6×0.2) = 7.2
└── Weighted contribution: 7.2 × 0.30 = 2.16
```
Who's behind the protocol matters:
TEAM IDENTITY ASSESSMENT
Factor 1: Team Identification
├── Anonymous: Score 2
├── Pseudonymous with reputation: Score 4
├── Partially doxxed (some members): Score 6
├── Fully doxxed (key members public): Score 8
├── Institutional backing (known company): Score 9
└── More transparency = more accountability
Factor 2: Professional Background
├── No verifiable background: Score 2
├── Some crypto experience: Score 4
├── Significant crypto experience: Score 6
├── Traditional finance + crypto: Score 7
├── Top-tier backgrounds (FAANG, major institutions): Score 8
└── Experience in relevant domains
Factor 3: Communication Quality
├── No regular updates: Score 2
├── Irregular updates: Score 4
├── Regular updates (monthly+): Score 6
├── Frequent, detailed updates: Score 8
├── Multiple channels, responsive: Score 9
└── Communication indicates professionalism
IDENTITY SCORE:
(Identification × 0.50 + Background × 0.30 + Communication × 0.20)
= Team identity sub-score
What has the team done before?
TEAM TRACK RECORD ASSESSMENT
Factor 1: Previous Projects
├── No previous projects: Score 3
├── Previous projects unknown: Score 4
├── Previous projects modest success: Score 6
├── Previous projects significant success: Score 8
├── Multiple successful projects: Score 9
└── Serial success indicates capability
Factor 2: Reputation
├── Unknown: Score 3
├── Mixed reputation: Score 4
├── Generally positive: Score 6
├── Strong positive reputation: Score 8
├── Industry-recognized: Score 9
└── What does the community think?
Factor 3: Controversy
├── No known controversies: +0 (baseline)
├── Minor controversies resolved: -1
├── Significant controversies: -3
├── Fraud or exit scam history: -10 (disqualifying)
└── Past behavior predicts future
TRACK RECORD SCORE:
(Projects × 0.40 + Reputation × 0.40 + Controversy adjustment)
= Team track record sub-score
```
Do the team's incentives align with users'?
INCENTIVE ALIGNMENT ASSESSMENT
Factor 1: Token Holdings
├── Unknown holdings: Score 3
├── Holdings disclosed, small: Score 5
├── Holdings disclosed, significant: Score 7
├── Large holdings with vesting: Score 9
└── Skin in the game matters
Factor 2: Vesting Schedule
├── No vesting / immediate unlock: Score 2
├── Short vesting (< 1 year): Score 4
├── Standard vesting (1-2 years): Score 6
├── Long vesting (2-4 years): Score 8
├── Very long vesting (4+ years): Score 9
└── Aligned long-term incentives
Factor 3: Revenue Alignment
├── Revenue from protocol success: +2
├── Revenue regardless of success: -1
├── Clear fee sharing with users: +1
└── Does team win when users win?
INCENTIVE SCORE:
(Holdings × 0.40 + Vesting × 0.40 + Revenue adjustment)
= Incentive alignment sub-score
```
TEAM SCORE CALCULATION
Team Score = (
Identity Score × 0.35 +
Track Record Score × 0.35 +
Incentive Score × 0.30
)
Weighted contribution to total: Team Score × 0.15
```
Where does the money come from?
YIELD SUSTAINABILITY ASSESSMENT
Factor 1: Yield Source Clarity
├── Unclear / can't explain: Score 1
├── Vague (staking rewards, etc.): Score 3
├── Clear but complex: Score 5
├── Clear and simple: Score 8
├── Audited / verified: Score 9
└── If you can't explain it, don't invest
Factor 2: Yield Source Type
├── Pure token emissions: Score 2
├── Mixed (emissions + fees): Score 5
├── Primarily organic (fees, interest): Score 8
├── Fully organic: Score 10
└── Sustainable sources > inflationary
Factor 3: Yield Reasonableness
├── > 100% APY: Score 1 (extreme risk)
├── 50-100% APY: Score 3 (high risk)
├── 20-50% APY: Score 5 (elevated risk)
├── 10-20% APY: Score 7 (reasonable)
├── < 10% APY: Score 9 (conservative)
└── Higher yields = higher risks
Factor 4: Historical Yield Stability
├── New (< 3 months): Score 3
├── Volatile yields: Score 4
├── Stable yields: Score 7
├── Stable yields for 12+ months: Score 9
└── Can it persist?
YIELD SUSTAINABILITY SCORE:
(Clarity × 0.25 + Type × 0.35 + Reasonableness × 0.25 + Stability × 0.15)
```
Does the system depend on itself?
CIRCULAR DEPENDENCY ASSESSMENT
Factor 1: Token Value Dependencies
├── Token value independent of protocol: Score 8
├── Token value linked but not critical: Score 6
├── Token value partially circular: Score 4
├── Token value highly circular: Score 2
├── Pure Ponzi structure: Score 0
└── Terra/Luna = highly circular
Factor 2: Collateral Dependencies
├── External collateral (BTC, ETH, fiat): Score 8
├── Mixed collateral: Score 5
├── Primarily native token collateral: Score 3
├── Fully circular collateral: Score 1
└── What backs the system?
Factor 3: Demand Dependencies
├── Organic demand (utility): Score 8
├── Mixed demand: Score 5
├── Primarily yield-driven demand: Score 3
├── Purely speculative demand: Score 2
└── Why do people use it?
DEPENDENCY SCORE:
(Token Dependencies × 0.40 + Collateral × 0.35 + Demand × 0.25)
WARNING SIGNS (automatic score reduction):
├── "Number go up" required for survival: -3
├── Reflexive mechanisms: -2
├── Death spiral potential identified: -3
```
What happens when things go wrong?
STRESS RESILIENCE ASSESSMENT
Factor 1: Drawdown Survival
├── Survives 30% token price drop: Score 2
├── Survives 50% drop: Score 5
├── Survives 70% drop: Score 7
├── Survives 90% drop: Score 9
└── Model under stress conditions
Factor 2: Liquidity Crisis Response
├── No mechanism: Score 2
├── Circuit breakers exist: Score 5
├── Multiple safeguards: Score 7
├── Battle-tested mechanisms: Score 9
└── What happens in a bank run?
Factor 3: Recovery Mechanisms
├── No recovery path: Score 2
├── Difficult recovery: Score 4
├── Clear recovery mechanisms: Score 7
├── Proven recovery (has recovered): Score 9
└── Can it come back?
STRESS RESILIENCE SCORE:
(Drawdown × 0.40 + Liquidity × 0.35 + Recovery × 0.25)
```
ECONOMIC DESIGN SCORE CALCULATION
Economic Score = (
Yield Sustainability × 0.35 +
Dependency Score × 0.40 +
Stress Resilience × 0.25
)
Weighted contribution to total: Economic Score × 0.25
```
ARCHITECTURE ASSESSMENT
Factor 1: Design Quality
├── Novel / untested design: Score 3
├── Established design with modifications: Score 5
├── Battle-tested design: Score 7
├── Industry-standard patterns: Score 9
└── Novel ≠ better; novel = riskier
Factor 2: Complexity
├── Highly complex (many contracts, interactions): Score 3
├── Moderate complexity: Score 5
├── Straightforward architecture: Score 7
├── Minimal complexity: Score 9
└── Simpler = fewer attack vectors
Factor 3: Code Quality Indicators
├── Poor documentation: Score 3
├── Basic documentation: Score 5
├── Good documentation: Score 7
├── Excellent documentation + comments: Score 9
└── Documentation indicates professionalism
ARCHITECTURE SCORE:
(Design × 0.35 + Complexity × 0.40 + Code Quality × 0.25)
```
DEPENDENCY ASSESSMENT
Factor 1: External Protocol Dependencies
├── Multiple critical dependencies: Score 3
├── Few critical dependencies: Score 5
├── Single well-tested dependency: Score 7
├── Minimal dependencies: Score 9
└── Each dependency is a risk vector
Factor 2: Oracle Dependencies
├── Single centralized oracle: Score 2
├── Single decentralized oracle: Score 5
├── Multiple oracles / aggregation: Score 7
├── TWAP / manipulation resistant: Score 9
├── No oracle needed: Score 10
└── Oracle = common attack vector
Factor 3: Upgrade Dependencies
├── Immutable: Score 9 (can't be changed)
├── Upgradeable with time lock: Score 6
├── Upgradeable with multisig: Score 4
├── Upgradeable by single key: Score 2
└── Upgradeable = can be changed
TECHNICAL DEPENDENCY SCORE:
(Protocol Deps × 0.35 + Oracle × 0.40 + Upgrade × 0.25)
```
TECHNICAL SCORE CALCULATION
Technical Score = (
Architecture Score × 0.50 +
Dependency Score × 0.50
)
Weighted contribution to total: Technical Score × 0.15
```
CENTRALIZATION ASSESSMENT
Factor 1: Admin Key Risk
├── Single admin EOA: Score 1
├── Multisig (3/5 or similar): Score 5
├── Large multisig (5/9+): Score 7
├── On-chain governance only: Score 9
├── Fully immutable: Score 10
└── Fewer keys = more risk
Factor 2: Upgrade Authority
├── Instant upgrades possible: Score 2
├── 24-48 hour timelock: Score 5
├── 7+ day timelock: Score 7
├── 30+ day timelock: Score 9
├── No upgrades possible: Score 10
└── Time locks enable response
Factor 3: Emergency Powers
├── Unrestricted emergency powers: Score 2
├── Limited emergency powers: Score 5
├── Emergency powers with transparency: Score 7
├── No emergency powers: Score 8
└── Emergency powers = trust requirement
CENTRALIZATION SCORE:
(Admin Key × 0.40 + Upgrade × 0.35 + Emergency × 0.25)
```
GOVERNANCE PROCESS ASSESSMENT
Factor 1: Token Distribution
├── Highly concentrated (>50% top 10): Score 3
├── Moderately concentrated: Score 5
├── Distributed: Score 7
├── Widely distributed: Score 9
└── Concentration = vote manipulation risk
Factor 2: Proposal Process
├── No formal process: Score 3
├── Informal process: Score 5
├── Formal on-chain process: Score 7
├── Robust process with safeguards: Score 9
└── Process indicates maturity
Factor 3: Attack Resistance
├── Vulnerable to flash loan voting: Score 2
├── Some protections: Score 5
├── Robust protections: Score 8
└── Can governance be attacked?
GOVERNANCE PROCESS SCORE:
(Distribution × 0.35 + Process × 0.35 + Attack Resistance × 0.30)
```
GOVERNANCE SCORE CALCULATION
Governance Score = (
Centralization Score × 0.55 +
Process Score × 0.45
)
Weighted contribution to total: Governance Score × 0.15
```
TOTAL PROTOCOL SCORE
Total = (Security × 0.30) + (Team × 0.15) + (Economic × 0.25)
+ (Technical × 0.15) + (Governance × 0.15)
INTERPRETATION:
├── 8.0-10.0: Lower risk - Up to 15-25% allocation
├── 6.0-7.9: Moderate risk - Up to 10-15% allocation
├── 4.0-5.9: Higher risk - Up to 5% allocation
├── 2.0-3.9: Extreme risk - Up to 1-2% allocation
└── 0.0-1.9: Avoid
```
SCORE-BASED POSITION SIZING
Formula:
Max Position % = Base × (Score / 10)²
Where Base = 25% (maximum any position)
Examples:
├── Score 9.0: 25% × 0.81 = 20.3%
├── Score 8.0: 25% × 0.64 = 16.0%
├── Score 7.0: 25% × 0.49 = 12.3%
├── Score 6.0: 25% × 0.36 = 9.0%
├── Score 5.0: 25% × 0.25 = 6.3%
├── Score 4.0: 25% × 0.16 = 4.0%
└── Score 3.0: 25% × 0.09 = 2.3%
Squaring penalizes lower scores appropriately.
```
✅ Systematic evaluation reduces oversight. Checklists and frameworks prevent common evaluation failures.
✅ Quantified scores enable comparison. Numbers allow meaningful portfolio construction.
✅ Multi-factor analysis captures diverse risks. No single factor predicts failure; combining factors improves assessment.
⚠️ Correct weighting is unknown. The weights are reasoned but not empirically validated.
⚠️ Scoring subjectivity remains. Different evaluators will score differently.
⚠️ Scores change over time. Regular updates required.
📌 Score worship. Scores are inputs to judgment, not replacements.
📌 False precision. 6.73 isn't meaningfully different from 6.68.
📌 Gaming. Protocols can optimize for criteria without reducing actual risk.
Protocol scoring transforms vague impressions into actionable numbers. But scores are estimates with significant uncertainty. Use them as one input among many, update frequently, and never forget that even high-scoring protocols can fail.
Assignment: Complete comprehensive risk scorecards for three DeFi protocols.
Requirements:
- Select 3 protocols (include variety)
- Include at least one XRPL protocol if applicable
Part 2: Detailed Scoring
Complete full scorecard for each with scores and evidence notes.
Part 3: Position Sizing
Calculate maximum recommended position for each.
Part 4: Comparison Analysis
Rank protocols and explain allocation rationale.
Part 5: Framework Reflection
Note challenges, missing information, and potential weight adjustments.
- Scoring completeness (25%)
- Evidence quality (25%)
- Consistency (20%)
- Position sizing logic (15%)
- Reflection quality (15%)
Time investment: 3 hours
1. Weight Justification:
Why does Security receive the highest weight (30%)?
A) Security is easiest to evaluate
B) Smart contract exploits cause instant, total losses
C) Security is more important than economic design
D) Higher weights produce higher scores
Correct Answer: B
2. Audit Quality:
A protocol has one Tier 1 audit (8 months old), resolved critical findings, no bug bounty. Approximate audit quality score?
A) 8-9 (Excellent)
B) 5-6 (Moderate)
C) 2-3 (Poor)
D) 9-10 (Perfect)
Correct Answer: B
3. Economic Design Red Flag:
A protocol offers 80% APY from token emissions. Token price must rise for real USD returns. What's the main concern?
A) Counterparty risk
B) Circular dependency / reflexivity
C) Oracle manipulation
D) Governance attack
Correct Answer: B
4. Position Sizing:
Protocol scores 6.0. Using Max = 25% × (Score/10)², what's the maximum position?
A) 6.0%
B) 9.0%
C) 15.0%
D) 25.0%
Correct Answer: B
5. Score Limitations:
Protocol X scores 8.2 then gets exploited 6 months later. Most likely explanation?
A) Framework is useless
B) Scores are estimates; even well-scored protocols can fail
C) Calculation errors
D) Wrong weights
Correct Answer: B
- DeFi Llama (protocol data)
- DeFi Safety (protocol ratings)
- L2Beat (Layer 2 risk methodology)
- Audit firm reports
- Rekt News (exploit analysis)
For Next Lesson:
Lesson 3 deep-dives into smart contract risk—reading audit reports, understanding attack vectors, and assessing time-based risk decay.
End of Lesson 2
Total words: ~5,400
Estimated completion time: 60 minutes reading + 3 hours for deliverable
Key Takeaways
Five dimensions capture protocol risk comprehensively.
Security (30%), Economic Design (25%), Team (15%), Technical (15%), and Governance (15%).
Sub-factor breakdown ensures thorough evaluation.
Breaking dimensions into specific factors prevents oversight.
Scores enable position sizing.
A score of 8.0 warrants ~16% max allocation; score of 5.0 warrants ~6%.
Portfolio-level scoring reveals aggregate risk.
Weight protocol scores by position size.
Scores are tools, not answers.
Use as input to judgment, maintain consistency, update regularly. ---