Version 1.0 - May 2026
HyperDeteX is a decentralized platform for synthetic voice detection. Our production model was internally audited in April 2026 on a balanced 400-sample test set: it flags 100% of synthetic speech generated by ElevenLabs, OpenAI, Azure, Deepgram and Voxtral, and correctly classifies 99.48% of real voices from LibriSpeech and VoxPopuli. These results are strong but in-distribution only: they do not yet measure robustness to replay attacks, unseen TTS vendors, telephony audio, or non-English languages. The model is retrained continuously on community-contributed samples precisely to close that gap. A blockchain-based incentive layer on Base rewards contributors who expand and diversify this dataset.
To protect the authenticity of voice communication in the digital age by developing accessible and effective detection solutions, supported by an engaged community.
To become the global standard for synthetic voice detection, establishing a trust framework for digital voice communications.
In an era where ElevenLabs, OpenAI, Azure Neural HD, Deepgram and Mistral Voxtral produce voices indistinguishable from real ones, existing detection methods relying on hand-crafted spectral features fail to generalize. HyperDeteX uses a deep learning model that operates directly on raw waveforms and, per our April 2026 internal audit, flags 100% of digital-to-digital synthetic speech from the five providers above and classifies 99.48% of real voices correctly on LibriSpeech + VoxPopuli. The same model is then continuously retrained on community-contributed samples to extend coverage to replay attacks, unseen vendors, telephony audio and multiple languages.
The model uses a frozen self-supervised backbone (rich acoustic representations learned from 960 hours of speech) topped by a lightweight, task-specific classification head. This keeps the system data-efficient and fast to retrain as new TTS engines emerge — every new community sample feeds the next training cycle without architectural changes. A blockchain incentive layer on Base rewards the community contributors who keep the dataset current.
Modern TTS systems — ElevenLabs v3/Flash v2.5, OpenAI gpt-4o-mini-tts/tts-1-hd, Azure Neural HD, Deepgram Aura 2.0, Mistral VoxTral mini-2603 — now generate voices that fool human listeners and traditional MFCC/spectral detectors alike. Classifiers trained on one TTS engine generalize poorly to others, creating a perpetual cat-and-mouse dynamic. The absence of a shared, continuously updated dataset means every organization rebuilds detection from scratch.
$5B+
Annual losses from voice fraud
250%
Increase in deepfake incidents
85%
Companies seeking solutions
HyperDeteX leverages a proprietary deep learning architecture combining advanced neural networks optimized for voice authentication. Our multi-layer approach processes raw audio waveforms directly, extracting acoustic signatures that distinguish genuine human speech from AI-generated voices. The system employs a transfer learning strategy with selective fine-tuning, enabling rapid adaptation to emerging deepfake technologies while maintaining computational efficiency. Blockchain smart contracts handle dataset provenance, contributor rewards, and governance.
Fake-as-fake rate
100%
ElevenLabs · OpenAI · Azure · Deepgram · Voxtral
Real-as-real rate
99.48%
LibriSpeech + VoxPopuli
Inference
47 ms
GPU optimized · 1-3 s CPU
Scope disclosure. These numbers are in-distribution on the 5 TTS vendors listed above. They do not yet cover replay attacks, voice cloning, unseen TTS engines, telephony audio or non-English speech. Public benchmark results will be published as the model is retrained on community-contributed samples (see §9 Roadmap).
HyperDeteX runs a continuously trained detection model, audited at 100% fake-detection and 99.48% real-classification on in-distribution samples across 5 TTS vendors. Rather than swapping architectures, the model is retrained on every cycle with new community-contributed samples, expanding coverage to replay attacks, unseen TTS engines, telephony audio, and non-English speech. The architecture is designed for rapid adaptation as new TTS engines emerge.
Processing Approach:
• Direct raw audio analysis
• Multi-layer feature extraction
• Contextual pattern recognition
Optimization:
• Efficient transfer learning
• Minimal retraining requirements
• Fast inference — verdict within seconds
Our detection system employs a sophisticated multi-stage pipeline that analyzes raw audio signals through proprietary deep learning models. The approach combines modern neural architecture patterns with custom optimization techniques.
Direct waveform analysis without traditional feature extraction dependencies
Multi-layer neural networks extract acoustic signatures automatically
Advanced contextual analysis identifies synthetic voice characteristics
Binary decision output with confidence scoring
HyperDeteX Detection Pipeline
┌─────────────────────────────────────────────┐
│ RAW AUDIO INPUT │
│ Voice message or audio clip │
└─────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────┐
│ ACOUSTIC FEATURE EXTRACTION │
│ Multi-layer neural processing pipeline │
└─────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────┐
│ CONTEXTUAL PATTERN ANALYSIS │
│ Advanced deep learning architecture │
└─────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────┐
│ CLASSIFICATION LAYER │
│ Binary decision with confidence scoring │
└─────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────┐
│ OUTPUT │
│ REAL | FAKE + Confidence Score (%) │
└─────────────────────────────────────────────┘
Confirmed: 100% of digital-to-digital TTS from the 5 audited vendors is detected; real speech is classified at 99.48% on LibriSpeech + VoxPopuli; no blind spot per provider.
Not yet measured: replay attacks (microphone-played TTS), voice cloning (impersonation from short reference audio), unseen vendors, telephony audio, non-English speech, and public benchmarks. Published as the model is retrained on community-contributed samples.
Validation Performance
Real-World Performance
Inference Speed
1–3 sec
Result delivered within seconds of submission
Efficiency
Optimized
Minimal computational overhead
Scalability
Enterprise
Supports high-volume processing
Transparency Note: As we scale from controlled datasets to community-contributed data, we anticipate performance variations that are normal in production ML systems. This section outlines expected challenges and our mitigation strategies.
v1 Baseline — In-Distribution Audit
v2 Target — Real-World, Voice Cloning & Telco
Mitigation Strategy:
Community-contributed samples may contain labeling errors, affecting training quality.
Mitigation Strategy:
Deepfake technology evolves rapidly. TTS systems in 2026-2027 will be significantly more sophisticated than current models.
Mitigation Strategy:
DTX is an ERC-20 utility token deployed on Base L2 (Coinbase layer-2). It powers the HyperDeteX ecosystem by rewarding dataset contributors and enabling decentralized governance. A single-tier presale at $0.01/DTXprecedes a Uniswap V2 launch at $0.015/DTX (+50% upside for presale buyers).
Up to 10% of total supply (15M DTX) is offered through a 5-day FOMO price ladder. Earlier days receive a better price; after Day 5 the price stabilizes until the presale closes.
| Day | Price | Upside vs launch |
|---|---|---|
| Day 1 | $0.010 | +50% |
| Day 2 | $0.011 | +36% |
| Day 3 | $0.012 | +25% |
| Day 4 | $0.013 | +15% |
| Day 5 → close | $0.014 | +7% |
| TGE / Launch | $0.015 |
| Parameter | Value |
|---|---|
| Allocation | Up to 15,000,000 DTX (10% of supply) |
| Soft cap | $20,000 USDC |
| Hard cap | $125,000 USDC |
| Min / Max contribution | $100 / $2,500 per wallet |
| Payment | USDC on Base L2 |
TGE unlock: 15% immediate at launch + 85% released linearly, second by second, on-chain over 6 months from TGE. No cliff, no waiting period — buyers can claim at any moment.
Pool Launch is the maximum DTX reserved for LP seeding. The actual LP seed is sized to 30% of the raise at $0.015/DTX. Any unused portion of the bucket is burned 100% at launch(). Reserve splits 10% (750K) to a Trading Wallet (liquid, MM inventory) and 90% (6.75M) to 12-month linear vesting.
The presale raise is split three ways at finalize(). The LP is locked for 12 months and opens at the invariant price of $0.015 / DTX:
LP receiver: 30% of raise → Uniswap V2 pair
Buyback wallet: ~9% of raise → price support post-launch
Treasury: ~61% of raise → operations, dev, marketing
DTX_LP = LP_USDC / $0.015
Example (hardcap $125K): $37.5K USDC + 2.5M DTX in LP → spot $0.015. The Pool Launch bucket holds 3M DTX, so 500K DTX are burned 100% at launch() even at hardcap. The lower the raise, the larger the burn — natural deflation proportional to softness of the raise.
| Raise | LP USDC | DTX in LP | Burned at launch |
|---|---|---|---|
| $125K (hardcap) | $37,500 | 2,500,000 | 500,000 |
| $75K | $22,500 | 1,500,000 | 1,500,000 |
| $20K (softcap) | $6,000 | 400,000 | 2,600,000 |
Buyback wallet ≈ tokensSold × 7.5% × $0.015. The market-making Trading Wallet is funded separately at TGE with 750K DTX (10% of the Reserve allocation, one-time, liquid).
All linear schedules start at TGE (not at deploy) and unlock continuously, second by second, on-chain. No cliffs, no end-of-month wait — claimable balance grows on every Base block (~2s). No tradable allocation can move before launch().
Presale (15M DTX)
15% at TGE + 85% linear continuous over 6 months
Development Fund (48.75M DTX)
1% at TGE + 95% linear continuous over 12 months
Marketing (28.50M DTX)
1% at TGE + 95% linear continuous over 12 months
Team & Advisors (19.88M DTX)
0% at TGE + 100% linear continuous over 12 months, no cliff
Reserve (7.50M DTX)
Split at TGE: 750K (10%) sent immediately to a liquid Trading Wallet (market-making inventory, one-time). Remaining 6.75M (90%) linear continuous over 12 months from TGE.
Community Rewards (27.38M DTX)
Distributed only via an on-chain oracle that signs reward attestations as users contribute valid voice samples. The reward formula and any discretionary exception rules (airdrops, giveaways, partner allocations) are disclosed before the presale opens — including how, how much, and the maximum cap.
Pool Launch (3.00M DTX)
Used at launch() to seed Uniswap LP at $0.015. Sized to exactly fit the LP at hardcap raise — any surplus is burned 100%.
Total Supply
150M
Fixed, non-inflationary, no mint
Presale Entry
$0.010
Day 1, 5-day ladder to $0.014
Launch Price
$0.015
+7% to +50% upside per tier
Blockchain
Base
Coinbase L2
The HyperDeteX contribution model is designed to continuously expand the dataset beyond the initial corpus the production model was trained on. Community submissions feed each retraining cycle, targeting the dimensions the model does not yet cover well: replay attacks, voice cloning, unseen TTS engines, telephony audio, and non-English speech. The production model evolves through continuous fine-tuning on the growing community dataset — moving from the wav2vec2 v1 baseline toward the XLS-R 300M + AASIST v2 target.
Submit short audio clips (real or AI-generated) via the Telegram bot — they feed the next retraining cycle
Samples the current model finds difficult are especially valuable — they directly improve detection accuracy after retraining
Participate in sample and model validation
Run nodes and maintain network infrastructure
Rewards are dynamic: each accepted contribution earns a fraction of the remaining rewards pool, adjusted by the contributor's tier, the contribution type, and the sample quality score. This guarantees long-term sustainability — the pool decreases gradually rather than draining.
Reward range per accepted sample:
Typical range: 2 – 30 DTX per contribution, depending on tier, type, and quality.
Full formula and per-sample breakdown are disclosed in the Telegram bot after each accepted contribution.
Contributors submit samples through three complementary channels, each producing a different kind of signal. The model gains the most from rarer, harder-to-collect signal — which is why live AI-Calls earn the largest multiplier.
1. Voice memo
User records a 2–15 s voice memo directly in the Telegram bot. Captured at device sample rate, transmitted via Telegram's Opus codec, server-side downsampled to 16 kHz.
Captures: real human voice · accents · consumer microphones
2. TTS upload
User generates a clip with any commercial TTS provider (ElevenLabs, OpenAI, Azure, Deepgram, Mistral, …) and uploads the digital file. Tracks the moving target of new synthetic engines.
Captures: emerging TTS vendors · pure digital synthesis
3. AI-Call (live) — NEW
User runs /call in the Telegram bot, joins a live WebRTC room with our AI agent (Realtime API). The full bidirectional conversation is streamed, server-side downsampled Opus → µ-law 8 kHz to match G.711 telephony, then processed by the model.
Captures: telephony 8 kHz · replay-realistic channel · live voice-cloning attempts
Different contribution types apply different multipliers. Rarer or more valuable types (live AI-Calls, bounty submissions) earn proportionally more from the pool:
| Contribution Type | Multiplier | What it captures |
|---|---|---|
| Voice memo | ×1.2 | Real human voice via Telegram |
| TTS upload | ×1.0 | Synthetic samples from any provider |
| AI-Call (live) | ×1.5 | Live conversation with our AI agent |
| Bounty | ×1.5 | Targeted samples requested by the team |
Each accepted sample gets a quality score (1–10) that scales the final reward:
Verdict latency
1–3s
From upload to result, up to 30 seconds for voice ai calls
Reward settlement
on-chain
EIP-712 signed, paid in DTX
Rewards pool
27.4M
DTX reserved for contributors
Our continuously trained detection model (audited at 100% fake-detection / 99.48% real-classification on in-distribution samples, 1-3 s CPU inference) targets the growing problem space of synthetic-voice fraud and impersonation. As the model is retrained on community-contributed samples, coverage expands to telephony audio, replay-hardened detection, and multilingual speech — the dimensions where modern deepfake voice attacks are most damaging.
Our technical roadmap outlines the planned evolution of the HyperDeteX platform, focusing on continuous improvement of detection capabilities, scalability, and user experience.
Ensuring robust protection and system stability
Supporting growing network demands
Streamlining integration and usage
Exploring new AI architectures
Enhancing data protection
Improving system efficiency
HyperDeteX is led by a team of experts in artificial intelligence, blockchain technology, and cybersecurity. Our leadership combines deep technical expertise with extensive industry experience to drive innovation and sustainable growth.
HyperDeteX operates within a comprehensive regulatory framework designed to ensure compliance with international standards while protecting user privacy and data security. Our approach combines proactive regulatory engagement with robust internal controls.
As voice technology continues to evolve, HyperDeteX is positioned to lead the next wave of innovation in synthetic voice detection and verification. Our vision extends beyond current capabilities to shape the future of secure voice communication.
Market Size
$5.6B
By 2030
Samples collected
100K+
Target 2030
Partners
500+
Global reach
HyperDeteX is positioned to capitalize on the explosive growth of the voice biometrics and deepfake detection market, projected to reach $5.6 billion by 2030 with a CAGR of 47.6%. As the global AI market expands to $2 trillion and voice authentication becomes standard across financial services, healthcare, and government sectors, HyperDeteX will serve as the critical infrastructure protecting against synthetic voice fraud. Through our decentralized approach and community-driven development, we are building the foundation for trusted voice communication in an AI-dominated future.