Hardware Requirements
Everything you need to know about running Happy Horse 1.0.
TL;DR: You need an NVIDIA H100 or A100 with at least 48GB VRAM. Consumer GPUs (RTX 4090) cannot run the full model. FP8 quantization may enable A100 40GB usage with quality tradeoffs.
Minimum Requirements
| GPU | NVIDIA H100 80GB or A100 80GB |
| VRAM | ≥48GB (80GB recommended) |
| System RAM | ≥64GB |
| Storage | ~30GB for model weights |
| CUDA | 12.0 or later |
| Python | 3.10 or later |
| OS | Linux (Ubuntu 22.04+ recommended) |
GPU Comparison
| GPU | VRAM | Compatible? | Est. Speed | Notes |
|---|---|---|---|---|
| H100 80GB | 80GB | Yes | 38s (1080p) | Recommended — fastest inference |
| H100 NVL | 94GB | Yes | ~35s | Best performance, dual-GPU card |
| A100 80GB | 80GB | Yes | ~55s | Good performance, widely available |
| A100 40GB | 40GB | FP8 only | ~70s | Requires quantization, quality loss |
| L40S | 48GB | FP8 only | ~80s | Budget data center option |
| RTX 4090 | 24GB | No | — | Insufficient VRAM even with FP8 |
| RTX 3090 | 24GB | No | — | Insufficient VRAM |
Cloud Cost Comparison
On-demand pricing for single H100/A100 instances (as of April 2026).
| Provider | GPU | $/hour | Cost per 1080p video | $/1000 videos |
|---|---|---|---|---|
| Lambda Cloud | H100 | $2.49 | $0.026 | $26 |
| AWS (p5.xlarge) | H100 | $3.50 | $0.037 | $37 |
| GCP | H100 | $3.70 | $0.039 | $39 |
| Azure | H100 | $3.60 | $0.038 | $38 |
| AWS (p4d.xlarge) | A100 | $2.20 | $0.034 | $34 |
* Cost per video calculated based on ~38s generation time for a 5-second 1080p clip. Actual costs may vary with instance startup time and data transfer.
Cost Optimization Tips
- 1Use spot/preemptible instances
Can save 60-70% on cloud costs. Suitable for batch generation where interruptions are acceptable.
- 2Iterate at 256p, finalize at 1080p
Use the fast 256p mode (~2s) to refine prompts, then generate the final version at full quality.
- 3Consider FP8 quantization
If slight quality loss is acceptable, FP8 reduces VRAM to ~20GB, enabling cheaper GPU options.
- 4Batch processing
Generate multiple videos per session to amortize instance startup costs.