Parallelism Memory Explorer

Visualizing how Model States (Weights, Gradients, Optimizer) are sharded across GPU clusters.

Total GPUs 4
Total Weights 14GB
VRAM / GPU 0 GB
Sharding ZeRO-3
System Healthy
Weights
Gradients
Optimizer
Residuals