ProductTechnologySolutionsValidationCustomersCompanyNewsAcademy Contact 中文
WS5000 in mass production · live demo units ready

Make every GPU
earn its keep

ZK-DPU WS5000 is an all-flash accelerated storage appliance for AI. A disaggregated architecture and an end-to-end high-speed data path free your GPU cluster from waiting on data — lifting utilization and cutting total cost, with no changes to your framework.

Independently validated by Beijing Information Science and Technology University · median latency reduction 90.9% across 7 metrics

0 GB/s
Aggregate bandwidth
Vendor spec S9
Peak model-load speedup
Validated S38
0M
Random IOPS
Vendor spec S9
0 µs
Access latency
Vendor spec S9
THE BOTTLENECK

You bought top-tier GPUs — and they wait on data

Stacking more GPUs yields diminishing returns. The real bottleneck is data supply: model loading, checkpoint I/O and KV-cache scheduling.

Compute throttled by storage

Average utilization at China’s AI data centers is below 60%; in I/O-bound cases effective GPU utilization is often just 30–50%.S11

Storage is the hidden ceiling

Conventional NFS / centralized storage caps bandwidth, so GPUs idle waiting for data. The larger the model, the higher the toll.

Turn storage into an amplifier

ZK-DPU disaggregates storage from a supporting role into a compute amplifier, lifting GPU utilization by 2–3×.S4

WS5000

The WS5000 all-flash storage appliance

A high-performance appliance for AI training and inference. Disaggregated storage plus an end-to-end fast data path raise effective utilization and slash total cost — without touching your framework.

  • 300 GB/s aggregate bandwidth, 50M random IOPS, 20 µs latency
  • 90%+ mainstream GPU coverage, deeply tuned for Huawei Ascend and domestic accelerators
  • Turnkey deployment in 48-72 hours; ~40% lower total cost
  • Four core technologies: NVMe-oF/RDMA, GPUDirect, all-flash EBOF, KV-cache scheduling

Explore the product

WS5000 · ALL-FLASH EBOF
INDEPENDENT VALIDATION

Reproducible third-party benchmarks

Beijing Information Science and Technology University ran an independent test on the Huawei Ascend Atlas 910B platform against an NFS baseline — leading on all 7 metrics.

85.17×
DeepSeek-32B model-load speedup
563.85s → 6.62s (−98.83%)
9.33×
DeepSeek-70B inference service speedup
End-to-end service
5.3–12.5×
Training / checkpoint I/O speedup
Weights and checkpoints
+356.9%
Effective token throughput gain
High-frequency switching (40/day)

See the full benchmark

SOLUTIONS

Four scenarios, one disaggregated platform

From greenfield clusters to brownfield retrofits, from training to inference — across the full lifecycle of AI infrastructure.

Training clusters

Accelerate model loading and checkpoint I/O to shorten training iterations and cut idle GPU time.

Inference serving

Long-context and high-frequency multi-model switching — markedly higher effective GPU utilization.

AI centers / domestic stack

Disaggregation plus deep Ascend tuning for sovereign, self-controlled infrastructure.

Brownfield retrofit

No GPU swap, no downtime — revive idle compute assets in place.

Explore solutions AI Center × WS7000

ECOSYSTEM

Ecosystem & certainty

Validated · manufacturable · ecosystem-ready

Huawei AscendLuxshare Precision (foundry)Beijing Information Science and Technology UniversityDomestic GPU 90%+AMD (in testing)xFusion (in testing)

Honesty discipline

We separate what is delivered from what is in progress: third-party validation and mass-production foundry are delivered; AMD and xFusion platform adaptation are in testing (subject to final reports).

Benchmark it on your own workload

2 live demo units are ready for immediate PoC. Let the data do the talking.