WS5000 in mass production · live demo units ready

Make every GPU
earn its keep

ZK-DPU WS5000 is an all-flash accelerated storage appliance for AI. A disaggregated architecture and an end-to-end high-speed data path free your GPU cluster from waiting on data — lifting utilization and cutting total cost, with no changes to your framework.

Request a PoC → See the benchmarks

Independently validated by Beijing Information Science and Technology University · median latency reduction 90.9% across 7 metrics

0 GB/s

Aggregate bandwidth

Vendor spec S9

0×

Peak model-load speedup

Validated S38

Random IOPS

Vendor spec S9

0 µs

Access latency

Vendor spec S9

THE BOTTLENECK

You bought top-tier GPUs — and they wait on data

Stacking more GPUs yields diminishing returns. The real bottleneck is data supply: model loading, checkpoint I/O and KV-cache scheduling.

Compute throttled by storage

Average utilization at China’s AI data centers is below 60%; in I/O-bound cases effective GPU utilization is often just 30–50%.^S11

Storage is the hidden ceiling

Conventional NFS / centralized storage caps bandwidth, so GPUs idle waiting for data. The larger the model, the higher the toll.

Turn storage into an amplifier

ZK-DPU disaggregates storage from a supporting role into a compute amplifier, lifting GPU utilization by 2–3×.^S4

WS5000

The WS5000 all-flash storage appliance

A high-performance appliance for AI training and inference. Disaggregated storage plus an end-to-end fast data path raise effective utilization and slash total cost — without touching your framework.

✓300 GB/s aggregate bandwidth, 50M random IOPS, 20 µs latency
✓90%+ mainstream GPU coverage, deeply tuned for Huawei Ascend and domestic accelerators
✓Turnkey deployment in 48-72 hours; ~40% lower total cost
✓Four core technologies: NVMe-oF/RDMA, GPUDirect, all-flash EBOF, KV-cache scheduling

Explore the product →

INDEPENDENT VALIDATION

Reproducible third-party benchmarks

Beijing Information Science and Technology University ran an independent test on the Huawei Ascend Atlas 910B platform against an NFS baseline — leading on all 7 metrics.

85.17×

DeepSeek-32B model-load speedup

563.85s → 6.62s (−98.83%)

9.33×

DeepSeek-70B inference service speedup

End-to-end service

5.3–12.5×

Training / checkpoint I/O speedup

Weights and checkpoints

+356.9%

Effective token throughput gain

High-frequency switching (40/day)

See the full benchmark →

SOLUTIONS

Four scenarios, one disaggregated platform

From greenfield clusters to brownfield retrofits, from training to inference — across the full lifecycle of AI infrastructure.

Training clusters

Accelerate model loading and checkpoint I/O to shorten training iterations and cut idle GPU time.

Inference serving

Long-context and high-frequency multi-model switching — markedly higher effective GPU utilization.

AI centers / domestic stack

Disaggregation plus deep Ascend tuning for sovereign, self-controlled infrastructure.

Brownfield retrofit

No GPU swap, no downtime — revive idle compute assets in place.

Explore solutions → AI Center × WS7000 →

ECOSYSTEM

Ecosystem & certainty

Validated · manufacturable · ecosystem-ready

Huawei AscendLuxshare Precision (foundry)Beijing Information Science and Technology UniversityDomestic GPU 90%+AMD (in testing)xFusion (in testing)

Honesty discipline

We separate what is delivered from what is in progress: third-party validation and mass-production foundry are delivered; AMD and xFusion platform adaptation are in testing (subject to final reports).

Benchmark it on your own workload

2 live demo units are ready for immediate PoC. Let the data do the talking.

Request a PoC → Contact us

Make every GPUearn its keep

You bought top-tier GPUs — and they wait on data

Compute throttled by storage

Storage is the hidden ceiling

Turn storage into an amplifier

The WS5000 all-flash storage appliance

Reproducible third-party benchmarks

Four scenarios, one disaggregated platform

Training clusters

Inference serving

AI centers / domestic stack

Brownfield retrofit

Ecosystem & certainty

Honesty discipline

Benchmark it on your own workload

Make every GPU
earn its keep