Können wir Ihnen behilflich sein?
Keysight AI Data Center Validation Solution
Accelerate design and deployment of AI network infrastructure
Already own this product? Visit Technical Support
Highlights
Keysight AI Data Center Validation Solution can:
- Emulate AI workloads without large GPU clusters, reducing test and validation costs by leveraging high-density traffic load appliances or software endpoints.
- Leverage high-density AI host emulation, supporting 800GE / 400GE capabilities to accurately mirror AI cluster behavior.
- Streamline benchmarking with Keysight AI Data Center Validation Solution's Collective Benchmarks app that validates AI network fabric performance, enabling usage improvement.
- Automate AI fabric testing to assess network impact on job completion time, performance isolation, load balancing, and congestion control for optimized AI training performance.
Driving the Future of AI Networking: How Keysight Empowers Juniper
- Keysight helps Juniper validate next generation network fabric by emulating collective communications workload coming from a large scale of AI accelerators.
- Provides comprehensive test scenarios to demonstrate the efficiency and performance of lossless network fabric in load balancing and congestion mitigation.
This is a modal window.
Solving for AI Networking Challenges
Key industry trends and challenges in the AI / ML industry include:
- AI clusters are expected to surpass 100K+ nodes by 2026.
- Idle up to 50 % of time waiting for data exchange.
- Innovation in AI networking requires new measurement and benchmarking tools.
- Keysight offers a 800GE / 400GE test solution with a track record of lossless fabric validation. It is faster to deploy with deeper insights compared to benchmarking with GPU-based systems and delivers provable fidelity of AI traffic emulation.
Accelerate AI Network Design
Define the future of AI / ML infrastructure. Unlock possibilities and shape tomorrow’s landscape.
Benchmark job completion time of AI collective communications
Navigate the complexities of AI workloads.
Achieve precision in network performance measurements
Make design decisions based on deeper AI communications insights.
Flexible what-if scenarios
Optimize AI collective performance by experimenting with AI traffic patterns to fine-tune fabric configuration.
Cost-effective high-density AI network testbeds
Scale experiments with AresONE-M 800GE and AresONE-S 400GE AI traffic emulation.
Transform AI Infrastructure Benchmarking
- Optimizing AI / ML system design with realistic emulation of high-scale AI workloads.
- Delivering insights into collective communications performance.
- Simplifying benchmarking and validation with pre-packaged methodologies delivered as applications.
- Emulating Remote Direct Memory Access (RDMA) over Converged Ethernet v2 (RoCEv2) endpoints by using high-density AresONE traffic load appliances with hundreds of 400GE or 800GE ports.
Simplify AI Infrastructure Validation with Collective Benchmarking
Keysight accelerates AI infrastructure validation by providing precision, scalability, and actionable insights. The Keysight AI Data Center Validation Solution simplifies performance evaluation with a collective benchmarks application coupled with pre-packaged test methodologies and high-fidelity instruments, enabling AI operators to optimize infrastructure design and network performance.
Key capabilities include:
- Evaluating collective communication efficiency by measuring job completion time, algorithm and bus bandwidth, and deviations from theoretical maximum performance.
- Using AresONE traffic load appliances to emulate RoCEv2 endpoints, analyzing Queue Pair (AI data flow) performance with drill-down capabilities.
- Validating RoCEv2 emulation fidelity by comparing AresONE hardware results with real AI system metrics.
- By integrating AI collective benchmarking, Keysight AI Data Center Validation Solution enables AI operators and infrastructure vendors to gain deep insights into data movement efficiency, network congestion, and overall system performance.
RoCEv2 Endpoints Emulation and Stateful Validation
Beyond emulation, pioneering precision in RoCEv2 validation
RoCEv2 Support in IxNetwork / AresONE-S
IxNetwork / AresONE-S supports RoCEv2 transport protocol with Data Center Quantized Congestion Notification (DCQCN) congestion control and Priority Flow Control (PFC). It provides a scalable and cost-effective solution to validate data plane traffic management effectiveness in AI clusters, optimizing network fabric performance.
Speed and Scale
AresONE-S offers up to 16 x 400GE port capacity per device and can be combined into a multi-appliance configuration with 256+ ports in a single collective. Each port emulates an RoCEv2 endpoint and supports thousands of Queue Pairs with line rate traffic. This scale is crucial for reproducing network topologies of real AI clusters.
Traffic Flexibility
To match realism of AI workload patterns and reproduce issues at smaller setups, AresONE RoCEv2 capabilities cover a range of traffic patterns from in-cast, to partial mesh, to full all-to-all collectives in the first release. At the transport level, it supports sequences of RDMA verbs with configurable data sizes, burst rates, intervals, all combined with DCQCN and PFC rate control mechanisms.
Per Queue Pair DCQCN Flow Control
DCQCN per queue pair enables precise network congestion control with features like Explicit Congestion Notification (ECN) and rate control, optimizing data flow and network fabric responsiveness.
Visit the GitHub repository for AI / ML testing methodologies.
How to Test AI Data Center Networks
Efficient network design is crucial for faster data movement and reduced latency. The AI Fabric Test Methodology aims to provide a consistent testing process with measurable metrics to optimize data center infrastructure for AI workloads. Follow this test methodology to benchmark job completion times, performance isolation, load balancing, and congestion control.
Benchmarking AI / ML clusters with realistic workloads requires costly investments in computing systems with GPUs and RDMA network interface controllers (NICs). Proper benchmarking involves configuring parameters such as cluster setup, congestion control, workload algorithms, job data size, traffic profile, and NIC performance.
AI Test Hardware
Keysight's data center load modules deliver high density and performance Ethernet IP test solutions with the industry's first 1G, 10G, 25G, 40G, 50G, 100G, 400G, and 800G speeds.
Protocol and Load Test
Extend the Capabilities
Featured Resources
Related Products
Want help or have questions?