Hewlett Packard Enterprise announced that it is removing barriers for enterprises to easily build and train machine learning models at scale, to realize value faster, with the new HPE Machine Learning Development System. The new system, which is purpose-built for AI, is an end-to-end solution that integrates a machine learning software platform, compute, accelerators, and networking to develop and train more accurate AI models faster, and at scale.
The HPE Machine Learning Development System builds on HPE’s strategic investment in acquiring Determined AI to combine its robust machine learning (ML) platform, now formally called the HPE Machine Learning Development Environment, with HPE’s world-leading AI and high performance computing (HPC) offerings. With the new HPE Machine Learning Development System, users can speed up the typical time-to-value to start realizing results from building and training machine models, from weeks and months, to days.
Early adopter of HPE Machine Learning Development System launches training of giant multimodal AI model in record speed
HPE also announced that Aleph Alpha, a German AI startup, has adopted the HPE Machine Learning Development System to train their multimodal AI, which includes Natural Language Processing (NLP) and computer vision. By combining image and text processing in five languages with almost human-like context understanding, the models push the boundaries of modern AI for all kinds of language and image-based transformative use cases, such as AI-assistants for the creation of complex texts, higher level understanding summaries, searching for highly specific information in hundreds of documents, and leveraging of specialized knowledge in a conversational context.
By adopting the HPE Machine Learning Development System, Aleph Alpha had the system immediately up and began efficiently training in record time, combining and monitoring hundreds of GPUs.
“We are seeing astonishing efficiency and performance of more than 150 teraflops by using the HPE Machine Learning Development System. The system was quickly set up and we began training our models in hours instead of weeks. While running these massive workloads, combined with our ongoing research, being able to rely on an integrated solution for deployment and monitoring makes all the difference.” – Jonas Andrulis, Founder and CEO, Aleph Alpha
“Enterprises seek to incorporate AI and machine learning to differentiate their products and services, but are often confronted with complexity in setting up the infrastructure required to build and train accurate AI models at scale,” said Justin Hotard, executive vice president and general manager, HPC and AI, at HPE. “The HPE Machine Learning Development System combines our proven end-to-end HPC solutions for deep learning with our innovative machine learning software platform into one system, to provide a performant out-of-the box solution to accelerate time to value and outcomes with AI.”
Removing barriers to realize full potential of AI with complete machine learning solution
Organizations have yet to reach maturity in their AI infrastructure, which according to IDC, is the most significant and costly investment required for enterprises that want to speed up their experimentation or prototyping phase, to develop AI products and services. Typically, adopting AI infrastructure to support model development and training at scale, requires a complex, multi-step process involving the purchase, setup and management of a highly parallel software ecosystem and infrastructure spanning specialized compute, storage, interconnect and accelerators.
The HPE Machine Learning Development System helps enterprises bypass the high complexity associated with adopting AI infrastructure by offering the only solution that combines software, specialized computing such as accelerators, networking, and services, allowing enterprises to immediately begin efficiently building and training optimized machine learning models at scale.
Gaining accurate models to unlock value faster with the HPE Machine Learning Development System
The system also helps improve accuracy in models faster with state-of-art distributed training, automated hyperparameter optimization and neural architecture search, which are key to machine learning algorithms.
The HPE Machine Learning Development System delivers optimized compute, accelerated compute, and interconnect, which are key performance drivers to scale models efficiently for a mix of workloads, starting at a small configuration of 32 NVIDIA GPUs, all the way to a larger configuration of 256 NVIDIA GPUs. On a small configuration of 32 NVIDIA GPUs, the HPE Machine Learning Development System delivers approximately 90% scaling efficiency for workloads such as Natural Language Processing (NLP) and Computer Vision. Additionally, based on internal testing, the HPE Machine Learning Development System with 32 GPUs, delivers up to 5.7X faster throughout for an NLP workload compared to another offering containing 32 identical GPUs, but with a sub-optimal interconnect.
Speeding up POC to production with ready-to-use, AI model development and training solution
The HPE Machine Learning Development System is offered as one, integrated solution that provides preconfigured, fully installed AI infrastructure for turnkey model development and training at scale. As part of the offering, HPE Pointnext Services will provide onsite installation and software setup, allowing users to immediately implement and train machine learning models for faster and more accurate insights from their data.
The HPE Machine Learning Development System is offered starting in a small building block, with options to scale up. The small configuration starts with the following:
- Innovative machine learning platform with the HPE Machine Learning Development Environment to enable enterprises to rapidly develop, iterate, and scale high-quality models from POC to production
- Optimized AI infrastructure using the HPE Apollo 6500 Gen10 Plus system to provide massive, specialized computing capabilities to train and optimize AI models, with eight NVIDIA A100 80GB GPUs for accelerated compute
- Enabling fine-grained centralized monitoring and management for optimal performance with the HPE Performance Cluster Management, a system management software solution
- Management stack to control and manage system components using HPE ProLiant DL325 servers and 1Gb Ethernet Aruba CX 6300 switch
- Ensuring performance of compute and storage communications using the NVIDIA Quantum InfiniBand networking platform
The HPE Machine Learning Development System is available now worldwide.
HPE expands AI product portfolio to help customers improve insights and make better decisions
HPE is building on today’s news with additional AI offerings, including the launch of HPE Swarm Learning, the industry’s first privacy-preserving, decentralized machine learning framework for the edge or distributed sites. With HPE Swarm Learning, a range of organizations such as healthcare, banking and financial services, and manufacturing, can share learnings from their AI models with other organizations to improve insights, without sharing the actual data.
Additionally, HPE announced that it is building on its collaboration with Qualcomm Technologies, Inc. to deliver advanced inferencing offerings to support heterogenous system architectures that provide AI inferencing at scale. HPE will offer the HPE Edgeline EL8000 Converged Edge systems, which are compact, ruggedized edge computing solutions optimized for harsh environments outside the datacenter, with the Qualcomm® Cloud AI 100 accelerator to deliver inferencing for datacenters and at the edge. The combined solution delivers high-performance at low-power for demanding AI Inference workloads.2 The offering will be generally available in August 2022.
To learn more about HPE’s AI solutions, please visit the website here.
Image licensed by pixabay.com