Through its Swarm Learning and Machine Learning Development System offerings, HPE wants to help companies realize the benefits of machine learning models faster.
If companies see the value of machine learning for certain tasks or processes, the introduction of models may push them away. HPE wants to help them by releasing two solutions: Swarm Learning and Machine Learning Development System. The first is based on group learning, the purpose of which is to use the value of data generated at the edge or distributed sites. The second provides a hardware-based model training solution that eliminates the need for companies to build their own specialized infrastructures.
The machine learning development system is based on the 2021 acquisition of Defined AI. It offers an open source platform for accelerating machine learning model training and integrates with HPE hardware. Several configurations are possible for this, but HPE specifies that a typical “small configuration” consists of an Apollo 6500 Gen10 server with enough power for ML model drives, HPE ProLiant DL325, Aruba CX 6300 servers and switches. Includes Nvidia’s Quantum InfiniBand networking platform and specialized software from HPE: Machine Learning Development Environment and Performance Cluster.
Example of a complete setup for training machine learning models (Photo: HPE)
HPC for machine learning
According to Peter Rutten, vice president of research at IDC, the concept is essentially about bringing high-performance computing (HPC) capabilities into enterprise machine learning so they don’t have to develop their own systems. “Now that AI has matured, companies are really asking for this type of system,” he said.
“Having to develop your own system is the biggest hurdle to getting AI into business.” For some, using the cloud might be an option, but very often the data needed to train AI models is sensitive and business-critical, so it’s not an option for some of them, not to mention regulatory restrictions. some industries, which makes it completely impossible for others.
Decentralized machine learning in a swarm
With its Swarm Learning product, HPE is trying to address the privacy concerns of machine learning data. Swarm Learning’s decentralized framework uses containerization to achieve two goals: First, it allows machine learning tasks to be performed on edge systems without having to go back and forth to a central data center and thus get accurate information faster than they could otherwise. . . Second, it allows like-minded companies to share AI model learning outcomes with each other without having to share underlying data with each other, which can benefit the entire industry. “Take the example of seven hospitals that are trying to solve problems with AI models, but because they can’t share their data, model training will be limited,” Rutten explained. They will be inaccurate, with their inherent potential for error, depending on the patient’s demographics and a variety of other factors. “To solve this problem, Swarm Learning does not pass data, but only the training results of the model, and combines them into a single model that will be trained on all the data,” Rutten said.
He notes that this swarm training method is relatively new, meaning that it may take some time for widespread adoption. On the other hand, HPE’s Machine Learning Development System directly targets the current point of tension, he said, and this is perhaps the more interesting announcement of the two. “It’s practically an aaS (as a service) offering for a company’s data center,” he said. “This is exactly what people are looking for when training AI models in their business,” he added.