From May 6 – 9th, I was fortunate to attend nGage’s Advance Scale Forum in Austin, TX, to discuss how CB Technologies delivered a centralized HPC service. The Advanced Scale Forum is the convergence of two conferences – Leverage Big Data and Enterprise HPC – creating a power-forum designed to solve challenges in the build-out of scalable advanced enterprise computing solutions. This event was specifically focused on Artificial Intelligence, Big Data, Blockchain, Cloud, Deep Learning, HPC, Hybrid IT, IoT – Edge to Core Analytics, and Machine Learning.
Given CBT’s many years of success in the areas of HPC & Analytics, I was asked to share one of our case studies in a boardroom presentation. I focused on the growing difficulties of enterprise HPC sprawl/creep and how we worked to centralize multiple HPC environments at a large aerospace & defense company to gain greater efficiencies – lowering their total cost of ownership and providing a greater return on investment.
The sprawling and siloed HPC systems had evolved over time within our customer’s organization. There were 16 separate HPC sites spread out across the continental US. These sites were managed under different business units, management structures and accounting rules. Within the company there were a multitude of vendor combinations for software, hardware and interconnects. This led to a distributed set of siloed systems that provided an inconsistent user experience with inconsistent performance results. The facilities and asset life cycle management was a nightmare, and each group had the burden of addressing facilities cost, capital forecasting and long-term depreciation expense of each separate HPC environment.
The solution to these problems was to establish a centralized HPC service for the enterprise (shared between all users). This meant that an HPC Rated Service or Internal Private Cloud would be deployed for the enterprise to utilize. The operations team and workload management (scheduler) would be centralized. A chargeback system based on node hours used would be established allowing a true consumption model for all internal organizations to use. To enhance security and address latency issues, the HPC service would utilize Virtual Desktop Infrastructure (VDI). The VDI service also helped centralize the pre/post processing and its associated data. The final step was to establish a user council made up of representative from each user community and appointing a Chairman as the oversite. This group became the decision makers for resource allocation, forecast management, capacity planning, capital funding, and insured user representation.
Today, HPC demand is continuing to grow within the enterprises. HPC & Analytics computing resources have similar high density computing characteristics. For example; With GPU nodes hosting multiple GPU’s, at 300W per GPU, we are seeing a rise in rack level power requirements that grow from 5kVA per rack to 50kVA. High speed fabrics such as InfiniBand and Omnipath instead of just ethernet, and software/firmware stacks requiring more and more critical interdependencies management to keep it all running. As demand for analytics, big data, deep learning and machine learning resources grow, we are seeing the normal data center IT staff turn to the teams that manage the HPC environments to help develop these new environments.
With all of these new challenges facing the enterprise, it will be important to take the lessons learned in centralizing HPC services and apply it to the realms of analytics, big data, deep learning, and machine learning. CB Technologies is proud to be on the leading edge of these technologies, and with our industry-leading partner ecosystem at our side, we know we can help many more customers reach their HPC & Analytics goals.
See the full presentation: contact us at firstname.lastname@example.org or fill out the ‘Get Started’ form below.