4 crucial factors for scalable and secure high-performance computing in the cloud
High performance computing (HPC) in the cloud is an intelligent way to get the powerful computational capabilities needed for complex simulations and machine learning, while only paying for the resources you actually require at the time.
This has considerable benefits for key industries where the speed of innovation is in sharp focus; cloud HPC can give you unlimited computing capacity exactly when you need it without spending excess resources on maintaining physical (on-premise) machines.
It also allows you to wield greater control over how resources are used, so you can keep projects within budget and on track. However, for organizations accustomed to working with their own super secure, air-gapped on-premise processing power, this may seem like a bit of a leap.
After all, how can you ensure that your industry secrets stay that way as they travel across a scattered cloud infrastructure?
In this blog, we explore how 4 crucial elements can ensure your cloud HPC gives you the maximum value without risking the integrity of your project or valuable data.
First, where can cloud HPC have the greatest impact?
Cloud HPC can be used for a wide variety of intensive processing tasks, from AI-driven drug discovery to highly complex simulations involving digital twins or intricate physical processes. So, the application stretches across many industries and sectors.
Technology and engineering startups certainly have a lot to gain from cloud HPC, because the investment threshold is very low. It enables near-instant access to the most powerful processing capabilities without investing a lot of money in hardware or a dedicated maintenance crew.
As a result, more resources can go towards innovation while still getting practically infinite computing power for machine learning, generative AI, or complex simulations.
When done right, your cloud HPC solution can adhere to the strictest of security standards and be easier to manage, compared to an on-prem solution. Startups can also benefit from this, it brings the same level of security standards as larger companies for relatively low cost.
#1 User management and access
You want to ensure that your intellectual property and other data doesn’t leak as it moves between your own systems and the cloud. User management is a key part of this. AWS Research and Engineering Studio (RES) allows you to easily set up users and restrict access to specific resources and data.
For each project, AWS RES allows administrators to set up virtual desktop environments for each user. This allows you to specify which services, resources, and applications are available, and define access controls. This means that users can access HPC capabilities remotely by using web-based interfaces that are highly secure. You still need to use a secure user management system, including IAM, MFA, and SSO where appropriate.
#2 Resource management
While the cloud allows for highly elastic capabilities and nearly limitless scalability, you still need to control how resources are used and gain visibility over usage. You can set a budget and enforce it with the settings available in AWS, with this feature makes sure that the cost in check all the time
Once you’ve activated the RES environment tags, you can specify a budget for each project in the AWS Billing console. This isn’t a ‘hard’ limit, and you can always increase allocations as needed – but it’s a valuable way to get some control over how resources are used. Alerts can keep you updated when resource usage increases unexpectedly and reaches a defined limit.
#3 Security of high-performance computing in the cloud
There’s a lot at stake with this kind of workload, because it may include potentially compromising data that could enable a breach, industry secrets, or valuable intellectual property. The goal here is to replicate the security of an air-gapped on-premise computer, but without any of the annoying drawbacks (like expensive maintenance, limited computing power, and access issues).
First, you can employ several standard tactics, such as blocking the ability to copy data from a virtual workspace based on string length, enforcing read-only access, and using secure authentication methods. On top of this, you should make sure you’re using the built-in security of managed solutions like AWS KMS. Encryption is a default with RES, but you should also look at how your data moves outside RES itself. AWS Key Management Service can give you better security by encrypting data with your own key.
To create this air gapped setup, you need a network that is easy to configure and can provide you with all the compartments you need. AWS virtual private cloud (VPC) can give you this granular control over your network, and allows you to separate different components into distinct subnets and define access policies for them. ACLs and security groups can govern traffic, and these can be reinforced with a network firewall. To top it off, AWS GuardDuty can help detect potential threats.
When it comes to IAM, least-privilege permissions are the best way to limit the potential for accidental or malicious damage to vital data. Using the AWS IAM service also gives you a unified view.
#4 Storage of intellectual property and secret data
Depending on which sector or industry you work in, the specifics of your data will vary a lot. It might be highly valuable intellectual property, or sensitive data from thousands of patients, for example. Whatever data you work with, it must remain secure while also being easy to access when needed.
All AWS data storage services can be encrypted with your own set of KMS keys. This can help avoid accidental exposure of identities and credentials, and other data that may be stored as plain text.
Next, you need to consider the storage solution itself. AWS FSx is a good option, because it works with a variety of file systems and it’s fully managed. Because of this, your data is protected by cross-regional replication and redundancy – ensuring it always stays intact and available. You can choose between SSD or HDD storage, and you have options for optimizing storage with deduplication and compression too.
Backups are also easy to automate and manage with AWS FSx, and data is encrypted with KMS throughout.
What’s next? How to get started with cloud high-performance computing
By paying attention to the 4 aspects we’ve outlined above, you can safeguard your project even when your team is scattered across the globe. But what about setting up your high-performance computing?
This can be a very complex process, and certainly one that benefits from some expertise from people who’ve already been there and done it successfully. And with a little advice at the earliest stages, you can accelerate your innovation and start changing people’s lives even sooner.
Want to talk about your project? Get in touch.