NASA's cloud computing odyssey: From Earth to Mars
- 13 December, 2012 16:22
Credit: NASA/JPL-Caltech/Malin Space Science Systems
When Curiosity landed on red soil
When Curiosity ended one phase of its mission by hitting red soil and started the main, most important part of its job, JPL witnessed firsthand how the elasticity of cloud can, in some cases, be a game changer. The Mars Science Laboratory website, which ran on AWS, was a "good foray" into cloud computing when it comes to Web hosting, Shams says, surviving the onslaught of massive amounts of traffic.
In the wake of its success — JPL's regular website went down during the Curiosity landing due to the volume of traffic, so it was redirected to the MSL site, which remained up — websites across NASA are beginning to be migrated to the cloud. Unlike a service such as Netflix — also a heavy user of Amazon's public cloud — which knows that it's going to have massive traffic on a daily, NASA's traffic tends to spike and ebb, depending on public excitement about different missions.
"We get a lot of attention and then it will die down, and then we land a rover, get a lot of attention, and it dies down," Shams says. "It's a very elastic environment that's basically built for cloud computing."
With a service like S3, NASA can store data and then not worry about going in and adding more services as interest ramps up, because it will scale automatically behind the scenes. Shams adds that the storage service also means that backups aren't a concern because data will be automatically replicated across multiple data centres, and daemons will regularly check the integrity of data to make sure nothing has been lost, re-replicating it as needed.
Another advantage of using cloud for Web hosting is security: holes have to be opened in firewalls to allow page requests in and data out, which can create a vulnerability. If a machine is running on NASA's network, there's the risk that the compromising of a Web server might open up the rest of the organisation's network to attack. With cloud, you can put a Web server in an isolated environment, "so somebody penetrates your website — that's all they've gotten into."
Despite wariness over cloud computing, the security team at JPL has also found other advantages over on-premise hardware. Cloud computing can be used to combat uncontrolled IT sprawl and give security far more oversight. "Cloud computing is way more secure than me setting up a server at a desk under my cubicle," Shams says.
"We will see a major shift toward cloud computing for websites across NASA," Shams says. It's an "ongoing process" that's being endorsed within JPL. It's "being enabled by our Office of the CIO, and they're doing everything they can to make it happen as quickly as possible."
When Curiosity touched down on Mars, there were two successes, not one, Shams said during a presentation at AWS's re:Invent conference. The rover was landed successfully, and NASA was able to share the moment with the rest of the world. And while the magnitude of the latter feat may go unnoticed by some, its implications for IT are significant, to say the least.
"Mission accomplished," Shams told the conference.
Not always smooth sailing
Although Curiosity may have provided a highly visible success story, the path to the cloud has not always been smooth sailing for JPL. One of the early stress-inducing incidents encountered was an apparent attack on their cloud infrastructure.
JPL uses Amazon Web Services' Virtual Private Cloud (VPC) offering, which lets an enterprise cordon off a set of EC2 instances and connect them to their internal network over a VPN, treating them as extensions of internal infrastructure. JPL set up a VPC and started running instances in it, and one morning at 6am, Shams says, they got a phone call saying that a node in the VPC was under attack.
"We all panic and we're looking around to see what might be going on, who might be on to us and who's trying to compromise our system and if there's an internal breach... And three hours later we're still trying to figure out what actually happened," he recalls.
"It turns out that our IT security team, the same people who monitor the alarm that went off, also have a system that does penetration testing of all of our systems. And because our machines were in the VPC, they also went and said 'Oh hey, you're a Web server I'm going to starting throwing all this traffic at you and see if you succumb to one of my SQL injections for instance'."
"So it was a self-inflicted-alarm," Shams says. However, it was actually "really good", he adds, "because, one, it helped us ensure that our testing infrastructure that we have for internal resources is still working and, two, when things do go wrong in the VPC we still figure out, just like our infrastructure. It was an attestation to the fact that we're able to leverage our internal infrastructure to protect the resources in the cloud, just like they would the resources on-premise."
Another lesson learned the hard way, albeit one that involved less panicked phone calls, was the importance of collaboration with cloud vendors: approaching the relationship as more of a partnership than a pure vendor-customer relationship.
"With cloud is there are so many features that are coming in so fast," Shams says, and when JPL started its cloud journey circa 2008-2010, Shams was "in a very exciting development role". "I would be developing services to build on top of cloud capabilities and I'd develop a service, it would have two or three more bugs left, and we'd hear that Amazon was about to release this other service that's going to do everything that we've written, except a lot better. And at that point we would throw away our code and say, 'Okay we'll just use this'."
The lesson, Shams says, was to be open with the vendor about what you're trying to build and what's missing in their services, "so that we can actually stay on the same page as to what might be coming up and what we should build and what we shouldn't build. Kind of understand, principally, what the vendor is interested in building and what they're interested in letting others build on top of that."
Rohan Pearce travelled to Amazon re:Invent as a guest of Amazon.
Follow Rohan on Twitter: @rohan_p
Join the Computerworld Australia group on Linkedin. The group is open to IT Directors, IT Managers, Infrastructure Managers, Network Managers, Security Managers, Communications Managers.
TPG buys AAPT
US Supreme Court to hear software patent case
Telstra hits 300 Mbps in LTE-A trial
Telstra hits 300 Mbps in LTE-A trial
With look ahead to manned mission, China launches lunar rover