Thought Leaders in the Cloud: Talking with Randy Bias, Cloud Computing Pioneer and Expert
Randy Bias is a cloud
computing pioneer and recognized expert in the field. He has driven
innovations in infrastructure, IT, Operations, and 24×7 service delivery since
1990. He was the technical visionary on the executive team of GoGrid, a major cloud
computing provider. Prior to GoGrid, he built the world's first multi-cloud,
multi-platform cloud management framework at CloudScale Networks, Inc.
In this interview, we
discuss:
-Cloud
isn't all about elasticity. Internal datacenters run about 100
servers for each admin. The large cloud providers can manage 10,000
servers per admin.
-Users
can procure cloud resources on an elastic basis, but like power
production, the underlying resource isn't elastic, it's just built above
demand.
-Just
doing automation inside of your datacenter, and calling it private cloud,
isn't going to work in the long term.
-Laws
and regulations are not keeping pace with cloud innovations.
-Startups
aren't building datacenters. In the early days, companies built
their own power generation, but not any more. Buying compute instead
of building compute is evolving the same way.
=The
benefit of cloud isn't in outsourcing the mess you have in your
datacenter. It's about using compute on-demand to do processing that
you're not doing today.
Robert Duffner: Could you take a minute to
introduce yourself and your experience with cloud computing, and then tell us
about Cloudscaling as well?
Randy Bias: I'm the CEO of
Cloudscaling. Before this, I was VP of Technology Strategy at GoGrid, which was
the second to market infrastructure-as-a-service provider in the United States.
Prior to that, I worked on a startup, building a cloud management system very
similar to RightScale's.
I was
interested very early in cloud technology and I also started blogging on cloud
early in 2007. Prior to cloud I had already amassed a lot of experience
building tier-one Internet service providers (ISPs), managed security service
providers (MSSPs), and even early pre-cloud technology solutions at Grand
Central Communications.
Cloudscaling
was started about a year and a half ago, after I left GoGrid. Our focus is on
helping telcos and service providers build infrastructure clouds along the same
model of the early cloud pioneers and thought leaders like Amazon, Google, Microsoft,
and Yahoo.
Robert: On your blog, you recently
stated that elasticity is not cloud
computing. Many
people see elasticity as the key feature that differentiates the cloud from
hosting. Can you elaborate on your notion that elasticity is really a side
effect of something else?
Randy: We look at cloud and cloud
computing as two different things, which is a different perspective from that
of most folks. I think cloud is the bigger megatrend toward a hyper-connected
"Internet of things." We think of cloud computing as the underlying
foundational technologies, approaches, architectures, and operational models
that allow us to actually build scalable clouds that can delivery utility cloud
services.
Cloud
computing is a new way of doing IT. Much in the same way that enterprise
computing was a new way of doing IT compared to mainframe computing. There is a
clear progression from mainframe to enterprise computing and then from
enterprise computing to cloud computing. A lot of the technologies,
architectures, and operational approaches in cloud computing were pioneered by
Amazon, Microsoft, Google, and other folks that work at a very, very large
scale.
In order
to get to a scale where somebody like Google can manage 10,000 servers with a
single head count, they had to come up with whole new ways of thinking about
IT. In a typical enterprise data center, it's impossible to manage 10,000
servers with a single head count. There are a number of key reasons this
is so.
As one
example, a typical enterprise data center is heterogeneous. There are many
different vendors and technologies for storage, networking, and servers. If we
look at somebody like Google, they stated publicly that they have somewhere
around five hardware configurations for a million servers. You just can't get
any more homogeneous than that. So all of these big web operators have had to
really change the IT game.
This
highlights how we think of cloud computing as something fundamentally new. One
of the side effects of large cloud providers being able to run their
infrastructures on a very cost effective basis at large scale is that it
enables a true utility business model.
The cost
of storage, network, and computing will effectively be driven toward zero over
time. Consumers have the elastic capability to use the service on a metered
basis like phone or electric service, even though the actual underlying
infrastructure itself is not elastic.
It's just
like an electric utility. The electricity system isn't elastic, it's a fixed
load. There's only so much electricity in the power grid. That's why we
occasionally get brown-outs or even black-outs when the system becomes
overloaded. It's because the system itself is not elastic, it's the usage.
Robert: That's actually a great
analogy, Randy. You mentioned that public cloud is at a tipping point. There are obvious reasons
for organizations wanting to go down a private cloud path first. Are you
sensing that many organizations will go to the public cloud first? And then
re-evaluate to see what makes sense to try internally?
Randy: In our experience, a
typical large enterprise is bifurcated. There is a centralized IT team focused
on building internal systems that you could call private cloud as an
alternative to the public cloud services. On the other side are app developers in
the various lines of business, who are trying to get going and accomplish
something today. Those two constituencies are taking different approaches.
The app
developers focus on how to get what they need now, which tends to push them
toward public services. The centralized IT departments see this competitive
pressure from public services and try to build their own solutions internally.
We should
remember that we're looking at a long term trend, and that it isn't a zero-sum
game. Both of these constituencies have needs that are real, and we've got to
figure out how to serve both of them.
We have a
nuanced position on this, in the sense that we are neither pro-public cloud nor
pro-private cloud. However, we generally take the stance that probably in the
long term, the majority of enterprise IT spending and capacity will move to the
public cloud. That might be on a 10 to 20 year time-frame.
If you're
going to build a private cloud that will be competitive, you're going to have
to take the same approach as Amazon, Google, Microsoft, Yahoo, or any of the
big web operators. If you just try to put an automation layer on top of your
current systems, you won't ultimately be successful.
We know
the history of trying to do large-scale automation inside our data centers over
the past 20 or 30 years. It's been messy, and there's no reason to think that's
going to change. You've got to buy into that idea of a whole new way of doing
IT. Just adding automation inside your data center and calling it a private
cloud won't get you there.
Robert: Some of the people that we've
spoke to have expressed the notion that clouds only work at sufficient scale. When we talk about Azure
and the cloud in the context of ideal workloads or ideal scenarios, we always
talk about this idea of on and off batch processing that requires intensive
compute or a site that's growing rapidly. And then of course your predictable
and unpredictable bursting scenarios. In your experience, is there some minimum
size that makes sense for cloud implementation?
Randy: For infrastructure clouds,
there probably is a minimum size, but I think it's a lot lower than most people
think. It's about really looking at the techniques that the public cloud
providers have pioneered.
I see a lot of people
saying, "Hey, we're going to provide virtual machines on demand. That is a
cloud," to which I respond, "No, that's virtual machines on demand." Part of
the cloud computing revolution is that providers like Amazon and Google do IT
differently, like running huge numbers of servers with much lower head count.
Inside
most enterprises, currently IT can manage around a 100 servers per 1 admin. So
when you move from a 100:1 to say a 1000:1, labor opex moves from $75 a month
for a server to $7.50 per month. And when you get to ten thousand, it's a mere
$0.75 a month.
These are
order of magnitude changes in operational costs, or in capital expenditures, or
in the overall cost structure. Now what size do you have to be to get these
economies? The answer is ... not as big as you think.
When some
people consider economies of scale, they believe it means the ability to buy
server hardware cheaply enough. But that's not really very difficult. You
can go direct to Taiwanese manufacturers and get inexpensive commodity hardware
that is very reliable. These hardware has the same components as the
hardware you could get from IBM, Dell, or HP today and is built by the same
companies that build these enterprise vendors hardware.
For
hardware manufacturers, especially the original Taiwanese vendors, there is
only so much of a discount they can provide, so Amazon doesn't have
significantly more buying power than anybody who's got a few million bucks in
their pocket.
There are
also economy of scale comes from more subtle places, such as the ability to
build a rock star cloud engineering team. For example, Amazon Web
Services cloud engineering team iterates on a rapid pace and they have designed
software so they can actually manage a very large data system efficiently at
scale.
You could
do that with a smaller team and less resources, but you've got to be really
committed to do that. Also finding that kind of talent is very difficult.
Robert: You've also talked about how cloud is fundamentally different
from grid and HPC. How do
you see that evolving? Do you see them remaining very separate, for separate
uses and disciplines? Or do you see the lines blurring as time goes on?
Randy: I think those lines will
blur for certain. As I say in the blog post, I view cloud more as high
scalability computing than as high performance computing. That actually means
that the non-HPC use cases at the lower end of the grid market already make
sense on public clouds today. If you run the numbers and the cost economics
make sense, you should embrace cloud-based grid processing today.
Amazon is
building out workload-specific portions of their cloud for high performance
computing running on top of cloud. Still, at that very top of the current layer
of grid use cases that are HPC, the cost economics for cloud are probably never
going to make sense. For example, it may be the case for a large research
institution like CERN or some other large HPC consumer that really needs very
low latency infrastructure for MPI problems.
Robert: It seems that a lot of
issues around the cloud are less associated with technical challenges than they
are about law, policy, and even psychology. I'm thinking here about
issues of trust from the public sector, for example. Many end customers also
currently need to have the data center physically located in their country. How
do you see the legal and policy issues evolving to keep up with the technical
capabilities of the cloud?
Randy: It's always hard to
predict the future, but some of the laws really need to get updated as far as
how we think about data and data privacy. For example, there are regulatory
compliance issues that come up regularly when I talk to people in the EU. Every
single EU member country has different laws about protecting data and providing
data privacy for your users. Yet at the same time, some of that is largely
prescriptive rather than requirements-based, like stating that data can't
reside outside of a specific country.
I don't
know that that makes as much sense as specifying that you need to protect the
data in such a way that you never leave it on the disk or move it over the
network in such a way that it can be picked up by an unauthorized party. I
think the security, compliance, and regulatory laws really need to be updated
to reflect the reality of the cloud, and that will probably happen over time.
In the short term, I think we're stuck in a kind of fear, uncertainty, and
doubt cycle with cloud security.
Previously,
I spent about seven years as a full-time security person. What I found is
that there is always a fairly large disconnect between proper security measures
and compliance. Compliance is the codification in laws to try to enforce a
certain kind of security posture.
But because of the way that
data and IT are always changing and moving forward, while political systems
take years to formulate laws, there's always a gap between the best practices
in security and what the current compliance and regulatory environment is.
Robert: Now, you mentioned a big cloud project your company did in
South Korea. What
are some of the issues that are different for cloud computing with customers
outside the United States?
Randy: I think one of the first
things is that most folks outside the U.S. are really at the beginning of the
adoption cycle, whereas inside the U.S., folks are pretty far along, and
they've got more fully formulated strategies. And the second thing is that in
many of these markets, since the hype cycle hasn't picked up yet, there are
still a lot of questions around whether the business model actually works.
So for
example, in South Korea, the dedicated hosting and web hosting business is very
small, because most of the businesses there have preferred to purchase the
hardware. It's a culture where people want to own everything that they are
purchasing for their infrastructure. So will a public cloud catch on? Will
virtualization on demand catch on? I don't know.
I think
it'll be about cost economics, business drivers, and educating the market. So I
think you're going to find that similar kinds of issues play out in different
regions, depending on what the particulars are there. We're starting to work
with folks in Africa and the Middle East, and in many cases, hosting hasn't
caught on in any way in those regions.
At the
same time, the business models at Infrastructure as a Service providers in the
U.S. don't really work unless you run them at 70 to 80% capacity. It's not like
running an enterprise system where you can build up a bunch of extra capacity and
leave it there unused until somebody comes along to use it.
Robert: I almost liken it to when
the long-distance companies, because of the breakup of the Bells, started to
offer people long distance plans. You had to get your head around what your
call volume was going to look like. It was the same when cell phones came out.
You didn't know what you didn't know until you actually started generating some
usage.
Randy: I think the providers will
have options about how they do the pricing, but the reality is that when you
are a service provider in the market, you are relatively undifferentiated. And
one of the ways in which you try to achieve differentiation is through
packaging and pricing. You see this with telecommunications providers today.
So we're
going to see that play out over the next several years. There will be a lot of
attempts at packaging and pricing services to address consumers' usage
patterns. I liken it to that experience where you get that sticker shock
because you went over your wireless minutes for that month, and then you
realize that you need plan B or C, and then you start to use that.
Or when
you, as a business, realize that you need to get an all you can eat plan for
all of your employees, or whatever you want that now works for your business
model. Then service providers will come up with a plethora of different content
pricing and packaging to try to service those folks and that will be more
successful.
Robert: In a recent interview I
did with New Zealand's Chris Auld, he said that cloud computing is a model for the procurement of
computing resources. In other words it's not a technological innovation as much
as a business innovation, in the sense that it changes how you procure
computing. What are your thoughts on his point?
Randy: I am adamantly opposed to
that viewpoint. Consider the national power grid; is it a business model or a
technology? The answer is that it's a technology. It's a business
infrastructure, and there happens to be a business model on top of it with a
utility billing model.
The utility
billing model can be applied to anything. We see it in telecommunications, we
see it in IT, we see it with all kinds of resources that are used by businesses
and consumers today.
We all want to know, what
is cloud computing? Is it something new? Is it something disruptive? Does it
change the game?
Yes, it's
something new. Yes, it's something disruptive. Yes, it's changed the game.
The
utility billing model itself has not changed the game. Neither has the
utility billing model as applied to IT, because that has been around for a long
time as well. People were talking about and delivering utility computing
services ten years ago, but it never went any where.
What has
changed the game is the way that Google, Amazon, Microsoft, and Yahoo use IT to
run large scale infrastructure. As a side effect, because we've figured
out how to do this very cost effectively at a massive scale, the utility
billing model and the utility model for delivering IT services now actually
works. Before, you couldn't actually deliver an on-demand IT service in a way
that was more cost effective than you could build inside your own enterprise.
Those
utility computing models didn't work before, but now we can operate at scale,
and we have ways to be extremely cost-efficient across the board. If we can
continue to build on that and improve it over time, we're obviously going to
provide a less expensive way to provide IT services over the long run.
It's
really not about the business model. It really is about enabling a new way of
doing IT and a new way of computing that allows us to do it at scale. Then
on top of this to provide a utility billing model.
Robert: Clearly, we're seeing a
lot of immediate benefit to startups, for the obvious reason that they don't
need to procure all of that hardware. Are you seeing the same thing as well?
Randy: I've been more interested
in talking about enterprise usage of public services lately, but it seems that
the start ups are well into the mature stage, where nobody ever goes out and
builds infrastructure anymore if they have a new start up. It just doesn't make
any sense.
When
folks were first starting to use electricity to automate manufacturing,
textiles, and so on, larger businesses were able either to build a power plant,
or to put their facility near some source of power, such as a hydroelectric
water mill. Smaller businesses couldn't.
Then when
we built a national power grid, suddenly everybody could get electricity for the
same cost, and it became very difficult to procure and use electricity for a
competitive advantage. We're seeing the same thing here, in the sense that
access to computing resources is leveling the playing field. Small businesses
and start ups actually have access to the same kinds of resources that very
large businesses do now. I think that that really changes the game over the
long term.
You will
know we crossed a tipping point when two guys and their dog in a third world
country can build the infrastructure to support the next Facebook with a credit
card and a little bit of effort.
Robert: Those are all of the
prepared questions I had. Is there anything else you'd like to talk about?
Randy: There are a few things
that I'd like to add, since I have the opportunity. The first thing reaches
back to the point I made before, likening the way cloud is replacing enterprise
computing to the way client-server or enterprise computing replaced mainframes.
What drove the adoption of client-server (enterprise) computing?
It really
wasn't about moving or replacing mainframe applications, but about new
applications. And when you look at what's going on today, it's all new
applications. It's all things that you couldn't do before, because you didn't
have the ability to turn on 10,000 servers for an hour for $100 and use them
for something.
If you
look at the way that enterprises are using cloud today, you see use cases like
financial services businesses crunching end-of-day trading data, or
pharmaceutical companies doing very large sets of calculations overnight, where
they didn't have that capability before.
There's a weird fixation in
a lot of the cloud community on enterprise or private cloud systems. They're
trying to say that cloud computing is about outsourcing existing workloads and
capacity. Somebody who maybe doesn't have the same kind of cost efficiencies
that Amazon or Google has.
If you
just outsource the mess in your data center to someone else who has the same
operational cost economics, it can't really benefit you from a cost
perspective. What has made Amazon and others wildly successful in this area is
the ability to leverage this new way of doing IT in ways that either level the
playing field or otherwise create new revenue opportunities. It's not about bottom
line cost optimization.
If we
just continue doing IT the way we already do it today, I think we're going to
miss the greater opportunity. On the other hand, you ask your developers,
"What can you do for the business if I give you an infinite amount of
compute, storage, and network that you can turn on for as little as five
minutes at a time?" That's really the opportunity.
Robert: That's excellent, Randy. I
really appreciate your time.
Randy: Thanks Robert.