Hybrid Cloud as Alternative to "Pure" Cloud Computing
We can define "pure cloud" approach as one in which IT services are provided:
- via public Internet or other WAN network
- by an external company
From the historical standpoint "pure cloud" providers represent the return to the mainframe era
on a new technology level. And older folks remember quite well how much IBM was hated in 60th and
70th and how much energy and money people have spent trying to free themselves from this particular
"central provider of services" model.
And to what extent rapid adoption of PC was driven by the desire to send a particular "central service
provider" to hell. That historical fact raises a legitimate question of the user resistance to "pure
cloud" model. Now security consideration and widespread NSA interception of traffic and access to major
cloud providers webmail also became hotly debated issues.
And like any new service along with solving one old problems it creates new, often more complex and
less solvable. Proponents often exaggerate positive features and underestimate problems and costs. The
vision of the IT future based on a remote centralized and outsourced datacenter that provides services
via "cloud" using high speed fiber links is utopian.
It is very true that the huge bandwidth of fiber optic lines made "in the cloud" computing more acceptable
and some business models impossible in the past quite possible (Netflix). But that does not mean that
this technology is a "universal bottle opener". Bottlenecks remain. I would like to stress that this
is just another example of distributed computing model (with usage of WAN instead of LAN) and as such
it has its limits. According to Wikipedia "The Fallacies of Distributed Computing" are a set of common
but flawed assumptions made by programmers in development of distributed applications. They were originated
by Peter Deitch and his "eight classic fallacies" can be summarized as following:
- The network is reliable.
- Latency is zero.
- Bandwidth is infinite.
- The network is secure.
- Topology doesn't change.
- There is one administrator.
- Transport cost is zero.
- The network is homogeneous.
Giants of cloud computing such as Amazon, Google, Yahoo push the idea of "pure" centralized "cloud-based"
services to the extreme. They essentially are trying to replicate mainframe environment on a new level.
Inheriting all the shortcoming of such an approach. Many companies is such industries as aerospace,
financial services, chemical and pharmaceutical industries are likely to avoid "pure" cloud computing
model because of their trade secrets, patented formulas and processes as well as compliance concerns.
Also large companies usually run multiple datacenters and it is natural for them to be integrated into
private "gated" cloud. No rocket science here.
These private clouds use a dedicated private network that connect their members and that help to
avoid the problem of "bandwidth communism". At the same time a private cloud can consolidate the hardware
and software needs of multiple companies, provide load balancing and better economies of scale.
There is also a complimentary trend of creating remote autonomous datacenters controlled and
managed from the well equipped central command&control center. Server remote control tool now reached
the stage than you can mange server hunreds miles from your office as well as if you can wlkt to the
server room. Actually not quite, and here much depends on your ingenuity.
But having central control center instead of centralized serve farm is more flexible approach as
you can provide computer power directly were it is needed at the cost of a typical large server hosting
Distributed scheme also requires much less WAN bandwidth and is more reliable as in case of problems
with WAN connectivity of local hurragane that cut electricity supply local services remain functional.
IBM promote this approach under the name of "autonomic computing".
The level of automation strongly influences how efficiently IT is running. This factor alone might
have greater influence on the future shape of IT than then all this "in the cloud" buzz. Vendors propose
multiple solutions know as Enterprise System Management (ESM), computer grid (Sun), etc. Such products
as HP Openview and BMC Software Patrol Service Impact
Manager (former MasterCell 3.0). The vision here are autonomous
centrally controlled computational entities, controlled from a common control center. This approach
can be combined with several other alternatives to "in the cloud" computing, such as "datacenter in
the box", virtual appliances, and application streaming.
One important advantage of hybrid approach is the possibility of using autonomous servers. This is
pretty new development that became possible due to advances in hardware (DRAC 7, blades) and software
(Virtual Software Appliances). By autonomous
server we will understand a server installed in location where there is no IT staff. There can be local
staff knowledgeable on the level of a typical PC user, but that's it. Such servers can be managed centrally
from a specialized location which retains highly qualified staff. Such location can be even on a different
This type of infrastructure is pretty common for large chemical companies, banks (with multiple branches)
and many other companies with distributed infrastructure. Details of implementation of course differ.
Here we will concentrate on common features rather then differences. The key advantage of local autonomous
servers (and by extension small local autonomous datacenters) is that local services and local traffic
remains local. The latter provides several important advantages. Among them:
- You are less dependent on the reliability of central infrastructure of your "cloud provider"
or company "master" datacenter (which can well be across Atlantic). In case of "hybrid cloud"
you can provide bandwidth-intensive services over local LAN via local real or virtual servers. Only
administration is centralized.
- Snooping of traffic is more difficult
as processing is distributed and, for example, if user mailboxes exist at local locations, not at
central location where you can pick all the information in one swipe.
- Cost of such infrastructure is lower then central cloud infrastructure
- Dependency on third parties is lower and you can control more aspects of it.
The attraction of hybrid cloud environment is directly connected with advances of modern hardware
- Modern hardware provides reliable remote services with good remote management capabilities buil-in
(especially in blade environment); it also can and advances in virtual machine hypervisors, which
allow running virtual instances at speed close to the speed of "native" OS. A typical Dell or HP
server easily endure fives years of service without single minute of hardware-related downtime.
- Services from modern manufacturers have built-in KVM (Drac in Dell, ILO in HP) with remote
control of power and many bell and whistles. Dell DRAC is the standard in the industry as for
capabilities (especially in blade environment) with DRAC 7 outperforming HP ILO by a wide margin.
- Raid Zero configuration with additional redundant drive and SSD-based configurations are essentially
bullet-proof. Chances that they can cause downtime are very slim.
- Redundant power supplies usually last five-seven years without malfunctioning.
- Modern software now provide powerful tools for managing multiple similar servers from the central
location dramatically lowering the cost of "reverse cloud" administration.
Problems with "pure" cloud computing
Cloud computing in it "pure" form (100% on remote datacenter) is centralized mainframe-style solution
and it suffers from all limitations inherent in centralization. All powerful remote datacenters with
minimum local client functionality are actually pretty costly and savings sited if advertisements are
more often then not pure fiction. They are attractive for higher management as "stealth" outsourcing
solution, but that's it. As any outsourced solution it suffers from loyalty problem. Cole combination
of local and centralized capabilities are much more optimal. There are several important concerns with
"pure" centralized solution (single remote datacenters-based cloud provider):
- Application performance concerns (WAN bandwidth issues; capability of remote datacenters
to provide adequate response time, etc)
- Bureaucratization problems, sweeping under the carpet mainframe-style problems with external
providers of IT services (everything becomes contract negotiation).
- Interoperability problems. Integration issues and balkanization threat in case of multiple
"in the cloud" vendors. The lock-in concerns as well as potential rip-off due to complicated pricing
- Total cost of ownership concerns. Cloud providers are pretty greedy and the long time
cost can't be assumed to be stable, it can rise substantially as soon as the vendor feels that the
customer locked into solution, evaporating all potential savings.
- Security concerns. Those are simply huge, from that danger of government interception
of all data to competitors stealing the crown jewels of technology.
- Capabilities of application and application performance concern. Even the most powerful
remote datacenter has difficulties to compete with model 3GHz 16GB of memory and 400GB SSD laptop.
- Poor customer service and increased risk of downtime. For cloud provider you are just
one of many clients and in case of downtime you are left in the cold.
Cloud computing as a technology is already more then ten years old ;-) and now we know quite a bit
about its limitations.
While an interesting form of reincarnation of mainframes, it is critically dependent on availability
of high speed fiber WAN networks. But this technology can be implemented in many flavors: can be a pure
play (everything done on the servers of the cloud provider, for example Amazon or Google) or mixed approach
with a significant part of the processing done at the client (more common and more practical approach).
In no way it presuppose simplistic variant in which the everything should be done of the server while
for some strange reasons rich client considered to be not kosher.
|In no way "in the cloud" software services model presuppose
the everything should be done of the server and that rich powerfule client is not kosher. Also
it ignores low cost of modern server and their excellent remote administration capabilities
What is more important is providing Software as a Service (SaaS) . SaaS is a model
of software deployment where an application is hosted as a service provided to customers across the
network (local or Internet). By eliminating the need to install and run the application on the customer's
own computer, SaaS alleviates the customer's burden of software maintenance, ongoing operation, and
support. Various appliances (email, DNS, etc) are one popular implementation of SaaS.
Webmail, web hosting and other new "groupware" technologies (blogs, wikis, web-conferencing, etc)
can he hosted on the local network cheaper and more reliably then on Google or Amazon and they enjoy
tremendously better bandwidth due to usage of local network (which now is often 10Gbit/sec) instead
of much slower (and in case of private networks more expensive) WAN. Web hosting providers emerged as
a strong, growing industry that essentially pioneered the commercialization of SaaS model and convincingly
proved its commercial viability. As Kismore Swaminathan aptly noted in
to Alastair McAulay’s article
.... as presently constituted, cloud computing may not be a panacea for every organization.
The hardware, software and desktop clouds are mature enough for early adopters. Amazon, for
example, can already point to more than 10 billion objects on its storage cloud; Salesforce.com generated
more than $740 million in revenues for its 2008 fiscal year; and Google includes General Electric
and Procter & Gamble among the users of its desktop cloud.
However, several issues must still be addressed and these involve three critical matters:
where data resides, the security of software and data against malicious attacks, and performance
Hillary Clinton private emails server scandal and Podesta email hack provides additional interesting
view of the problems of cloud computing,
Actually the fact that General Electric and Procter & Gamble use Google arises strong suspicion about
quality of high IT management in those organizations. IMHO this is too risky gamble to play for any
competent IT architect. For a large company IT costs already are reduced to around 1% or less, so there
are no big saving in going further in cost saving direction. But there are definitely huge risks, as
as some point quantity of cost cutting turns in real quality of service issue.
I would argue that "in the cloud" paradise looks more like software demo from a popular anecdote
about the difference between paradise and hell ;-). It turns out that for the last five years there
are several competing technologies such as usage of virtual appliances and autonomous datacenters or
as they are sometimes called "datacenter in the box".
Usage of a local network eliminates the main problem of keeping all your data "in the cloud": possibility
of network outages and slowdowns. In this case all local services will continue to function, while in
the "pure" cloud services are dead. From the end user perspective, it doesn’t make a difference if a
server glitch is caused by a web host or a software service provider. Your data is in the cloud, and
not on your local PC. If the cloud evaporates you can’t get to your data. This iss well known to Google
users. If the service or site goes down, you can’t get to your data and have nobody to contact. And
even if you have somebody to contact as this is a different organization, they have their own priorities
While software as a service might represent a licensing savings, there might be a better way to achieve
the middle ground and lot to lose all advantages of local storage of data. For example by using some
kind of P2P infrastructure automatically synchronized so that you can view/edit data locally. Data should
not be held hostage until the user can get back to the cloud. In this sense Microsoft's Microsoft Live
Mesh might be a step in right direction as it provides useful middle grown by synchronizing data across
multiple computers belonging to a users (home/office or office1/office2, etc).
But the key problem with "in the cloud" delivery of services is that only some services, for example
Web sites and email, as well as others with limited I/O (both ways) are suitable for this deployment
mode. Attempting to do company wide videoconference via cloud is a very risky proposition.
That does not means that the list of services that are OK for centralization and can benefit from
it is short. Among "limited I/O" services we can mention payroll and enterprise benefits services --
another area where "in the cloud" computing definitely makes sense. But most I/O intensive enterprise
processing like file sharing is more efficiently done on a local level. That includes most desktop Office
related tasks, ERP tasks to name just a few. They are more difficult and more expensive to move into
the cloud and economic justification for such a move is open for review. So in a way it is a complementary
technology to the local datacenter not an alternative one. Moreover "in the cloud" approach can be implemented
on a local level over LAN instead of WAN ("mini cloud" or "cloud in the box").
Actually cloud-based services itself are not that cheap so cost savings in switching to the "in the
cloud" service provider for large organizations are typically exaggerated and might even be illusory.
The top rated providers such as Amazon are mainly interesting, if you experience substantial, unpredictable
peaks in your workloads and or bandwidth use. For stable, consistent workloads you usually end paying
too much . Spectrum of available services is limited and outside running your own virtual servers it
is difficult to replace services provided by a real datacenter. May be except small startups. The most
commercially viable part is represented by Web hosting and rack hosting providers. For web hosting providers
the advantages are quickly diminishing with the complexity of website, though. Among Amazon Web services
only S3 storage currently can be called a successful, viable service. Microsoft Live Mesh
is mostly a data synchronization service. It presuppose existence of
local computers and initially provides syncing files between multiple instances of local computers belonging
to the same user. This is an important service, but this not a pure "in the cloud" provider model. It
is a more realistic "mixed" approach.
Again, with the current prices of servers and network equipment, most existing services are not cheap
and became really expensive as you scale them up. We will discuss this issue later.
All-in-all it is not very clear what market share "in the cloud" services deserve and but it is clear
that "in the cloud" service providers can't fit all enterprise needs. Currently,
there are a lot more places to run software than there used to be, thanks to the proliferation of powerful
laptops, mobile phones like iPhone/Android phones, tablets, etc. Actually smartphones represent
another interesting development that runs in the direction opposite to the move to the "in the cloud"
computing. It is a powerful local device with wireless network connectivity. The key here is substantial
processing power and storage capacity available on the device which is increasing with each new model.
From social standpoint cloud provider services are also problematic. Service providers mind their
own interests first. also large service provider is the same bureaucratic monster as a typical datacenter
that share with the latter all the spectrum of Dilbertalization problems. Large customers experience
"vendor lock-in" working with service providers as it involves significant effort to adapt on both sides.
So walk-out is a less viable option that one can assume on pure theoretical backgrounds. It is also
possible that outsourcing to "software service providers" like any outsourcing can be used as a negotiating
tool (aka "method of blackmail"), to force wage concessions from workers, tax breaks from governments,
etc., even when the actual savings would be small, or even when they are negative and moving makes
no business sense whatsoever.
Promoters of remote outsourced datacenter, such as Nicholas Carr usually ignore the availability
and the cost bandwidth. Think Netflix and all conflicts it is fighting with the local cable internet
providers. We can't assume that free Internet connectivity is adequate for all business purposes. such
an assumption is corrected called "bandwidth communism". Yes, fiber-optic changed WAN landscape making
remote services more viable and Internet tremendously more powerful. But the devil is in details. For
example file sharing for a large company over WAN is still bad idea as public bandwidth is insufficient
and private is costly. Also any enterprise making bet of 24x7 availability of public bandwidth for vital
corporate services looks slightly suicidal because of the "tragedy of commons" problem, which already
demonstrated itself in repressions against P2P enthusiasts by large ISPs. All-in-all this "grand utility
computing" vision ("bandwidth communism") is problematic because somebody needs to pay for all this
Fiber networks increased both Internet and LAN bandwidth substantially and revitalized distributed
computing. But there is a big difference whether you distribute over LAN or WAN. The latter is much
tougher case. With all the tremendous progress of Internet available bandwidth does not increase as
quickly as computing power. Nowhere close, and it never has. If anything, due to increased scrutiny
and "shaping" by ISPs (they are not a charity and need to be profitable) bandwidth "per user" might
recently start decreasing as such resource hogs as YouTube and video programming distribution services
(movies on demand) are becoming more and more prominent. Ability of video streams and P2P services to
clog the Net in the most inopportune moment now is well established fact and is a real curse for university
For i/o intensive tasks, unless you pay for the quality of service, "in the cloud" computing model
stands on a very shaky ground. Reliable 24x7 bandwidth cannot be free for all users in all circumstances
and for all protocols. Substantial amount of traffic with the remote datacenter is not that easy to
transmit reliably and with minimal latency via public channels in rush hours. But buying private links
to remote datacenters can be extremely expensive: for mid-side companies it is usually as expensive
as keeping everything in house. For multinationals it is more expensive, so only "other considerations"
(like "jurisdiction over the data") can sustain the centralization wave to the large remote datacenters.
For many multinationals SOX was the last straw that made move of datacenters out of the USA desirable,
costs be damned. Now the shadow of NSA serves as keeping this scare alive and well. Cost of private
high speed links limits cost efficiency of the "in the cloud" approach to any service where disruptions
or low bandwidth in certain times of the day cannot lead to substantial monetary losses. For critical
business services such as ERP public data channel can be too brittle.
But it is fair to state that the situation is different for different services. For example for SMTP
mail outsourcers like Google/Postini, this problem is not relevant due to the properties of the SMTP
protocol: they can communicate via regular Internet. The same is true to DNS services providers, webmail
and instant messaging. CRM is also pretty close. But for ERP, file sharing and WAN based backup the
situation is very different: providing high speed networking services over WAN is a very challenging
engineering task to say the least. The cost of bandwidth also puts natural limits on service providers
growth as local networks are usually much cheaper and much faster. Achieving 1Gbit/s speed on LAN is
extremely cheap (even laptops now have 1Gbit adapters) while it is quite expensive over WAN. There are
also other limiting factors:
- The possibility of creation of local LAN backbone with speeds higher the 1 Gbit/s. 10Gbit/s
backbones are becoming more and more common.
- Limited bandwidth at the point of connection of provider to the Internet. Every provider
is connected to the Internet via a pipe and that pipe is only so big. For example OC-1 and OC-3 have
their upper limit of 51.84Mbit/s and 155.2 Mbit/s correspondingly. Funny the upper speed of OC-3
(which is pretty expensive) is only slightly higher that 100Mbit/s which long ago became the lowest
common denominator for LANs. Large service providers typically use OC-46 with speed up to 2488.32
Mbit/s which is similar to the speed of gigabit Ethernet. 10 Gigabit Ethernet is the fastest commonly
available network standard for LANs. It is still emerging technology with only 1 million ports shipped
in 2007 but it is quickly growing in importance. It might be eventually used in modified form for
WANs too. Anyway as WAN bandwidth is limited and shared between multiple customers the spike in activity
of one customer might negatively affect others. Networking problems at the provider level affect
all its customers and recovery period might lead to additional spikes of activity.
- Local vs. remote storage of data. Recent enterprise level hardrives (Cheetah
15K) have speed up to 164 MB/sec (Megabytes, not megabits). From the speed and cost point of
view the ability to keep data/programs local is a big technological advantage. For I/O intensive
applications it might be that the only viable role for remote providers is synchronization with local
data ;-). Example of this approach is Microsoft's Live Mesh
- Interdependence of customers on the transport level. This is jut another incarnation of
"tragedy of commons" problem. Bandwidth hogs like game, P2P, music and video enthusiasts do not care
a dime about your SLA and can easily put a company that uses public links into disadvantage any time
of the day if and when something new and exiting like a new HD movie was released. Also providers
are not willing to sacrifice their revenue to accommodate "free-riders.": as soon as usage of bandwidth
cuts into profits it is punishable and no amount of rhetoric about "Internet freedom" and "Net neutrality"
can change that. That means that enterprise customers relying on public bandwidth can suffer from
the effort of providers to manage free-riding. That means the corporation which moved services to
the cloud competes with various bandwidth hogs who do not want to scarifies any ground and ready
to go quite far to satisfy their real or perceived needs. My university experience suggest that corporate
users can suffer from Internet clogging in the form of sluggish download speeds, slow response times
and frustration with i/o intensive services that become much less useful and/or enjoyable. See for
Time Warner Cable Vs. The Horde.
- Competition for the resources at remote datacenter level. For any successful service providing
all the necessary bandwidth is costly and cuts into margins. Recently Amazon faced the situation
when bandwidth required for its
Elastic Compute Cloud (EC2) proved to be higher then by all of Amazon.com’s global websites combined.
You can read between lines how that affect profitability:
Adoption of Amazon
Elastic Compute Cloud (EC2) and Amazon
Simple Storage Service (S3) continues to grow. As an indicator of adoption, bandwidth utilized
by these services in fourth quarter 2007 was even greater than bandwidth
utilized in the same period by all of Amazon.com’s global websites combined.
providers which offer customers unlimited bandwidth are banking on the fact that the majority of
their customers will not use much of their bandwidth. This is essentially a marketing trick.
As soon as you exceed a fraction of what is promised they may well kick you out. People
who tried to implement software , mp3 or video sharing services on low cost ISP accounts realized
that very soon. See for example references that I collected under "Unlimited
bandwidth myth". Web neutrality does not mean the tragedy of commons is not applicable. As
Huberman, Rajan M. Lukose noted:
Because the Internet is a public good and its numerous users are not charged in proportion
to their use, it appears rational for individuals to consume
bandwidth greedily while thinking that their actions have little effect on the overall
performance of the Internet. Because every individual can reason this
way, the whole Internet's performance can degrade considerably, which makes
everyone worse off. An analysis of the congestions created by such dilemmas predicts
that they are intermittent in nature with definite statistical properties leading to
short-lived spikes in congestion. Internet latencies were measured over a wide range
of conditions and locations and were found to confirm these predictions, thus
providing a possible microscopic mechanism for the observed intermittent congestions
of the Internet.
So a company which will try to implement Web based streaming of say corporate video conference
via cloud is up to nasty surprises unless it paid "arm and leg" for dedicated lines to its headquarters
and other major locations (which make the whole idea much less attractive in comparison with the
local datacenter). The ability to stream video of any considerable quality in real-time between two
(or more!) arbitrary points in the network is not really something that can be easily done over the
The main point to make is that a reliable WAN network connectivity cost a lot of money is difficult
to achieve. This problem is unavoidable if your major components are "in the cloud" (in WAN). Also in
the "free internet" enterprises are starting to compete for bandwidth with streaming media (films over
Internet). The latter proved to be a huge resource hog and quality of a typical Internet connection
now fluctuates widely during the day. That means that in order to achieve respectable quality of service
for bandwidth intensive applications enterprises need to buy dedicated WAN connections. That is a very
expensive habit to say the least. In typical for multinationals moves, say, relocation of SAP/R3 instance
from USA to Europe (just from one continent to another) to achieve reasonable latency for requests coming
from the USA is not that easy and definitely not cheap. The cost of high bandwidth transatlantic connection
is the major part of additional costs and eats all savings from the centralization. The same effect
is true about any WAN connection: reliable high-bandwidth WAN connections are expensive. Moreover the
reliability needs to be carefully monitored and that also cost money as anybody who was responsible
for the company WAN SLA can attest.
|Centralisation, or centralization (see
spelling differences), is the
process by which the activities of an organization, particularly those regarding decision-making,
become concentrated within a particular
There is a shadow of mainframes over the whole idea of "in the cloud" software services providers.
Excessive centralization has its own perils, perils well known from the days of IBM360/370 and "glass
datacenters". While at the beginning growth of "cloud services" providers increase profit margins and
may even increase the quality of the service eventually each organization hits "size" wall.
That's why businesses tend to perceive such providers as inflexible, and, due to natural pre-occupation
with profit margins, hostile to their business needs: exactly like their perceived "grass datacenter"
in not so distant past. So much for the improvement of the business model in comparison with traditional
local datacenters, as problematic as they are (and I would be the last to deny that they have their
own set of problems).
As one reader of Calculated Risk blog noted in his comments to the post
Popular Google Product Suffers Major Disruption
They started employing more MBAs at Google? Seriously, any creative company that "grows up" and
starts taking advice from bean counters/ entrail readers and sociopaths is doomed.
Microsoft's greatest strength was that its founders never lost control.. so even though their
products were inferior (at least initially), their competitors took advice from MBAs and f**ked themselves
up, leaving MS as the only survivor.
It's very hard to scale service-based tech firm and keep the mojo that made the startup successful
in the first place, especially via acquisitions or employing 'professional managers' to operate the
company. Basically I think the story is simple -- Parkinson's law -- bureaucracies naturally grow without
limit. That includes the management of large "cloud-services" providers including Google. Excessive
layers of middle managers and affiliated cronies ("shadow secretaries") count in the denominator of
labor productivity. Everyone knows that many middle managers (often called PHBs) are mainly "inventing"
work for themselves and other middle managers writing memos and calling meetings and stuff. Middle management
is all about sustaining internal information flow; technology makes good middle management more efficient
since it enables better information flow but it makes bad middle manager even more harmful as they "invent"
useless information flow (spam) and block useful information in order to survive.
Issues of quality, loyalty, knowledge of the business are automatically surfaced and as a result
customers suffer. Mainframe-style utility model encourages excessive bureaucratization with rigid procedures,
stupid customer service and power concentrated at the top. That means that the issues of sociopathic
leadership and sycophant managers replacing founders is even more acute then in regular IT firms. Corporate
predators prefer large companies. As a result demoralization surface and IT personnel, that is cubicle
serfs, now spend a large fraction of their time in the office, surfing the internet.
Contrary to simplistic description and assumptions typical for writers like Carr mainframe warts
are very difficult, if not impossible to overcome. And they can be amplified by low cost of free services
with no reliability guarantee. Disappearance of data usually is not covered. There is a danger of relying
of semi-free (advertisement supported) services too much.
For example anybody who used low cost Web-hosting provider can attest that interests of providers
run contrary to the interests of advanced users and as such often stifle advanced technology adoption
even if they are supported by a particular provider because they, for example, might increase the load
on the server. Also provision of the adequate bandwidth for multiple users (and decent responses times)
can be a shaky area. Especially during rush period like 8-11 EST. Typically customer service is far
from being responsive to any complains.
Security is another important (and costly to resolve) challenge: break-in into a large provider affects
all its customers. There were several high profile break-ins into large Web hosting providers during
the last two years, so this is not a theoretical threat. Claiming that Web provider are a total solution
for any organization is like saying that just because the Big 4 accounting firms (with their the army
of accountants, tax specialists and so on) exist, organizations can do away with internal accountants
altogether. The hybrid platforms, such as Saleforce.com's application upgrades and quality of service
issue still are of major concern.
Relying on advertisement is another mine field. Many people hesitate to send anything important to
a Gmail address, knowing that the mail will be scanned (by what is an advertising company) in transit.
Still there is a lot of disappointments with this model as exemplified with the following characterization
of "cloud-based" services and outsourcing:
'We thought that we are like a farmer shooing a fat rabbit, but it turned out that the rabbit
is shooing at us."
This quote suggests that providers of such services as any outsourcers in the future might have difficulties
to shake money loose from the organizations, as customers discover that the interests are highly divergent.
IBM already discovered that this is an inherent limit of their push to the "'service oriented organization".
Recently they even experienced a backlash.
Because we need to bridge interests of two or more different organization, there are several significant
problems in interaction with "in the cloud" providers. Among them
- Problem of loyalty. It cuts both ways. First of all in case you use external providers
loyalty of staff disappears and for any complex service you face "all or nothing situation": If service
works everything is wonderful, but if it does not troubleshooting is extremely difficult and fixing
the problem is almost impossible even if you understands what is happening -- infrastructure belongs
to other organization. Anybody who used Web hosting (the most successful example of such services)
can attest that this is a wonderful service as long as it works. But if you have a problem you have
a big problem: local staff has no loyalty to the particular organization and competition does not
work as another provider can be as bad or even worse; so switching brings you nothing but additional
headache and losses. Even elementary problem can take several months to be resolved and I am not
joking I experience it myself. Oversubscription which leads to highly loaded servers and insufficient
network bandwidth is another common problem. There are also problems related to the "race to the
bottom" in such services: the main differentiator becomes price and to attract new customers Web
providers often make claims that are untrue (unlimited bandwidth is one typical example). As a result
naive customers who believe in such claims are burned.
- Forced upgrades. In case of external providers that platform is owned by the provider
and if his financial or technological interests dictate that upgrade is necessary it will be done.
That means that customers have to adapt or leave. And upgrade for a provider with large number of
customers can be huge mess that cost clients dearly. I experienced this situation myself and can
attest that the level of frustration involved is substantial and the mess can easily last several
- Compatibility problems. As provider uses specific technology
the level of interoperability of this technology with other important for the company technologies
is not under the control of the user of such services. It can lead to significant costs due to luck
of interoperability. In the simplest example lack of interoperability with Microsoft Office is a
serious problem for Sun which uses Open Office (Star Office to be exact).
- The loss of operational
flexibility When switch to "in the cloud" provider is done on cost grounds alone, that usually
creates new (and initially mostly unrecognized) dependencies that deprive the company from much of
the operational flexibility. Typically security departments are direct beneficiary as security-related
spending tend to grow dramatically as a defensive reaction of the organism. The main consequence
of the bureaucratization is the loss of flexibility, and sooner or later this lack of flexibility
can come back and haunt the company.
In the recent interview Bill Gates
noted that "The IT systems are your brain. If you take your
brain and outsource it then any adaptability you want
(becomes) a contract negotiation". After you negotiated
the contact with the "in the cloud" provider any flexibility you used to have is lost as every change
explicitly or implicitly become a change of the contact negotiation. Moreover,
if the company lost its key IT staff and key architectural issues are decided by outsourcers,
then essentially the company becomes a hostage of outsourcers as it no longer has brain
power to access the options and the architecture (and thus the costs) are by-and-large controlled
by forces outside your control. That is much more serious risk that many PHB assume: the line between
architectural decisions and implementation decisions is pretty fuzzy. There is also associated brain-drain
risk – if you outsource important functions, you irrevocable erode the capabilities within your firm.
When problems arise, internal IT staff can often resolve the problem more quickly,
with less bureaucratic overhead inherent when two corporations are involved.
The fist category of factors that lead to loss of flexibility is connected with additional standards,
policies and procedures that are usually introduced in external service provider situation.
The other important aspect of the loss of remnants of competitive advantage,
if any, as the service provider is now the place were the critical mass of know-how and talent pool
reside. That somewhat reverses "client-provider" relationship: it is service provider who now can
dictate the rules of the game and who is the only party who understands the precise nature of the
tasks involved and can conceal this knowledge for his own advantage from the other party. That usually
is helped by the demoralization of the remaining staff in the company
- Amplification of the management missteps.
Their first and most important desire of service provider is to keep the client, even if client's
situation changed and no more lend itself easily to the services provided. That can lead to huge
architectural blunders. Those things are not limited to offshoring but happened often with complex
projects like ERP were external consultants are involved, especially in case big consultant firms
are involved. Several large companies went out of business or were bought as a result. Among example
we can list FoxMeyer, AT&T Wireless. Several companies were severely hurt (Herschel, etc). This is
might actually include Compaq.
As for Compaq, the story is more complex. Dot-com bust hurt "value added" companies like HP and Compaq
disproportionally and increased appetite for mergers and acquisitions activities. Dogged determination
of Carly Fiorina about the merger (which served both for self-promotion and as a smokescreen for
her own problems at struggling HP) and the ability of former Compaq boss, a certified public accountant
Michael Capellas, to
understand personal benefits from Carly Fiorina proposition proved to be a winning combination. Capellas
who proved to be a "serial CEO", , was a big fan of SAP and that probably was a contributed factor
is Compaq troubles. When he was appointed, he promised to turn around struggling Compaq in 180 days.
Now we know that after just 84 he found a much better financial solution, at least for himself. A
shortcut for a turnaround ;-). It is interesting to note that in 2000, based on iPAQ success, he
ComputerWeekly.com,Feb 2000] :
"There will be an explosion in devices, all attached to the Web. We
are going to simplify the PC radically." Capellas promises that future desktop computers will
be easier to manage than today's machines.
Large companies have become very adept at establishing remote teams in places like India, Russia,
etc. Due to their ability to offer higher wages and better environment they are able to attract local
talent and run challenging research and development projects. Often, though, these outposts become disconnected
from their parent companies because they fail to establish rapport with key participants in the central
office. Foreign prophets are usually ignored. There is something fundamental in this "tragedy of local
talent" and communication is still a problem even in the age of videoconferences. Without "warm-body"
personal contacts it is difficult to build long-term trust based relationships.
Many organizations who thought outsourcing IT was the key to success miserably failed. Whose who
did not failed lost competitive advantage, experienced the demoralizing effect of vendor lockdown and
QOC hokey-pokey which just simply made no sense. Some in-sourced the IT back, some recreated it from
scratch, are still in denial.
That means that client of service providers will be implicitly pushed to lowest common denominator
and cannot utilize local expertise, even if such exists. They face "helpdesk level" people and instead
of benefiting from specialized provider are often proposed wrong solutions to misdiagnosed problems.
my experience with WEB providers suggests that trivial problems like an error in DNS record or wrong
permissions can became real production issues.
Service providers can evolve and upgrade software independently of wishes of some or all of the customers.
That means that customers who are not satisfied with the direction taken need either to adapt or abandon
Public internet is unsuitable for handling large volume of transaction with stringent performance
criteria. That means that it is dangerous to put databases at "in the cloud providers" : the more successful
"in the cloud" providers are ( or if there just are too many P2P and or multiplayer videogames enthusiasts
in the same subnet), the slower your WAN connection will be ("tragedy of commons").
Moreover, in comparison with LAN, WAN-based provision of software services is more complex system
and as such is less reliable especially at bottlenecks (which are service provider "entry points" and
associated infrastructure (DNS, routers, switches, etc). With WAN outage the situation can became a
lot worse then when just when spreadsheets or MS Word documents suddenly are inaccessible on the local
server due to LAN outage (but you can still download then into USB stick directly from the server and
work with the local copy until network connectivity is restored, because your departmental file server
is just several dozens of yards away and friendly administrator probably can help you to get to the
data. In case of WAN there is no way to put a USB stick on the server or use other shortcut to avoid
effects of network downtime: if WAN connection is down you are really down. Generally not only you can
do nothing about the outage, its effects might be amplified by the fact that there are many customers
affected. All you will get is the message like this:
The service is experiencing technical difficulties. We apologize for the
inconvenience. Please try again later .
That means that in some cases the effect on organization of external outage might be such that the
head of the person who enthusiastically recommended company to move "into the cloud" rolls down independent
of his position, real or faked IQ and technical competence. Recently both Gmail and Amazon services
experienced multiple outages. As Brad Stone noted in NYT:
There is plenty of disappointment to go around these days. Such technology stalwarts as
Research in Motion
, the company behind the BlackBerry, have all suffered embarrassing
technical problems in the last few months.
About a month ago, a sudden surge of visitors
to Mr. Payne’s site began asking about the normally impervious
Amazon. That site was ultimately down for several hours over two business days, and
Amazon, by some estimates, lost more than a million dollars an hour in sales.
The Web, like any technology or medium, has always been susceptible to unforeseen hiccups.
Particularly in the early days of the Web, sites like
eBay and Schwab.com regularly went dark.
But since fewer people used the Internet back then, the stakes were much lower. Now the
Web is an irreplaceable part of daily life, and Internet companies have plans to make us
even more dependent on it.
Google want us to store not just e-mail online but also spreadsheets, photo albums,
sales data and nearly every other piece of personal and professional information. That data
is supposed to be more accessible than information tucked away in the office computer or
The problem is that this ideal requires Web services to be
available around the clock — and even the Internet’s biggest companies sometimes have trouble
making that happen.
Last holiday season, Yahoo’s system for Internet retailers, Yahoo Merchant Solutions,
went dark for 14 hours, taking down thousands of e-commerce companies on one of the busiest
shopping days of the year. In February, certain Amazon services that power the sites of
many Web start-up companies had a day of intermittent failures, knocking many of those companies
The causes of these problems range widely: it might be system upgrades with unintended
consequences, human error (oops, wrong button) or even just old-fashioned electrical failures.
Last month, an electrical explosion in a Houston data center of the Planet, a Web hosting
company, knocked thousands of Web businesses off the Internet for up to five days.
“It was prolonged torture,” said Grant Burhans, a Web entrepreneur
from Florida whose telecommunications- and real-estate-related Web sites were down for four
days, costing him thousands of dollars in lost business.
I was actually surprised how much posts each "in the cloud" service outage generates and how significant
were losses reported by some users; in addition to
Official Gmail Blog one of the best places
to track Gmail outages proved to be
Twitter; there is also a site
which provides a free check for frustrated Gmail and other "in the cloud" services users). Some reactions
are pretty funny:
- Thank you so much for coming back. I felt lonely
without your constant, vigilant presence.
- I was so relieved Gmail was down. I thought
my employer decided to block it on my berry.
- Gmail ruined my day. Damned cloud!
- Gmail is really pissing me off today,
which doesn't happen often. S'like having your first real fight in a relationship. Very disconcerting!
- Gmail's back, but now my IndyChristian
provider is down. And Facebook.com . Go figure.
- Thank God I never jumped aboard the evil gmail
- To all Gmail outage whiners: if you want
the right to bitch, don't use free email. Also, look up the definition of "BETA" sometime.
- Very comforting to know that I wasn't the only
one that had no Gmail for a bit there. I'm not alone in this world? Long but good day.
- Man, gmail goes down for one moment and
the whole world is up in arms :)
- How did I miss the gmail outage? I feel
so out of the loop!
- the weather is hot, the a/c is b0rked, gmail
was down this afternoon. expect to see four horsemen walk by any minute...
- Sorry, completely
missed the gmail outage because I was out biking and enjoying the amazing evening we're
- Having Gmail down about made me bang my
head on the table. Oy.
gmail and getting behind the olympics
- For Twitterers, Gmail going down is like
JFK being shot, innit?
- Ignorance IS bliss-so busy today I did not even
know about the great gmail outage of '08.
- The cloud of unreliability: It's not clear why
anyone should be surprised that Gmail, Amazon.com's cloud ..
- Was thinking about going to back to Gmail
from .Mac since MobileMe was down. Now I hear Gmail was down. Thinking about envelopes
- so mad at gmail for 502-ing her earlier
today. what did we do before google?
- With Gmail having gone down and me not
noticing, I am undiagnosing myself of Internet Addiction and OCD.
- Maybe gmail is down to express solidarity
As any self-respecting obsessive business e-mail checker could tell you, each outage is a real shock
and fails on the most inopportune moment. In reality most email outages does not make users less productive,
they just deprive them from their favorite tool of wasting own and other time and procrastination ;-)
Here is a short but pretty sobering list for 2008. It neither complete not contains the most important
posts about particular outage.
While there is nothing strange about such outages as large distributed datacenters are notoriously difficult
to operate, please note that in case the organization uses both Amazon and Gmail they have multiple
noticeable outages during the first half of 2008. At the same time it is impossible to deny that such
outages definitely make fragility of in the cloud model visible even for the most "pro-clouds" observer
such as Mr. Carr although I doubt that he'll reproduce the statistics listed above in his blog.
[Jan 28, 2008]
Gmail OUTAGE, relying on Google too much - TECH.BLORGE.com
Google services are usually rock solid reliable, but earlier today some Gmail users lost service
for a couple of hours. That begs the question, are we relying on Google too much?
the last few years
and every time it creates a flurry of activity as businesses and individuals realize how much
they rely on a single business and platform for all of their messaging needs.
The same concept is mirrored with search engine traffic, where many people don’t even realize
there are search engines out there other than Google.
[Feb 16, 2008] Amazon explains
its S3 outage Between the Lines ZDNet.com
Amazon has issued a statement that adds a little more clarity to its Web services outage on Friday.
Here’s Amazon’s explanation of the S3
outage, which wreaked
havoc on startups and other enterprises relying on
Early this morning, at 3:30am PST, we started seeing elevated levels of authenticated requests
from multiple users in one of our locations. While we carefully monitor our overall request
volumes and these remained within normal ranges, we had not been monitoring the proportion
of authenticated requests. Importantly, these cryptographic requests consume more resources
per call than other request types.
Shortly before 4:00am PST, we began to see several other users significantly increase their
volume of authenticated calls. The last of these pushed the authentication service over its
maximum capacity before we could complete putting new capacity in place. In addition to processing
authenticated requests, the authentication service also performs account validation on every
request Amazon S3 handles. This caused Amazon S3 to be unable to process any requests in that
location, beginning at 4:31am PST. By 6:48am PST, we had moved enough capacity online to resolve
As we said earlier today, though we’re proud of our uptime track record over the past two
years with this service, any amount of downtime is unacceptable. As part of the post mortem
for this event, we have identified a set of short-term actions as well as longer term improvements.
We are taking immediate action on the following: (a) improving our monitoring of the proportion
of authenticated requests; (b) further increasing our authentication service capacity; and
(c) adding additional defensive measures around the authenticated calls. Additionally, we’ve
begun work on a service health dashboard, and expect to release that shortly.
The Amazon Web Services Team
[Jun 9, 2008]
502 outages SKFox
mint to see live stats for the
traffic to skfox.com and Google Analytics for the long term statistics. I’ve seen a resurgence
today of traffic to my two blog posts (in
March and again in
May) about my gMail account being down. The most common for me is a “Temporary Error (502)”
and it would seem to be happening to other users too. I hate to be the bearer of bad news, but
there isn’t a whole lot you can do about it. On the short side, most of the outages only last
30 minutes or so with the longest being 3.5 hours. It can be terribly frustrating, but there are
a few things you can do to alleviate the pain.
- Use your email client like Outlook or Thunderbird to download your messages to your
local machine with POP3 while keeping your gmail account unchanged. That way, even if gMail
is inaccessible, you can get to your old email for reference. Of course, you have to do this
once your account is up and running again.
- Visit the
gMail Google group to at the least find others in your boat and get the latest updates.
Some users have reported success getting to their mail during outages by using any number of
alternative links to gmail. Your mileage may vary, but here they are:
Today’s outage is a known issue, and I hope for all of your coming here from search engines,
that it comes back up quickly for you.
[Jul 20, 2008]
Amazon's S3 Storage Outage Busts Web Sites, Crashes iPhone Apps
What happened? Sometime this morning, Amazon's (AMZN) S3 storage service went down. Or, according
to Amazon's official note, S3 is "experiencing
elevated error rates." A lot of companies -- Twitter, 37signals, etc. -- rely on S3 to host static
files like images, style sheets, etc. for their Web apps. So when S3 goes down, those sites lose
some/most/all functionality, depending on what the company uses S3 for.
So how'd Amazon's storage service bork my iPhone?
Tapulous relies on S3 for images like Twinkle user icons. And they must not have included
a "plan B" in their code to handle an image server outage. So when S3 hiccuped, and Twinkle couldn't
download images, the app crashed, taking me back to the iPhone home screen. (And, hours later,
it's still crashing.)
[Aug 11, 2008]
TheStar.com Business Google
Gmail users having troubles
SEATTLE–Google Inc said Monday that many users of its Gmail service are having
trouble accessing their online e-mail accounts due to a temporary outage in a contacts system
used by Gmail.
The problems began at about 5 p.m., and a company spokesman said Google is starting
to implement a fix for the problem right now.
"We are starting to roll out a fix now and hope to have the problem resolved as
quickly as possible," said a Google spokesman.
[Aug 25, 2008]
FOXNews.com - Amazon's
Site Outage Raises Scalability Questions - Science News Science & Technology Technology News
Amazon.com has built a reputation for being an aggressive,
take-no-prisoners kind of company, but it showed its more cooperative side this week.
Late last week,
Internet traffic monitoring firm Keynote
Systems issued a report that said many of the largest e-commerce players will face major
load-handling challenges for the holidays if they don't make big changes by the fourth quarter.
However, after so many years of holiday shopping seasons, some in the industry
scoffed that the majors could be caught so short.
To help out, Amazon (AMZN) generously
knocked out its own
servers for almost 2 hours on Aug. 21.
OK, OK. Amazon probably didn't crash just to help Keynote make a point. But
its timing was impeccable nonetheless.
It's interesting to note that within a few days this month, three of the industry's
most powerful retail forces — Wal-Mart (WMT), Amazon
and Google (GOOG) — all suffered problems that,
one way or the other, can be classified as scalability-related.
But the most important difference between "in the cloud" services and local services is not duration
or frequency of the outages. The most important is the level and quality of information available about
the situation. In case of local services all information about the situation is readily available and
thus some countermeasures can be taken. In military world such difference is of paramount important.
IT is not the different in this respect from military world.
In case of "in the cloud" the amount and quality of information about the outage are very low;
services customers are essentially in the dark. Services just abruptly seizes to exists and then magically
comes back. This lack of information has its own costs.
|The most important difference between "in the cloud" services
and local services is not duration or frequency of the outages. The most important is the level
and quality of information available about the situation. In case of local services all information
about the situation is readily available and thus some countermeasures can be taken. In military
world such difference is of paramount important. IT is not the different in this respect from
Virtualization generally magnifies failures. In the physical world, a server failure typically would
mean that backup servers would step in to prevent downtime. In the virtual world, depending on the number
of virtual machines residing on a physical box, a hardware failure could impact multiple virtual servers
and the applications they host. As a result failures have a much larger impact, effecting multiple applications
and multiple users. Little fires turn into big fires. Virtualization might also increase two other problems:
- Performance-related problems. Companies ensure top performance of critical applications
using powerful dedicated servers, network and storage resources for those applications, segmenting
them from other traffic to ensure they get the resources they need. "In the cloud" provider model
of solving those problems is based on allocation of additional resources on demand as his goal is
to minimize cost of infrastructure. That means that at any given time, performance of an application
could degrade, perhaps not to a failure, but to a crawl making users less productive or paralyzed.
- Troubleshooting complexity. Server problems in the past could be limited to one box, but
now the problem can be related to hypervisor, not the server per se or move with the application
instance from on virtual machine to another. If problem existed on one server, but disappears after
application is moved to another and later reappears the question arise: "Is the problem fixed?" Often
temporary disappearance of the problem lull the staff into a vicious activity cycle and they start
to transfer the problem around from virtual server to virtual server instead of trying to solve it
All-in-all the more a particular company relies on "in the cloud computing", the more severe will
be effect of the outages. And in comparison with traditional local LAN-based "physical server" deployment
there are several sources of them:
- In the cloud provider outages: upgrades, equipment replacements, DNS blunders (a very popular
type of blunder ;-), blunders of personnel, problems connected with the personnel incompetence, etc.
- Links from in the cloud provider to ISP
- Problems in ISP (again here there are issues of upgrades, blunders of personnel, physical damages,
DNS problems, etc)
- Problems in the link from ISP to the company
- Virtualization-related problems, if it is used by "in the cloud" provider.
The first thing you notice during large outage of "in the cloud" service provider is that customer
service is overloaded and you need to wait a long time to get to the human voice. There are usually
just too many angry and frustrated customers before you. Also bad is that the nature of the problem
and the estimate of the time needed to resolve is usually are not communicated, keeping you in the dark.
Moreover even after you get to a support person the information you get will be sketchy. Here is one
I was working on a client webinar using WebEx Event Center. I tried to log onto our site and got
a cryptic message saying the site was unavailable. Whammo. I got on the line to technical support
and waited my turn in queue, along with what must have been quite a few other panicked or angry customers.
My support person was surprisingly calm and good natured, given the call volume they must have been
processing. He confirmed that there were network problems on their side and said that while he didn't
know details of the problem, he was certain it would be all right soon. I had a little time before
our event, so I agreed to try again later.
Sure enough, our site came up, but on a backup network. This wasn't completely clean, as one of
our scheduled future events had disappeared from our registration page, meaning people couldn't sign
up. But at least we could carry on with today's show. Unfortunately, performance was quite a bit
below normal standards, with slides and annotations taking inconsistent and sometimes very long time
lags to appear on participants' computers.
After the event, strange behavior continued, with an inability to access post-event reports through
the WebEx administrator interface. I called technical support again and got agreement that it certainly
was strange behavior and we should all hope it would correct itself once the system was back to normal
again. Whenever that might be.
Now, I'm not singling out WebEx for any particular acrimonious treatment here. I happened
to be using them when this problem occurred. Any provider can have a similar problem. At
least WebEx had a backup network plan in place and we were technically able to carry on with our
scheduled event. But it's worth noting that there is a frustrating sense of powerlessness while a
problem like this is going on. You can't prod anybody for more details, because you don't have access
to the people trying to fix the problem as you might if a similar situation occurred with your own
business network. You can't get much in the way of progress updates. And you can't put your own backup
plan into effect.
One interesting difference in "horror stories" between loacl and "in the cloud" datacenter was aptly
outlined in the following
Just after midnight on Jan. 22, 2006, Con Edison began telling the Internet that it was Martha
Stewart. That is, Con Edison erroneously began sending out routing information to redirect Internet
traffic intended for Martha Stewart Living Omnimedia to its own servers.
The publishing company was a Con Edison customer. So were other businesses and Internet providers
whose routing information Con Edison hijacked at the same time. The result was a mess that wasn't
completely cleared up for 18 hours — and some businesses were offline for most of that time.
But not Martha Stewart, whose CIO, Mike Plonski, wrote to me to clarify what happened at his company.
Plonski's secret sauce? No big secret — just network monitoring and
Plonski said: "While one of the Internet connections at our corporate offices was impacted by
the ConEd issue you describe, we, as a company, are smart enough to have employed redundancy, both
by location and carrier, for our network operations. As a result, during this time frame, we simply
flipped all of our Internet traffic to run over our secondary line until ConEd resolved their issue."
OK, it was a little more complicated than that. Plonski said his staff spotted the problem through
routine network monitoring. There was clearly something wrong with network traffic coming to the
corporate office over the Con Edison line. Thanks to the monitoring,
the company knew about the problem about 30 minutes after it started.
Because of the type of outage, an IT staffer had to connect and manually switch over to a redundant
line. That took another hour.
Total time for the outage: about 90 minutes in the wee hours of a Sunday morning. Total impact:
An outage? Yes. A knockout? No way. And handling the problem didn't
require rocket science — just monitoring, redundancy and sharp IT staff work.
Repeat after me: [in case of local datacenters "handling the problem didn't require rocket
science — just monitoring, redundancy and sharp IT staff work."
While individual "in the cloud" service provider can do a decent job in providing the required functionality,
the problem arise when you need to use several such providers and issues of interoperability arise.
The lack of interoperability between different SaaS applications is one of the
most well known and, at the same time, largest problems with SaaS. In a way companies who are using
several different providers spending way too much money with the side effect of creating a problematic,
cumbersome IT architecture that lack flexibility and efficiency and as such is inferior to the integrated
datacenter architecture. The focus of vendors is on the adaptation
of the SaaS model to enterprise requirements (enterprise-level integration), but there is a growing
need for vendor-to-vendor integration. How and when those needs will be addressed remains to be seen.
In a way SaaS application emphasize too much GUI portion of functionality
of application at the expense of the ability of smoothly exchange data with other, often similar applications.
The emergence of a several classes of enterprise SaaS (for email, CRM, supply chain management, benefits
and other HR-related applications, etc ) creates problems of similar or identical data existing in various
formats in different providers and their synchronization and timely updates. It providers does not share
common "hardware cloud" we have a huge problems as not only protocols of data exchange, but reliability
and security issues are pretty complex.
Also the existing interoperability can be broken anytime by software updates by one of several for
existing "in the cloud" providers.
Other competing technologies
Pure cloud approach is just one instance of Service Oriented Architecture (SOA). The latter emphasizes
the grouping of software functionality around business processes and packaging software as network
services. SOA also presuppose network usage as a channel for the exchange for data between applications.
It also relied on standard protocols that provide a loose coupling of services with particular
OS and implementation language. Here is how Wikipedia defines SOA:
SOA separates functions into distinct units, or services[
which are made accessible over a network in order that they can be combined and reused in the production
of business applications. These
services communicate with each other by passing data from one service to another, or by coordinating
an activity between two or more services. SOA concepts are often seen
as built upon, and evolving from older concepts of distributed computing and modular programming.
I would also say that SOA concept is connected with the concept of Unix , especially pipes and co-routines.
SaaS represents just one limited architecture for implementing SOA and while "in the cloud" services
providers are good for certain production tasks they are bad and/or too costly for others. Webmail providers
are the most prominent examples of successful SaaS application; mail filtering (for example Google Postini)
and teleconferences are two other examples of successful delivery of functionality over the Internet.
But they are not so good for complex enterprise applications (for example SAP/R3) and they are questionable
for file and documents sharing.
In many cases existing "legacy" IT solutions are working very well and are cost more efficient then
pure cloud. It can be the system that was installed five yeas ago, but it can be even the system that
was installed 15 years ago (some publishing companies are still using PDP-based systems to run the business),
but they are by-and-large running "by-themselves" without need for intervention for months if not years.
All the painful deployment problems were ironed out a long time ago, users know the software well and
everything is as smooth as it should be. Why bother to rock the boat and pay greedy service providers
per user fees for the solution that you already have and paid for ? It would be interesting if Carr
tried to explain the reason why this is can be cost efficient for the enterprise...
Appliance are specialized computers which have stripped OS specifically tuned to perform only the
functions required for a particular application (or set of applications). Support is remote and performed
by the vendor via WAN. In a sense they represent approach which can be called "local cloud".
Most popular type of appliances are firewalls, proxy servers and IDS sensors, but mail server appliances
are also popular. They are real and battle tested. Some versions of Microsoft OS (Windows Server, Small
Business Edition) can be considered to be appliances in disguise.
In case of application streaming applications are located on a remote computer and are accessed 'as
needed" using network protocols ("on the fly" installation). The key advantage of application streaming
is that you use local computing power for running the application, not a remote server. That removes
the problem of latency in transmitting video stream generated by GUI interface on the remote server
(were the application is running) to the client.
Also modern laptops have tremendous computing power that is difficult (and very expensive) to match
in remote server park. Once you launch the application on the client (from a shortcut ) the remote server
streams (like streaming video or audio) the necessary application files to your PC and the application
launches. This is done just once. After that application works as if it is local. Also only required
files are sent (so if you are launching Excel you do NOT get those libraries that are shared with MS
Word if it is already installed). But each time you launch an application verification of the current
version is performed in in case of upgrades or patches launching of a new version is transparent.
In a way Microsoft patching system in automatic mode can be considered as a rudimentary application
streaming framework so this approach is not as exotic as it sounds. It implements some neat tricks:
downloading new version in the background while you are working with the new version and upgrade during
the next reboot.
Virtualization promises more agile and more efficient local datacenters and undercut "in the cloud"
software services model in several ways. First of all it permits packaging a set of key enterprise applications
as "virtual appliances". The latter like streamed applications run locally, store data locally, are
cheaper, have better response time and are more maintainable. This looks to me as a more promising technical
approach for complex sets of applications with intensive I/O requirements. For example, you can deliver
LAMP stack virtual appliance (Linux-Apache-PHP-MySQL) and use it on a local server for running your
LAMP-applications (for example helpdesk) enjoying the same level of quality and sophistication of packaging
and tuning as in case of remote software providers. But you do not depend on WAN as users connect to
it using LAN which guarantees fast response time. And your data are stored locally (but if you wish
they can be backed up remotely to Amazon or to other remote storage provider).
With the virtual appliances model the underlying infrastructure is still owned by the end-user business
but they pay for compete delivery of the actual application in a highly compartmentalized fashion, which
is a much simpler and more efficient way of working. It is not hard then to understand where the demand
for virtual appliances comes from.
You might ask why virtual appliances have been deployed widely during the last three years or so.
The answeris that virtualization started to became more mature and commodization only with the development
of Intel 5160 CPUs. At this point even mid range servers became able to consolidate several old servers
without breaking much of a sweet. Virtual appliances can be very quickly provisioned to meet customer
demand, and automation tools can be used to reduce the management headache and simplify the process.
Multiple vendors gives businesses possibility to select the offering which provides real value.
The other trend is the emergence of higher level of standardization of datacenters ("'Cloud in a
Box"" trend). It permits cheap prepackaged local datacenters to be installed everywhere. Among examples
of this trend are standard shipping container-based datacenters which are now sold by Sun and soon will
be sold by Microsoft. They already contain typical services like DNS, mail, file sharing, etc preconfigured.
For a fixed cost an organization gets set of servers capable of serving mid-size branch or plant. In
this case the organization can avoid paying monthly "per user" fees -- a typical cost recovery model
of software service providers. It also can be combined with previous two models: it is easy to stream
both applications and virtual appliances to the local datacenter from central location. For a small
organization such a datacenter now can be pre-configured in a couple of servers using Xen or VMware
plus necessary routers and switches and shipped in a small rack. This spring IBM started offering BladeCenter
servers with power and x86 processors, and service management software.
Liquid cooling, once used in mainframes and supercomputers, may be returning to data centers as an
alternative to air conditioning. Solutions include modular liquid-cooling units placed between racks
of servers; a new door at the back of a server rack with tubes in which chilled water is circulating;
and server racks with integrated power supply (three-phase convertors to DC, DC distribution to servers
and liquid cooling. It permits significantly increase density of servers be it blades or regular multicore
I would like to stress that the power and versatility of modern laptop is the factor that should not
be underestimated. It completely invalidates Carr's cloudy dream of users voluntarily switching to network
terminal model inherent is centralized software service provision ( BTW mainframe terminals and, especially,
"glass wall datacenters" were passionately hated by users). Such a solution can have a mass appeal only
in very limited cases (webmail). I think that users will fight tooth and nail for the preservation of
the level of autonomy provided by modern laptops. Moreover, in no way users will agree to sub-standard
response time and limited feature set of "in the cloud" applications as problems with Google apps adoption
While Google apps is an interesting project, they can serve as a litmus test for the difficulties of
replacing "installed" applications with "in the cloud" applications. First of all functionality is really,
really limited. At the same time Google have spend a lot of money and efforts creating them but never
got any significant traction and/or sizable return on investment. After several years of existence this
product did not even match the functionality of Open Office. To increase penetration Google recently
started licensing them to Salesforce and other firms. That means that the whole idea might be flawed
because even such an extremely powerful organization as Google with its highly qualified staff and huge
server power of datacenters cannot create an application suit that can compete with preinstalled on
laptop applications, which means cannot compete with the convenience and speed of running applications
locally on modern laptop.
In case of corporate editions the price is also an issue and Google apps in comparison with Office
Professional ($50 per user per year vs. $ 220 for Microsoft Office Professional) does not look like
a bargain if we assume five years life scan for the MS Office. The same situation exists for home users:
price-wise Microsoft Office can be now classified as shareware (in Microsoft Office Home and Student
2007 which includes Excel, PowerPoint, Word, and OneNote the cost is $25 per application ). So for home
users Google need to provide Google apps for free, which taking into account the amount of design efforts
and complexity of the achieving compatibility, is not a very good way of investing available cash. Please
note that Microsoft can at any time add the ability to stream Office applications to laptops and put
"in the cloud" Office-alternative software service provider in a really difficult position: remote servers
need to provide the same quality of interface and amount of computing power per user as the user enjoys
on a modern laptop. That also suggests existence of some principal limitations of "in the cloud" approach
for this particular application domain. And this is not unique case. SAP has problems with moving SAP/R3
to the cloud too and recently decided to scale back its efforts in this direction.
All-in-all computing power of a modern dual core 2-3GHz laptops with 2-4G of memory and 100G-200G hard
drives represent a serious challenge for "in the cloud" software services providers. This makes for
them difficult to attract individual users money outside advertising-based or other indirect models.
It's even more difficult for them "to shake corporate money loose": corporate users value the independence
of locally installed on laptop applications and the ability to store data locally. Not everybody wants
to share with Google their latest business plans.
- [Jun 09, 2017] Amazon's S3 web-based storage service is experiencing widespread issues on Feb 28 2017 ( Jun 09, 2017 | techcrunch.com )
- [Apr 01, 2017] Amazon Web Services outage causes widespread internet problems ( Apr 01, 2017 | www.cbsnews.com )
- [Apr 01, 2017] After Amazon outage, HealthExpense worries about cloud lock-in by Maria Korolov ( Apr 01, 2017 | www.networkworld.com )
- [Mar 03, 2017] Do You Replace Your Server Or Go To The Cloud The Answer May Surprise You ( Mar 03, 2017 | www.forbes.com )
- [Jan 12, 2017] Digitopoly Congestion on the Last Mile ( Jan 12, 2017 | www.digitopoly.org )
- [Nov 09, 2015] Thoughts on the Amazon outage ( nickgeoghegan.net )
- MAT Monitoring and Administration Tool
- Unix SysAdm Resources Automated Unix SysMgmt Software
Amazon's S3 web-based storage service is
experiencing widespread issues, leading to service that's either
partially or fully broken on websites, apps and devices upon which it
relies. The AWS offering provides hosting for images for a lot of sites,
and also hosts entire websites, and app backends including Nest.
The S3 outage is due to "high error rates with S3 in US-EAST-1,"
Amazon's AWS service health dashboard
, which is where the company
also says it's working on "remediating the issue," without initially
revealing any further details.
Affected websites and services include Quora, newsletter provider
Sailthru, Business Insider, Giphy, image hosting at a number of publisher
websites, filesharing in Slack, and many more. Connected lightbulbs,
thermostats and other IoT hardware is also being impacted, with many
unable to control these devices as a result of the outage.
Amazon S3 is used by around 148,213 websites, and 121,761 unique
domains, according to data tracked by
, and its popularity as a content host concentrates
specifically in the U.S. It's used by 0.8 percent of the top 1 million
websites, which is actually quite a bit smaller than CloudFlare, which is
used by 6.2 percent of the top 1 million websites globally – and yet it's
still having this much of an effect.
Amazingly, even the status indicators on the AWS service status page
rely on S3 for storage of its health marker graphics, hence why the site
is still showing all services green despite obvious evidence to the
Update (11:40 AM PT):
AWS has fixed the issues with its
own dashboard at least – it'll now
accurately reflect service status as it continues to attempt to fix the
- Update (11:57 AM PT):
AWS says it
believes they new "understand root cause" of the S3 issues, and are
"working hard at repairing." It has not shared specifics of that cause.
- Update (12:15 PM PT):
Network intelligence software
notes that all the packet loss for the ongoing issue
appears to be happening in the Ashburn, VA area. Amazon has an AWS data
center in Ashburn, whose
exact location was revealed in a news story last year
due to a fire
during its construction.
- Update (12:54 PM PT):
AWS says it's seeing "recovery
for S3 object retrievals, listing and deletions" which means you're
probably seeing avatars and other visuals assets come back in some spots.
The company also says it expects further improvements to error rates
within the next hour.
- Update (1:20 PM PT):
S3 is now fully recovered in
terms of the retrieval, listing and deletion of existing objects,
according to the AWS status page, and it's now working on restoring
normal operation for the addition of new items to S3-based storage.
- Update (2:10 PM PT):
AWS says that it's now fully
recovered in terms of resolving the error rates it was seeing, and S3
service is now "operating normally."
Feb 28, 2017 6:03 PM EST
NEW YORK --
Amazon's cloud-computing service,
Amazon Web Services, experienced an outage in its eastern U.S.
region Tuesday afternoon, causing unprecedented and widespread
problems for thousands of websites and apps.
Amazon is the largest provider of cloud computing services in
the U.S. Beginning around midday Tuesday on the East Coast, one
region of its "S3" service based in Virginia began to experience
what Amazon, on its service site, called "increased error rates."
In a statement, Amazon said as of 4 p.m. E.T. it was still
experiencing "high error rates" that were "impacting various AWS
"We are working hard at repairing S3, believe we understand root
cause, and are working on implementing what we believe will
remediate the issue," the company said.
But less than an hour later, an update offered good news: "As of
1:49 PM PST, we are fully recovered for operations for adding new
objects in S3, which was our last operation showing a high error
rate. The Amazon S3 service is operating normally," the company
Amazon's Simple Storage Service, or S3, stores files and data
for companies on remote servers. It's used for everything from
building websites and apps to storing images, customer data and
"Anything you can think about storing in the most cost-effective
way possible," is how Rich Mogull, CEO of data security firm
Securosis, puts it.
Amazon has a strong track record of stability with its cloud
senior editor Dan Ackerman told CBS News.
"AWS... is known for having really good 'up time,'" he said,
using industry language.
Over time, cloud computing has become a major part of Amazon's
"Very few people host their own web servers anymore, it's all
been outsourced to these
, and Amazon is one of the major ones," Ackerman
The problem Tuesday affected both "front-end" operations --
meaning the websites and apps that users see -- and back-end data
processing that takes place out of sight. Some smaller online
services, such as Trello, Scribd and IFTTT, appeared to be down for
a while, although all have since recovered.
Some affected websites had fun with the crash, treating it like
a snow day:
"... "From a sustainability and availability standpoint, we definitely need to look at our strategy to not be vendor specific, including with Amazon," said Lee. "That's something that we're aware of and are working towards." ..."
"... "Elastic load balances and other services make it easy to set up. However, it's a double-edged sword, because these types of services will also make it harder to be vendor-agnostic. When other cloud platform don't offer the same services, how do you wean yourself off of them?" ..."
"... Multi-year commitments are another trap, he said. And sometimes there's an extra unpleasant twist -- minimum usage requirements that go up in the later years, like balloon payments on a mortgage. ..."
The Amazon outage reminds companies that having all their eggs in one cloud basket might
be a risky strategy
"That is the elephant in the room these days," said Lee. "More and more companies are starting
to move their services to the cloud providers. I see attackers trying to compromise the cloud provider
to get to the information."
If attackers can get into the cloud systems, that's a lot of data they could have access to. But
attackers can also go after availability.
"The DDoS attacks are getting larger in scale, and with more IoT systems coming online and being
very hackable, a lot of attackers can utilize that as a way to do additional attacks," he said.
And, of course, there's always the possibility of a cloud service outage for other reasons.
The 11-hour outage that Amazon suffered in late February was due to a typo, and affected Netflix,
Reddit, Adobe and Imgur, among other sites.
"From a sustainability and availability standpoint, we definitely need to look at our strategy
to not be vendor specific, including with Amazon," said Lee. "That's something that we're aware of
and are working towards."
The problem is that Amazon offers some very appealing features.
"Amazon has been very good at providing a lot of services that reduce the investment that needs
to be made to build the infrastructure," he said. "Elastic load balances and other services make
it easy to set up. However, it's a double-edged sword, because these types of services will also
make it harder to be vendor-agnostic. When other cloud platform don't offer the same services, how
do you wean yourself off of them?"
... ... ...
"If you have a containerized approach, you can run in Amazon's container services, or on Azure,"
said Tim Beerman, CTO at Ensono , a managed
services provider that runs its own cloud data center, manages on-premises environments for customers,
and also helps clients run in the public cloud.
"That gives you more portability, you can pick something up and move it," he said.
But that, too, requires advance planning.
"It starts with the application," he said. "And you have to write it a certain way."
But the biggest contributing factor to cloud lock-in is data, he said.
"They make it really easy to put the data in, and they're not as friendly about taking that data
out," he said.
The lack of friendliness often shows up in the pricing details.
"Usually the price is lower for data transfers coming into a cloud service provider versus the
price to move data out," said Thales' Radford.
Multi-year commitments are another trap, he said. And sometimes there's an extra unpleasant twist
-- minimum usage requirements that go up in the later years, like balloon payments on a mortgage.
Is your server or servers getting old? Have you pushed it
to the end of its lifespan? Have you reached that stage
where it's time to do something about it? Join the
crowd. You're now at that decision point that so many
other business people are finding themselves this year.
And the decision is this: do you replace that old server
with a new server or do you go to: the cloud.
talking about the cloud nowadays so you've got to consider
it, right? This could be a great new thing for your
company! You've been told that the cloud enables companies
like yours to be more flexible and save on their IT
costs. It allows free and easy access to data for
employees from wherever they are, using whatever devices
they want to use. Maybe you've seen the
by accounting software maker MYOB that found
that small businesses that adopt cloud technologies enjoy
higher revenues. Or perhaps you've stumbled on
that said that small businesses are losing
money as a result of ineffective IT management that could
be much improved by the use of cloud based services. Or
of more than 1,200 small businesses by technology
which discovered that
" cloud users cite cost savings, increased efficiency and
greater innovation as key benefits" and that " across all
industries, storage and conferencing and collaboration are
the top cloud services and applications."
So it's time to chuck that old piece of junk and take
your company to the cloud, right? Well just hold on.
There's no question that if you're a startup or a very
small company or a company that is virtual or whose
employees are distributed around the world, a cloud based
environment is the way to go. Or maybe you've got high
internal IT costs or require more computing power. But
maybe that's not you. Maybe your company sells
pharmaceutical supplies, provides landscaping services,
fixes roofs, ships industrial cleaning agents,
manufactures packaging materials or distributes gaskets.
You are not featured in
and you have
not been invited to presenting at the next Disrupt
conference. But you know you represent the very core of
small business in America. I know this too. You are just
like one of my company's 600 clients. And what are these
companies doing this year when it comes time to replace
These very smart owners and managers of small and
medium sized businesses who have existing applications
running on old servers are not going to the cloud.
Instead, they've been buying new servers.
Wait, buying new servers? What about the cloud?
At no less than six of my clients in the past 90 days
it was time to replace servers. They had all waited as
long as possible, conserving cash in a slow economy,
hoping to get the most out of their existing machines.
Sound familiar? But the servers were showing their age,
applications were running slower and now as the companies
found themselves growing their infrastructure their old
machines were reaching their limit. Things were getting
to a breaking point, and all six of my clients decided it
was time for a change. So they all moved to cloud, right?
Nope. None of them did. None of them chose the cloud.
Why? Because all six of these small business owners and
managers came to the same conclusion: it was just too
expensive. Sorry media. Sorry tech world. But this is
the truth. This is what's happening in the world of
Consider the options. All of my clients' evaluated
cloud based hosting services from
also interviewed a handful of cloud based IT management
firms who promised to move their existing applications
(Office, accounting, CRM, databases) to their servers and
manage them offsite. All of these popular options are
viable and make sense, as evidenced by their growth in
recent years. But when all the smoke cleared, all of
these services came in at about the same price:
approximately $100 per month per user. This is what it
costs for an existing company to move their existing
infrastructure to a cloud based infrastructure in 2013.
We've got the proposals and we've done the analysis.
You're going through the same thought process, so now
put yourself in their shoes. Suppose you have maybe 20
people in your company who need computer access. Suppose
you are satisfied with your existing applications and
don't want to go through the agony and enormous expense of
migrating to a new cloud based application. Suppose you
don't employ a full time IT guy, but have a service
contract with a reliable local IT firm.
Now do the numbers: $100 per month x 20 users is
$2,000 per month or $24,000 PER YEAR for a cloud based
service. How many servers can you buy for that amount?
Imagine putting that proposal out to an experienced,
battle-hardened, profit generating small business owner
who, like all the smart business owners I know, look hard
at the return on investment decision before parting with
For all six of these clients the decision was a
no-brainer: they all bought new servers and had their IT
guy install them. But can't the cloud bring down their IT
costs? All six of these guys use their IT guy for maybe
half a day a month to support their servers (sure he could
be doing more, but small business owners always try to get
away with the minimum). His rate is $150 per hour.
That's still way below using a cloud service.
No one could make the numbers work. No one could
justify the return on investment. The cloud, at least for
established businesses who don't want to change their
existing applications, is still just too expensive.
Please know that these companies are, in fact, using
some cloud-based applications. They all have virtual
private networks setup and their people access their
systems over the cloud using remote desktop technologies.
Like the respondents in the above surveys, they subscribe
to online backup services, share files on DropBox and
file storage, make their calls over Skype, take advantage
of Gmail and use collaboration tools like
Docs or Box. Many of their employees have iPhones and
Droids and like to use mobile apps which rely on cloud
data to make them more productive. These applications
didn't exist a few years ago and their growth and benefits
cannot be denied.
Paul-Henri Ferrand, President of
North America, doesn't see this trend continuing. "Many
smaller but growing businesses are looking and/or moving
to the cloud," he told me. "There will be some (small
businesses) that will continue to buy hardware but I see
the trend is clearly toward the cloud. As more business
applications become more available for the cloud, the more
likely the trend will continue."
"... By Shane Greenstein On Jan 11, 2017 · Add Comment · In Broadband , communication , Esssay , Net Neutrality ..."
"... The bottom line: evenings require far greater capacity than other times of the day. If capacity is not adequate, it can manifest as a bottleneck at many different points in a network-in its backbone, in its interconnection points, or in its last mile nodes. ..."
"... The use of tiers tends to grab attention in public discussion. ISPs segment their users. Higher tiers bring more bandwidth to a household. All else equal, households with higher tiers experience less congestion at peak moments. ..."
"... such firms (typically) find clever ways to pile on fees, and know how to stymie user complaints with a different type of phone tree that makes calls last 45 minutes. Even when users like the quality, the aggressive pricing practices tend to be quite irritating. ..."
"... Some observers have alleged that the biggest ISPs have created congestion issues at interconnection points for purposes of gaining negotiating leverage. These are serious charges, and a certain amount of skepticism is warranted for any broad charge that lacks specifics. ..."
"... Congestion is inevitable in a network with interlocking interests. When one part of the network has congestion, the rest of it catches a cold. ..."
"... More to the point, growth in demand for data should continue to stress network capacity into the foreseeable future. Since not all ISPs will invest aggressively in the presence of congestion, some amount of congestion is inevitable. So, too, is a certain amount of irritation. ..."
Congestion on the Last Mile
Jan 11, 2017
has long been recognized that networked services
contain weak-link vulnerabilities. That is, the
performance of any frontier device depends on the
performance of every contributing component and
service. This column focuses on one such
phenomenon, which goes by the label "congestion."
No, this is not a new type of allergy, but, as
with a bacteria, many users want to avoid it,
especially advanced users of frontier network
Congestion arises when network
capacity does not provide adequate service during
heavy use. Congestion slows down data delivery
and erodes application performance, especially
for time-sensitive apps such as movies, online
videos, and interactive gaming.
Concerns about congestion are pervasive.
Embarrassing reports about broadband networks
with slow speeds highlight the role of
congestion. Regulatory disputes about data caps
and pricing tiers question whether these programs
limit the use of data in a useful way. Investment
analysts focus on the frequency of congestion as
a measure of a broadband network's quality.
What economic factors produce congestion?
Let's examine the root economic causes.
Congestion arises when demand for data exceeds
supply in a very specific sense.
Start with demand. To make this digestible,
let's confine our attention to US households in
an urban or suburban area, which produces the
majority of data traffic.
No simple generalization can characterize all
users and uses. The typical household today uses
data for a wide variety of purposes-email, video,
passive browsing, music videos, streaming of
movies, and e-commerce. Networks also interact
with a wide variety of end devices-PCs, tablets,
smartphones on local Wi-Fi, streaming to
television, home video alarm systems, remote
temperature control systems, and plenty more.
It is complicated, but two facts should be
foremost in this discussion. First, a high
fraction of traffic is video-anywhere from 60 to
80 percent, depending on the estimate. Second,
demand peaks at night. Most users want to do more
things after dinner, far more than any other time
during the day.
Every network operator knows that demand for
data will peak (predictably) between
approximately 7 p.m. and 11 p.m. Yes, it is
predictable. Every day of the week looks like
every other, albeit with steady growth over time
and with some occasional fluctuations for
holidays and weather. The weekends don't look any
different, by the way, except that the daytime
has a bit more demand than during the week.
bottom line: evenings require far greater
capacity than other times of the day. If capacity
is not adequate, it can manifest as a bottleneck
at many different points in a network-in its
backbone, in its interconnection points, or in
its last mile nodes.
This is where engineering and economics can
become tricky to explain (and to manage).
Consider this metaphor (with apologies to network
engineers): Metaphorically speaking, network
congestion can resemble a bathtub backed up with
water. The water might fail to drain because
something is interfering with the mouth of the
drain or there is a clog far down the pipes. So,
too, congestion in a data network can arise from
inadequate capacity close to the household or
inadequate capacity somewhere in the
infrastructure supporting delivery of data.
Numerous features inside a network can be
responsible for congestion, and that shapes which
set of households experience congestion most
severely. Accordingly, numerous different
investments can alleviate the congestion in
specific places. A network could require a
"splitting of nodes" or a "larger pipe" to
support a content delivery network (CDN) or could
require "more ports at the point of
interconnection" between a particular backbone
provider and the network.
As it turns out, despite that complexity, we
live in an era in which bottlenecks arise most
often in the last mile, which ISPs build and
operate. That simplifies the economics: Once an
ISP builds and optimizes a network to meet
maximum local demand at peak hours, then that
same capacity will be able to meet lower demand
the rest of the day. Similarly, high capacity can
also address lower levels of peak demand on any
Think of the economics this way. An awesome
network, with extraordinary capacity optimized to
its users, will alleviate congestion at most
households on virtually every day of the week,
except the most extraordinary. Accordingly, as
the network becomes less than awesome with less
capacity, it will generate a number of
(predictable) days of peak demand with severe
congestion throughout the entire peak time period
at more households. The logic carries through:
the less awesome the network, the greater the
number of households who experience those moments
of severe congestion, and the greater the
That provides a way to translate many network
engineering benchmarks-such as the percentage of
packet loss. More packet loss correlates with
more congestion, and that corresponds with a
larger number of moments when some household
experiences poor service.
Tradeoffs and Externalities
Not all market participants react to
congestion in the same way. Let's first focus on
the gazillion Web firms that supply the content.
They watch this situation with a wary eye, and
it's no wonder. Many third-party services, such
as those streaming video, deliver a
higher-quality experience to users whose network
suffers less congestion.
content providers invest to alleviate congestion.
Some invest in compression software and superior
webpage design, which loads in ways that speeds
up the user experience. Some buy CDN services to
speed delivery of their data. Some of the largest
content firms, such as YouTube, Google, Netflix,
and Facebook, build their own CDN services to
Next, focus on ISPs. They react with various
investment and pricing strategies. At one
extreme, some ISPs have chosen to save money by
investing conservatively, and they suffer the
complaints of users. At the other extreme, some
ISPs build a premium network, then charge premium
prices for the best services.
There are two good reasons for that variety.
First, ISPs differ in their rates of capital
investment. Partly this is due to investment
costs, which vary greatly with density,
topography, and local government relations. Rates
of investment tend to be inherited from long
histories, sometimes as a product of decisions
made many years ago, which accumulated over time.
These commitments can change, but generally
don't, because investors watch capital
commitments and react strongly to any departure
The second reason is more subtle. ISPs take
different approaches to raising revenue per
household, and this results in (effectively)
different relationships with banks and
stockholders, and, de facto, different budgets
for investment. Where does the difference in
revenue come from? For one, competitive
conditions and market power differ across
neighborhoods. In addition, ISPs use different
pricing strategies, taking substantially
different approaches to discounts, tiered pricing
structures, data cap policies, bundled contract
offerings, and nuisance fees.
The use of tiers tends to grab attention
in public discussion. ISPs segment their users.
Higher tiers bring more bandwidth to a household.
All else equal, households with higher tiers
experience less congestion at peak moments.
Investors like tiers because they
don't obligate ISPs to offer unlimited service
and, in the long run, raise revenue without
Users have a more mixed
reaction. Light users like the lower prices of
lower tiers, and appreciate saving money for
doing little other than email and static
In contrast, heavy users perceive that they
pay extra to receive the bandwidth that the ISP
used to supply as a default.
ISPs cannot win for losing. The archetypical
conservative ISP invests adequately to relieve
congestion some of the time, but not all of the
time. Its management then must face the
occasional phone calls of its users, which they
stymie with phone trees that make service calls
last 45 minutes. Even if users like the low
prices, they find the service and reliability
The archetypical aggressive ISP, in contrast,
achieves a high-quality network, which relieves
severe congestion much of the time. Yet,
firms (typically) find clever ways to pile on
fees, and know how to stymie user complaints with
a different type of phone tree that makes calls
last 45 minutes. Even when users like the
quality, the aggressive pricing practices tend to
be quite irritating.
One last note: It is a complicated situation
where ISPs interconnect with content providers.
Multiple parties must invest, and the situations
involve many supplier interests and strategic
Some observers have alleged that the biggest
ISPs have created congestion issues at
interconnection points for purposes of gaining
negotiating leverage. These are serious charges,
and a certain amount of skepticism is warranted
for any broad charge that lacks specifics.
Somebody ought to do a sober and detailed
investigation to confront those theories with
evidence. (I am just saying.)
What does basic economics tell us about
congestion? Congestion is inevitable in a network
with interlocking interests. When one part of the
network has congestion, the rest of it catches a
More to the point, growth in demand for data
should continue to stress network capacity into
the foreseeable future. Since not all ISPs will
invest aggressively in the presence of
congestion, some amount of congestion is
inevitable. So, too, is a certain amount of
Copyright held by IEEE.
To view the printed essay, click here.
"... The 'Cloud' isn't magic, the 'Cloud' isn't fail-proof, the 'Cloud' requires hardware, software, networking, security, support and execution – just like anything else. ..."
"... Putting all of your eggs in one cloud, so to speak, no matter how much redundancy they say they have seems to be short-sighted in my opinion. ..."
"... you need to assume that all vendors will eventually have an issue like this that affects your overall uptime, brand and churn rate. ..."
"... Amazon's downtime is stratospherically high, and their prices are spectacularly inflated. Their ping times are terrible and they offer little that anyone else doesn't offer. Anyone holding them up as a good solution without an explanation has no idea what they're talking about. ..."
"... Nobody who has even a rudimentary best-practice hosting setup has been affected by the Amazon outage in any way other than a speed hit as their resources shift to a secondary center. ..."
"... Stop following the new-media goons around. They don't know what they're doing. There's a reason they're down twice a month and making excuses. ..."
"... Personally, I do not use a server for "mission critical" applications that I cannot physically kick. Failing that, a knowledgeable SysAdmin that I can kick. ..."
Disaster Recovery needs to be a primary objective when planning and implementing any IT project,
outsourced or not. The 'Cloud' isn't magic, the 'Cloud' isn't fail-proof, the 'Cloud' requires
hardware, software, networking, security, support and execution – just like anything else.
All the fancy marketing speak, recommendations and free trials, can't replace the need to do obsessive
due diligence before trusting any provider no matter how big and awesome they may seem or what their
marketing department promise.
- Why do Data Centers have UPS and Diesel Generators on-site? They know electricity can
and does fail.
- Why do we buy servers will dual power supplies? We know they can and do fail.
- Why do we implement RAID? We know hard drives can and do fail.
Prepare for the worst, period.
Putting all of your eggs in one cloud, so to speak, no matter how much redundancy they say
they have seems to be short-sighted in my opinion. If you are utilizing an MSP, HSP, CSP, IAAS,
SAAS, PAAS, et all to attract/increase/fulfill a large percentage of your revenue or all of your
revenue like many companies are doing nowadays then you need to assume that all vendors will
eventually have an issue like this that affects your overall uptime, brand and churn rate. A
blip here and there is tolerable.
Amazon's downtime is stratospherically high, and their prices are spectacularly inflated.
Their ping times are terrible and they offer little that anyone else doesn't offer. Anyone holding
them up as a good solution without an explanation has no idea what they're talking about.
The same hosting platform, as always, is preferred: dedicated boxes at geographically disparate
and redundant locations, managed by different companies. That way when host 1 shits the bed, hosts
2 and 3 keep churning.
Nobody who has even a rudimentary best-practice hosting setup has been affected by the Amazon
outage in any way other than a speed hit as their resources shift to a secondary center.
Stop following the new-media goons around. They don't know what they're doing. There's a reason
they're down twice a month and making excuses.
Personally, I do not use a server for "mission critical" applications that I cannot physically
kick. Failing that, a knowledgeable SysAdmin that I can kick.
MAT is an easy to use network enabled UNIX configuration and monitoring tool. It provides
an integrated tool for many common system administration tasks, including Backups and Replication
It includes a warning system for potential system problems, and graphing of many common system parameters.
Click here for
Coming soon in version 0.24 will be an embedded interpreter with it you will be able
to monitor any parameter you can write a script to capture. It also will create the ability to have
OS specific configuration tools.
- Master System is
a public-domain Unix systems configuration tool written in Perl. The system is architecture and operating
system independent, but it can handle architecture and operating system dependent configuration.
It is designed to control the configuration of large groups of systems that are configured in the
same style, but are not necessarily identical. From a group at Rutgers University.
- Webmin is a free web-based admin interface
for Unix systems. Via a web browser, you can configure DNS, Apache, Samba, filesystems, startup scripts,
inetd, crontabs and more. Written in Perl5 and easily extendable. Supports several Linux versions
Softpanorama hot topic of the month
Administering Remote Servers
FAIR USE NOTICE This site contains
copyrighted material the use of which has not always been specifically
authorized by the copyright owner. We are making such material available
in our efforts to advance understanding of environmental, political,
human rights, economic, democracy, scientific, and social justice
issues, etc. We believe this constitutes a 'fair use' of any such
copyrighted material as provided for in section 107 of the US Copyright
Law. In accordance with Title 17 U.S.C. Section 107, the material on
this site is distributed without profit exclusivly for research and educational purposes. If you wish to use
copyrighted material from this site for purposes of your own that go
beyond 'fair use', you must obtain permission from the copyright owner.
ABUSE: IPs or network segments from which we detect a stream of probes might be blocked for no
less then 90 days. Multiple types of probes increase this period.
Two Party System
as Polyarchy :
Corruption of Regulators :
and Control Freaks : Toxic Managers :
Harvard Mafia :
: Surviving a Bad Performance
Review : Insufficient Retirement Funds as
Immanent Problem of Neoliberal Regime : PseudoScience :
Who Rules America :
: The Iron
Law of Oligarchy :
War and Peace
Finance : John
Kenneth Galbraith :Talleyrand :
Oscar Wilde :
Otto Von Bismarck :
George Carlin :
Propaganda : SE
quotes : Language Design and Programming Quotes :
Random IT-related quotes :
Somerset Maugham :
Marcus Aurelius :
Kurt Vonnegut :
Eric Hoffer :
Winston Churchill :
Napoleon Bonaparte :
Ambrose Bierce :
Bernard Shaw :
Mark Twain Quotes
Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient
markets hypothesis :
Political Skeptic Bulletin, 2013 :
Unemployment Bulletin, 2010 :
Vol 23, No.10
(October, 2011) An observation about corporate security departments :
Slightly Skeptical Euromaydan Chronicles, June 2014 :
Greenspan legacy bulletin, 2008 :
Vol 25, No.10 (October, 2013) Cryptolocker Trojan
Vol 25, No.08 (August, 2013) Cloud providers
as intelligence collection hubs :
Financial Humor Bulletin, 2010 :
Inequality Bulletin, 2009 :
Financial Humor Bulletin, 2008 :
Bulletin, 2004 :
Financial Humor Bulletin, 2011 :
Energy Bulletin, 2010 :
Malware Protection Bulletin, 2010 : Vol 26,
No.1 (January, 2013) Object-Oriented Cult :
Political Skeptic Bulletin, 2011 :
Vol 23, No.11 (November, 2011) Softpanorama classification
of sysadmin horror stories : Vol 25, No.05
(May, 2013) Corporate bullshit as a communication method :
Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law
Fifty glorious years (1950-2000):
the triumph of the US computer engineering :
Donald Knuth : TAoCP
and its Influence of Computer Science : Richard Stallman
: Linus Torvalds :
Larry Wall :
John K. Ousterhout :
CTSS : Multix OS Unix
History : Unix shell history :
VI editor :
History of pipes concept :
Solaris : MS DOS
: Programming Languages History :
PL/1 : Simula 67 :
History of GCC development :
Scripting Languages :
Perl history :
OS History : Mail :
DNS : SSH
: CPU Instruction Sets :
SPARC systems 1987-2006 :
Norton Commander :
Norton Utilities :
Norton Ghost :
Frontpage history :
Malware Defense History :
GNU Screen :
OSS early history
Principle : Parkinson
Law : 1984 :
The Mythical Man-Month :
How to Solve It by George Polya :
The Art of Computer Programming :
The Elements of Programming Style :
The Unix Hater’s Handbook :
The Jargon file :
The True Believer :
Programming Pearls :
The Good Soldier Svejk :
The Power Elite
Most popular humor pages:
Manifest of the Softpanorama IT Slacker Society :
of the IT Slackers Society : Computer Humor Collection
: BSD Logo Story :
The Cuckoo's Egg :
IT Slang : C++ Humor
: ARE YOU A BBS ADDICT? :
The Perl Purity Test :
Object oriented programmers of all nations
: Financial Humor :
Financial Humor Bulletin,
2008 : Financial
Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related
Humor : Programming Language Humor :
Goldman Sachs related humor :
Greenspan humor : C Humor :
Scripting Humor :
Real Programmers Humor :
Web Humor : GPL-related Humor
: OFM Humor :
Politically Incorrect Humor :
IDS Humor :
"Linux Sucks" Humor : Russian
Musical Humor : Best Russian Programmer
Humor : Microsoft plans to buy Catholic Church
: Richard Stallman Related Humor :
Admin Humor : Perl-related
Humor : Linus Torvalds Related
humor : PseudoScience Related Humor :
Networking Humor :
Shell Humor :
Financial Humor Bulletin,
2011 : Financial
Humor Bulletin, 2012 :
Financial Humor Bulletin,
2013 : Java Humor : Software
Engineering Humor : Sun Solaris Related Humor :
Education Humor : IBM
Humor : Assembler-related Humor :
VIM Humor : Computer
Viruses Humor : Bright tomorrow is rescheduled
to a day after tomorrow : Classic Computer
The Last but not Least
Copyright © 1996-2016 by Dr. Nikolai Bezroukov. www.softpanorama.org
was created as a service to the UN Sustainable Development Networking Programme (SDNP)
in the author free time. This document is an industrial compilation designed and created exclusively
for educational use and is distributed under the Softpanorama Content License.
Original materials copyright belong
to respective owners. Quotes are made for educational purposes only
in compliance with the fair use doctrine.
FAIR USE NOTICE This site contains
copyrighted material the use of which has not always been specifically
authorized by the copyright owner. We are making such material available
to advance understanding of computer science, IT technology, economic, scientific, and social
issues. We believe this constitutes a 'fair use' of any such
copyrighted material as provided by section 107 of the US Copyright Law according to which
such material can be distributed without profit exclusively for research and educational purposes.
This is a Spartan WHYFF (We Help You For Free)
site written by people for whom English is not a native language. Grammar and spelling errors should
be expected. The site contain some broken links as it develops like a living tree...
The statements, views and opinions presented on this web page are those of the author (or
referenced source) and are
not endorsed by, nor do they necessarily reflect, the opinions of the author present and former employers, SDNP or any other organization the author may be associated with. We do not warrant the correctness
of the information provided or its fitness for any purpose.
Last modified: April 02, 2017