Posts tagged ‘Scalability’

June 13, 2011

Symantec.Cloud Review

I recently had an opportunity to attend a Symantec Conference (Enterprise 2011). I was mostly attracted by the 45  minute session on Cloud Computing – which was towards the end of the half day conference.

I waited with eager anticipation as I was ignorant of any of Symantec Cloud offerings. Being an ex Veritas File System developer with a long association at Symantec I was naturally curious.

As the hour approached, the presenter went on to unveil a set of Symantec Products offered via SaaS model (mostly concerning with Data Security/Availability – full list here http://www.symantec.com/business/theme.jsp?themeid=symantec-cloud).

I was disappointed, to say the least, at this attempt to pass off SaaS as cloud computing. I was hoping to learn of some cool new storage/compute virtualization story (Symantec and Vware are buddies) tied together with utility computing and security thrown in the mix. Alas, no such thing.

Something did not add up in my own mind after that presentation. I was not sure why I should be disappointed that “SaaS is not the same as Cloud Computing” (A similar debate is going on concerning Apple’s iCloud).

SaaS – Cloud Computing? It is and It isn’t.

It is difficult to find a precise and widely accepted definition of what cloud computing is and what it isn’t. However, there are some Cloud Computing Guarantees (well, promises) that are generally well accepted. Some of which are –

  1. Availability – promises on service availability (five 9 availability for e.g.)
  2. Scalability  – promises on how well the service scales (horizontally and vertically)
  3. Utility Computing Based Billing – pay as you go
My own thoughts when someone mentions cloud computing is the cloud computing infrastructure – hardware and operating system software sans the (business logic) applications. Similar to Google App Engine, Amazon EC2, Microsoft’s Azure.
Applications written to run on such infrastructure are expected to exhibit availability and scalability properties.
It is interesting to note that Symantec promises 100% service availability and a guaranteed latency (from which you can draw inferences about the scalability). I suppose from a service consumer’s point of view this is all that matters.
It is anybody’s guess as to what kind of infrastructure powers Symantec.Cloud and consumers should not be unduly concerned. They should instead focus on the service availability and response times which are very very good indeed.
April 12, 2011

Looking at code performance

In any system under performance consideration, there  are 3 important timescales.

  1. The timescale at which the CPU works
  2. The timescale at which the Memory can be accessed
  3. The timescale at which data can be stored on disk

These three have widely different timescales in a traditional computer setup. CPU runs at nano second resolution, memory at micro-second and Disks at milli-second.

This makes it difficult to estimate the true efficiency of  code. Because, the rate at which the instructions can be processed also now depends on the rate at which the memory is accessed and the disk is utilized. Hence, to truly measure the code efficiency one needs to make sure that the disk response times do not play a major role in the flow of instructions. In other words on e should take care of removing the IO bottlenecks before analyzing the performance of code.

In presence of strong influence of disk IOs the measurements will be biased by the characteristics of the storage system. The order of magnitude difference between the response times of disks and memory and memory and CPU makes it even more difficult to remove the IO bottleneck. While the CPUs have become faster in the last few years and there are now CPUs with multiple cores the memory and disk access speeds haven’t kept pace.  As a result the contrast between the response times of CPU, Memory and Storage has widened.

April 9, 2011

3 simple steps to adopt cloud computing

Cloud computing is now synonymous with Flexible Provisioning and Scale. Find out below if you are taking full advantage of cloud computing.

The As Is deployment – lowest adoption cost, reasonable benefits:

Move the server application “as is” to a cloud server. This is nothing but a co-located server, at Amazon for example. The provisioning and maintenance of the application is still a self driven task.

The win is in the dynamic on demand provisioning. Easy to compute the ROI here. Let us say that your application needs to be available all year round – but cater to seasonal demands. Say it costs $400 to host your application to cater to peak demand. You would end up paying 12*400 = $4800 per annum to keep your application up. Most of the time it would be under utilized. Cloud computing has made it really simple to change your compute capacity as easily as setting a reminder in your out look calender. With amazon or google, you could just log into the admin panel and say that you need additional resources only on certain dates. At the end of the month you get billed for the amount of resources you actually consume.

The Managed RDBMS deployment – reasonably low adoption cost, reasonable benefits:

A lot of work has to be done to ensure that the application is available. i.e. a replication strategy and policy to keep the database available. This is still a lot of effort and money. The alternative is a managed RDBMS, where the provider (amazon or google) manages the database. They worry about keeping the data safe from being lost. Much harder to do the ROI here – as the time spent in managing this would have to be offset against opportunity costs. Note that there would be some amount of code restructuring (not a lot) to get this going. An example of this is the Amazon MySQL RDS. At the time of writing, google is yet to announce the availability of their hosted sql service.

The Application Rewrite – highest adoption cost, highest benefits (arguably)

If your goal is to write an application which scales very well then you should consider a complete application rewrite to take advantage of the storage APIs. Hosted RDBMS is still a single machine (or a cluster) running a database server – with bottlenecks – be it memory, cpu, networ or disk.

Cloud computing offers storage APIs to access and manage data unlike traditional methods of file or rdbms storage. Because of the underlying architectural differences, cloud datastore offers better scalability – http://labs.google.com/papers/bigtable.html.