Entries in scalability (11)

Saturday
Jul092011

Scale Planning and the AKF Scale Cube

There are a lot of ways to draw diagrams for availability and scalability.  I use different ones for different purposes all the time.  However, when I was reading the Art of Scalability by AKF partners I ran across a nice compound diagram they call the AKF Scale Cube which helps simplify the explanation of the multi-dimensional nature of scalability issues in complex web application scenarios.  

I’ve been using this visualization model to help me explain how things fit together in both technical and business discussions.  It comes in very handy I must say.

Most recent I’ve been using it to describe a gnarly distributed application I’m working on for a client.  What follows is a generalized version of a functional use and some discussion of a compound view of ones I have created for clients of mine.
 


Some base-line definitions are in order if you haven’t read the book mentioned above.

X-Axis - Horizontal Scalability, Clones, Scale Out.  These are terms often associated with the X-Axis.  In the case of this graph, day you build a data processor then make 10 copies of it.  Well, that’s scaling to 10 units on the x-axis in this graph.  Depending on your application, this can help you increase your capacity; but not always!

Y-Axis - I’ve called this axis functional decomposition for a long time.  It can be thought of as breaking the application down from a monolithic single instance into discreet stand-alone parts.  I have some examples that are a bit of mix from various projects I’ve worked on in the past here.

Z-Axis - This is the tricky one for most folks.  This is what people might call sharding, partitioning, etc.  Keep in mind, I’m not only talking about a database here.  I’m talking about an entire complex multi-faceted distributed highly-available web application.

0,0,0 - The intersection of all three axis or 0,0,0.  This is what some would call an all-in-one server.  It’s often used for proof of concept for for launching without a care in the world for future growth needs.  There’s nothing wrong with it as long as you understand the limitations and technical debt associated with the approach.

Z-Axis Item Explanations

Client - Assuming this is a multi-tentant application you may want to shard your application by client such that each client or group of clients is assigned somehow to a specific cluster of nodes.

Geography - Assuming you’d like to have built in DRBC and your applciation is capable of surviving being split up into many pieces then you could end up sharding your application by data center and broader geographies such as city, state, country.

External Cloud - Using IaaS and PaaS resources outside of your own data centers.  For a refresher on IaaS and PaaS see the article I wrote, “Cloud Computing:  Get Your Head in the Clouds,” in 2008 that was heavily read over the years.

Internal Cloud - Using your own infrastructure resources BEHIND your own firewall to do whatever it is that your application does.  This doesn’t always have to be a cloud.  If you want to know what I think it takes to be a cloud then read the several articles I wrote over the years related to that topic.  I do set the bar pretty high though I’ve learned.

Purpose - You might want to simply partition along the Z-Axis by any generic purpose for various reasons.  I think of this a little bit like saying I want to put all widgets in data node 1 and all waggles in data node 2.  They’ll both fit in a single node but maybe I want to spread my risk around.  This one is a little nebulous but it can engender fun conversations about why things need to exist at all.

Shard Key - We see this all the time in traditional RDBMS style deployments and even in some of the newer tools in the NOSQL world.  It’s basically just some index of what node you put things one somewhere.  For those of you that had to deal with libraries before the internet you’ll remember the lovely card catalog.  It’s was nicely set up to help you figure out with shard of the library your book was close to.  Then, when you got there, good old dewey decimal system kicked in to take you the rest of the way.

Y-Axis Item Explanations

Data Processing - this could be some application that transforms data from one state to another.  For example, it might simply remove all the spaces in a document and replace them with dashes.  That’s a bit of a silly example, but just to make the point.

Data Aggregator - I’ve had to build project after project that needed one form or another of data aggregation.  So, just think of this as something that might consume and RSS feed and stick it in a database of some kind.

Distributed Calculation - I’ve been doing work and research with Map-Reduce, Actor Models, the Bulk Synchronous Parallel Model and more exotic instruments from past, present, and future.  This is simply something that does some kind of math or calculation of some sort.  For example, counting all the uses of the word onomatopoeia in 50TB of English essays by high school students across 100’s of of compute nodes.

Processor App - This is just a generic discreet application that processes something, like an API request for example.

Web App - This is an application, in my case, written in a modern MVC framework that has the job of interacting with web users and getting things done in the back-ground in various ways with various services.

Base Installation - I think of this as just shared code.  One of the developers I have been working with recently suggested that we extract a number of commonly used components from various application pieces on the Y-Axis and build a library of sorts.  Great suggestion in this case, so I stuck in on my general diagram to remind me in the future.

What’s interesting about all these conceptual applications is that if you create them correctly and with the correct architectural models that each item that lives on the X Axis will also be able to scale on the X and Z axis.  For example, you could have your web application running 5 X-Axis copies in 4 Y-Axis partitions; say external cloud, by client, purpose, and by geography per client.  So, you’d end up use four AWS Availability zones in 2 AWS locations running 96 application nodes in total.  Of course, your application has to be built correctly to take advantage of all of this distribution at every level.  But, that’s a topic for a later date I suppose.

In summary, this post was just to share some of my thinking around the use of a very nice visualization tool by the fine folks at AKF Partners.  So, a shout out to them for the nice tool and hopefully this helps people a bit understand how it can be used / thought of in a variety of ways.
Just remember, it's not one-size-fits all.  Your use, labels, and needs for such things will vary greatly depending on what you are trying to architect, develop, and deploy.
Tuesday
Feb012011

Being Cloud

What it takes to be cloud has and continues to change rapidly. Today, in my opinion, the following “features” are required for something to really be cloud. Each of these can be (and has been) written about at length all over the internet. But, it’s putting them together that creates the magic of the cloud.

Click to read more ...

Sunday
Jan302011

Scaling websockets Based Services - Getting Started

It looks like mid-2011 is the time that websockets should be adopted by all of the important browsers and available for common use.  You can track this here for yourself.

Bold statements like, “HTML5 Web Sockets provides a true standard that you can use to build scalable, real-time web applications.” are being flung around.  That was from an article called “HTML5 Web Sockets: A Quantum Leap in Scalability for the Web.“

That is a bold statement.  Like all things scalability related it’s certainly not automatic or free.  In order to take advantage of the scalability possibilities inherent in websockets one must do things correctly from the start and architect a system to do what is needed and even better, at scale.

So, I set out on a little journey this morning to figure out exactly what I would need to do if I wanted to build a scalable websockets service for the purpose of having lots of websocket clients connected doing meaningful things.

Even though there are amazing HTML5 and websocket demos abounding through the ‘net there is very little data about the server side to be found and we certainly can’t forget and that is the server-side.  What will serve all those websocket requests that start coming from browsers soon?  There has to be something on the other end that knows what to do with a websocket connection.

The current state of server support for websockets is pretty dismal.  It seems that clients our outstripping the servers at the moment.  Here is what I’ve found for some of the servers I regularly deal with:

Apache - Extend with pywebsockets to add websockets support to the server - http://code.google.com/p/pywebsocket/

nginx -

Tornado - This server can support websockets with the websockets module - https://github.com/finiteloop/tornado/blob/master/tornado/websocket.py

lighttpd - Extend w/ mod_websocket - https://github.com/nori0428/mod_websocket

Jetty - Jetty has websocket support since 7.01 - http://blogs.webtide.com/gregw/entry/jetty_websocket_server

node.js as a websocket server - https://github.com/ncr/node.ws.js

Kaazing - Claims to have a gateway/bridge server that will allow the use of websockets now.

No doubt I have missed things in this post.  This was just a getting started kind of post with some initial digging into what it’s going to take to be ready for a real-time, asynchronous, distributed web world otherwise knows as, “The Cloud."  More soon...

Wednesday
Nov102010

Scalability as a Discipline

Just a very quick post to point out a brief blog entry regarding the concept of scalability as a discipline and the scalability architect as an formal role.

AKF Partners posted a blog entry discussing this issue that is near and dear to my heart.  Of particular impact I thought was the concept that the Scalability Architect is also a teacher and has the purview to educate others about scalability concepts.  I couldn't agree more!

Take a moment and check out their post.

http://akfpartners.com/techblog/2010/11/09/scalability-as-a-discipline

 

Monday
Oct042010

Accountability vs. Responsibility in the Context of Organizational Scalability

I recently read a phrase in a book that has been clanging around in my head since I read it.  Since it was sticky I thought I would share here.  It said,

"You can delegate anything you would like, but you can never delegate the accountability for results." - Art of Scalability1

In my opinion, this means that with regards to scalability, if you are designing and managing an organization for growth it is crucial that the lines of not only responsibility are clear but also that the lines of accountability are clear.  The phrase, "The buck stops here," springs to mind.  Wherever the buck stops is where the accountability starts; a point often looked over by the delegators of responsibility.

Sources:

1 - "The Art of Scalability", Michael T Fischer

Tuesday
Sep282010

Building for the Cloud is Building for Scalability

Even today with all the hoorah about cloud computing there are very few applications that can really call themselves cloud computing applications outside of the sciences and even there, I suspect there aren't that many.  So, I began trying to piece together a list of the features that I think a cloud computing application must have.  Otherwise, it's not really a cloud computing application at all but perhaps on its way to being one or not at all.  What I realized quickly was that being a cloud application is often about three things. 

  • Scalability
  • Availability 
  • Costs 

Part of what prompted me to write this is that I have installed and scaled countless websites and applications over the years.  I've done Drupal, Wordpress, Clearspace, ModX, Expression Engine, Custom CMS applications, X Cart, Magento, Django, and every language you can think of from ruby to PHP to java to python.  Thing is, they all have a very serious problem and it baffels me that it doesn't appear to be addressed well yet.  Every single one of these applications was NOT built for the cloud.  They just do not scale well without serious work and complexity to make them scale.  I did not say they do not scale as in most cases, one way or another, they almost certainly can be made to scale.  For example, I've built Drupal sites that could handle 1.5 to 2.0 billion page views per month for example.  But, it was forced and very complex all in all.  

But, what is important to understand is that very, very few applications can natively and easily scale to what we call cloud scale or web scale as they are out of the box.  They simply cannot.  But, if you do want you application to be able to utilize all the benefits of cloud computing such as:

  • Automated provisioning and scale - Using tools like Chef and Puppet to code your infrastructure
  • Rapid Deployment - You should be able to deploy fast and often
  • Lower Cost to Market - You should be able to get up and running for less money in infrastructure than ever before (less capital expenditures on your balance sheet)
  • High server to admin ratios and reduced operational costs over time - You can do more with less and it's all about quality over quantity
  • Client satisfaction - Happy clients are clients that never even know your service grew 10x last quarter, they just know it works
  • Shareholder/Partner Satisfaction - sleep better, make more money, run a better business

Here are some items that I think are must-haves if you are going to build a true cloud computing application. They are in no particular order but all are important.  The take away from all this for me is that building something to scale is very much about building something for the cloud.

  • Be Distributed - It must be able to do whatever it does over a dispersed network of nodes(servers/services) nearly as easily as it does on a single server
  • Be Asynchronous - Think message based event driven frameworks
  • Be Monitorable (aka Be Instrumented) - Just like your developers should implement unit tests they should implement monitoring instrumentation at the application level
  • Be Self healing and Resiliant - The system should be able to handle faults and route around it when needed. A degraded state is often just fine so long as transaction can keep flowing to some degree.
  • Be multi-provider - Do not put all your eggs in one basket (see my post on configuration management and deployment automation w/ Chef for example)
  • Be multi-site - your site should be active-active to some degree across mutiple geographic locations.  There is no reason any longer not to do this.
  • Be able to Scale Out - Designing a distributed, stateless and asynchronous application will go a long way here
  • Be Stateless - This one is not always possible for many reasons it is worth doing or at least minimizing in clever ways, think hard about this one and don't abuse state.
  • Use the Cloud for your Infrasturucture - (note that I did not say virtualize only) - 'Nuff Said.
  • Be able to rollback anything - If you deploy it, you should be able to un-deploy it just as easy.  Oh database schema changes how I love thee.
  • Test, test, test - Make sure your test process is solid.
  • Be highly available (N+1) - If you need 5 servers to provide the capacity you need then you should have at least 6 servers at any given time. This is a simpler version of more complex capacity planning that you can do on a per server/service basis that's very, very important
  • Use the right technology for the job - If you need a relational database by all means use one. If you only need a key/value store then by all means do NOT use a relational database.  If you need data persistence to disk and complete data durablility then don't use memcached (not picking on memcached of course.. great tech there)  Use other technologies properly configured to give you what you need.  Put in non-technical terms, don't use a spoon to dig the foundation for a house.  That's just silly.
  • Build in failsafes (aka circuit breakers) - When something breaks like access from the application server to the database then the application should be able to handle it gracefully.  Build this in!

I'm sure this list can be tuned and improved.  I'm sure I will do so.  What I'm after is laying down the rules for building truly cloud native applications.  There has been some phenomenal work in this area over the years but it's only just beginning to be better understood and move more into the mainstream.  This is both frustrating and exciting!