There are very, very different ways of architecting cloud infrastructure so that it can scale. Prior to deciding on a cloud strategy you should ask one big question - why does your infrastructure need to scale?
About five or six years ago scalability migrated from an architectural property to a widely-discussed buzz word. I would occasionally be in a meeting where someone would ask if my application was "scalable." At times I was met with an incredulous stare when I asked to elaborate on what was meant by "scalable." It should just scale! Scales! Like... er... fish skin?
Scalability is a critical thing when designing your infrastructure and your applications, no doubt. What if I build an amazingly scalable Web application that can handle 10 million concurrent requests from users and could still double amount that in a blink of an eye... but my MySQL database quickly filled up with 1 billion user records? I may have amazing throughput and pass my JMeter load tests with flying colors, but if I don't prepare for the massive amounts of data my application might need to dig through in order to serve a query all the HTTP connections in the world will just serve to pour salt into the wound.
There are options for scaling your data layer and vastly reducing query time, as mentioned earlier in our discussions about non-relational databases in the Cloud. You can even expand to 10,000 node clusters like Google has done with Dremel. However, unless you are ready to build a league of data centers filling the Atlantic Ocean, you may want to consider what your upper limits of scaling really are. Don't automatically assume that a 10,000 node cluster of servers means more power, and definitely be careful of using technology that your application may have difficulty using properly. Non-relational databases won't fit well if your application already uses ORM tools for populating data, and sometimes you need to completely re-think how data is represented (as in Dremel) and move from simple rows of delimited data to nested columnar storage. It can make one's head explode. Mine already has twice this morning.
Cloud Computing infrastructure gives you a quintillion servers at your fingertips, but consider your diagonal scalability strategy first. Are we talking about syndicating or publishing content that is modified maybe 10-20 times a day but viewed thousands of times? Creating a series of read-only MySQL clones will work fabulously, even more so if your application caches data aggressively. Are you going to deal with a ton of transactions that need to be immediately written to your database? Architect your database layer using a sharding strategy that allows you to distribute the transactional load across multiple databases, even relational databases. Are you going to deal with massive amounts of data that will need immediate retrieval? Perhaps a non-relational database is a good idea, allowing you to distribute your data across multiple nodes and retrieve it by using a non-relational key. Need to perform full-text search? You may instead want to index your data in something like Lucene.
I can appreciate the need to standardize an organization's database tools, especially in an enterprise, but it is always good to be open to new solutions for unique problems. When it comes to scaling an application no single technology fits all situations... you need to predict why scaling might need to occur, then open up a path to remove potential bottlenecks before everything grinds to a halt.
- Customer Story: Why Disaster Recovery?
- Clouds News Report: Top 3 in Cloud â€“ Week Ending in March 5th
- Cloud News Report: Top 3 in Cloud â€“ Week Ending in February 27th
- Bluelock Adds Joe Kuntz as EVP of Sales and Announces Growth Plans for 2014
- Cloud News Report: Top 3 in Cloud â€“ Week Ending in February 19th