Entries in aws (12)

Monday
Sep052011

Slides from DevLink 2011

I had the privilege of speaking at DevLink 2011 a few weeks ago in Downtown Chattanooga, TN. I have been a bit OBE (overcome by events) since I left the conference and have been unable to post my slides until now. I hope to get the videos and other materials up in the coming week or so. If you came to one of these sessions – thanks – the attendance at both was great and I appreciated the questions from the audience.

 

Source code for GPGPU Talk

 

Source code for AWS Guest Book Demo

Source code for Azure Guest Book Demo

Monday
Jun062011

A Comparison of AWS and Azure

This past weekend at CodeStock, I gave a double-length session that was a side-by-side comparison of Amazon Web Services and Microsoft Windows Azure. The objective was to introduce the products, walk through the similarities and differences, and have a discussion around where the different offerings fit various needs better than the other (or not). 

The sessions were fairly well attended and we had some good conversations. The slides from both sessions are provided below and, if you attended, I’d appreciate it if you’d also take a minute to rate the sessions (button below) and provide feedback as to how they might be improved for the next time.

 

At the end of the second session, we walked through some code that demonstrated a guestbook on both the Amazon and Azure platforms. The source code bundles are available here:

Monday
Jun062011

Cloud Futures 2011–Scaling Document Clustering

I was honored to be able to give a talk at the Microsoft Research Cloud Futures 2011 conference this past week. I joined a number of other researchers and academics from around the world and discussed what folks were doing with the cloud, where issues remained, and where progress was being made. Having been in attendance at last year’s event, I was quite pleased to see the advancements both in logistics and content… the level of material was definitely stronger this year.

The talk I gave was on some early work we are doing in scaling a document clustering algorithm (Piranha) using cloud primitives. The slide deck is below and, if you attended the talk, I’d appreciate if you’d take a minute to rate the talk using the button below.

Friday
May062011

Hands On with Amazon Web Services

[updated 6/1/2011 with embedded video]

I have the opportunity to talk at StirTrek today and wanted to make the slides available from today's session. I'll update this post a bit more following the session.

Friday
Feb112011

Moving Applications To The Cloud with Windows Azure

appsinthecloud I just finished reading a book from the Microsoft Patterns & Practices group called Moving Applications to the Cloud on the Microsoft Windows Azure Platform. I’ve had the book for a few months, and my when I first received it, I read the first chapter or two, decided it wasn’t worth the read, and set it aside.

Lately, however, I picked it up again – finished the book, and am glad I did. Don’t get be wrong, it didn’t magically morph into a superb spectacle of literary greatness, but I did find that as I read further, the authors moved further from the very basics of the Windows Azure platform and the content became increasingly interesting.

If you are new (or relatively so) to the Windows Azure platform and contemplating the moving of existing applications to the cloud, this is a worthwhile discussion of a fictitious scenario that did just that. The scenario is slightly on the cheesy side, but realistic enough to help you think through issues you may be facing in your business.

If you are well experienced with the platform, you will likely find this a bit dry – especially the first portions. You’ll also likely be distracted or bothered by the not-so-covert marketing that takes place. That said, the book covers some more complex topics such as multiple tasks/threads sharing the same physical worker role, various optimization topics, and more. In the end, I’m glad I read it and feel that I learned some things from the book.

My last thought has nothing to do specifically with the book, but rather a growing frustration of mine with the Windows Azure platform – the design of the table storage platform. Upon reading books such as this I’m reminded (they stress it *many* times) how important your partition key/row key strategy is, and how literally hosed you are if you get it wrong. This compares with my recent experiences with Amazon’s SimpleDB product, and the delta couldn’t be more striking. Both platforms solve essentially the same problem, but in the case of SDB, it is effortless (at least by comparison). I don’t have to think of partition keys, or be overly concerned with how the underlying storage platform works… I just put data in it. Additionally, *every* column is indexed and performs reasonably under queries. I can’t shake the feeling that the Azure team is missing it here – there has to be a way to get a well-designed, horizontally scaling table structure without placing such a design burden on the users.

Thursday
Jan062011

Book Review: Host Your Web Site In The Cloud

hostyourwebsiteinthecloudOver the holiday break I spent some time getting ready for the cloud computing precompiler at CodeMash and as part of that effort I read Jeff Barr’s Host Your Web Site In The Cloud, Amazon Web Services Made Easy. This book is one of the few physical paper books I’ve gotten recently, and is unique to me in that it is the only book I have that is signed by the author.

That aside, I’d like to recommend this book to anyone who is looking at Amazon Web Services, or would consider themselves a beginner with AWS. I found the writing style to be very easy to read and, while I’m not a PHP developer, the code samples and walkthroughs were clear and simple to follow.

AWS is a fast moving target, and even though Jeff is on the team, I’m certain it was difficult to get a book to market that wasn’t completely outdated by the time it hit the shelves, but I think he does a good job of addressing the basics, providing a foundation on which you can build your knowledge, and even slips in a few notes regarding late breaking updates (as of press time) such as EC2 instances being bootable from EBS.

In my mind, this book is similar to the Windows Azure Training Kit in that it gives you most everything you need to get your feed wet, get rolling with the technology, and provides you with the framework by which you can add to your skills.

Tuesday
Jan042011

Speaking at the CodeMash Precompiler

timidI’m thrilled to be speaking at the CodeMash Precompiler next week. I’m going to be joined by Mike Wood and helped by Brian Prince and Michael Collier. Together, we’ll have nearly 8 hours of instruction and hands on labs covering both the Amazon and Microsoft cloud computing platforms. Below I’ve listed the abstracts for each of the sessions as well as the prerequisites for those planning on joining us. If you are going to be in Sandusky next Wednesday, be sure to drop by.

An Introduction to Amazon Web Services (half-day, afternoon)

AWS has been in the cloud computing space longer than most anyone, and they are the de facto standard when it comes to Infrastructure as a Service. While most developers are comfortable with the notion of virtual machines, reviewing the AWS offering can sometimes look like alphabet soup (EC2, S3, SNS, SDB, SQS). Join us to learn the power behind these acronyms and the tools that they can provide your next project. We'll discuss the major components, some of the trade-offs between different implementation choices (i.e. boot from S3/boot from EBS, etc.) and provide you with the opportunity to work through some labs, deploy some code, and begin to experience the Amazon cloud for yourself.

Examples are in .NET, but fundamental concepts apply to all platforms.

 

An Introduction to Windows Azure (half-day, morning)

Steve Ballmer has made it very clear that Microsoft is "all in" when it comes to the cloud and by now most have heard about Microsoft's Windows Azure platform... but what does that mean for you? Whether you are an experienced .NET developer who is wondering what all this cloud stuff means for how you write code, or maybe you are a traditional *nix developer looking to understand how to integrate your existing code with the Microsoft version of the cloud, join us for an in-depth discussion on what Platform as a Service is, how Microsoft has implemented it, what scenarios it best addresses, and a collection of hands-on-labs to get you started.
Examples are in .NET, but fundamental concepts apply to all platforms.

 

Prerequisites

The sessions will be part presentation, part hands on labs.  While you aren't required to bring a laptop, you'll get much more out of the sessions if you have one available to work through the labs with (but, there might be some people willing to pair as well!).  Please make sure to bring your power cord! 

Here are the prerequisites to have loaded:

An Introduction to Windows Azure

· Operating Systems Supported: Windows 7 (Ultimate, Professional, and Enterprise Editions); Windows Server 2008; Windows Server 2008 R2; Windows Vista (Ultimate, Business, and Enterprise Editions) with either Service Pack 1 or Service Pack 2

· Microsoft Visual Studio 2010 (full version or the free trial).

· SQL Server 2005 Express Edition (or above) (this is usually installed with Visual Studio)

· Install the Windows Azure Tools for Microsoft Visual Studio (and some hotfixes)

· Install the AppFabric SDK

· Install the Windows Azure Platform Training Kit

An Introduction to Amazon Web Services

· Amazon AWS SDK for .NET

· Requires Microsoft .NET Framework 2.0 or later.

· Use the AWS SDK for .NET with any of the following Visual Studio editions:

o Microsoft Visual Studio 2008 Professional Edition or later

o Microsoft Visual C# 2008 Express Edition (free!)

o Microsoft Visual Web Developer 2008 Express Edition (free!)

You might be thinking, "Hey, What a second!  This is CodeMash, you just listed all Microsoft tools there!".  Just like CodeMash, both Windows Azure and Amazon AWS are happy to mix in multiple development stacks.  Our labs and demos will be shown using Visual Studio, but don't let that stop you from following along or trying out the cloud platforms from your Mac, or using Java, PHP and Ruby on Windows.  Below are links to other SDKs for each cloud platform.  Please, feel free to explore your options and load these SDKs or libraries up if you prefer them.

For Windows Azure

· Windows Azure SDK For Java

o AppFabric: http://www.jdotnetservices.com/

· Windows Azure SDK for PHP

o AppFabric: http://dotnetservicesphp.codeplex.com/

o and tools http://azurephptools.codeplex.com/

o and Companion http://www.interoperabilitybridges.com/projects/windows-azure-companion

o Oh, and some love for Eclipse via a plug in: http://www.windowsazure4e.org/

· Windows Azure AppFabric SDK For Ruby

For Amazon AWS

· AWS Java Developer Center

· AWS PHP Developer Center

· AWS Python Developer Center

· AWS Ruby Developer Center

Wednesday
Nov172010

Does Amazon’s Cluster Compute Platform Still Represent Cloud Computing?

I’m sitting at the airport in New Orleans, after having attended the first half of the ACM/IEEE 2010 Super Computing conference. This was the first time I have attended this conference, and it was certainly interesting to participate.

During the workshop I participated in on Sunday (Petascale Data Analytics on Clouds: Trends, Challenges, and Opportunities), there arose a conversation regarding the Amazon EC2 “cluster compute instances” and their having reached a spot on the Top 500 list. What surprised me, however, was not that they were mentioned (I actually expected them to receive more attention than they did), but that they were described as not being “real” cloud computing.  The point was made that they represented some sort of special configuration that was done just for the tests and that the offering was somehow significantly different than the rest of the general populous could acquire. The two primary individuals involved in the exchange have significant history in classic HPC and have, at least a degree of “anti-cloud” bias, but I am responsible for helping influence the viewpoint of one of these folks so I’ve been thinking a bit over the past few days about how to properly articulate the inaccuracies of the argument… and wondering if it really matters anyway.

Commodity Hardware – by this I mean that the platforms being utilized could be purchased/deployed by anyone… and, by “anyone”, I am thinking of a moderately skilled computer hobbyist. I’m referring, particularly, to the chip architectures, availability of the motherboards, etc. A quick glance at the specs for a given machine validates that anyone (with enough money) could easily assemble a similarly-configured machine. It is simply a quad-core Intel box with 24 GB of RAM and roughly 2TB of disk. One might argue that the newly-announced Cluster GPU Instance is specialized hardware, but then again, anyone with an extra $2,700 to spare could add one of these to their machine. The point is, that machines in this class are in the 5K range, not the 50K or 500K range.

Commodity Networking – now to some of you, 10GB non-blocking networks might seem specialized or exotic, but – at least in the HPC realm – it isn’t. Most serious HPC platforms utilize a network technology called InfiniBand (usually QDR) or something fancier (more expensive such as an IBM custom interconnect or CRAY’s Gemini. A quick search shows one could purchase 10GBE switches starting in the 2-3K range and going up from there whereas IB QDR switches are at least double that.

Broad Availability – this point gets a little stickier. The point is, that anyone can get access to CCI nodes at any point – simply using a credit card and visiting the AWS website. However, getting access to 880 of them (the number used in the Top 500 run) is likely to be more difficult. The reason is not an unwillingness on Amazon’s part to provide this (I’m sure, given the proper commitment, this would not be impossible), but rather a question of economics and scale. Their more “general” nodes have a large demand and use case… the scale of demand for CCI nodes is yet to be established although I’d imagine the sweet spot for these customers is in the 16-64 node range… folks who could really use a cluster some of the time, but certainly don’t need it all of the time. As such, I (and I have no inside knowledge of their supply/demand change) don’t imagine that the demand is currently so large that beyond the currently active nodes, they have ~1000 nodes of this instance type sitting around just waiting for you to request them (this will likely changes as demand grows).

Inexpensive + Utility-style Pricing – This is one area where this instance type represents all of the goodness we have become accustomed to in the cloud computing world. These nodes (remember I listed the above as starting around 5K) are available at $1.60/hour ($2.10/hour for the GPU-enabled nodes). This makes a significant computing platform available to almost anyone. For just over $100/hour, you can have a reasonably-well powered 64-node cluster on which to run your experiments… that is disruptive in my opinion. The best part about it, is that this price is the worst case scenario – meaning, this is the price with no special arrangement, or reservation, or volume discount, or anything. It represents no long term commitment… nothing beyond a commitment for the current hour.

So… what is different? – I have spent the majority of this post explaining how I think that these instance types are similar in many ways to other IaaS offerings and thereby deserve categorization as “regular” cloud computing, but that begs the question – what is unique about these nodes that would cause Amazon to promote them as better for HPC workloads? What facts formed the foundation for these rather experienced HPC experts to classify them as different? In my mind, there is really only 2 or three things here. The first is the networking – rather than being connected to a shared 1GBE network, you are given access to a 10GBE network, and guaranteed full bisection bandwidth node-to-node. It is this fact alone that makes the platform so interesting to the HPC folks as it makes it actually viable for network-heavy applications (think traditional MPI apps). Secondly, you have clear visibility to the hardware. Amazon tells you exactly what type of processors you are running on allowing you to optimize your codes for that particular processor (somewhat common in the HPC realm). Tightly coupled with this fact is that you can’t get a “part” of this instance type. You get the entire node (less the hypervisor) and, as such, are not contending with any other customers for node-local resources (RAM, ephemeral disks, network, etc). Finally, the fact that you can get nodes that have specialized hardware (NVidia GPUs) is unique… there are very few cloud providers currently offering this sort of feature set.

In the end, I think the Amazon offerings are very much representative of the “cloud” and, particularly, of where the cloud is going. I think we will continue to see a broad level of homogeneity (basic hardware abstractions) with comparatively small pockets of broad-domain specific assets. The key being that for a large number of researchers, the offerings announced by Amazon this summer (and additionally this week) make the decision as to whether or not to buy that new departmental cluster much more difficult – especially when a true TCO analysis is performed. These are similar to the arguments and justifications for “normal” cloud compute scenarios and as such, should be considered one and the same.

Wednesday
Jul072010

Amazon Web Services for the .NET Developer

I spoke at CodeStock (http://codestock.org) a few weeks ago and one of my talks was focused on AWS from the perspective of the .NET developer. The slides are available here:

 

And the video of the session is available here:

Tuesday
May182010

You Still Have to Plan and Understand Your Toolset

I just finished reading an article (http://searchcloudcomputing.techtarget.com/news/article/0,289142,sid201_gci1512394,00.html) discussing some of the power issues and related outages at one of Amazon’s (http://aws.amazon.com) data centers last week. While much of the article was fine and factual, I take a bit of issue with the way the article wraps up:

Users may not like being told they should fend for themselves on disaster preparedness, but that appears to be part of the price for getting everything else AWS offers.

This highlights a sentiment that is unfortunately pervasive within the community of those evaluating or adopting cloud computing – that of believing that cloud computing is a panacea for all scale and datacenter problems.

What the users of these platforms need to understand is that they are toolkits. While the various cloud computing vendors provide important services and features, the consumer of said platforms must do their homework to understand the technical tradeoffs of various decisions so that they can appropriately reap the benefits of the selected platform. Simply uploading your code/application and expecting it to be always available is unrealistic. The consumer must understand what high availability features are offered by their particular cloud vendor and exploit those features to ensure that their app has the appropriate availability. In the case of the Amazon outage(s), if users had followed the high-availability guidelines provided by Amazon, they would not have experienced any outage at all. Cloud providers such as Microsoft, Amazon, and others provide the notion of availability zones, or regions, and – much like you would if you were hosting the app yourself – you need to distribute your application across such to ensure that a failure in one location doesn’t mean a complete outage for your application.

Rather than a magic wand that solves all scaling and availability issues, cloud computing provides a democratized toolset that informed consumers can use to develop a highly available, scalable, and fault-tolerant application. The key word here is “democratized” – meaning – these features are available to anyone, at a fraction of the cost of doing it yourself. I experience similar frustration when reading complaints from folks about the pricing of Windows Azure (i.e. “Why can’t I host my simple website there fore $10/month?”). The question illustrates that the inquirer doesn’t understand the fundamental architecture of the platform (both how it works, and what its primary use cases are). Neither Amazon’s EC2 nor Windows Azure are designed to compete with a low-cost web hoster… rather they are designed to provide the tools by which a company that needs features not available from a low-cost hoster, but doesn’t have (or wish to spend) the capital to build those features themselves.

They are great platforms that provide you the ability to build a very solid offering, but you have to understand how to properly utilize those features. Cloud computing should not be approached with ignorance or any less planning than you would if you were building out the infrastructure yourself (of course the level of detail will differ).