Perl Catalyst and Cloud Computing
Submitted by editor on Tue, 07/22/2008 - 17:14.
in
[1] Perl Catalyst and Cloud Computing
·
Eight years ago I was sitting in a focus group sponsored by IBM, who was interested in seeing the reaction to a potential product offering they called, "Utility Computing". The pitch was a solution to a classic problem faced by all administrators of IT infrastructure, which is to find the correct mix of anticipating growth versus making good use of existing equipment when planning hardware procurement. In the classic data center model, a company would manage expensive stacks of hardware, much of which would be underutilized due to the fact it costs less to have extra equipment than to run the risk of not being able to scale. However, few companies have the unlimited budget required to plan for every possible IT contingency. IBM's basic idea was that your computing resources could be provided to you from an external utility service, in the same way as electricity and water, and if you needed more you could simply, "turn the tap". This could save a company a lot of money, since you'd only pay for what you used. Additionally, you'd reduce the risk associated with failing to plan correctly for increases in a company's IT requirements.
The concept of Utility Computing was well received by our group. Most of us could see the benefits. Objections included some fear over allowing important company assets, such as proprietary information stored in our data warehouses, to reside outside company walls on the equipment of a third party, possible even on hardware shared by competitors. For some of us in regulated industries there were legal concerns about data security. In the end, the consensus around the table was that such a service could find a home in our respective infrastructures, but it wouldn't replace them.
The Utility Computing concept IBM showed me was primarily targeted at large companies who were trying to reduce the risk and costs associated with maintaining large data centers. In contrast, Cloud Computing services, such as Amazon's EC2 (http://aws.amazon.com/ec2), are accessible even to small companies or individuals operating on a shoestring budget. Anyone can create an account from their web browser. This opens up all sorts of possibilities. It can allow a startup to scale quickly without having to incur the expense of purchasing or leasing expensive equipment. It can also allow an established company to purchase extra computing power when they need a place to experiment but don't have available internal resources. Lastly, the availability of prebuilt 'images' for most Cloud Computing providers can allow you to very quickly roll out a new service to your clients, as long as a server image for your need exists.
Recently I had the change to speak with Frank Speiser, who is leading several Cloud Computing projects that should be of interest to the Perl Community and to Catalyst developers in particular.
Q. What is Cloud Computing and why is it important to Perl?
Cloud computing is on-demand, virtualized computing. You don't need to own the hardware, or even see it, in order to run your software.
Most cloud computing providers offer the ability to pay-as-you-go, and to buy computing power as you need it. In many cases, you configure a "machine image", which is a base OS install plus all of the stuff you'd like to run on it, and then you’d freeze it to use when you spawn up an instance of that machine image.
This is the natural outgrowth of "commodity computing". As a software owner, or software builder, you really care about access to computing power, and that is scalable and available. Cloud computing solves a lot of problems that have been a real issue for growing businesses or businesses with peak load times. You used to have to plan to carry enough hardware to cover your peak, plus some additional room for insurance. Now, you just have to be able to balance and scale your application if you have thought about architecting in such a way as to allow your app to scale horizontally.
If you properly design your applications so that you can scale them by balancing / routing requests, you can build machine images or application profiles of what you need, and as demand for them increases, you can scale horizontally. There are Facebook-app-specific cloud computing instances out there, as those apps are self-contained in scope by nature, and it lends itself very well to provisioning only when demand warrants it, but large organizations can take advantage of this type of thing too. For instance, why would you run a machine that costs $3-8 per hour with power and maintenance factored in, when you can run a machine for 1/10 to 1/20th of that -- so long as the stability for such a thing has been demonstrated? If there's a way to mitigate that risk, it's almost your duty as an IT Manager to make sure thos cost savings are realized. There are companies like GoGrid, Joyent, Amazon and Terremark all operating in this space, in slightly different ways. The opportunity to cut your costs is out there, though.
Q. What Cloud Computing related projects exist for Perl developers?
Currently, there are only two Perl projects relating specifically to cloud computing of which I am aware. There's Jeff Kim's great work on Net::Amazon::EC2 (http://search.cpan.org/dist/Net-Amazon-EC2), which allows for the management of Amazon's virtualized services through a Perl API, and then there's the Net::Cloud (link forthcoming) project. I registered that namespace about two years ago, but back then there was only really one viable cloud computing option, and it was a closed Beta, which I was fortunate enough to get into with a recommendation from Ruv Cohen over at Enomaly (a provider of cloud management software). The project is due for its initial release on CPAN within the next few weeks. I'm working to roll in as many initial cloud provider services as has been feasible. Some of my work extends Jeff Kim’s Net::Amazon::EC2 modules and I have to give him a lot of credit because he was the first one to do a lot of this stuff, so when I needed to settle on a standard, I went with what Jeff had done.
Q. What is the best way for a Perl programmer to get started developing for the Cloud?
There's already a big resurgence in Perl around the Catalyst (http://www.catalystframework.org/ ) MVC framework, headed up by Matt Trout, and Moose (http://search.cpan.org/dist/Moose) by Stevan Little. I think people should give those a look, because it gets them developing with a proper design pattern that is well suited for projects that organizations want completed RIGHT NOW. To that end, I went and set up two machine images over at Amazon's EC2 that have everything you need to get up and running with an MVC framework, in the time it takes you to register an account and read through the instructions.
The machine ids at Amazon are:
ami-bdbe5ad4 developer-tools/Debian-Etch_Catalyst_DBIC_TT.manifest.xml
ami-9fbe5af6 developer-tools/Fedora8-Catalyst_DBIC_TT.manifest.xml
You have one instance with Debian (Etch) and one with Fedora8. Take yer pick. The Debian instance includes svk and git, but aside from that, there’s pretty similar in terms of what you can do. I set them up in such a way that if you actually do something that gains traction, there’s source control there for you to call in the cavalry. Just remember to save your work somewhere, because if a cloud instance is terminated for any reason, you lose your unsaved work.
The effort to get up and running ON the cloud is minimal -- and it should be. Architecture and scaling, which has been like a black art for years, shouldn't be so difficult. There's plenty of room to be clever, so smart folks will always be employed in this field. There are better problems to solve. Porting existing apps and building Perl (http://www.perl.org/ ) extensions to make that possible are a great way to get paid while you use cloud services. There’s also a real need and opportunity for Perl development in many organizations to simplify architectural designs that grew out of dive-from-the-hand-grenade necessities of scaling to meet demand. In this economy, saving people money is not going to go out of style.
I remember the first time I saw Brad Fitzpatrick's slides on how he grew LiveJournal (http://www.livejournal.com/), and I remember thinking, "Wow, that is briliant, but it is also crazy." He has a slide in there specifically intended for people not to faint, because that's how complicated it grew to be, managing all those devices. I mean, all that hardware was just amazing… AND IT WORKED! The fact that LiveJournal and Danga produced all that great stuff is a testament to the folks that worked there, and they were doing their thing well in advance of this whole emergence of cloud computing. But the fact is, Brad Fitzpatrick and his team isn't available everywhere, and the risk associated with trying to duplicate that would prohibit most places (in fact all but a few) from ever replicating that success.
Nowadays, you have cloud computing, Hadoop filesystem (http://hadoop.apache.org/), perlbal (http://danga.com/perlbal/) and memcached (http://www.danga.com/memcached/) (thanks Danga!), and either squid or commercial reverse proxy systems. A lot of these can both be moved to the cloud and be made more service-like. This is right up Perl’s alley. The more you can decouple the service from the hardware, the better off you’ll be in your organization. Unless your organization makes its money by selling hardware solutions coupled with software. In which case, grab a life raft now.
Having these images are a big time saver for people wanting to get started using Perl and Catalyst. Catalyst can take some time to install, particularly if you are limited to an older version of Perl. Here's a recent article with a step by step plan to getting started with EC2: http://www.onlamp.com/pub/a/onlamp/2008/05/13/creating-applications-with-amazon-ec2-and-s3.html
Additionally, look to the future to see documentation about how to get started with the Catalyst EC2 images integrated into the core Catalyst manual. Or see the online documentation at Amazon: http://aws.amazon.com/ec2
Come back soon when we have a screencast video showing step by step how to get started using the EC2 Catalyst development machine images.
Q. What are the next steps for your Cloud Computing plans related to Perl? Where would you like to see Perl in the Cloud Computing space over the next 18 to 24 months?
The goal of this project is to allow for the basics, and build from there. You should be able to start, stop, load balance and cache across one or multiple clouds: think "DBI for cloud computing" with the various "condensers" for each cloud provider or virtualization API being congruent to the DBD effort that has proven a very successful and economical Perl implementation in many organizations. Perl is great at integration, and holding things together. As such, it makes an excellent choice for a transfer layer among computing services. Now that the basics of the project are in place and we’re about to roll out the initial release, there will be quite a few opportunities for Perl developers to get involved in contributing to cloud computing. There’s a lot of brilliant Perl developers out there, and I think that they only get amped up about new and improving technologies. The potential of virtualized cloud computing is enough to do that, I’d hope. I am inviting anyone that wants to contribute to the project. There’s room enough for everyone, and this type of project is going to have a few different outgrowths. There’s a lot of room for involvement.
The first step is to make a very stable and portable interface to all cloud and virtualized environments. We want to enable people to get up and running on a cloud and demonstrate that it is real, it works, and is viable. The way to do this is to develop the gateway that allows people to make this happen, and make it as easy to use as possible. I think with this release of Net::Cloud, I have that covered.
From there, we will split the project into providing architectural support and portability, and then allowing access to a growing number of specialized instances across all the clouds. There will be caching-specific instances, genomic computation specific instances, and possibly neural networks, but it doesn’t need to be so academic, either. The idea is to provide a uniform interface layer for people to issue publicly available software as a service (if I can borrow a buzz phrase), and make that easy for developers to tie it all together. Although I plan to build a gateway REST-based service for access to one or multiple clouds, I don’t intend to keep it specifically Perl-centric. The idea here is to provide the bridge and standardize, hopefully letting the good languages (more specifically good developers) rise to the top.
Looking longer-term, I would like to see a lot of IPC stuff move to allow for slower-but more scalable transport across the cloud, or multiple clouds for redundancy. I am not talking about simple "virtualization" in the sense that comes built into machines rolling out of the systems manufacturers. I mean actual "virtual process communication". Also, databases should be segmented to make full use of the cloud, clustering and replication and anything you can replace in a database using Hadoop or GFS will probably happen sooner or later, depending on how easy and reliable we developers make it, to make the leap. I think that stuff is maybe two years out, but it’s coming. It won’t be long before that type of talk is not taboo among large organizations. My advice would be to start pitching the seeds of these ideas now, and build up acceptance in your organization for them.
For the non Perl developers, the mentioned modules 'DBI' and 'DBD' refer to the database access framework used universally in the Perl community. DBI is similar to ODBC or JDBC, while DBDs are database specific drivers. For more information see the documentation at: http://search.cpan.org/dist/DBI
Q. Can Perl's Cloud Computing projects do anything to assist people who are using other programming languages?
Absolutely. Like I alluded to in the last question, I’m looking at this as a way to extend many existing computational limitations. If you have an existing project, setting up cloud images and using a REST-based call to a server running Net::Cloud should help a lot of people doing things that don’t have much to do with Perl at all. The idea is to make it so that someone doesn’t need to master the ins and outs of Perl, Catalyst of Net::Cloud to use it.
I would love the help of other front-endian developers doing Ajax and UI work to help build some slick tools to go on top of this stuff, that we would ship with existing cloud management instances, or deploy on you local machine to get up and running, a
Q. What can the community do to help? What are the current communication channels (mailing list, IRC, etc).
When this release hits, register for some cloud services, and try it out. Start a few machine instances, stop them, and then pick one or two services which you’d want to test scale for. Maybe it’s encoding video formats, or managing sessions. If you make a plugin that works, abstract it and I’ll gladly include it in Net::Cloud and mention you in the release notes as a contributor.
Aside from that, I want to do something here that the Perl community has traditionally overlooked. I want to put a polished package on this, that includes an easy-to-install, easy-to-use interface. I don’t care if that is “trivial”. I think the missing piece for wider co-operation and adoption here is just a cloud solution that looks good and the price is right. The fact that this release is free should satisfy at least one of those requirements.
Currently the best place for discussion is(IRC) #net-cloud at irc.perl.org
Q. A little about yourself, your interest in Perl and what inspired you to start with Cloud computing. Also, some details about your role related to some of the discussed projects.
I have been interested in computing -and generally taking things apart to see what they do- since I was an early age. Sometimes I was even been able to put those things back together. This tendency sort of led me out of the business program in college, and into building web sites and doing commerce-based stuff. My first non-static website was built with hand-rolled Perl CGI, then I had a go with Java, then back to Perl. I’ve worked with PHP, Python, Ruby, and now back to Perl. I keep coming back to Perl because it seems to create resourceful bridges amongst diverse systems. This is a reflection of the people in the community as well as the language itself. I have never considered myself painted into a corner with Perl. It’s improvisational enough to do new things, but has voluntarily enforced conventions enough to make it work at scale. Sort of like a reflection of a healthy society, in a way. I do not think this was an accident on Larry Wall’s part, but that’s a topic for another discussion.
People say that bad Perl code is unreadable, but I’d say that code that doesn’t run is worse than code that doesn’t read well. There’s varying degrees of cleverness, readability and elegance in all languages and it’s a balancing act. I think Perl and the community around it is full of intelligent and motivated people that are necessarily skeptical and inquisitive. Some code is great and some isn’t. There’s a guy, a legend basically, Paul Lindner at Hi5 right now, and he has a knack for taking a complex problem and solving it in a simple, solid-state way, and being pretty accommodating to anyone that has to read the code. That’s the kind of Perl we should all be shooting for. There’s another guy I met while in Boston.pm, named Ronald J. Kimball, and he writes some impressively well presented code as well. Some of these guys have moved up and moved on, because their skill set is so valuable, but I think the culture we have in place in the Perl community still breeds that type of meritocracy. Stevan Little and the Moose guys are an example of that. They're doing great stuff. The team at Shadowcat is doing great stuff as well, and the Moose, Catalyst and DBIx::Class communication channels are very active and helpful.
My involvement with this cloud initiative is that I am the author / maintainer of the Net::Cloud modules and namespace, and I have been using cloud computing services since Amazon made them available in 2006. I had a semi-successful Beowulf cluster cobbled together with some buddies back in maybe 2001 or so, so it has always been interesting to me to use and share excess computing power. I even ran that SETI app when it first came out, but I stopped when it produced no publicly reported aliens ;) .
Recently, I see how it has become impractical to have to sustain a static top-end of your computational ability. It’s expensive, and impractical, especially if you ever start trying to solve bigger problems or scale. To that end, I have started this project and I’m working with some great folks on getting it off the ground. I’m looking forward to building on this, because I think it is the next frontier in computing and connecting systems. There’s already a bunch of bright people involved with it, and there’s always something new to learn or to try while paying the bills, which is why most of us got into this profession to begin with.
Some additional resources:
· http://en.wikipedia.org/wiki/Cloud_computing
· http://www.utilitycomputing.com
Note: I followed up with IBM regarding their OnDemand services and how there were positioned against Cloud Computing services such as Amazon's EC2. I received the following email in response:
"Thank you for your feedback to the On Demand Community site. We are the volunteer site for IBM employees and retirees not the On Demand Business site. I did search on www.ibm.com and the link below provides information about On Demand Business. I hope this provides the information you need. "
http://www.ibm.com/Search/?q=ondemand&v=16&en=utf&lang=en&cc=us&Search=Search
However I found the search page that link returned to not be very useful. I send a request for additional information and will relate anything I receive in response.

