Sign In/My Account | View Cart  
advertisement


Listen Print

Building a Large-scale E-commerce Site with Apache and mod_perl

by Perrin Harkins
October 17, 2001

Common Myths

Table of Contents

Roll Your Own Application Server

Case Study: eToys.com

Apache::PerlRun to the Rescue

Planning the New Architecture

Surviving Christmas 2000

The Architecture

Proxy Servers

Application Servers

Search Servers

Load Balancing and Failover

Code Structure

Caching

Session Tracking

Security

Exception Handling

Templates

Controller Example

Performance Tuning

Trap: Nested Exceptions

Berkeley DB

Valuable Tools

An Open-Source Success Story

When it comes to building a large e-commerce Web site, everyone is full of advice. Developers will tell you that only a site built in C++ or Java (depending on which they prefer) can scale up to handle heavy traffic. Application server vendors will insist that you need a packaged all-in-one solution for the software. Hardware vendors will tell you that you need the top-of-the-line mega-machines to run a large site. This is a story about how we built a large e-commerce site using mainly open-source software and commodity hardware. We did it, and you can do it, too.

Perl Saves

Perl has long been the preferred language for developing CGI scripts. It combines supreme flexibility with rapid development. Programming Perl is still O'Reilly's top-selling technical book, and community support abounds. Lately though, Perl has come under attack from certain quarters. Detractors claim that it's too slow for serious development work and that code written in Perl is too hard to maintain.

The mod_perl Apache module changes the whole performance picture for Perl. Embedding a Perl interpreter inside of Apache provides performance equivalent to Java servlets, and makes it an excellent choice for building large sites. Through the use of Perl's object-oriented features and some basic coding rules, you can build a set of code that is a pleasure to maintain, or at least no worse than other languages.

Roll Your Own Application Server

When you combine Apache, mod_perl and open-source code available from CPAN (the Comprehensive Perl Archive Network), you get a set of features equivalent to a commercial application server:

  • Session handling
  • Load balancing
  • Persistent database connections
  • Advanced HTML templating
  • Security

You also get some things you won't get from a commercial product, such as a direct line to the core development team through the appropriate mailing list and the ability to fix problems yourself instead of waiting for a patch. Moreover, each part of the system is under your control, making you limited only by your team's abilities.

Case Study: eToys.com

Learning PerlLearning Perl
By Randal L. Schwartz & Tom Phoenix
Table of Contents
Index
Sample Chapter
Read Online -- Safari

When we first arrived at eToys in 1999, we found a situation that is probably familiar to many who have joined a growing startup Internet company. The system was based on CGI scripts talking to a MySQL database. Static file serving and dynamic content generation were sharing resources on the same machines. The CGI code was largely written in a Perl4-ish style and not as modular as it could be; which was not surprising since most of it was built as quickly as possible by a small team.

Our major task was to figure out how to get this system to scale large enough to handle the expected Christmas traffic. The toy business is all about seasonality, and the difference between the peak selling season and the rest of the year is enormous. The site had barely survived the previous Christmas, and the MySQL database didn't look like it could scale much further.

The call had already been made to switch to Oracle, and a DBA team was in place. We didn't have enough time to do a redesign of the software, so we had to scramble to put in place whatever performance improvements we could finish by Christmas.

Apache::PerlRun to the Rescue

Apache::PerlRun is a module that exists to smooth the transition between basic CGI and mod_perl. It emulates a CGI environment, and provides some (but not all) of the performance benefits associated with code written for mod_perl. Using this module and the persistent database connections provided by Apache::DBI, we were able to do a basic port to mod_perl and Oracle in time for Christmas, and combined with some new hardware we were ready to face the Christmas rush.

The peak traffic lasted for eight weeks, most of which were spent frantically fixing things or nervously waiting for something else to break. Nevertheless, we made it through. During that time, we collected the following statistics:


O'Reilly Open Source Convention -- July 22-26, San Diego, CA.

From the Frontiers of Research to the Heart of the Enterprise

Efficient Shared Data for mod_perl
Perrin Harkins discusses the performance and ease of use of different options for sharing data that are available on CPAN, from IPC::Shareable to Cache::Cache at the O'Reilly Open Source Convention, this July 22-26, in San Diego.

  • 60 - 70,000 sessions/hour
  • 800,000 page views/hour
  • 7,000 orders/hour

According to Media Metrix, we were the third-most-heavily trafficked e-commerce site, behind eBay and Amazon.

Planning the New Architecture

It was clear that we would need to do a redesign for 2000. We had reached the limits of the current system and needed to tackle some of the harder problems that we had been holding off on.

Goals for the new system included moving away from offline page generation. The old system had been building HTML pages for each product and product category on the site in a batch job and dumping them out as static files. This was effective when we had a small database of products since the static files gave such good performance, but we had recently added a children's bookstore to the site that increased the size of our product database by an order of magnitude and made the time required to generate each page prohibitive. We needed a strategy that would only require us to build pages that customers were actually interested in and would still provide solid performance.

We also wanted to re-do the database schema for more flexibility, and structure the code in a more modular way that would make it easier for a team to share the development work without stepping on one another. We knew that the new codebase would have to be flexible enough to support a continuously evolving set of features.

Not all of the team had significant experience with object-oriented Perl, so we brought in Randal Schwartz and Damian Conway to do training sessions with us. We created a set of coding standards, drafted a design and built our system.

Pages: 1, 2, 3, 4, 5

Next Pagearrow