With darcs, it's easy to push a limited number of patches, instead of everything in the branch. There are several ways to do this:

The default is interactively:

$ darcs push 

Tue Jan  1 22:33:33 EST 2008  Mark Stosberg <mark@stosberg.com>
  * RT#1234: new subsystem
Shall I push this patch? (1/1)  [ynWsfvpxdaqjk], or ? for help:     

Even if you've never used darcs, before it's easy to figure what to do. You might press "?" for help:

How to use push:
y: push this patch
n: don't push it
w: wait and decide later, defaulting to no

s: don't push the rest of the changes to this file
f: push the rest of the changes to this file

v: view this patch in full
p: view this patch in full with pager
x: view a summary of this patch

d: push selected patches, skipping all the remaining patches
a: push all the remaining patches
q: cancel push

j: skip to next patch
k: back up to previous patch

?: show this help

I like to use "x" to review which files I'm about to push, or "p" to review the patch in a pager. I haven't yet found that git has this kind of review possibility built into the workflow. Also with darcs, it's easy to push just one patch:

$ darcs push -p 'new subsystem'

It's also very useful to push all the patches related to a specific ticket or project:

$ darcs push -p 'RT#1234'

In darcs we call that spontaneous branches because you get some benefits of a branch, without actually doing anything to create one.

The related user experience with git push is worse in a couple significant ways. First, I don't see a way to use the human readable patch name. A separate step is required to review 'git log' to find the SHA1 hash, which must be copy/pasted, because there's no way you remember it like a human-friendly word or phrase like darcs allows.

Armed with that, you ready to start the four step process that git requires to push specific patches, by creating a new branch, 'cherry-picking' to it, doing the push, and then deleting the new branch

$ git checkout --track -b <tmp local branch> origin/<remote branch>
$ git cherry-pick -x <sha1 refspec of commit from other (local or remote) branch>
$ git push origin <tmp local branch>
$ git branch -D <tmp local branch>

It's because of a number of annoyances like this that I continue to use darcs whenever I can, and git when I have to.

Grover's recumbent, side view There's a lot of trash talk among professional web programmers regarding using vanilla CGI, like Stevan Little's recent comment "There is no excuse to still use vanilla CGI as it is simply just a waste of resources".

As an experienced professional website developer myself, I find that CGI has its place. First, let's recap what we're talking about.

In "vanilla CGI", a Perl interpreter is loaded each time a request is made, the code is compiled and run on the fly, the content is delivered, and the whole process shuts down again.

In FastCGI or mod_perl, the interpreter along with some compiled application code can remain persistent in memory between requests, so just the run-time effort needs to be done for each request.

For anything that's particularly high traffic or a complex application to load, the persistence feature will be a win. Performance is dramatically improved because loading the interpreter is done once and the compile-time effort is not repeated.

However, there are other considerations besides performance that still make vanilla CGI the better option for some applications, particularly on low to moderate traffic applications.

In case there's a lack of imagination about applications that may always be small or low traffic, here's some examples:

  • A company provides a few dozen sales reps a private area to access dynamically generated spreadsheets. As a private application with limited users, traffic is always low. Beyond this, the company has a tiny web presence.
  • A college has a customized online application to an off-campus program. Less than 100 people submit the application each year.
  • A small business needs a customized contact form and a custom event registration form. Each are expected to be used10 times per day or less
  • A university needs a public search engine for a database on a niche topic.

I have helped build and deploy applications like each of the above in vanilla CGI, with reasonable performance. There are many website needs out there which are unique enough to merit custom programming but are neither large, nor destined to become high traffic attractions.

This brings us to some specific benefits of vanilla CGI:

Hosting Availability. Googling for "CGI Hosting" today I get 166,000 results, but searching for "FastCGI hosting" I get just 225. Plain CGI hosting is far more widely available. If you just need to add one dynamic form on your site that is lightly used, do you want to switch hosts just for FastCGI support? For low traffic applications, a FastCGI site could chew more memory while the persistent part sites idle, while the equivalent CGI system would completely free up the memory. A host can support more plain CGI customers (again, assuming low to moderate traffic), because their memory is constantly being freed up for use for other customers. A persistent FastCGI process has an persistent memory cost for the host.

Persistence-specific bugs. Generally, well written code will run in a persistent environment with no problems. Still, there a number of hang-ups you have to aware of which aren't a concern in CGI scripts. Just look at the extensive amount details documented for people moving from vanilla CGI to mod_perl. It's not that has to be hard or time consuming to write Perl for persistent environments, but there are clearly extra considerations.

You can easily scale up, but not down. Using CGI::Application with it's built-in support for CGI.pm and HTML::Template, you've got a great framework of lightweight components for building applications that will perform well in vanilla CGI. They will generally also work in a persistent environment without any changes. This is what I do myself. When a mod_perl project comes along, it's easy to scale up using Titanium with additional plugins to provide more features, with the same familiar framework of CGI::Application underneath. On the other hand, there's the choice to develop by default with a framework designed for a persistent environment, like Catalyst. II'm not aware of any of these heavier frameworks that have an option to scale down in a way that performs well in plain CGI will still retaining the same feel and features. Such a path is a commitment to either also deploy smaller, low-traffic applications in a persistence environment because the heavy framework requires it for decent performance, or you'll need to learn a second framework just for small apps. I'd rather use one framework that performs well in both CGI and persistent environments.

In summary: Simplicity. Vanilla CGI is the simplest to code for and deploy. If performance is good enough, why not use the simpler option?

Update-- it was pointed out my examples of "low traffic" were all very low traffic. A more interesting question is what the upper limit for traffic is, before vanilla CGI performance degrades. My recent benchmarks inform this. A "Hello World" CGI::Application project benchmarks at 0.20s. So, being very generous it's reasonable to assume that a complete CGI script written this framework could be expected to be complete and shutdown in under a second. Which means a rough upper bound is 1 CGI request per second. Busy websites regularly exceed this request rate but I think that illustrates that vanilla CGI is a capable performer for many uses.

See Also

re-using and recycling with the bakfiets Many of the key organizations I deal with in my daily life now run Linux on the desktop. First, let's taken as given that I run it home and work and my wife runs it, too. Many other organizations in Richmond, Indiana have switched over to Linux on the desktop as well:

  • My church has three computers, one for the pastor, one for the office manager and one for the hardware recycling program. They all independently chose to run Linux. It's a popular choice in the congregation as well, as with more than a dozen systems in use by members ranging from 4 years old past 64 years old.
  • My doctor, Kurt Ritchie, runs his business exclusively on Linux
  • My lawyer, Thomas Kemp, runs his law practice primary on a Linux-based groupware solution now, and travels with a Linux laptop
  • My grocery store, The Clear Creek Coop, runs exclusively Linux on the desktop. They bought a Dell laptop with Ubuntu pre-installed.
  • My bike shop, Ike's Bikes, now runs exclusively Linux on the desktop.
  • A local high school, North Eastern, runs primary Linux on the desktop, as part of trend of over 20,000 Indiana students running Linux.
  • A local college, Earlham, features Linux labs
  • Local graduate schools, Earlham School of Religion and Bethany Seminary, also use and promote Linux on the desktop
  • A local computer store, System Solutions, has had a stack of Linux install disks to give out, and pledges interest to support Linux more in the future, citing frustrations with Windows Vista and Windows malware problems in general.

Those are the commercial desktop Linux desktop uses I can think of off the top of my head. Among home users, I've found that a number of people are installing Linux themselves now, from farmers to bloggers.

Microsoft may still have majority share on the desktop here, but in my world they are losing ground fast to the benefits of open source software.

Who has switched in your world?

What's the minimum time it could take a serve a web page using a given Perl-based solution using CGI? What's the minimum amount of memory it would take?

To check the relative differences between several options listed below, I made the simplest possible "Hello World" page, and benchmarked the time and memory used to serve it. To create a base line, I also measured the results for an empty Perl script that just prints "Hello World".

The result summary is after the jump.

If it's not already clear, these benchmarks are not meant to represent real world applications or web servers. Of course, physics dictate that would be difficult of any application to perform any faster than these times, using the same systems and hardware.

System Version Startup* Memory**
Empty Script Perl 5.8.8 0.01s 4.7M
CGI::Simple 1.103 0.08s 5.5M
CGI.pm 3.42 0.11s 5.7M
CGI::Application 4.20 0.21s 6.0M
Mojo 0.8.9 0.29s 8.3M
Mojolicious 0.8.9 0.31s 8.8M
Titanium 1.0 0.14s 6.0M
Titanium w/Dispatch 1.0 0.39s 10.1M
Catalyst 5.8000_03 0.80s 13.5M
HTTP::Engine 0.0.18 1.50s 14.9M
HTTP::Engine/Shika 0.0.99_01 0.19s 6.6M
  • The "real" time in seconds as timed with time -p perl foo.pl on a 1.1 Ghz ThinkPad T23 laptop with Ubuntu 8.04 Linux. ** The virtual memory size when the script is about to exit, as timed by printing the output of this at the end of the script run: ps -o pid,user,rss,vsize,ucomm -C perl;

Let's go through some of the numbers.

CGI::Simple, a lighter-weight CGI.pm alternative turns in the best time and memory consumption. Before considering using it, note that it has a different interface in a few places, like file upload handling, and there are still some cases that CGI.pm can handle but it can't, like some kinds of file uploads.

Next there's much-maligned CGI.pm, still appearing here with a respectable second-fastest start time and low memory consumption. Both CGI::Simple and CGI.pm help process an HTTP request and prepare an HTTP response, which is functionality targeted by most of the options listed here.

CGI::Application is next on the list. It explicitly doesn't provide any of the functionality of CGI.pm itself, and I don't think CGI.pm would even be loaded for this bare-bones "Hello World" case. CGI::Application provides basic application structure, control flow, and a callback system for plugins. As a long time user, It's no surprise to confirm that CGi::Application is a very lightweight solution.

Mojo is next. It's functionality includes supporting several server backends and abstracting the HTTP server response and request. This is a bit similar to what CGI.pm does, since CGI.pm also supports mod_perl and FastCGI as well has helping with the HTTP request and response. (Of course the Mojo and CGI approaches couldn't be more different!). Mojo also provides a similar amount of functionality as HTTP::Engine. Mojo's result shows that it can be a rather lightweight solution itself, perhaps suitable for running in a CGI environment. Overall, the performance of Mojo is impressive here.

Next up we have Mojolicious and Titanium, which had very similar speed benchmarks. Mojolicious adds dispatching and a file-extension based templating system to Mojo, including the ability to serve static pages and a simple built-in templating system. It's perhaps similar to the amount of functionality that CGI::Application covers when combined with CGI.pm and HTML::Template, but again the approach is very different.

Of the group of systems tested here, I would say Titanium covers the largest scope of functionality, since it goes beyond usual framework offerings, including additional methods for managing database handles, config files, sessions, form validation and form filling. I often run Titanium-style apps under CGI with good performance. In a persistent environment, the smaller memory footprint would be a more significant win here, allowing more application processes to be run concurrently.

Second to last on the list we have Catalyst, the Perl framework which needs no introduction. It's slower startup time is not particularly considered a concern here, as it is not targeted at being deployed in CGI environments. That it uses a bit more memory does show some kind of additional overhead, although for many deployments a bit extra memory per-process won't matter.

Finally, taking far more time and still more memory, we have HTTP::Engine. Considering its scope of functionality competes with Mojo, this performance and memory overhead is disappointing. HTTP::Engine explicitly has a CGI interface, but taking 1.5 seconds and almost 15 megs of memory just to print "Hello World" is dismal. Update 12/1/2008- HTTP::Engine replaced "Moose" with "Shika" and massive speed and memory gains, apparently without any reduction in functionality. The run time of 0.19 seconds and memory usage of 6.6 megs is rather respectable.

I can already anticipate the feedback to this article saying that the startup times don't matter because much of this is compile-time work, which doesn't apply to persistent-environments, because compilation would only then happen once at server start-up time, and is not per-request. And there's the perception I expect to hear echo'ed back again: No one uses straight CGI anymore, right?

As a professional website developer with over a decade of experience, nearly all the web applications I've written have run in CGI, and have performed just fine this way. Why do anything more complicated, when the simple solution works?

A web framework is a bit like an operating system, it's there to prop up the layer where the real value-added work happens. For an OS, that's the application. For a web-app, that's the custom application logic. If your web application is going to be slow and hog memory it should because your custom application does that, not because the framework itself takes almost two seconds to boot up and uses 15 megabytes of memory before your application really gets started.

If your application is not expected to be high-traffic, several of the solutions mentioned here are candidates for running under CGI, saving the complexity and overhead of deploying in FastCGI or mod_perl. If you creating your applications with best practices like avoiding global variables, it will likely be easy to convert it to run under mod_perl or FastCGI later.

As a disclosure, some of the testing scripts I used are below. For Catalyst, Mojo, and Mojolicious, I used the default CGI scripts that came bundled with them.

CGI.pm

use CGI;
my $q = CGI->new;
print $q->header;
print "Hello from CGI\n"; 

CGI::Application

package HelloWorld;
use base 'CGI::Application';

sub setup {
    my $self = shift;
    $self->start_mode('hello_world');
    $self->run_modes([qw/hello_world/]);

}

sub hello_world {
    my $self = shift;
    return "Hello World\n".
}

HelloWorld->new->run;


# Titanium with Dispatching, dispatch.pl
use CGI::Application::Dispatch;
CGI::Application::Dispatch->dispatch(
    default => 'hello-world/hello_world',
);

Titanium with Dispatching, HelloWorld.pm

package HelloWorld;
use base 'Titanium';

sub setup {
    my $self = shift;
    $self->start_mode('hello_world');
    $self->run_modes([qw/hello_world/]);

}

sub hello_world {
    my $self = shift;
    return "Hello World\n";     
}
1;


# HTTP::Engine
use HTTP::Engine;
my $engine = HTTP::Engine->new(
      interface => {
          module => 'CGI',
          request_handler => 'main::handle_request',
      },
  );
$engine->run;

sub handle_request {
      my $req = shift;
      HTTP::Engine::Response->new( body => "Hello from HTTP::Engine\n\n");
}

Empty.pll

print "Hello World\n";

Both darcs and git support a --dry-run option for pull, but git 1.6 provides no details about what it's going to push. Take a look:

git push -v --dry-run git@github.com:markstos/foo.git
Pushing to git@github.com:markstos/foo.git
To git@github.com:markstos/foo.git
   b233f62..1eb8888  master -> master

Yes, git managed to repeat back to me the name of the remote repo not once, but twice, but didn't tell me a single thing about the patches that might be pushed. Now compare that with "darcs push --dry-run":

$ darcs push --dry-run --summary
Would push to "/home/mark/tmp/tmp2"...
Would push the following changes:
Thu Nov 13 22:24:40 EST 2008  m
  * My new patch name.

    M ./one.txt -2 +1

Making no changes:  this is a dry run.

darcs reports what's going in plain English, including the patch names and file change details. Even better, if I just used darcs push, I would have gotten an interactive prompt by default, with a built-in help system:

$ darcs push
Pushing to "/home/mark/tmp/tmp2"...
Thu Nov 13 22:24:40 EST 2008  m
  * My new patch name.
Shall I push this patch? (1/1)  [ynWsfvpxdaqjk], or ? for help:    

Using that, I can navigate between the patches, cherry pick which ones I want to push, and inspect each patch by reviewing just the file changes involved, or directly viewing the actual diffs.

The lack of an interactive "push" command for git is just one of the annoyances that cause me to continue to use darcs whenever I can, and git when I have to.