Sunday, April 3, 2011

Perl, Procrastination, and the CPAN

I just uploaded my first CPAN module today.

If you’re a technogeek like myself, that may mean something to you.  If you’re not, you’ll wonder what the hell I’m on about for the remainder of this post, so let me see if I can explain it to you.  (Some of this some of you will undoubtedly know, so please don’t get offended if I overexplain.)

So I’m a computer programmer, right?  And, when you program, you use a certain language: just as we humans speak different languages depending on when and where we’re raised, computers and programmers speak different languages too, and that often has to do with when and where they’re raised.  My first computer was a Commodore 64, and it only spoke BASIC, so I learned BASIC so we could communicate.  And then I learned that BASIC wasn’t the only language it spoke: in fact, BASIC wasn’t even its native tongue.  And all the misunderstandings we were having trying to talk to each other, and the reason it took my poor computer forever to figure out what I was trying to tell it to do was that it actually spoke 6502 Assembly (and, yes, I know the processor was actually a 6510, but the Assembly dialect was the same, that’s the important bit).  So I taught myself that.  This was around high school time.

Later, in between college and college (I took an extended break from college at one point), I ended up getting my first programming job, because one of my neighbors told my new boss that I was a “computer whiz.” She thought this, of course, because she too had a Commodore 64 which spoke BASIC, and her computer and I were on good terms.  My new boss was a salesman, and he had people to talk to computers for him, so he had no idea that being able to talk to one computer doesn’t mean you can talk to all computers, and he hired me.  Now here I was presented with an IBM 8088 PC, and I was told I had to speak to it in a new language: C.

So I taught myself C.  And then, later, I taught myself C++, which is kind of like going from Spansish to Portuguese: many of the words are spelt the same, but it sure doesn’t sound anything like it.  And I loved C++.  I loved the expressiveness of it, its elegance ... writing C++ was like writing a technical paper where you get to use all sorts of really big, impressive sounding words that you typically don’t use because normal human beings have no idea what they mean, but somehow when you’re writing for PhDs none of that matters.  You know, those five-syllable words that express a concept that otherwise would have taken a paragraph to explain.

But the problem is that writing that stuff is tedious.  Writing technical papers can be partially fun, but there’s also lots of boring bits: you have to define all your terms at the beginning, and you have to provide a detailed bibliography at the end, and in the middle there’s footnotes, footnotes, footnotes.  After a while, you wonder if you spend more time writing references than writing text.  In C++, it’s libraries instead of references, but the concept is very much the same.

Then I discovered Perl.  And the thing about Perl is, it was designed by a linguistics student, and linguistics students understand something that most people who write computer languages don’t: humans like to have different ways to say the same thing.  Most computer languages want everything to be very cut and dried, with exactly one way to say everything you can say.  This is convenient if you’re writing a compiler (that is, the computer program which changes a given computer language into the computer’s native tongue), but not so much when you’re a programmer.  In other words, most computer languages were created by translators, not speakers.

Translators hate ambiguity.  They hate connotation vs denotation, and abstract idioms, and subtleties of context, and nuances of conjugation, and all the other things that make it hard to say “this Mandarin phrase means exactly this in Italian”.  But speakers love ambiguity.  In English, I can give you a “gift,” or I can give you a “present.” What’s the difference?  Some people would say nothing.  Some people would find tons of very subtle differences.  If I know you, and I know that you know me, I could conceivably communicate worlds to you with my choice of words that have “identical” meanings.  And, sometimes, you’ll spend weeks worrying over what I meant by choosing this word instead of that one even when I didn’t mean for there to be any difference.  This is just part of the joy of language.

And Perl really gets that.  It’s a language with not only verbs and nouns, but adjectives and adverbs, with indirect objects, even, and, most importantly, with context.  How do you know the difference between “bat” and “bat”?  One’s a small flying mammal and one’s a hunk of wood you hit a baseball with, but how can you tell which one I’m talking about?  They sound the same.  They’re even spelled the same.  But, 99% of the time when I say “bat,” you’re going to know exactly which one I meant.  Because of context.  Perl’s got that, in spades.  In fact, many programmers hate it for that very reason.  Many programmers have fled to computers precisely because they didn’t like having to talk to people, where language was messy and easily misunderstood.  Reproducing that for the computer seems horrific to them.

But I’m an English major; I’m a wannabe writer.  Words and language are everything to me; their subtlety and beauty fascinate me.  So while I may be a technogeek programmer and I may have just as much desire as the next technogeek programmer to have my computer understand what I’m saying in a very precise manner, I also appreciate a computer language with context, a computer language where There’s More Than One Way To Do It (which, as it happens, is one of the unofficial mottos of Perl).

So I loved Perl from the moment I learned of it, but the real reason I switched to it from C++ is because all those footnotes and references and bibliographies that I’d spent ever so much time writing in C++ ... they were all written for me in Perl.  Oh, sure, they might not be written exactly as I’d have done them, but they’re close enough for a quick copy and paste.  If I need to tweak them a bit, I can do that: after all, I have the full text of the references right here.  And there’s a full set of references, formatted any way I can imagine, combined with other sets in any way I can imagine, set out according to as many different style guides as I can imagine.  Remember that in my analogy, “references” are libraries, and, in Perl, libraries almost all come from one place: the Comprehensive Perl Archive Network, or CPAN.

But how do they get there?  You can certainly imagine how wonderful writing research papers would be if practically every possible set of references you might need were precompiled for you and put into a giant collection for you to pore through and pick out the one that was perfect for you, but wouldn’t you wonder who had actually put them there?  Sure you would.

To really beat this analogy to death, let’s imagine that you’re a new graduate student.  You’re in the campus library, working on a research paper.  With you is your good friend who’s been a graduate student for years now, so she’s an old hand at writing research papers.  Imagine a conversation like this:

“Man, I don’t mind writing research papers, but doing all these bibliographies and junk is boring.”

“Why don’t you just use the RPRAC?”

“The what-prack??”

“The RPRAC.  You know, the Research Paper Reference Archive Collection.”

“WTF is an RPRAC?”

“It’s the ... look, I’ll just show you.  Here, look at this.”

“Hunh.  This is ... oh, that one’s nice.  Yeah, this is ... wow, I would’ve spent days ... holy crap!  This thing is awesome!!”

“Duh.”

“This’ll save me shitloads of time.  Where does all this stuff come from, anyway?”

“People like us.”

“Henh?  Whatchoo mean, ‘people like us’?”

“I mean, people like us, of course.  Anyone who wants to can add references in there.  Original ones, or ones derived from others in there.  I’ve got a couple in there myself; nothing too exciting, but they’re useful from time to time.”

“But ... well, if anyone can put stuff in, there must be lots of really crappy ones in here ...”

“Oh, sure, some of ’em are.  But not as many as you might think.  Look right there: your name’s on it.  So, if it’s crappy, everyone knows that you put some crap in the collection.  Poof! there goes your reputation.”

“Yeah, but some people won’t care about that.  Some people are just dicks.”

“True.  But most people do care.  And the ones who are dicks wouldn’t take the time and effort to add to the collection anyway.”

“Yeah, that’s another thing: why do people spend the time and effort.  I mean, these people are all graduate students, like us, right?  They’ve all got lives, and other papers to write, and families and shit ... why spend your free time putting crap in a book just so other people can save time?”

“Because some people just like being helpful.  Some people like to know their name’s in the book.  Some people like to show off their stuff in the book when they’re applying for grants, to show how good they are.  Some people just figure, hey, I spent all this time putting this together, seems a shame if other people have to start from scratch.  I think all those reasons come down to one thing, though: pride of ownership.  Here’s something you can point to and say: see, I contributed.  I gave something back.  And this is mine, and I’m proud of the work I did.  And I want everyone to see it.”

“Wow.  Maybe someday I’ll add something to the collection.”

“I’m sure you will.  Someday.”

So, do you see?  That’s what CPAN is to Perl programmers.  And I’ve been a Perl programmer for nearly 15 years, and I have used dozens—nay, hundreds—of CPAN modules to help me write my own programs faster and more easily and more efficiently.  And never once have I taken anything I’ve written and put it on the CPAN.  Until today.

In some ways, this is almost a rite of passage for a Perl programmer.  In many ways, I just now “became a man” in the great Perl tribe.  And, like so many rites of manhood, the first time you do this one is pretty terrifying.  And, after that, it’s no big deal.

I’m pretty sure the reason it took me so long to get something up there was just plain fear.  As I indicated in my imaginary conversation, there is a certain amount of useless crap on CPAN, but not nearly as much as you’d expect.  And part of the reason for that is your name gets forever associated with whatever you upload.  Like any facet of the Internet, whatever you put up there is there forever—even after it disappears from CPAN, it’ll still be on the BackPan.  Any time you apply for a Perl job, they will inevitably want to know what you’ve contributed, if anything.  And your fellow members of the Perl tribe, they shall know you by your CPAN modules.  Any person you meet online in a Perl capacity is going to judge you by what you have on CPAN, and how well written it is, and how well documented it is, and how popular it is, and how many stupid mistakes it contains.  They may not even mean to, really ... but they will.

So it’s definitely not the case that having something—anything—on CPAN is better than having nothing.  Having something stupid could well be worse than nothing, by a long shot.  It’s also not necessarily easy to get something onto CPAN.  Like any system run by volunteers, there are many different ways to do it.  If you want help figuring out how to do it, there are hundreds of guides out there ... and they all give you slightly different approaches.  You have to apply for an account, and you need to format your stuff properly or other people won’t be able to use it, and you may need to register a namespace, and you’re supposed to talk about the name of your module first so that everyone can tell you what a stupid name it is and what a much better one would be (I freely admit I skipped that step).  And then after you submit something, there is a huge network of people who (completely automatically) download your contribution and test it out, on different operating systems, using different versions of Perl, with different configurations, and, if anything doesn’t work right, they post it up on a web page so you (and, of course, everyone else in the world) can see it.  And, assuming you survive all that, there’s a rating system, so someone can still come along and tell you suck.

All in all, it’s rather daunting.

And I think I let my fear of not having it “just right” cripple me.  Supposedly Meg Whitman (former CEO of eBay, one of whose subsidiaries is thoughtful enough to provide my biweekly paycheck) was fond of saying “perfect is the enemy of good enough.” I’m not sure I agree with her 100%—I think often “good enough” is the enemy of “not going to collapse on you when you least expect it”—but in this case she really had me pegged.  As part of my CPAN upload, I have to provide a “Changes” file, which documents all the versions of the module on CPAN.  But, since I was making a big point in my module’s documentation that I’d been using this thing in production code for over 10 years now, I decided that I should provide the change history of my module even before it finally got to CPAN.  That meant scratching around in 2 or 3 different places, digging up historical data, and actually putting dates to all the changes I’d made in the past 10+ years.  And I also figured out the date when I first started the version that exists today, the version that I specfically built from the ground up to be my first CPAN release.  It was August 7th, 2008.

I’ll pause while you check the date of this post and do the math.

Yes, that’s right, it took me almost three years from the point I decided I would create a CPAN module to the point where I actually uploaded the first, neotonous, cautionarily designated as a developer release, version.  That’s just insane.  And, sure, I can make other, perfectly good excuses: I have two beautiful children that I love spending time with, things were going on at work, including the lauch of two major initiatives during that time that I was responsible for leading on the tech side, I was buying my first house, which was certainly something that demanded a lot of attention ... but you know, all that comes down “I didn’t have time,” and the only thing I didn’t have time for was getting things perfect.  Getting things good enough, I could have done months and months ago.

But now it’s done, and I have my first CPAN module.  And, you know: it wasn’t that hard.  Hopefully my next one will take much less time.

No comments:

Post a Comment