Rate this page del.icio.us  Digg slashdot StumbleUpon

How to write really good documentation:Semi-definite rules for the indefinite article

by Brian Forte

Here’s an easy way for a writer to start an argument with me, an editor. Include the following in your copy:

We’ve been looking for an hotel for hours: this place is open; the clerk isn’t armed or psychotic; and I’m stuffed. It’ll do.

This will start an argument because I will instantly edit the copy thus:

We’ve been looking for a hotel for hours: this place is open; the clerk isn’t armed or psychotic; and I’m stuffed. It’ll do.

And the ‘“a” or “an” in front of words beginning with h’ argument is on.

Except I’m right and–since I’m the editor in this little scenario–I’m going to win anyway.

That would make for a rather short column, however. So let’s look at this in a bit more depth.

The word ‘an’ dates to Old English (ie 1150 CE at the latest) and it started out meaning ‘one’. So ‘an apple’ meant ‘one apple’.

By 1300 the word had become proclitic, which is linguist-speak for a word that precedes another but is pronounced as if it is part of that following word.

It also stopped being the English word for ‘one’ and became the indefinite article.

(And, before anyone asks, yes ‘a’ or ‘an’ do occasionally get used to mean ‘one’. For example ‘make a choice’, meaning one choice out of however many are on offer. This use of the word isn’t directly related to the indefinite article’s origins as the Old English word for ‘one’, however.)

An indefinite article is a modifier to a noun. It turns the noun from a general label into a reference to a particular but indefinite example of the object.

‘A cat’ is any cat, but not all cats and not a particular cat.

‘The cat’ is that feline over there, torturing the shiny object hanging from the piece of string. (Which is why ‘the’ is called the definite article.)

Now, in 1066 William of Normandy invaded and conquered England. William and his army spoke Norman, a Langues d’oïl or Old French language. A language that didn’t pronounce ‘h’ at the beginning of any word.

Status conscious English-speakers, looking to get ahead in the new regime, rushed to learn Norman and began dropping their ‘aitches’ as often as was practical (and sometimes more often than that).

In 1714, George Louis, a German-speaking Duke from Hanover, became George I, King of England. This was mostly thanks to the Act of Settlement 1701 and the Act of Union 1707: the former put him in line for the throne by barring the Catholic James Francis Edward Stuart from succession; and the latter enabled him to jump from 52nd in the Protestant line to first).

George wasn’t a popular king, and his son, George II wasn’t much better liked. And George III is best remembered in the United States as the king they rebelled against in 1776, although in England he’s remembered best as the recurrently insane monarch (until his permanent relapse in 1811).

The Hanoverians did have an impact on the English language, however: they made German and things Germanic a sign of social status.

And the Germans always pronounce an initial ‘h’.

Which is why prescriptive grammarians of the 18th and 19th centuries (who were mostly concerned with educating upper-class English boys) ignored years of real-world English usage and started teaching complex rules about ‘an’ being the correct indefinite article if it preceded a word beginning with ‘h’ that came to English from German, regardless of how it was pronounced.

Make the rule complex enough and it doesn’t look like you’re actually teaching absurd and affected pronunciation designed to do nothing more than make class differences obvious as soon as people open their mouths.

Of course, they didn’t always get where words came from right. Hotel comes to English from 17th Century French, for example.

But that’s neither here nor there.

The core point remains. In English, the indefinite article has two forms, ‘a’ and ‘an’, because we English speakers don’t routinely pronounce two vowel sounds in a row.

Consequently, if the indefinite article precedes a word beginning with a vowel sound it is spelled and pronounced ‘an’. If the indefinite article precedes a word beginning with a consonant sound, it is spelled and pronounced ‘a’.

And any exception to or formalization of that rule is a pointless effort to maintain a speech-based class distinction that those promulgating the rule aren’t party to and that hasn’t had much meaning for 100 years in any event.

Which means, getting back to the sentence which started all this: if you’re trying to tell me (your editor) that the speaker has an accent that routinely drops the initial ‘h’ (eg a Cockney or East London accent), show me:

We’ve been looking for an ’otel for ’ours: this place is open; the clerk isn’t armed or psychotic; and I’m stuffed. It’ll do.

I’ll leave that alone (except to suggest ’stuffed’ as a colloquialism for ‘tired’ is more Australian usage than it is British).

Bringing this to technical writing, things get trickier.

How do you pronounce an initialism like HTML?

I was taught English in public Australian schools of the 1970s. So I was taught aitch rather than haitch. Which means I pronounce ‘HTML’ with an initial vowel sound and I write ‘an HTML page.’

If I’d gone to a private Irish Catholic school, however, I would have been taught haitch and would, naturally enough, think ‘a HTML page’ is correct.

More generally, haitch is standard in Hiberno-English and is a way for disputing Protestant and Catholic Northern Irelanders to distinguish themselves from each other.

So, if I insist on ‘an HTML page’ I’m telling 4.5 million English speakers their way of writing and speaking is wrong, or non-standard at the very least. And I can’t reveal accent by writing ‘an ’TML page’ because it’s a technical document, not a novel.

And what about FAQs?

I pronounce this as an initialism and write or talk about ‘an FAQ’. I’ve heard people pronounce it as a word, however, and they write or talk about ‘a FAQ’ (”a fack”).

Or, consider the following, from the forthcoming Red Hat Enterprise Linux Deployment Guide:

When the startx command is executed, it searches for an .xinitrc file in the user’s home directory to define the desktop environment and possibly other X client applications to run.

As written, it requires the word ‘.xinitrc’ to be pronounced with a silent dot (so to speak). Is that the accepted pronunciation?

For myself, I always pronounce the initial full stop of a ‘dot file’ mostly because so many dot files are user-specific examples of a general or default configuration file found elsewhere.

The .xinitrc file, for example, lives in ~/. If it doesn’t exist, startx defaults to using the settings found in the file xinitrc in /etc/X11/xinit/. Pronouncing the initial ‘dot’ makes it clear–at least to me–which version of the file I’m referring to.

In all these cases, it’s not a simple matter of applying a style guide.

Technical writing–perhaps more than any other sort of writing–gets read and used by people from every corner of the Anglophonic world. And people don’t get less sensitive to perceived slights or the appearance of cultural insensitivity because it’s a manual or help page.

If anything, they’re more sensitive in such a circumstance. If you’re reading a manual to learn how to use a piece of software you don’t really care about or that is throwing errors you don’t understand, a hint that the manual was written by some insensitive clod who can’t be bothered to get the English ‘right’ is just another reason to be ticked off at the company that sold you the package.

There aren’t easy answers to these questions, unfortunately. If a pronunciation is clearly regional (like the haitch of Irish English), putting resources into a regionally-specific version of your documentation is the best (albeit time-consuming) solution.

In other cases (for example the question of how to pronounce ‘FAQ’, a word which came into English via a text-only medium), I can only suggest being consistent throughout your public documentation and keeping a close ear out for changes or trends in real world usage.

32 responses to “How to write really good documentation:Semi-definite rules for the indefinite article”

  1. Paul W. Frields says:

    Great article! I can tell you what our standard has been for Fedora Documentation: When using an acronym, the use of the indefinite article is dictated by the expanded form of the acronym. So in our writings, you would find “a HTML page” since that expands to “a Hypertext Markup Language page.” This avoids pesky arguments about whether “FAQ” is pronounced “fack” or “eff-ay-kyoo.”

  2. Stephen Smoogen says:

    Very interesting.. I got a split education on English. One from my parents and grandparents who were from England, and the other from a school system that taught that the proper contraction for “I am not” is “I ain’t”. I have given up trying to work out any rules for spelling and grammar as I have too many conflicting rules in my head.. [after 10 years of getting D’s in English for spelling colour wrong.]

  3. Nathan Oyler says:

    Enjoyable read.

  4. Antoine says:

    I really enjoyed reading your article. I never considered that the lowly indefinite article might be the cause of perceived cultural insensitivity. I’m going to stick with “an html” though as it’s by far the most standard usage.

  5. Luke Meyer says:

    Great read - wasn’t aware a/an had been used for class distinctions. Nice how it ties into tech doc as well…

  6. Ben says:

    I’m a little confused. You explained that the Normans didn’t pronounce the ‘h’, so status-conscious English speakers began to drop the ‘h’. And since that makes hotel start with a vowel sound, they would say “an ‘otel”.

    But then you explained that the Hanoverian dynasty caused status-seeking speakers to pronounce every ‘h’, since that is the German way. So then hotel would start with a consonant sound. It would therefore follow for them to say “a hotel”, but as you explain, instead they created a rule that ‘an’ should precede a word beginning with ‘h’ if it is of German origin, regardless of the pronunciation. But you give no reason for the sudden disregard for pronunciation. Is that more German, to have more double consonants, or were the grammarians just evil?

  7. James Quirk says:

    While I thoroughly enjoyed your article, I have
    to say that I also found it quite depressing.
    I have no beef with the gist of your argument,
    although given my surname, I naturally defer to
    Sir Radolph Quirk in grammatical matters. Nor do
    I particularly mind your historical inaccuracies;
    I grew up in the city of Chester which did not
    fall to William the Conqueror until 1069, a full
    three years after the battle of Hastings. And
    while I come from good Catholic stock,
    with four Irish grandparents, I am not unduly
    perturbed to find that I should be using “haitch'’ instead of
    “aitch.'’ No. The root of my depression is
    that your article completely fails to recognize that
    “really good documentation'’ is not just
    a question of English grammar.

    The notion that software is written in one corner and
    documented in another, was challenged by Knuth with
    his introduction of literate-programming.
    Admittedly WEB was too klunky for mainstream use, but if you
    take a look at:

    http://www.amrita-ebook.org/doc/fold::markup

    you will see just how far accepted documentation
    practices fall short of what is possible with
    with an electronic format such as PDF. Moreover,
    with Adobe soon to unleash Apollo, it should soon
    become clear to the proverbial “blind man on
    a galloping horse'’ that the time has come for
    the software community to reassess what
    it considers to be good documentation. But given
    that old, bad habits die hard, I am probably being
    a tad optimistic on that score.

  8. Nigel Colhoun says:

    Interesting stuff.
    An hotel really bugs me and so does Haitch.
    In old English “What” was spelt “Hwæt”, it rhymed with cat
    the Normans mucked up that.

  9. Robin Broomfield says:

    Superb. Didn’t know there was an American(?) that cared so much about the english language!

  10. John Wilson says:

    Best comment comes, I think from George Bernard Shaw (who you recall was Irish). I can’t remember his exact words, but they were approximately “The English hate their language, and will not teach their children to speak it correctly.”

    ‘Best’ in this context does not necessarily mean ‘most accurate’, of course.

  11. Rebecca Fernandez says:

    Robin,
    If memory serves me well, Brian is an Aussie.

  12. Nora says:

    At long last, I’m validated. When referring to the letter ‘h’, I’ve always said ‘haitch’, though raised in Canada, where everyone says ‘aitch’. My rationale: that many times, as this article points out, words that begin with ‘h’ have a breathy consonant sound, not just a vowel sound. Like hotel. Or house. Or happy.

  13. Lancelot says:

    That’s why we need translators >.^
    Great detailed article though

  14. Dave Hunt says:

    English speakers are very adept at accommodating outrageous dissimilarities in pronunciation for identical spellings. Pronouncing the ‘h’ in ‘house’ versus not pronouncing the ‘h’ in ‘honor/honour’ is just not an issue for English-first-language speakers.

    We (English speakers) certainly have no issue with ‘one’ (pronounced ‘won’) versus ‘onerous’ (pronounced ‘on-er-us’), despite both words beginning with the same three letters.

    Perhaps the most egregious of the neuroses of English pronunciation are these 6 different pronunciations of the four-letter series ‘ough’:

    bough (rhymes with ‘now’)
    cough (rhymes with ‘off’)
    rough (rhymes with ‘cuff’)
    though (ryymes with ‘toe’)
    through (rhymes with ‘too’)
    thought (rymes with ‘hot’)

    What can I say?

  15. Bill Cernansky says:

    Nitpick: “thought” doesn’t always rhyme with “hot” outside of the U.S.

  16. Marvin says:

    Next topic idea: Ending a sentence with a preposition.

  17. Douglas Pollock says:

    I’m guessing others pointed out the following error:

    So, if I insist on ‘an HTML page’ I’m telling 4.5 million English speakers their way of writing and speaking ARE wrong, or non-standard at the very least.

    Oops?

  18. Alan Thomas says:

    Very amusing, I wish someone would make this point to the BBC who still stick with “an hotel” despite Sir Ernest Arthur Gowers describing this usage as archaic, right back in the 1930s.

  19. Mike Weber says:

    Interesting article, thanks!

    Aren’t the apostrophes before “an ‘hotel for ‘hours” supposed to point the other way, same as in “o’clock?” Ever since it was introduced, that silly “smart quotes” feature has been messing up the apostrophe of initial omission in “class of ‘80″ and the like.

    I’m also surprised your initial sentence has semicolons instead of commas. It would be interesting to see you do a column on punctuation. :-)

  20. Brian Forte says:

    Robin Broomfield wrote:

    Didn’t know there was an American(?) that cared so much about the english language!

    And Rebecca Fernandez replied:

    If memory serves me well, Brian is an Aussie.

    Just to confirm Rebecca’s memory is serving her well — and re-iterated the passing point above about 1970s Australian public schools — I am, indeed, an Aussie.

  21. Brian Forte says:

    Ben wrote:

    I’m a little confused. You explained that the Normans
    didn’t pronounce the ‘h’, so status-conscious English
    speakers began to drop the ‘h’. And since that makes
    hotel start with a vowel sound, they would say “an
    ‘otel”.

    But then you explained that the Hanoverian dynasty caused
    status-seeking speakers to pronounce every ‘h’, since
    that is the German way. So then hotel would start with a
    consonant sound. It would therefore follow for them to
    say “a hotel”, but as you explain, instead they created a
    rule that ‘an’ should precede a word beginning with ‘h’
    if it is of German origin, regardless of the
    pronunciation. But you give no reason for the sudden
    disregard for pronunciation. Is that more German, to have
    more double consonants, or were the grammarians just evil?

    Not evil, just clueless.

    The rules they sought to impose regarding things like the definite article were mostly an effort to impose formal grammatical rules on English not dis-similar to the formal rules supposedly intrinsic to Latin.

    Ignoring the key point that English is weakly inflected where Latin is strongly inflected, Latin is no more able to be formally described by rules than any natural language.

    This didn’t stop grammarians from trying to impose Latinesque rules on Modern English Roughly speaking, Modern English is the English spoken since 1700 or threabouts.

    (The chief culprit in all this, BTW, was Robert Lowth, who’s 1762 book ‘A Short Introduction to English Grammar’ is a continuing bane of editorial life today.)

    It’s from these efforts that we get all sorts of pointless rules — eg don’t split infinitives; don’t end sentences with prepositions; and don’t use ‘they’ as the singular third-person pronoun — that produce pointless argument and equally pointless anxiety hundreds of years after they were promulgated for no especially good reason in the first place.

    The effort to formalise the ‘a or an’ before a noun’ usage into a rule rather than simply describe it was just another example of this.

  22. Brian Forte says:

    Mark Weber wrote:

    Aren’t the apostrophes before “an ‘hotel for ‘hours” supposed to point the other way, same as in “o’clock?”

    Yes.

    Ever since it was introduced, that silly “smart quotes” feature has been messing up the apostrophe of initial omission in ‘class of ‘80″ and the like.

    I’m not sure how far back you’re thinking here. So far as I’m aware, the first ’smart quotes’ algorithm was David Dunham’s, implemented in his miniWRITER desk accessory application for the Macintosh back in 1986: http://poppyware.com/dunham/smartQuotes.html

    The first application I personally used with a Smart Quotes capability was WriteNow, written by John Anderson and Bill Tschumy and an example of Dunham’s algorithm in use.

    One of the things I liked about Dunham’s algorithm — at least as implemented in old Macintosh apps like WriteNow — was the ease with which I could over-ride it and put a closing quote at the beginning of a string if necessary (eg “’otel”).

    Smart quote algorithms for web-based writing, however, are more problematic, mostly because they have no way of providing a writer a simple way of over-riding the algorithm while writing.

    (And, no, being able to insert entities in your text while writing doesn’t count as simple. Seeing a string like “’otel” in the midst of a paragraph is more likely to produce an error than it is anything else.)

    I’ve no direct knowledge of what’s being used here at Red Hat Magazine (this could be considered a hint to anyone on the editorial team to chime in here) but I suspect it’s the Texturize engine built-in to the WordPress publishing system which is running the site.

    Like every smart quotes engine I’m aware of, Texturize doesn’t know how to distinquish between an opening quote and an apostrophe denoting a leading contraction (the same character as a closing quote).

  23. Brian Forte says:

    Douglas Pollock wrote:

    I’m guessing others pointed out the following error:

    So, if I insist on ‘an HTML page’ I’m telling 4.5 million English speakers their way of writing and speaking ARE wrong, or non-standard at the very least.

    Oops?

    Oops indeed. That’s an example of a still-visible edit: at some stage I re-wrote the sentence, and I’ve not properly edited the old version out of existence.

    Thanks for noting the error. I’ll see if I can get it corrected.

  24. Terry says:

    Nitpick: “thought” doesn’t always rhyme with “hot” outside of the U.S.

    I suspect that “thought” doesn’t rhyme with “hot” in most places in the US either — at least not exactly. I think “hot” was used as a closer approximation. I know I can’t think of any word that does not end with “-ought” that rhymes with thought. (It might be interesting to ask English speaking folks around the Web how they pronounce some of those words …)

    I think the point is taken though. :-)

  25. Hugh says:

    I agree with Terry. I have traveled all over the US and met folks from all over the US. Never have I heard “thought” spoken to rhyme with “hot”; however, “thought” does rhyme with “aught” and “nought”.

    Cheers,

  26. Rebecca Fernandez says:

    Hugh,
    It’s highly regional. I grew up in Western Pennsylvania, and “hot” does indeed rhyme with “thought” there. And “pool” and “pull” are homonyms in that area, believe it or not.

    If you travel just a few hours to Eastern Pennsylvania, however, “hot” and “thought” have distinctly different vowel sounds.

    Thus is the beauty of linguistic variation.

  27. Majortom says:

    I too insist on “an HTML document”. It is proper as I learned English in Cleveland, OH in the 1970-early 80s.
    Today too many English classes have been dumbed down to use the current slang of the day.
    I am taking college course (again) and run into various manager’s and teachers. I am surprised at the lack of English skills, writing ability, and just plain old being able to put together a coherent thought.
    So when you write “an HTML page”, I doubt if anyone will notice at all.
    -T-

  28. Dave says:

    Can I just point out that with an East London accent it’s far more likely to be:

    “We’ve been looking for an ’otel for ’ours: this place is open; the clerk ain’t armed or psyco; and I’m knackered. It’ll do.”

    :-D

  29. Walter Kruse says:

    I’m also looking forward to future articles in this series !

    I had a similar problem: “an SQL query” or “a SQL query”. Some people say “es-q-el”, in which case it is “an”. Others (notably people working with Microsoft’s RDBMS) say “sequel”, in which case it should be “a”.

  30. Brian Forte says:

    James Quirk wrote:

    While I thoroughly enjoyed your article, I have to say that I
    also found it quite depressing. I have no beef with the gist
    of your argument, although given my surname, I naturally
    defer to Sir Randolph Quirk in grammatical matters.

    Sir Randolph is a favourite of mine as well. And, it’s worth noting his advocacy of a descriptive approach to English usage doesn’t make him an enemy of formal or Standard English.

    When he got involved in developing England’s national English curriculum in the late-1980s he ‘introducing more serious attention to vocabulary, in the course of exposing the misplaced disdain for Standard English affected by many in the educational establishment.

    Nor do I particularly mind your historical inaccuracies; I
    grew up in the city of Chester which did not fall to William
    the Conqueror until 1069, a full three years after the battle
    of Hastings.

    I could argue the invader’s perspective here. The Battle of Hastings, fought on 14th October 1066, saw the defeat and death of Harold, making William the king by conquest.

    From then on, all resistance to the new order was seen as the mediaeval equivalent to insurgency.

    But, it’s more accurate for me to cop being a bit fast with dates. After the Battle of Hastings, the south of England fell into Norman hands fairly quickly, even given the Saxon resistance at the Bridge of Southwark.

    Further north things weren’t so simple, with resistance and full-blown uprising continuing for some years (eg Hereward the Wake’s uprising in 1070).

    And while I come from good Catholic stock, with four Irish
    grandparents, I am not unduly perturbed to find that I should
    be using “haitch” instead of “aitch”.

    Anglo-Irish, perchance?

    No. The root of my depression is that your article completely
    fails to recognize that “really good documentation” is not
    just a question of English grammar.

    I agree entirely: good documentation is not ‘just a question of English grammar’. I think my previous column, Four Rules and an Axiom, makes it clear there are other things to consider.

    That said, all else being equal, clear, accessible English makes for better documentation.

    The notion that software is written in one corner and
    documented in another, was challenged by Knuth with his
    introduction of literate-programming.

    And I disagree with Knuth, rather strongly, as it turns out.

  31. Oisin Feeley says:

    A wonderful article. I found it amusing because I’m originally from Dublin, where I said “haitch” which my Canadian wife (from the very Scots/Irish maritime provinces) found odd. My family listened to too much BBC Radio 4.

    I moved eventually to Quebec where francophones have obviously made an unconscious note that one of the oddities of English is the presence of a distinct “h” sound. As a result it’s not uncommon to hear someone saying “hevery” instead of “every”.

    Fowler’s _Modern English Usage_ (which I like because it’s usually very level-headed) advises:
    “[…] _an_ was formerly usual before an unaccented syllable beginning with h and is still often seen and heard (_an historian, an hotel, an hysterial scene, an hereditary title, an habitual offender_). But now that the h in such words is pronounced the distinction has become anomalous and will no doubt disappear in time. Meantime speakers who like to say _an_ should not try to have it both ways by aspirating the h.”

  32. Kobe chan says:

    Just a note about your definition of ‘definite’ and ‘indefinite’ articles.

    The definite articles is used when the referent is known, familiar or identifiable to both the speaker and hearer, e.g.

    A: John bought the car
    B: Great, it ran well when we tested it!

    The indefinite article is used when the speaker assumes that the hearer does not know/ is not familiar with/ cannot identify the referent, e.g.

    A: John bought a car
    B: Oh really, what model?

    You see in both cases a particular reference is being made, after all if John bought a car, then surely the speaker is referring only to one particular car (and not to the class or set of cars). Definiteness is a complex issue across languages and there has been much debate on this topic (see Lyons (1999) ‘On Definiteness’ for a good summary on it).

Leave a reply