Thursday, December 18, 2008

scope problems

Recently Disaronno has been running commercials for different cocktails you can make with their liqueur. The two I've seen recently are "Disaronno on the rocks with milk" and "Disaronno on the rocks with ginger ale". These strike me as very odd. It's not that you can't have a mixed drink "on the rocks". While I most readily associate the phrase "on the rocks" with straight liquor (viz., scotch), it's quite common to order a margarita on the rocks, or a manhattan on the rocks. The problem for me lies with scope.

"Scope" refers to how much of a given sentence or phrase a word modifies. For instance, the phrase "dirty blond hair" could mean either someone with blond hair which is rather darker than "blond" hair ([[dirty blond]hair]), or someone with blond hair who hasn't showered in a while ([dirty [blond hair]]). In the Disaronno commericals, I think my problem is that to me a drink is on the rocks or it isn't. You can have a scotch on the rocks with a twist, but not a scotch on the rocks with soda. The latter would be a scotch and soda on the rocks. Likewise, "Disaronno on the rocks with milk" annoys me, because I feel like "on the rocks" should have scope over the entire drink, not just the liqueur. "Disaronno with milk on the rocks" is fine, but the way they phrase it clashes with my usage.

P.S. I'll be out of town for winter break for the next two weeks, so the next new post will be 1/5.

Monday, December 15, 2008

the weakness of h

I'm in the midst of writing a paper on stop aspiration in Navajo, which is never regular glottal aspiration like we have in English, but instead palatalized, velarized, and/or labialized. One of the issues I'm grappling with is why the glottal fricative h is a fairly weak phoneme. I have the intuition that it is, both as a speaker of a language that has /h/, and by looking cross-linguistically at languages that have an orthographic h (and thus an historical h) but not a pronounced h, Spanish in particular.

Spanish has the letter h in its orthography, but it is never pronounced, and Spanish speakers are often unable to correctly pronounce the glottal h of English, instead substituting the velar fricative [x] found in Spanish (represented orthographically by j and sometimes g). British English has h-less dialects and h-dropping. I'm a bit puzzled by the prescriptive rule for using the article "an" before a word beginning with h, since I've been told h-dropping in British English is fairly low-class, and thus I fail to see how this pronunciation was immortalized in our rules for good writing.

These facts, combined with the relative rarity of h cross-linguistically when compared to stops or other fricatives, have given me the impression that h is indeed weakly represented somehow, most likely because it is more difficult to perceive clearly than other fricatives. However, I've so far been unable to find any good literature on the matter. I suppose I can always posit the weakness of h myself, but it's always preferable to back up one's own opinions with citations.

Thursday, December 11, 2008

VP ellipsis gone wrong

I've posted about VP ellipsis before, which is where we leave out the verb phrase in a series of wider phrases when it can be recovered from context (or in fact from any context). Wikipedia supplies the example of I always tell Mary to do the dishes, but she never does, where the elliptical phrase is "do the dishes", i.e., she never does [do the dishes]. The type of example I'm specifically referring to is something like I can and will pass this exam, where we have two coordinated IP's headed by can and will, and the VP is left out of the first for the sake of not being redundant. It would sound strange to say I can pass this exam and I will pass this exam.

Sometimes this can go rather wrong, as it did on one of the recent applications I was filling out for Ph.D. programs. The question was asking what outside fellowships I had applied to, or was planning on applying to. The exact wording was "fellowships you have or will apply to". The reason this fails is because the VP's in this case aren't the same: have applied to vs. will apply to. So normally we would be hesitant to leave out the full VP's, because otherwise the immediate interpretation is "have apply to or will apply to" which is thoroughly ungrammatical. Of course the meaning can be recovered, but it's still quite odd.

Friday, December 5, 2008

syllabic /s/ in Blackfoot

Recently I've begun reworking a paper I presented the Algonquian conference this year so that it'll be in decent shape when the time comes to submit it for the proceedings in January. The paper is all about analyzing the status of the phoneme /s/ in Blackfoot, mostly in Optimality Theory. So in this post I thought I would present some of the evidence I use to claim that Blackfoot has a syllabic /s/.

This claim was (as far as I know) first seriously taken up by Donald Derrick a few years ago (though Don Frantz mentions that he has always assumed Blackfoot to have a syllabic /s/). I recap much of his evidence in my paper, because I find it very telling. Among the data he presents is the use of [ss] as a clapping unit by some speakers (I say some because this has not been reported by all investigations in the Blackfoot phonology). For instance, if I asked you to divide Minnesota into "units" of some type, you would most likely clap out Min-ne-so-ta. Likewise, if you ask a Blackfoot speaker to clap out a word like moapsspi, they would most likely clap out mo-a-pss-pi. The idea that non-vocalic syllable nuclei are pronounceable is pretty foreign to English speakers, even though we do it all the time: shhhhhh!

Derrick also points out (and I've backed this up with my own analysis) that the Blackfoot syllable is maximally simple if we assume syllabic /s/. This is desirable because it would be exceedingly odd for a language with as few sounds as Blackfoot to have syllable structure as complex as Blackfoot does without positing syllabic /s/. Once we treat [ss] as a syllable nucleus, however, the Blackfoot syllable template becomes maximally simple.

In my paper I also point to the fact that Blackfoot does not allow onset geminates (i.e., long consonants are divided between 2 syllables, e.g.,, yet [ss] appears in many places where it cannot be ambisyllabic. I need to look into this more, since until recently I was unaware that onset geminates had even been posited for certain languages (I assumed they were a phonological impossibility, and this may change some of my analysis).

The final small piece of evidence is that fact that /s/ acts weird in many other contexts, so why not syllable nuclei? It's the only phoneme that can form complex onsets, and Blackfoot has several Cs affricates (at least /ts/ and /ks/, and possibly also /ps/). So /s/ clearly has a special status in Blackfoot even without the claim of syllabicity.

Wednesday, November 26, 2008


Just to be clear, the apostrophe in the title has no semantic content; I'm pluralizing BFFL (according to MLA style, which is what I usually use). The reference, for those who aren't aware, is to a recent commercial that I keep seeing everywhere concerning mothers and daughters recollecting about barbie dolls. At one point the daughter in the commercial describes two dolls as "BFFL's -- best friend for lifes". This strikes me as odd, as I'm guessing it does many people. The oddness here comes from the scope of plural morphology. While possessive morphology attaches to a NP ("a friend of mine's cat", not "a friend's of mine cat"), plural morphology attaches to the head N ("friends of mine", not "friend of mines"). The reason behind this NP pluralization could be for two reasons: (1) reanalysis of the plural morpheme as attaching to a NP rather than a N, or (2) reanalysis of "best friend for life" as a single noun rather than a NP. My bet would be on (2), since the acronym BFFL is a single noun for all intents and purposes, which could in turn lead to an analysis of the entire phrase "best friend for life" as a single noun. Something similar has happened with "passer-by" and "mother-in-law"; many people would pluralize these as "passer-bys" and "mother-in-laws" because these set phrases have been reanalyzed as simple nouns. Others (myself included) still treat these as noun phrases, and thus pluralize them as "passers-by" and "mothers-in-law".

Friday, October 31, 2008

Still busy

Mostly I just wanted to post to say I still won't be posting much for the next month or so. This past month in addition to my teaching and taking classes I was working on presentations for two conferences, an abstract submission for a conference in March, and my application for the Javits fellowship, as well as trying to get funding for said conferences. In the next few weeks I'll be finishing up my NSF fellowship application as well as two other conference presentations. Once I'm finished with those I get to turn my attention to my thesis prospectus, my five Ph.D. program applications, and two papers I'm submitting for conference proceedings. So it'll probably be a while before I'm posting regularly again.

Since I don't have time for a detailed thought-out post, I'll simply pose a question today:

Does the existence of a phonemic tonal system in a language rule out the possibility of a prosodic stress system in that language?

Thursday, September 18, 2008

papers for conference proceedings

Though I think so few people read this that I have little need of apologies, I thought I would make available to the public the reason there was no post on Monday: I was finishing up two papers which were due for conference proceedings that day. I won't get into detail about them, but I thought I would offer a short description of each.

Neologisms in Indigenous Languages of North America

A neologism is a new word created to name a new concept, often using productive morphology. For instance, "computer" is a neologism in English. In English we borrow --a LOT--, and most of our technical terms come essentially wholesale from Latin or Greek. However, Native American langauges are different. The reason I chose this topic was because I kept noticing that their names for new (esp. European) concepts weren't borrowings, but rather descriptive words or phrases (e.g., the Blackfoot word for 'car' means "it starts moving without apparent cause"). So I decided to investigate further and hopefully prove what I had a hunch was true: American languages coin new words much more often than they borrow words or expand the semantic scope of existing words. In the end, this did indeed turn out to be true. I also discovered an interesting trend: for animals, the trend didn't hold. In that category words were slightly more likely to borrow (though it wasn't a statistically significant difference).

Irrealis in Blackfoot (with Leora Bar-el)

The term "irrealis" is used to sentences that refer to the world other than how it is. The most typical irrealis contexts are conditionals and counterfactuals (e.g., "If I had a million dollars... [but I don't]"), but also can include imperatives, future, negation, and several other situations. Our goal was to investigat whether it makes sense to say that Blackfoot has irrealis as a grammatical category. Some languages clearly do. In Caddo, a Caddoan language spoken in Oklahoma, they use a different set of person prefixes depending on whether the context is realis or irrealis. English does not seem to have irrealis as a grammatical category, because we treat many different irrealis contexts in different ways (compares imperatives, negation, questions, conditionals, and counterfactuals -- you won't find any striking morphological or syntactic similarities as we do in Caddo). Our conclusion was that Blackfoot indeed lacks a grammatical category irrealis because no irrealis contexts are marked in a similar manner except for yes/no questions and negative statements. Mithun (1999) claims that minimally we would expect conditionals and counterfactuals to pattern together if irrealis has any real status in a language. Since this isn't true in Blackfoot (and for several other reasons), we concluded that Blackfoot lacks irrealis as a true grammatical category.

Thursday, September 11, 2008

the four-way morphological typology of languages

I'm taking a typology class this semester, so I thought I would post on the four-way morphological typology typically employed when discussing languages. Typology is concerned with the limited number of patterns that languages use, and how the use of these patterns (especially which are most common) tells us about things which are universal in language. One way of categorizing languages is according to morphology (how languages put together words). Two indeces are typically used: the index of synthesis, which refers to how many morphemes are contained in a typical word, and the index of fusion, which refers to how segmentable morphemes are and how transparent the morphophonological changes are.

On the index of synthesis, we have two poles: isolating and polysynthetic. An isolating language typically has one morpheme per word (i.e., there is a separate word for every grammatical function, e.g., Chinese or Vietnamese). A polysynthetic language typically has many morphemes per word, and entire sentences/complete thoughts are a single word (e.g., Blackfoot). As an example, the Blackfoot word kitakitamatsinopoao(a), which is used as "goodbye", literally translates as "You (pl.) and I will see each other again". Sometimes this is classification broken down further, either into synthetic (1-3 morphemes per word) vs. polysynthetic (4 or more morphemes per word), or into synthetic/polysynthetic (many morphemes, but only one lexical root) vs. incorporating (words have multiple lexical roots, e.g., Chukchi).

The index of fusion also has two poles: agglutinative and fusional (or inflectional). Agglutinative languages have many morphemes in a word, but each morpheme contributes only one grammatical meaning, and each morpheme is clearly segmented, e.g., Turkish. English, when it uses multiple morphemes in a word, is usually agglutinative. "Wonderfully" is easily segmented into wonder-ful-ly, and each morpheme contributes a single meaning. Fusional languages, on the other hand, tend to use fewer morphemes per word because each morpheme contributes multiple grammatical meanings, e.g., Russian or Spanish. In Spanish, the -o in "hablo" contributes the meanings "1st person", "singular", "present", and "indicative mood". It's a single sound, so it's not possible to segment it at all; it simply has all those meanings rolled into one sound.

Now, of course there are essentially no language that fit neatly into one category or another (including the languages I cited as example in each category), which is why we organize the four traits into sliding scales rather than leaving them as strict categories. Some languages are more analytic, some or more synthetic. Some languages are more agglutinative, while some are more fusional.

Monday, September 8, 2008

footnotes v. endnotes

I'm in the process of preparing a paper for submission to the International Journal of American Linguistis (or IJAL as we affectionately call it), and one of their more annoying requirements is the use of endnotes rather than footnotes. As an author, I don't much care (though footnotes are a little easier to deal with because I don't have to scroll back to where I was to begin with). However, as a reader, I hate endnotes. If I want to know what's being referenced, I have to either flip to the last page every time I see one, or else leave that page out (an annoying requirement as I typically don't bind articles, and put the page I've just read behind the stack of paper). I'm sure many people don't read them at all. As a reader, I say that's fine, but I'm of the persuasion that if the author thought it necessary to include, it's probably valuable to read. As an author, it worries me, since I put information that's sometimes vital to the interpretation of my text in notes. Footnoting a sentence or two doesn't mean it's of no value; it simply means that the note doesn't flow correctly in that spot of the text. Thus the two possible solutions: (1) put the information in the main body of the text, or (2) leave out the information as unnecessary, are often neither one an option.

I suppose it's not very linguistics-related, but I had to get it off my chest. Oh, and they have to be double-spaced as well. I understand requiring that for the typesetter, but for initial manuscript submissions before the paper's even been accepted? Seems unnecessary. What was a twenty-page paper with normal margins, notes, and spacing is quickly becoming a forty-page paper, dangerously close to IJAL's upper limit of fifty pages.

Thursday, September 4, 2008

Is future a tense?

I'm currently taking a semantics seminar on tense and aspect, and yesterday we briefly touched on the sticky topic of whether future is truly a tense. The idea is that the boundary between future and irrealis (roughly, counterfactual and potential events) is fuzzy at best, and many claim nonexistent. This is because unlike the past and present, the future isn't set in stone, and thus there is the question of whether or not a statement in the future tense can ever have any kind of truth value. The main questions here are:

(1) Do statements about the future have truth values?

(2) Do irrealis statements have truth values?

Obviously, if the answer to (1) is yes and the answer to (2) is no, then future cannot logically be irrealis. However, there is good reason to say that future statements cannot have a truth value (or that they have a conditional truth value).

At the heart of the matter is what we mean when we say "John will arrive at 3:00 tomorrow". Is that truly a future tense statement? Is it the same as "John arrived at 3:00 yesterday"? One answer is no, that people mean "I believe John will arrive at 3:00 tomorrow", or "John is scheduled to arrive at 3:00 tomorrow". However, I firmly believe that in some statements from some people, there truly is a future tense. After all, we can negate the future: "It's not going to rain tomorrow". Conservatively, we can reduce that negative statement to a negative statement of belief, but I'm not convinced that's how we practically use the future (note that I'm not just talking about English here, but all langauges that have future marking that differs from irrealis marking). Logically, future is irrealis, but people don't speak logically. So taking a logical standpoint that when someone says "It's going to rain tomorrow" they cannot logically know the fact, that it is a statement of belief or prediction, is not necessarily valid or relevant.

If we compare the truth values for various irrealis contexts, we find that they differ significantly from the future. Conditional statements are evaluated by A --> B (I'm using --> to mean "then"), i.e., a conditional statement is true if A U B (U being the symbolic logical symbol for "and") or ~A (~ being the symbolic logical symbol for "not"). Counterfactuals have a similar truth value, but with the added given that A is not true, e.g., "If I had a million dollars, I'd be rich (but I don't have a million dollars)". Imperatives have no truth value: you can't say "that's not true" if I tell you to shut up (though you can respond that you are not talking, since imperatives presuppose that whatever state is demanded is not currently in existence, in this case, that you are not shutting up). Interrogatives I take to have a (vacuous) truth value, because if I say "Is it raining?" I am asserting that either it is raining or it is not, yielding an logical entailment of A v ~A (where v is the logical operator for "or"), which is always true.

On the other hand (to me, at least) future statements simply offer a single simple assertion: A, e.g., "It will rain tomorrow", and can easily be evaluated, even if not at the present moment. Likewise, past and present statements also give the simple assertion A, without any conditions or complex interactions. Since language and logic are so often not intertwined in any meaningful way, I haven't yet decided if this kind of analysis is at all useful, but it's a start.

Thursday, August 28, 2008

2nd person plural

The second person plural is one of those funny things in English that doesn't exist even though we want it to. Virtually every regional dialect has a 2pl. form (y'all, youns) even though standard English lacks it (except perhaps for the periphrastic "you guys", which is what I use). And that's because, gosh darnit, it's useful. Sure, context can often disambiguate whether one is addressing an individual or a group, but redundancy is a part of language, and in this case it's not even always redundant. More interesting is the emergence of what could be described as a tripartite plural pattern in Southern English: you (2sg.), y'all (2pl., small group), all y'all (larger group). It's not really a singular/dual/plural distinction, because I don't think anyone restricts the use of y'all to only two people, but there are people for whom there is a small plural/larger plural distinction between "y'all" and "all y'all". It just goes to show that if a language doesn't work in the way speakers want it to work, they make it work. After all, communication is the reason behind language, and we want to be able to accurately communicate what we want to.

Monday, August 25, 2008

The difficulty of working with a language one hasn't researched

Some of the best advice I got from one of my professors was to submit abstracts to conferences even if you don't have a paper written on the topic. If it's accepted, you can write the paper, and if not, save all that work for another time. It's essentially the same idea as getting a record deal based on a demo, instead of spending all that time and money recording an entire album that may or may not get picked up.

Of course, there are difficulties to this, the most notable being that it's pretty easy to get in over your head. I was recently working on an abstract about stress in Navajo for submission to the High Desert Linguistics Society conference in November. My basic idea was that stop aspiration in Navajo was dependent on stress, and I was going to figure out how. The problem was that I'd never done any theoretical work on Navajo before (with the exception of an abstract on stop aspiration in Navajo). So each miniscule aspect of the language I had to research. While I've studied Navajo a little from a textbook, I don't speak the language at all, and since it was a language textbook, it didn't use any theoretical or linguistic terminology. Instead of having any background knowledge, whenever I had a question about a certain rule or pattern, I'd have to go research it myself. And since Navajo isn't Indo-European, many times I'd simply have to do the research myself, however cursorily.

I highly recommend submitting abstracts even when the paper isn't written. It's one thing when one is doing an outside research project and writes up finding for that. But for most grad students, we're just trying to get ourselves into research and publishing, and generally don't have mountains of self-produced data to wade through for paper topics. So this about the best we can do, and I don't think that's so bad.

(Bonus trivia: I saw a copy of the book "Black Like Me" with an odd font that was squished together and I believe lacked uppercase; I'm so used to seeing the -eme ending in words like "phoneme", "sememe", etc., that I immediately interpreted the title as "Black Likeme".)

Thursday, August 14, 2008

another reduction in post frequency

Things are starting to pick up as the school year begins, and rather than lapse into an unpredictable and erratic posting schedule (viz., I was working so late yesterday that I forgot to post here), I'm going to go to twice weekly updates on Monday and Thursday. So the next new post will be Monday 8/18.

Thanks to all those who are reading and commenting.

Monday, August 11, 2008


The subjunctive is just one of many historical aspects of English that are falling by the wayside. An example most people will recognize (and probably the only instance where the subjunctive would be commonly used) is saying "if I were" (subjunctive) rather than "if I was" (indicative).

Disclaimer: I prefer to use the subjunctive. I always use it where it is appropriate. It annoys me when people don't use the subjunctive. However, there is nothing "wrong" with using the indicative rather than the subjunctive. Many people did not acquire the use of the subjunctive when they acquired English. It is baseless and ridiculous to call the non-use of the subjunctive "improper English".

The subjunctive generally indicates situations that are counter-factual (contrary to fact, e.g., "I wish I were a millionaire"), conditional ("...whether it be/be it Communism, Capitalism, or some other economic structure..."), or, in certain rote phrases, future or nonaffirmative ("'til death do us part"). It was this last usage that caught me off guard, because I've never consciously analyzed that phrase, so familiar from wedding ceremonies. Upon reflection, I realized it must be the subjunctive (which in this case is signified by the bare form of the verb "do"), even though I can't think of a single productive instance (cf. *until he go to the store).

Bonus trivia: "If he were in the room" is counter-factual, and implies that he is not, while "If he be in the room" is a true conditional, implying that it is uncertain whether or not he is in the room (not that I've ever heard this used).

Saturday, August 2, 2008

next new post 8/11

After spending all day finishing an abstract for the LSA conference in January, I'm taking a bit of a much-needed break. Next new post will be Monday 8/11.

Wednesday, July 30, 2008

odd stress placement

When I was on hold recently I was constantly told to "Please remain on the line. One of our customer care representatives will be with you momentarily." Almost immediately I noticed that the sentence was stressed very strangely (and this was an actual recorded voice, not a computer generated message). I would put a pause between those two sentences, making them two separate utterances for prosodic purposes. For the first I would put a primary accent on "please" and a secondary accent on "line". For the second I would put a primary accent on "care" and a secondary accent on "with". However, this was not at all the case in this recording.

Instead, the speaker did not have a pause or any kind of intonational reset where the orthographic period is. She seemed to parse it into "Please remain on the line one of our customer. Care representatives will be with you momentarily." (At least, that's how I would represent orthographically the prosodic pattern she used.) In the first "sentence", primary accent was on "please" (no surprise there), but the secondary accent was on "our". In the second "sentence", the primary accent was on the third syllable of "representatives", while the secondary accent was on "with". Adding further to the oddity was the fact that the first intonational phrase fit perfectly into 3/4 time, complete with minor accents on the first beat, with "please" taking two beats: "Please -- re-/ main on the / line one of /our customer", and then of course it started to break down. But I found it exceedingly odd that, like myself, the speaker parsed the utterance into two intonational phrases, but that her phrases did not correlate in any way with meaning or clausal structure. There has to be some sort of flagrant alignment violation here, and I don't like it one bit. Luckily I'll never have to deal with it again, since I was on hold to cancel my account.

Monday, July 28, 2008

another Jay Leno headline

Another humorous "headline" from the Tonight Show recently was the misspelling "hors devours" instead of "hors d'oeuvres". This is what is referred to as Cupertino (another linguistic term generated by Language Log), or a computer generated incorrection. Many people use spell checkers on their work, and some have automated spell checkers to replace misspelled words. Problems arise when the words are not actually misspelled, but rather unknown to the spell checking dictionary. If this were a human based error, we'd expect something closer to the original. Certainly no one, no matter how confused by spelling, would think "hors devours" is the correct spelling, especially since the got "hors" right despite its two silent graphemes. And let's be honest, who can remember how to spell "hors d'oeuvre"? The only reason I can manage to do so is because I'm aware of the French grapheme "oe", which keeps me from mixing up the order of the vowels, and because the metathesis (switching of segment order) of v and r in French loans is not uncommon (cf. Brett Favre).

(I'll refrain from discussing the addition of English plural morphology on a word that is already plural in the original language.)

Friday, July 25, 2008

Low humor

A lot of humorous signage and announcements arise from our preference for low attachment (attaching a NP to the lowest possible node, or in a more linear sense, having it complement the most recent verb, preposition, etc.) I heard a nice one a little while ago on Jay Leno's Headlines segment on the Tonight Show. It was a wedding announcement that mentioned the couple's traditional Hawai'ian wedding, complete with "the blowing of the conch shell and Hawai'ian minister". The intended reading is [[the blowing [of [the conch shell]]] and [Hawai'ian minister]], with two separate NP's joined by the conjunction "and". The humorous reading results from the fact that we don't want to attach "Hawai'ian minister" high up on the tree where it is sister to "the blowing of the conch shell". We want to attach it way down low to "blowing", giving us something like [[the blowing [of [the conch shell] and [Hawai'ian minister]]]].

Wednesday, July 23, 2008

new ECM verbs

Prescriptivists seem to be troubled by the advent of new and innovative ECM verbs. First, a little background. ECM stands for Extraneous Case Marking, and applies to verbs like "believe" or "expect" which take a CP complement (CP = Complementizer Phrase; in standard prescriptivist grammar essentially any complete clause) but assign accusative case to the subject of that clause. To quote my syntax teacher's example, "Max expects Maria to word letters carefully." In this example we have what is essentially a complete clause as the complement of "expects" (with the exception of the infinitive verb "to word", but that's outside the scope of this post). "Maria" is clearly the subject. Yet if we replace "Maria" with a pronoun, we're going to choose "her", not "she". This means that the NP in subject position of the complement clause is being assigned accusative case. How the heck is this possible? "Expect" is an ECM verb! It can assign case across a CP boundary (something verbs generally aren't supposed to be able to do).

Nowadays, though, it seems that lowly prepositions are taking CP complements (or has this always been the case?). Many of us have heard or uttered something like "I was surprised by them winning the race". Prescriptively, of course, this is "wrong". It should be "I was surprised by their winning the race", where "their winning the race" is a NP versus the CP of "them winning the race". Clearly there's something going on here, though, because plenty of people say things with this structure. My wager would probably be on the analysis of "be surprised by s.t." as a single verb, and then giving that verb ECM marking. Try as I might I can't think of very many good examples of this construction, even though I hear it all the time, so I may post a follow up later.

Monday, July 21, 2008

The logicians have lost.

As some people interested in English history know, there was a movement a few hundred years ago to make English a logical language, in the sense of obeying the rules of predicate logic. One example of this is the attempted eradication of double negation. In symbolic logic, ~p means "it is not the case that p is true" regardless of what p is. It could be a whole series of statements, and the one negation negates them all. This was not true in English until the logicians made it so. Many people still use double negation, but now it's a marked variant, a non-standard dialect. Another example is the use of the nominative case for verbs of being, i.e., "It is I". Logically, the logicians said, the copula there ("is") represents "=", and thus the word preceding and following it should have the same case. It seems, however, that ultimately the logicians have lost.

The token that really brought this home to me is the Godspeed You! Black Emperor song "motherfucker = redeemer". I was puzzling over what the song title could mean, and realized that one reading that was certainly not possible was that every motherfucker is a redeemer and that ever redeemer is a motherfucker, which in logic and math is what the symbol "=" is used for. 2+4=6 is a truth, no matter how you look at it. However, using the equal sign in English generally means that the left hand item is equivalent to the right hand item, but not vice versa. The copula works the same way. If I say "Computational linguists are jerks", I don't mean that every jerk is a computational linguist, but I probably mean that every computational linguist is a jerk (I didn't say it was an accurate statement; it's just an example). Sorry, logic. Natural languages don't really like you.

Friday, July 18, 2008

The scope of "next"

When I say "next Tuesday" I mean the Tuesday of next week. So if today's Monday, I don't mean tomorrow. Similarly, if I say "next Saturday", I mean the Saturday of next week. On Monday, I don't mean the day five days from now, I mean the day twelve days from now. I've found that this is not true of everyone, and that there's quite a split in how people perceive this usage. It seems the two main interpretations are "the next X that occurs" and "the X of next week". So to some people, saying "next Saturday" means "the next Saturday that occurs" which may often be the Saturday of this week. On the other hand, to people like me, "next Saturday" always means the Saturday of next week; using the word "next" cannot refer to any day this week. Needless to say, this causes problems.

The OED gives us, under the entry for "next":

Applied (without preceding the) to days of the week, with either the current day or (in later use; orig. Sc.) the current week as the implicit point of reference.
Thus (for example) next Friday may mean ‘the soonest Friday after today’ or ‘the Friday of the coming week’. The latter may be indicated contextually, e.g. by contrast with this, but it is not always clear which meaning is intended.

So the key question here is what kind of scope "next" has (here I don't really mean semantic scope so much as temporal scope). For some people, the frame of reference is the day, for some the week. It seems the key distinction is that last sentence from the OED quote: people who distinguish "this Friday", "next Friday" are going to use the week as the frame of reference, whereas for someone who doesn't use "this X" for days of the week isn't going to have any kind of week association with the word "next"; it will mean what it means in ordinary speech, i.e., the next X that occurs, without any intervening time.

Thursday, July 17, 2008

When good coarticulations go bad

In English we have two /l/ sounds, commonly called the "light" or "clear" l and the "dark" l. They exist in complementary distribution, with the light l in onsets and the dark l in codas (and syllable nuclei, though in these cases the l is underlyingly in coda position). The clear l is a regular lateral approximant, with the tip of the tongue resting on the teeth or alveolar ridge, depending on pronunciation, and the dark l identical in apical (tip) placement, but with co-occurring velar constriction by the lamina (the blade of the tongue).

However, Tom Brokaw for some reason just doesn't like clear l's. Not only does he pronounce all his l's (including those in onset position) dark, he doesn't even articulate the apical feature of the sound, instead using only the back of his tongue for the velar articulation, resulting in what can sound at times like a French "r" or Arabic "gh".

On a side note, my understanding of the two different l's in English was an extremely important step in my pronunciation of Spanish, which only has clear l's. Try it yourself: say "lamp" and "awl". The former is a clear l, the only l in most languages. The second is a dark l.

Wednesday, July 16, 2008

British h-dropping

I don't know a whole lot about British pronunciation, but my impression was that h-dropping (i.e., failing to pronounce the h at the beginning of a word) is considered "low class", and is not present in RP (Received Pronunciation, "the Queen's English"). It is also my understanding that RP is the standard dialect used in broadcasting, public speaking, etc., or at least it was until recently (I seem to remember John Wells blogging about Estuary English overtaking RP in public settings in the past 10-20 years). However, I noticed in a news clip from the 70's or 80's that the newscaster failed to pronounce his h's. There was one particular example that caught my ear. The clip was about Gary Glitter, a pop star from a few decades ago, who was arrested. The newscaster said, "Gary Glitter 'as been arrested." In rapid speech I probably wouldn't even have noticed the h-dropping (after all, even in American English we would probably drop the h in that situation), except that in the newscaster's non-rhotic dialect the "r" at the end of "Glitter" jumped out at me. Since he pronounced the "r", I had to surmise it was in onset position, which means there couldn't have been an h.

So I guess my question is for any British English speakers, or anyone else who knows: what's the deal with h-dropping? Is it common among broadcast speech?

Tuesday, July 15, 2008

More on alignment constraints

It seems that "jurisdiction" can be syllabified in two ways: juris.diction or juri.sdiction (leaving aside the other two syllable boundaries). The first is what I thought to be my own pronunciation (as it turns out that's only the way I perceive my own pronunciation because I perceive the morpheme boundary between "juris" and "diction"), while the second is probably the common way people pronounce the word. The NoCoda constraint strikes again! Alignment constraints want us to align morpheme boundaries with syllable boundaries, and since "jurisdiction" comes from a combination of Latin juris (the genitive of jus, 'law') and dictio (from dicere, 'say, speak'), theoretically the syllable boundary should be between the "s" and the "d". However, the NoCoda constraint is ranked high enough in English that we would rather sacrifice alignment than have a coda in the preceding syllable. We also have a lower ranking Ident-IO(vc) (input and output segments should have the same value for [voice]) constraint, since we would rather say jur.i.stic.tion than attempt the unwieldy jur.i.sdic.tion.

Monday, July 14, 2008


I've been keeping up with a show called "The Next Food Network Star", in which a number of cooks compete for their own show on the food network. In one of the episodes the contestants had to create and market their own pre-packaged food product. One of the contestants chose to make a chocolate sauce containing cherries and cognac, and marketed it as "Cherri-gac", pronounced ˈtʃeɹiˌjæk. What I thought was interesting about the spelling is the perception of the "g" and /j/, even though it comes before the "n". The original French pronunciation would be koɲɒk, with a probable American phonemicization of kɒnjæk, unless the speaker really has a palatal nasal in their idiolect.

Thus the "gn" sequence in "cognac" is interpreted as a phonetic [nj] sequence, and apparently it didn't bother this contestant that the [j] sound comes after the "n" while the orthographic "g" comes before the "n". I thought this was rather strange because (if I can try to remember back before I started being interested in orthography and pronunciation) I think my original interpretation of "gn" sequences in French and Italian was that the "g" was silent, and the palatalization of the "n" was just a quirk of those words in those languages. Clearly this is not the only way people view that digraph. Since he associated the "g" with [j], it made sense to him to spell his product as he did.

Friday, July 11, 2008

Another reason for universal linguistics education

To the phonetician, nothing is more amusing that seeing a band that has decided to bring umlauts into its name. It makes sense. Linguistics is such an understudied field that the very name of the field is (at least on one occasion in my own experience) confused with a type of pasta. To the layperson, umlauts, accents, and other diacritics are merely decorations. Sure, they know that in some arcane science they have some meaning, but nothing that they need to pay attention to. I think we should all go around pronouncing these ridiculous band names as they should be pronounced according to the orthography. It might turn some heads to pronounce Mötley Crüe as møtli kɹy (or perhaps as møtɬi crye), or Mötörhead as møɾøɹhɛd or møtørhead.

Thursday, July 10, 2008

Sszark the Burning

As anyone who has played the video game Diablo II can attest, the best part is the names of the unique characters. These include the venerable Frozenstein, Puke Pus the Sharp, and one that hadn't caught my eye until recently: Sszark the Burning. The real question here is, how do the game designers want us to pronounce this? Is that a long "s" at the beginning, followed by a [z]? Maybe the "ss" is a syllable nucleus, as can be the case in Blackfoot: ss.zark. Or is it an "s" followed by the Hungarian "sz" for /s/, giving a long s cluster at the beginning: ssaɹk. It's a mystery to me, but either way it's a great name.

Wednesday, July 9, 2008

Why we need linguistics education for all

In last month's issue of Cigar Aficionado, while reading an article on cachaca (Brazil's national liquor of choice, distilled from sugar cane but by a different method than rum), this gem of pronunciation advice caught my eye, concerning the three different c's in the word:

The first is hard, the second is the soft blend of "ch" as in "chagrin," and the third is a combination of "s" and "z" like the "c" in "facade": ka-SHAH-sa.

The first description is well-known to any speaker of English; we often talk about "hard" c's (/k/) and "soft" c's (/s/). The second description is a little confusing ("blend" of what? "c" and "h"?), but fairly transparent with the example word. The third description, however, really threw me for a loop. A combination of "s" and "z"? They seem to have picked the one feature of phonetics that truly is on or off, without any gradations (yes, there are several types of voicing, but it all comes down to either the vocal folds are vibrating or they aren't). It's clear enough what sounds they mean : /s/, as evidenced by the sample word "facade." But what the heck were they trying to describe by saying that this normal "s" sound is somewhere between /s/ and /z/? Maybe unbeknownst to me everyone else pronounces "facade" with a breathy voiced "z".

Tuesday, July 8, 2008

I hate lazy spammers

If you're going to phish for important information, at least get your syntax right. I was told recently in an email, "Verify your account now to avoid it closed." The intent of the message is obvious, and skitters just on the edge of being grammatical (maybe "avoid it being closed"?), but clearly this is an inexpert English user rather than a typing impaired spammer. Come on guys, have pride in what you're doing, even if that's trying to trick people in giving up their credit card information.

Any guesses on the writer's native language?

Monday, July 7, 2008

literally, technically

Semantic bleaching is something that happens all the time. In fact, I just used it: I don't mean that at every infinitesimal moment in time semantic bleaching is occurring; I merely mean that it is not uncommon to see its effects. Essentially semantic bleaching is the lessening of the force of a word. For instance, even "extremely" these days doesn't mean much. People have an interesting way of dealing with this. One of the strategies I hear most often is to use "literally," as in, "I could literally eat a horse I'm so hungry." Clearly this person does not mean that they desire to sit down and consume an entire horse. What they mean is that we use hyperbole so much in everyday speech that even saying one could eat a horse does not express the extreme hunger that person is experiencing. Hence the use of literally. My professor Tony Mattina proposed that people use "literally" essentially as the opposite of what it actually means: to mean "metaphorically". However, I think the difference is more subtle. Not all the usages I hear are strictly metaphorical. Rather, I think people are using the word "literally" not to mean literally, but simply as an intensifier.

Another (and I believe related) difference in word usage is with the word "technically". On a linguistics forum I frequent a new member posed a question about "technically". He recently voted for candidate A because he did not want candidate B to win. Thus in his mind he was technically voting against candidate B, rather than for candidate A. However, his friend argued with him, saying that since technically it is not possible to cast a vote against a candidate, he technically voted for candidate A. My take on this is that the friend is the correct one, at least in the strictest usage of the word. To me, "technically" refers to procedure, i.e., what is objectively happening at any given moment. Thus if John shoots Bill, technically all he's doing is pulling the trigger of a gun. Of course, if you believe in the slippery slope argument, where do we draw the line? Perhaps I should really say John's brain is firing electrical impulses that cause his index finger to contract.

Thursday, July 3, 2008

4th of July weekend

No new posts until Monday because of the 4th of July weekend. I will be busy smoking a butt of pork and drinking mint juleps, but fear not, I'm sure I'll still be thinking about Optimality Theory.

Wednesday, July 2, 2008

Any X is X who X

I was struck by an odd (to my ear) phrasing by Richard Gere is a preview for what looks to be a terrible romantic comedy/drama: "Any man is a fool who doesn't appreciate you." I'm sure I've heard this construction before, but it definitely rubs me the wrong way. The problem is the isolation of the relative clause from the NP it modifies. I would always phrase such a thought as "Any X who X is X." Searching for "any * is * who *" on Google returns several hits of the same phrasing, so clearly it's not a rare construction. After careful consideration, I'm unable to make heads nor tails of it syntactically. The incorrect interpretation would be the following:

With "any" the interpretation doesn't make much sense, but replace it with "every" and there could be real ambiguity, at least in print. The only way I could see generating kind of construction syntactically is some sort of movement, where the CP starts out under the NP "any man" and then moves lower down. This is also a good example of the cognitive preference for low attachment, i.e., we want the CP "who does not appreciate you" to be attached to the lower NP, not the higher one the speaker wants us to attach it to.

Tuesday, July 1, 2008

Noun Incorporation

My wife brought to my attention a spectacular example of noun incorporation in pop culture: twinfanticipating. Apparently it was used to describe Angelina Jolie, who is currently pregnant with twins. It is, of course, a blend of three words: twin, infant, and anticipating. We couldn't agree on how to pronounce it. I opted for ˌtwɪnfənˈtɪsɪpejtɪŋ, preserving the stress of twin, infant, and anticipating, but using the vowel quality in infant for the second vowel, rather than the vowel quality of anticipating. My wife, on the other hand, insists that it should be ˌtwɪnfænˈtɪsɪpejtɪŋ, with the same conservation of stress, but using the vowel quality of anticipating rather than infant. My argument is that anticipating already has four syllables to itself, so infant should at least get that second syllable, even if it has to share both of them.

As for the word itself, the meaning is fairly obvious: anticipating twin infants. It is also a prime example of true noun incorporation. One of the hallmarks of true noun incorporation is that it decreases the valency of the verb. While non-verbal noun incorporation is debatable (where's the line between true incorporation and compounding, or are they the same?), verbal noun incorporation always includes a decrease in verbal valency, i.e., the number of arguments the verb takes decreases. In this case, the original phrase "anticipating twin infants" has a valency of 2: the anticipator and the twins. However, "twinfanticipating" has a valency of 1: only the anticipator is involved. We see this kind of incorporation all the time in Salishan and Wakashan languages, in which, for example, you could have a verb that refers to buying meat, so that one would say something like "I did some meat-buying this weekend." "Meat" ceases to become a separate argument of the verb and instead becomes incorporated inside the verb as a bound morpheme that contributes meaning to the specificity of the verb. Recently I've heard many English speakers do this as well: "online-shop" instead of "shop online." More interestingly, I was told by a waitress to "overlook" their list of daily specials when I was on my way to a conference a couple months ago (she meant for me to look over the list, not overlook it).

Monday, June 30, 2008

Looking out after someone

From the fertile linguistic field of sitcoms comes another expression that struck my ear as odd: "They're just looking out after you" (from an episode of Friends). A quick check of Google turns up the following results:

  • "look after": 15,500,000
  • "look out after": 282,000 (11 out of the first 30 results are in the sense of "look after", or 37%)
  • "looking after": 7,620,000
  • "looking out after": 157,000 (25 out of the first 30 results are in the sense of "looking after", or 83%)

So what does this mean? It definitely means the line I heard was not an isolated occurrence or malapropism. Clearly the writers or the actor intended that phrase. What puzzles me more than the phrase itself is the large discrepancy between the results using "look" and "looking". However, I'm guessing the discrepancy is not actually that high, but rather it's easier to find results of "...look-out, after" and "...look out, after" than it is to find the corresponding phrases with the progressive form. Still, to me "looking out after" sounds decidedly weird. Perhaps it's a mixture of "look out for" and "look after". As much as a quick search of Shakespeare can tell us, the expression seems to be new rather than original, since "look after" turns up 6 hits in Shakespeare, whereas "look out after" returns none.

Friday, June 27, 2008

Nasal weakening

Languages seem not to like nasals. They're usually forced to assimilate in place of articulation to a following stop, sometimes even a fricative, and often they're dropped entirely (which leads to nasalized vowels as separate phonemes, e.g., French or Brazilian Portuguese sãõ). In some languages they are dropped entirely (an areal feature of a group of 5 languages from the Chimakuan, Wakashan, and Salishan families of the northwest coast of North America). Even in languages with strong nasal functional load, nasal consonants are often weakened in some way. In American English the flapping rule applies not only to /t/ (bʌɾəɹ for "butter"), but also to /n/ (ɛɾ͂i for "any"). In Portuguese (at least Brazilian Portuguese), the palatal nasal (represented orthographically by the digraph "nh") is often reduced to a nasalized vowel + palatal glide, so that a word like "minhas" comes out as mĩjas.

If we think of this in terms of Optimality Theory (which, honestly, is what I'm doing always, even in non-linguistic things like traffic patterns and evolutionary biology), we can talk about two constraints: Exp(ressivity) (a sort of catch-all constraint I've been using half-seriously, with a definition something like "language should be able to express a speaker's thought accurately") and *Obs(truction) (something like "sounds should obstruct the vocal tract as little as possible"). The desire for expressivity is clearly what drives the robust distinctions between consonants after thousands of years. Though some will dispute it, there is an undeniable tendency for languages to simplify, espeically phonologically. This is how "want to" becomes "wanna," how "going to" becomes "gonna," and further how "I'm going to" becomes "I'ma." What holds back this march of simplicity is the Exp constraint. Simplify things too much, and people won't be able to express themselves properly, at least not without long strings of the same consonants/vowels.

So what languages seem to do is reduce nasal whenever possible. The reason this doesn't usually turn into nasal deletion or any other radical change is the necessity to use language as an expressive tool.

Thursday, June 26, 2008

Pre-glottalization in British English

As an American, one of the most striking phonetic features of British English for me is the pre-glottalization of voiceless stops, often with concurrent aspiration of a word-final stop, as in the word "caught": kɒʔtʰ. In some dialects pre-glottalization becomes replacement, as in the classic example of Cockney "bottle" as bɒʔəl. Since complete glottal closure is part of the articulation of a stop, whether or not a stop is pre-glottalized can be difficult to tell, especially in rapid speech. Teasing out the glottal closure with the articulation of the stop itself to make it a separate phonetic segment takes time and articulatory energy, and thus it is much reduced in rapid speech. Listening to a paragraph at the Speech Accent Archive (my discussion will be a lot clearer if you click on that first link and take a look at the paragraph yourself), I felt I had to disagree with the transcribers decisions in transcribing "thick," "snack," and "meet," which to be have a hint of pre-glottalization.

Much has been written of pre-glottalization, most of it by Frederik Kortlandt, but there seems to have been little attention paid to what this does to syllable structure. There seems to be a tendency to interpret a glottal stop as syllable-final, and any stops after it get thrown into the onset of the following syllable. For instance, in the paragraph mentioned above, "meet her" comes out more like miʔ.tʰɜ than mi(ʔ)t.hɜ. We can see a clear difference with the word "snake," which has no pre-glottalization. In the sequence "snake and," we end up with snejk.ʔæn(d), rather than snej(ʔ).kæn(d). An argument I made in a presentation this year was that it may be possible for word-final stops to be syllabified separately, especially after a glottal stop (I took my data from Blackfoot, but English does this quite commonly as well). As my colleague Tim Henry explained to me, the reason stops have so much force at the end of words is that without aspiration or a following vowel, it is exceedingly difficult to perceive the stop's place of articulation.

Wednesday, June 25, 2008

talk balk

I ran across an example of, for lack of a better term, a typographic echo a while ago: "talk balk" for "talk back." Certainly this could merely be a typographic error. Under the influence of "talk," the typist all but duplicated the word by changing the "c" in "back" to an "l." However, I do not think this is the case. In fact, it was difficult for me to even type the phrase "talk balk" (I ended up with "baclk" at first), because it isn't a phrase that's ever used, as opposed to "talk back." More likely this is the kind of error someone would make in speech, and since (at least for me) phonology strongly influences typing errors in people who are fairly fluent typists, and it was translated to the screen without the speaker/typist noticing.

There is clearly a cross-linguistic tendency for sounds to want to assimilate to their neighbors. Almost every language has a nasal assimilation rule that prevents clusters like nk, instead turning these into ŋk clusters, even across morphological boundaries (ɪŋkəmplit for "incomplete) and sometimes even word boundaries (ɪŋ kamən for "in common"). These preferences can also skip over segments, as in vowel harmony or the famous tongue twister "She sells seashells by the seashore." I think that may be the most likely explanation for the error "talk balk."

Tuesday, June 24, 2008

Verbing in American English

If there's one thing we love to do in English, it's verbing. In case my usage isn't wholly transparent, I mean creating verbs out of nouns simply by reclassifying them, as opposed to applying any derivational morphology. The title of this post refers to American English because that's my dialect, but I expect it's widespread in the English language, and for all I know many others. Recent examples that come to mind specifically have to do with web sites: to google something, to mapquest something, to youtube something. We love to simply take a noun (especially a proper noun) and just use it as a verb with no special morphology. Case in point: just now one of the how to's of the day on my iGoogle page was "How to network." In fact, it's a little difficult to think of network not being a verb, but it resulted from the noun. Thus the verb is "to make something a network." The OED records the first instance of network as a noun in 1530, with the computer usage coming about in 1962. The verb network, on the other hand, first appeared in 1845, with the computer usage coming about in 1982. In both cases the noun appeared, followed later by the verb, even though in the case of the computer usage, the use of network as a verb was already well-attested.

The example that prompted this post was my use of JSTOR (an online repository of scholarly articles) a few weeks ago. I was searching for something which returned no results, and as a helpful tip JSTOR told me that I may have gotten no results because my search "may have been ANDed instead of ORed." Here we have even lowly conjunctions being used as verbs; JSTOR was telling me that the AND operator was probably used instead of the OR operator, resulting in far fewer hits.

The ease with which we do this sort of thing raises the question of how different nouns and verbs really are at the underlying level in the lexicon. There are plenty of papers on the noun/verb debate in Salishan and Wakashan languages of the American Northwest, in which many words can be used as either noun or verb merely by applying noun morphology or verb morphology (NB: I do not mean adding derivational morphology to derive a noun or a verb, I mean simply adding tense/aspect/mood inflection or person/number inflection). Of course, in English we don't even deal with morphology. The difference is indicated entirely in the syntax and semantics.

Monday, June 23, 2008

putting the light verb phrase to work

I ran across a construction recently that struck me as decidedly odd, in reference to some new medication: "it can cause you side effects." Granted, this wasn't in an advertisement or official medical literature, but it clearly made sense to the person typing it. For me this is a decidedly marked construction, even though we use it all the time in other contexts, e.g., "I gave her the book." I'm not at all versed in modern syntax, but my understanding is that it is the so-called light verb phrase that is at work here. The idea is that IP immediately dominates this vP, the bar-level category of which immediately dominates a regular VP. The verb is then base-generated in the V node, but moves up to the v node, as shown in the following (generated using RSyntaxTree by Yoichiro Hasebe):

Of course, knowing all this still doesn't tell us why the construction is odd. I think the reason is probably because the verb "cause" simply doesn't usually have a valency this high. Often it takes a clausal complement ("I caused him to drop the ball"), and when it doesn't it usually has a single internal argument ("This drug can cause side effects"). When I try to rephrase "cause you side effects" I don't really get anywhere: cause side effects to/in/for you. Any of the prepositions still sound marked to me. So I think this is simply a case where the verb manages to have a higher valency for someone else than it does for me. If my idiolect allowed "cause" to have a third argument, I don't think I'd find the double-object construction odd. But since it doesn't, I do.

Sunday, June 22, 2008

the NoCoda constraint

As I mentioned last week, there is a cross-linguistic tendency to assign as many consonants as possible to an onset when dealing with a consonant cluster (review: syllables have three parts - the onset, which consists of the initial consonant or consonant cluster, the nucleus, which consists of the vowel or diphthong in the syllable, and the coda, which is the final consonant or consonant cluster; NB: this is a simplified description). The reason for this is two complementary cross-linguistic trends: syllables like onsets, and they don't like codas. These trends are formalized in Optimality Theory as the constraints Onset (syllables should have an onset) and NoCoda (syllables should be open).

If these were the only two relevant constraints, there wouldn't be any syllables with codas, because all of the consonants would get piled on the onset of the following syllable. However, other preferences override these constraints in certain situations. Take the word "constraint," for example. The word has the massive "nstr" cluster in the middle. If all we cared about was getting rid of codas, we'd shove all those consonants onto the second syllable, giving a syllabification of co.nstraint. Go ahead, try pronouncing that. It's not very fun. The reason is that there is another constraint at work here, called sonority sequencing. This refers to the general tendency of onsets to increase in sonority and for codas to decrease in sonority (I'll leave an explanation of sonority for another post, but generally speaking, in order from least to most sonorous, sounds are classified this way: stops < fricatives < resonants < vowels, with voiced sounds being more sonorous than voiceless sounds within each category). In OT this constraint is Son-Seq (sounds increase in sonority moving from syllable margin to syllable nucleus). In the "nstr" sequence, the n is higher in sonority than the s, which means putting them together in an onset violates sonority sequencing. Since in English Son-Seq is ranked higher than NoCoda, we would rather syllabify the word as con.straint than co.nstraint.

The reason I'm talking about this at all is from having set our DVR to record the show Good Eats on the food network. The host of the show, Alton Brown (who is also the announcer/narrator for Iron Chef America) has a formidable NoCoda ranking, and almost always assigns many more consonants to onsets than most of us would ever want to. An example of this is his syllabification of the word "fifteen." I say fɪf.tʰin, with aspiration on the t because it begins a stressed syllable. However, Mr. Brown says fɪ.ftin, a syllabification distinguishable from my own pronunciation by the last of aspiration on the t, which signifies that it is not syllable-initial (compare pʰɪt with spɪt). This unusual syllabification violates sonority sequencing (remember that stops are less sonorous than fricatives), but he prefers it because it results in only one NoCoda violation instead of two.

Another example of an extremely high-ranking NoCoda constraint is the syllabification of a word-final coda with the following word, even in careful speech. I wish I could remember the token Alton produced to make me think of this, but any random example will do. Usually in rapid speech, with a phrase like "farm aid," we say far.maid, because of the constraint *Complex (syllable margins should be simple). Our Align-Morph-R constraints are being violated here, since the first morpheme is being split between two syllables, but in rapid speech we generally prefer violating that to violate *Complex, which is an articulatory constraint as opposed to a semantic one. However, I noticed that even in careful speech, Alton Brown continually violates Align-Morph-R, because he would always rather fulfill phonetic and articulatory constraints over more abstract semantic ones.

Friday, June 20, 2008


The word "eggcorn" was coined by the good folks at Language Log (by Geoff Pullum, to be specific), and refers to the substution of an analyzable word or phrase for one which is obscure or archaic. The eponymous example is "eggcorn" for "acorn." Since "acorn" is monomorphemic (and apparently obscure in someone's dialect), a woman insisted that the word was "eggcorn," presumably because acorns resemble small eggs. As much as wikipedia insists that eggcorns and folk etymologies are completely separate occurrences, I think there are obvious similarities. Folk etymologies are more technically only found throughout a society, as opposed to in a single idiolect, but the idea behind them is the same: making sense of an otherwise nonsensical word or phrase. An oft-cited example is "sparrow grass" for "asparagus" (though personally I've never heard someone say it).

There is a whole rash of words that arose via the misparsing of articles and the words they attach to. A different process perhaps, but again, one that stems from a common source: the belief of speakers that they understand the origins or a word or phrase. Examples include "an apron" for historical "a napron," "an orange" for "a norange," "an asp" for "a nasp," etc. One common example that I think we all run into from time to time is "a whole nother." People almost never write this, opting instead for "a whole other," because they know that the word is "other," not "nother." The astute observer may notice that in all the above examples the /n/ is moving from the article to the noun, and never the other way around. This is due to the tendency of syllables to have onsets. Phonetically, when presented with the sequence VCV, people will almost exclusively parse it as V.CV, even if morphologically it is [VC][V]. This is formalized in OT as the constraint Onset (syllables should have onsets).

One final discussion: spelling pronunciations and pronunciation spellings. Spelling pronunciations occur quite often, and are simply pronunciations based on how a word is represented orthographically as opposed to what the historical or more common pronunciation is. A great example of this is the [r] in Burma. The name of the country was more accurately pronounced ba:ma, and in non-rhotic British English, the way to signify vowel length was to add an "r". For them, of course, it would yield something approximating the correct pronuncation, but for us rhotic speakers it creates a non-underlying [r] sound.

A great example of a pronunciation spelling I ran across recently is "pubity" for "puberty." Presumably the person lives in New England, in which case "puberty" and the hypothetical "pubity" would be pronounced the same: pʲubəɾi.

Wednesday, June 18, 2008

There's as an existential quantifier

Recently (or maybe this is an example of the recency illusion) I've noticed more and more people using "there's" with plural items. This is not to say I just heard it yesterday, but I remember not having ever heard it before, say, 5-10 years ago. Clearly it was around earlier, because I overheard it in an episode of Friends from their first season, which I believe was 1992 or 1993. A quick google search turns up a movie called "For Every Man, There's Two Women", which dates from 1984. And, in fact, a search of Shakespeare turns up 3 hits (e.g., "There's two or three of us have seen strange sights" Julius Caesar, I, iii). I was, however, unable to find an example in Chaucer, so it could be a mere 500-600 years old. Clearly this is an example of the recency illusion, the tendency of people to believe things they have just heard (or more often just noticed) are new to the world.

I believe what's going on here is not merely laziness. Prescriptivists love to jump all over linguistic innovations, pointing out how they are vague, lazy, or just downright immoral. In many instances, these critics couldn't be any more wrong. Most often linguistic innovations arise because people have a desire to express themselves, and want a better and more succinct (and often LESS vague) way to say what they're thinking. A good example of this is "like," as in "John saw Steve hit Mary and was like 'What the hell?!'" Critics would probably have this utterance rephrased as "...and said, "What the hell," or "...and thought, What the hell. The problem occurs when John neither said nor thought this. The use of the word "like" conveys an emotion via a descriptive phrase, and there is simply no other way to do this in the English language. I'm a very conservative like user, because to me it is marked and sometimes, when used in excess, the subject of contempt. However, I do use "like" in all situations like the above (of course this is in addition to the "normal" uses of like as just demonstrated), because it's the best way of expressing myself.

So why would someone say "There's two pencils on the table"? I would wager not because they're stupid or lazy. I think the most likely possibility is that the contraction "there's" has ceased to be a true contraction, and instead has become a sort of existential quantifier that signifies "There exists some x," where x is a state. In the above example, the state is "two pencils are on the table." Many more people would say "There's two pencils on the table" than would say "Two pencils is on the table," so clearly there is a perceptual difference. People aren't looking at "there's" as a verb, but rather a mathematical or logical operator. Now, this works fine in a predicate logic framework, but I wouldn't want to try to explain it in current syntactic theory. I'm sure someone could, though, so if you're so inclined, please share. Also, if someone can antedate there's with plural argument to before Shakespeare, that would be interesting to see.

Tuesday, June 17, 2008

more morphological reanalysis

To refresh, morphological reanalysis is the treatment of a given phrase as a single set lexical item, specifically for the purposes of stress assignment and prosody. The token that got me thinking about this again was Conan O'Brien talking about Krispy Kreme a couple weeks ago. He pronounces it ˈKrispy ˌKreme, with the accent on the first word, whereas I have the more conservative ˌKrispy ˈKreme, which is essentially what I would say if I were talking about cream that had somehow become crispy.

However, I'm usually pretty liberal when it comes to morphological reanalysis (i.e., I usually treat a set phrase as a single prosodic word). One example I run into often is my ˈgreen ˌbeans, versus my wife's more conservative ˌgreen ˈbeans, the same thing anyone would say when confronted by a random bean which was green in color.

Occasionally I get confused when reading lexical items when I get the stress wrong because of this process. For instance, I was shocked to see how high the deˈfault APR was on a credit card offer I received in the mail, until I realized that it was actually the ˈdefault APR (the APR you receive if you default on a payment). One other instance was a dictionary entry that caught my eye when I was looking up something the other day: safety orange. I assumed this was ˈsafety orange, some delicious variety of the fruit with which I was unfamiliar. It is, of course, ˌsafety ˈorange, the color of traffic cones and hunting vests.

So, which pronunciations do you guys use?

Monday, June 16, 2008

Optimality Theory

The question: why does Ben Barnes (who plays Prince Caspian in the new Chronicles of Narnia movie) fail to aspirate the /k/ in Prince Caspian?
The answer: He ranks MOP above Align-Morph-R.

Don't worry, all will be explained. This will be the first of several posts dealing with the most recent theory in phonology: Optimality Theory, or OT for short. The idea behind OT is a simple and universal one, based on constraints. We all are familiar with constraints from our everyday lives. Would you rather spend that $20 on a movie or dinner? That's exactly what OT does for linguistics, except instead of figuring out how to spend money it's trying to figure out (in phonological applications) what the surface realization of an underlying form will be.

OT presents constraints as realizations such as Onset (which states that syllables should have onsets) and *Comp (which states that syllable margins should be simple, e.g., "string" would violate this because of the complex "str" cluster at the beginning). The problem, of course, is that we can't get everything we want all the time. If you only have $20, you can't spend $20 on dinner and then $20 on a movie. You have to pick one. Here's an example from OT.

One constraint is the Maximal Onset Principle, or MOP (it's questionable whether this is a necessary constraint; most likely NoCoda and S(onority)S(equencing) render it irrelevant in all cases, but we won't deal with that now). This states that if there's a question whether to assign a segment to the coda of the preceding syllable or the onset of the following syllable, you should do the latter. In the word "instance," we could syllabify it as ins.tance or in.stance, and MOP says we should pick the latter. Another common constraint is Align-X, where X can be L for left or R for right. It states (in the most vague definition possible) that things should be aligned with other things. This can apply at any level of abstraction. We are most concerned with Align-Morph-R, which I will define as "The right edge of a morpheme should be at the right edge of a syllable."

In the example I gave at the beginning of the post, these are the two relevant constraints. So, why do I aspirate the /k/ in Caspian while Ben Barnes doesn't? The MOP constraint wants us to assign as many segments as possible to the onset of the second syllable of Prince Caspian: prin.skaspian (clearly I'm not using IPA, I'm not going to venture into Unicode yet; that's for later this week). However, the morphemes are [prins] and [kaspian], so this syllabification violates Align-Morph-R, because the right edge of the morpheme [prins] in the onset of the following syllable. If we syllabify the phrase as prins.kaspian (I'm ignoring syllable boundaries in Caspian), we're violating MOP, because "sk" is a perfectly valid onset in English (school, escape, etc.). We can't fulfill both constraints, so we have to choose one. I choose [prins.][], because for me Align-Morph-R is ranked higher than MOP, i.e., I would rather have my morphemes lined up with syllable boundaries than assign as many segments as possible to onsets. Ben Barnes is the opposite; MOP is very important to him, so he chooses to violate Align-Morph-R: [prin.s][].