Saturday, March 3, 2012

new Collins dictionary site

I had my attention directed to the new Collins dictionary site this week. For the most part it's your standard dictionary site -- definitions, usage examples, etc. The IPA transcriptions are solid, though don't include syllable boundaries, and I have yet to find a word improperly transcribed. The search function is predictably fine, though doesn't have autosuggestions. What got me excited, though, was the information about the relative frequency of the word in each entry. In the top right of each entry is a "commonness" bar, which indicates how common a word is by shading between 1 and 5 circles. Additional mouseover text mentions the size of that category, e.g., "X is one of the 30,000 most commonly used words". Though there are obviously questions about what corpus is being used for these calculations, it's still a neat feature. Another cool graphic is featured in the bottom right of each entry, which features an adjustable chart showing how common the word has been for a specified time period (from the last 500 years to the last 10 years). Perhaps not as detailed as COCA's interface, but still a great addition to a dictionary web site. And let's not forget the "translations" section, which gives translations of the word into other languages. Okay, Blackfoot isn't included, but you can instantly see translations from all the most spoken languages. Overall, it's pretty cool; I'll probably be using it as my go-to dictionary site from now on, mostly because of the frequency statistics.

Saturday, February 25, 2012

another crash blossom

As those who are regular Language Log readers know, a crash blossom is a news headline that leads us to an incorrect parsing of its meaning. In many ways crash blossoms are similar to garden path sentences, the classic one being "The horse raced past the barn fell", which lead us down a metaphorical garden path by presenting information that can be parsed easily into a certain structure, only to ruin our structural hypothesis later on. In the case of "The horse raced past the barn fell", our hypothesis is that "raced" is the main verb of the sentence, rather than part of a relative clause describing the horse. Thus we parse "The horse raced past the barn", and then have to completely redo our structure when we get to "fell".

The crash blossom that caught my eye the other day is similar: "Stepmom charged with murder has baby". The main headline on the full story is no better: "Alabama stepmom charged with girl's murder gives birth". Again we have a relative clause with the relative pronoun omitted, leading us to think that what is in fact part of a relative clause is the main verb of the sentence. So in the full headline, we think the story is about a stepmother in Alabama being charged with murder, when in fact the story is about the woman in question giving birth.

Saturday, January 21, 2012

language as technology

My friend and erstwhile colleague Josh Birchall posted a link on Facebook to an interesting TED talk by Mark Pagel entitled How Language Transformed Humanity, on the development of language as a communicative tool and how it presented a huge evolutionary advantage over non-linguistic species. It isn't difficult to see how language, an infinitely productive system capable of expressing ideas that are not tied to a specific time and location, confers a greater benefit than other forms of communication. Language can be used to transfer abstract ideas and share a much wider range of information and technology compared to, say, the system of pheromones that ants use to communicate. It is for that reason that many archaeologists typically assume that the rise of abstract expression (viz., art) and the exponential proliferation of tool development coincides with the rise of language.

At any rate, it's an interesting talk, even if it's not perfect, and I think well worth watching.

Saturday, January 14, 2012

allophones and marginal phonemes

One of the most basic concerns of phonology is determining what phonemes constitute the phonological inventory of a given language, i.e., which sounds are used contrastively. Sounds are used contrastively if switching one pronunciation for another could result in a change in meaning (it may not be the case that a new word thus formed actually exists, but I think the point still holds for pairs like brick~blick). I can pronounce the word "atom" with an alveolar flap or with an alveolar stop, but the choice does not produce a difference in meaning. At worst I sound British if I use an alveolar stop rather than a flap. One of the ways phonologists look at contrastive versus non-contrastive sounds is the distribution of sounds. Contrastive sounds will typically have the same distribution. In English, for instance, /t/ and /d/ are contrastive, and we can find them in the same environments. Each one can occur in the onset or coda of a syllable, including before sonorants in a complex onset or after sonorants in a complex coda. As mentioned before, alveolar flaps and stops are non-contrastive in English, and are viewed as variant pronunciations. Another example is aspirated and unaspirated voiceless stops -- we perceive the /p/ in "pit" and "spit" to be somehow the same, even though one is aspirated and one is not.

Problems arise when phonemes are in complementary distribution but don't in any real sense seem to be allophones. An example from English is the case of /h/ and /ŋ/. /h/ occurs only in onset position, and /ŋ/ only in coda position. We might have to do a little bit of hand-waving, but as long as we don't insist on following our rubrics with machine-like strictness, we can say something about the fact that we don't have any external evidence that these two sounds come from the same phoneme, e.g., alternations that we do find in "atom" (with a flap) versus "atomic" (with a stop). We also have evidence from historical English sources (including modern orthography) that explains why /ŋ/ only shows up in coda position -- it arose from place assimilation when /n/ appeared before /g/, which is why we spell /ŋ/ as ng, as in "sing". Another difficulty appears when sounds only contrast in some environments. This is called neutralization. For instance, in Blackfoot, /t/ and /ts/ contrast in most environments. Both can appear in onsets and codas. However, only /ts/ appears before /i/. Thus we can't simply say that /t/ and /ts/ are or aren't contrastive; we have to specify the specific environments in which they are contrastive.