I haven't had much time for blogging recently, since it's crunch time on my thesis. I was working even on the brief holiday (that's 'vacation' for all you Americans) I took to the US for my brother's wedding.
It's always a fabulous trip, visiting Silver Lake in the summer. This year was no exception. Despite a few rainy days, the weather was accommodating enough to allow for some good lake time. With nieces and nephews multiplying it makes for some fun times!
And the wedding was great. We even got to set off Chinese wish lanterns. But my favorite part was being able to spend time with family and friends - what a great trip! Now here's to writing!
In my last post I discussed the concept of deixis and illustrated deictic demonstratives in Pnar. Pnar's deictic demonstrative system combines gender clitics with largely distance-based deictic morphemes, so there are potentially twenty demonstrative forms that identify nominal distance in relation to the deictic center, which can be either Speaker or Addressee, depending on the context. I'm still trying to figure out which for some forms.
The subject of my HLS talk related the Pnar demonstrative system to the demonstrative systems of other neighboring and related languages. Pnar is a bit unusual in that it is a clearly Austroasiatic language (like other Khasian varieties), yet it is geographically separated from most other Austroasiatic languages by Tibeto-Burman and Indo-Aryan speakers. Identifying similar concepts and forms in these neighboring languages could provide evidence for language contact. Here is a map from the Ethnologue (which I made adjustments to) that shows some of the languages I decided to use to compile a table. Unfortunately not all the neighboring languages listed in black have adequate descriptions. For all you budding linguists out there, go describe them!
Because discussion of deixis is not uniform, I decided to limit myself only to forms which served a nominal demonstrative functon: those words which point out the distance of a noun from a deictic center, or exist in a paradigm with words that do. This excludes a lot of Tibeto-Burman verbal affixes which encode direction (uphill, downhill, across, etc..), and was necessary to keep the talk short enough for the conference. I worked from existing descriptions of neighboring and related languages, and unfortunately some of the languages have little to no description, descriptions that made little to no reference to demonstratives, or I couldn't get ahold of the reference in time. So here is a table with some of my findings:
I think a few things are worth noting here. First, all languages in my sample have at least a proximal/distal distinction. Second, Tibeto-Burman languages in my sample vary widely in terms of how many distinctions they encode in demonstratives. Third, Austroasiatic languages have the largest number of distinctions in their demonstratives, having at least four in each language. However, they are not uniform in terms of which distinctions they make - Pnar, while it has five distance-based demonstratives, has no 'up' or 'down' (in some descriptions 'upstream' or 'downstream') that is part of the demonstrative paradigm.
In terms of language contact, it is interesting to consider that Ao, Karbi, and Garo (which only have a dual distinction according to the descriptions I read) have had considerable contact with Indo-Aryan languages which have only a dual distinction. It is worth noting that Ao also has a non-visible/anaphoric marker that was not considered by Coupe (2007) to be a demonstrative, but may well be (personal communication).
There are many more interesting thoughts that could be drawn from this brief look at deictic demonstrative systems, and I hope these posts have helped you think about the system in your own language or languages you work on. Feel free to leave thoughts, suggestions, corrections, and general comments below!
Baclawski Jr., Kenneth. 2013. Deictics and related phenomena in Kuki-Chin. Dartmouth College, Hanover, NH: ICSTLL 46. Benjamin, Geoffrey. 1976. An outline of Temiar grammar. Oceanic Linguistics Special Publications 129–187.
Brown, Nathan. 1848. Grammatical notices of the Asamese language. Sibsagor: American Baptist Mission Press.
Burenhult, Niclas. 2008. Spatial coordinate systems in demonstrative meaning. Linguistic Typology 12:99–142.
Burling, Robbins. 2004. The language of the Modhupur Mandi (Garo), volume 1: Grammar. New Delhi: Bibliophile South Asia, in association with Promilla and Co., Publishers.
Coupe, A. R. 2008. A Grammar of Mongsen Ao. Berlin, Boston: De Gruyter Mouton.
Dasgupta, Probal. 2003. Bangla. In The Indo-Aryan langauges, ed. George Cardona and Dhanesh Jain, 351–390. New York: Routledge.
Diffloth, Gérard. 1976. Jah-Hut: An Austroasiatic language of Malaysia. In Southeast Asian linguistic studies 2 , ed. Nguyen Dang Liem, Pacific Linguistics C-42, 73–118. Canberra: Australian National University.
Ghosh, Arun. 2008. Santali. In The Munda Languages, ed. Gregory D. S. Anderson, 11–98. New York: Routledge.
Henderson, Eugénie J. A. 1965. Tiddim Chin: A descriptive analysis of two texts. Oxford: Oxford University Press.
Imai, Shingo. 2003. Spatial deixis. Doctoral Dissertation, SUNY Buffalo.
Konnerth, Linda Anna. 2014. A grammar of Karbi. Doctoral Dissertation, University of Oregon.
Kruspe, Nicole. 2004. A grammar of Semelai. Cambridge University Press.
Matisoff, James A. 1973. A Grammar of Lahu. University of California Publications in Linguistics, 75. Berkeley: University of California Press.
Nagaraja, K. S. 1985. Khasi, a descriptive analysis. Doctoral Dissertation, Deccan College, Pune.
Osada, Toshiki. 2008. Mundari. In The Munda Languages, ed. Gregory D. S. Anderson, 99–164. New York: Routledge.
Ring, Hiram. Forthcoming. Khasic: Pnar. In Handbook of the Austroasiatic languages, ed. Matthias Jenny and Paul Sidwell, Chapter B: 21, ~30p. Brill.
At the Himalayan Languages Symposium last week I gave a talk about deixis. This grammatical feature is essentially 'pointing', and words or morphemes in language can point to various things, so grammarians often talk about person or distance-based deixis, social deixis, and temporal deixis.
Distance-based deixis is often encoded in words called 'demonstratives', social deixis in 'honorifics' like "sir", "ma'am" etc.., and temporal deixis is encoded in tense markers. Deixis is actually more complex, though, as deictic morphemes can really point to any point in the communication space, as illustrated in the diagram on the right from Gerner (2009).
Since deixis is such a large topic, my 20-minute talk focused on the way distance-based deixis is encoded in Pnar and in related languages through demonstratives, specifically words that identify the location of nouns in space, relative to a deictic center. Most languages have at least a 2-way contrast (like English "this" and "that"), and rarely more than three. I began to be interested in this feature since in Pnar there is a 5-way contrast in demonstratives and some of the forms resemble similar words in neighboring Tibeto-Burman languages (a completely different language family). Just to illustrate, below on the left are the spatial deictic morphemes in Pnar (the black circle in the middle represents the 'deictic center', which in this case is the person who is speaking), and on the right are the words in some examples of noun phrases in Pnar. You will notice that demonstratives in Pnar are a combination of deictic markers with gender proclitics that identify the noun that the demonstratives are pointing to.
At this point there are a lot of other things I could discuss, but the post is getting a bit long. So I think I'll pause here and my next post will be about the features of demonstratives in neighboring languages. At least now you have a better idea of what deixis is, and how languages can differ significantly in terms of what they can encode in a spatial deictic system.
Gerner, Matthias. 2009. Deictic features of demonstratives: a typological survey with special reference to the Miao group. The Canadian Journal of Linguistics/La revue canadienne de linguistique 54:43–90.
I mentioned that Dr. Anvita Abbi gave a great talk at the Himalayan Languages Symposium on her work on Great Andaman in the Indian Ocean. Here's a map just to show you where that is. [Image credit: Barefoot Holidays]
It's a pretty remote area. In fact, the Nicobar Islands to the south are completely closed to outsiders. When you consider that the speakers of Great Anadaman are down to a single location and the community is switching to Hindi and English as a means of communication, the closed nature of the Nicobar Islands seems somewhat justified. Great Andamanese is actually 10 languages, of which 4 were documented by Dr. Abbi and are spoken by only a handful of speakers. I'll let you check out more about that on this site.
One of the reasons it was fascinating to hear about was because of the highly-developed gender system based on a conceptualization of the world in relation to the human body. Generally, the kind of gender in languages that people are familiar with is that found in Romance languages, where nouns are marked as masculine or feminine, and verbs agree with nouns so that you know which noun is 'controlling' the action (it's more complicated than that of course, but this is just to illustrate a point).
However, gender is simply a noun class system, and nouns can have as many classes as a language (or speakers of a language) find(s) useful. So German has three noun classes (masculine, feminine, neuter), and Bantu languages have a ton (help me out Bantu language experts), and other languages have noun classes based on living things, non-living things, plants, humans, tools, certain kinds of animals, etc..
What is interesting about Great Andamanese is that the same class markers are used on both nouns and verbs in a highly productive way (meaning that they seem to apply in all sorts of ways to both verbs and nouns). These noun class markers identify actions (such as going and coming) as related to one of 7 or so body part prefixes (which also classify nouns) depending on whether the action is conceptualized as relating to mouth (being ingested, digested, etc.. i.e. thinking or being beautiful), or moving in a certain manner (feet), and there are conceptualizations related to all sorts of body parts. Unfortunately I don't have all my notes with me, as I just flew to the US for my brother's wedding, but it's really interesting to think of how this language connects (or doesn't connect) to languages in Southeast Asia and Africa. Read up more on this fascinating system here, and check out Dr. Abbi's new grammar of Great Andamanese, recently published by Brill.
Yesterday afternoon I gave a talk at the Himalayan Languages Symposium, which was held this year at NTU. It's the 20th meeting, and has generally focused on languages of the Himalayan region, which is a pretty broad area when you consider that the Himalayan range stretches from Pakistan to Burma. That's a heck of a lot of languages.
It was a really great conference, thanks to clear papers and engagement on a variety of topics. Phonetics and phonology of individual languages, historical reconstruction, ancient Tibetan, theoretical implications of marking patterns, field reports, typological surveys, Nepali Sign language, child language acquisition, and sociolinguistic studies were only some of the areas covered in the talks. One of the most interesting to me was a report by Anvita Abbi on the languages of Great Andaman, an island in the Andaman-Nicobar chain. These languages are an isolated group that remain unclassified and are in danger of extinction. I'll have to write a separate blog post to explain my fascination.
My talk was on deictic demonstratives in Pnar and the neighboring languages of northeast India. Look for a follow-up post in the next couple days that explains a bit more. For now, I'll just say that it was a great conference and it's back to the thesis in the coming week.
Image Credit: ICIMOD
Last week I had a Eureka!* moment. I love these moments - when you've been trying to figure out a problem (could be big, could be small) and it is frustrating you to no end, and then finally you break through and find the solution! It's pretty amazing.
This Eureka! moment had to do with the linguistic examples I wrote about earlier. They weren't formatting properly, and because of this some of the examples were splitting across pages. Pretty early on in my attempts, I posted on a forum devoted to LyX/LaTeX/TeX, the typesetting program I use. Forums are pretty nifty ways to aggregate knowledge, and I've learned a ton about LaTeX through this particular forum. If you have a specialized industry or tool and you haven't found a forum where people can help each other out, find one quick or make one yourself. It is totally worth it.
Unfortunately, with this particular issue no one was able to help. So I kept troubleshooting, trial and error. Eventually one of the things I tried worked! So satisfying. I imagine this is what I'll feel once I finally submit my PhD thesis... though people tell me a grammatical description is never complete, even if it's over 1,000 pages.
*As I remember, and according to Wikipedia, "Eureka" comes from the Ancient Greek word εὕρηκα heúrēka, meaning "I have found (it)" and is attributed to Archimedes, who discovered how the volume of objects could be measured by water displacement.
I realize that some of my posts haven't been as clear as they could be. Specifically, I talked a lot about interlinearized texts, but what does that actually mean? Well, the thing about language is that when you are talking about specific aspects of language, it's helpful if the reader actually knows what you're talking about. Thus, examples are useful. When you're discussing an unwritten language, this has to be taken to a whole new level.
When I'm discussing examples in Pnar, I need four levels of representation, as in the example below. On the left the numbered lines represent the local orthography (line 1), the phonetic/phonemic representation using IPA (2), the word-for-word translation or English gloss (3), and the free translation that actually tells you the English meaning (4).
So on the left we have the four levels of representation, but you notice that the items on each line don't quite match up. This can be confusing, particularly if you're dealing with long examples. Interlinearization allows each element to correspond to one in the following line.
One way linguists do this is by creating tables, which have to be individually edited for each example. This is what you have to do in MSWord, unfortunately. Another way is using a typesetting program called LaTeX - this is how I produced the nicely formatted example on the right. Another convention is to have the local writing system be italicized and non-interlinearized.
Notice that the glosses on the third line are not exactly a translation equivalent, sometimes they are grammatical abbreviations for function words. Here, 'ALL' is an abbreviation for 'allative', which is a traditional term for a marker on nouns that indicates the noun to be a 'goal' or what another noun is moving towards.
Hopefully that clear things up a bit. To read more about interlinearized linguistic examples, this Wikipedia page should help.
So my team did not make it through to the next round. They played a good game against Portugal last night, but it just wasn't good enough. Too much ball control, not enough finishing power. A ton of missed chances. Disappointing, but such it is sometimes. Hopefully next World Cup will be a different story.
Now just to clarify, Ghana is the country I was born in. One of my Chinese friends was shocked that I wasn't supporting the USA in the World Cup - why would I even consider supporting a country other than the country of my passport? I explained that in the USA we like to support the 'underdog', a concept that took a bit of explaining. We actually spent 5 minutes with me trying to explain why Americans like to support teams that meet the requirements of 'the little guy'.
Now Ghana is definitely not completely the little guy when it comes to soccer (football for all the non-Americans), but as a nation when compared to the USA, which is pretty much the biggest on the block, they are. And as someone born and raised in the African nation, I guess I have more call than most to support them on the international stage.
But this definitely does not mean that I dislike the USA - not at all. In fact, I am really glad to be a citizen and am really glad they made it into the World Cup, and hope at some point they get to hoist the trophy. But given a sports matchup between them and Ghana, I'd have to support Ghana all the way. Though now that the US is through, I still have a team to support...
As I've been working with code to try and do some programming to get the computer to format my text properly, I've run into some issues. It's got me thinking... You know how computers think... wait, you do?! No you don't! Computers don't think, unfortunately, that's the problem. Computers aren't good at connecting the dots or making inferences like humans are. All they can do is connect the dots that a human tells them to. There's the rub. The computer is only as smart as you are.
Fortunately, when I'm writing a program to go through my 80,000+ words of text (times 6, since there's 4 lines of interlinearization plus one of free translation = 480,000) which it parses in an instant, the computer tells me when it fails. Or rather, since I'm writing the code, when I FAIL. You know exactly where you stand with a computer, because there's only one right way for a code to run, and that's if all the processes are logical and well-formed according to the rules of the code's architecture.
I must say I'm glad that life isn't that way. Yes, there are principles that can be recognized and lived. You generally receive from life based on what you put into relationships, study, work, etc... But there's no single perfect way to run. It's not like the world is a giant piece of code architecture and your life is a logical process from one thing to another. Life is dynamic. It can change and be changed by a small movement in one direction or another. And failure is just the beginning of a new direction.
On the way back to the office from dinner the other night (see how much time this coding takes if I go back to the office after dinner!) I was talking with one of my friends about job prospects and how life changes. There's a lot of uncertainty, but I said that one thing I've learned is to figure out what is important to you and make it part of your life. I guess I'm still figuring...
This past week I've been attending a workshop on the linguistic notion of Affectedness that my co-supervisor Frantisek Kratochvil organized. It has really helped me think about possible ways this feature could be at work in Pnar verbal constructions. And if you didn't understand that sentence feel free to ask and I'll try to explain it better. My brain has been fried most days this week.
While at the conference and in the evenings I've been working on organizing my linguistic database. A few weeks ago my friend Matt showed me how to use Python scripting to format and search the texts that output from my Toolbox database.
Toolbox allows for interlinearization of linguistic data, which is the standard for examples in linguistic papers and allows people who don't understand anything about the language to see the grammatical structure. It usually includes a local orthographic line of text, followed by IPA (International Phonetic Alphabet) representation, a line of word for word glosses (translations), and a free translation. Glosses and free translation are usually in English.
The script Matt wrote (with my input) allows for regex (regular expression) searching and output. So in my corpus I can find all the verbs followed by nouns, for example, or all the verbs preceded by the form 'ka', and output their context.
The script I wrote this week (with his input) takes the whole Toolbox corpus (or a portion thereof) and reformats it so that I can read it with a typesetting program called LyX, a front end GUI of the popular but obtuse typesetter LaTeX. I still have a bit of work to do, but basically it allows me to turn my corpus database of 90,000+ words into a nice corpus, typeset with interlinearization, as a PDF.
After excluding about 2 hrs of data from my corpus because of parsing issues in Toolbox, my resulting PDF file was over 700 pages of just interlinearized examples, with no other formatting. I don't think I'll be including it all in the dissertation I plan to submit in August, but it's amazing to have such a simple tool for outputting my data in a readable format.
I love technology...
I'm a linguist and singer-songwriter. I write about life, travel, language and technology.