Sociolinguistic Variation in Online Social Media

Sunday, 15 February 2015: 3:00 PM-4:30 PM
Room LL20C (San Jose Convention Center)
Jacob Eisenstein, Georgia Institute of Technology, Atlanta, GA
While social media language is sometimes described as a dialect in itself, in fact it displays a remarkable amount of internal variation, aligning with social factors such as geography and ethnicity. This variation can be revealed by computational statistical methods that search for patterns of association between language and social variables in large corpora of unannotated text, simultaneously identifying stylistically coherent sets of linguistic variables and the social groups that use them. Computational analysis can also shed light on the relationships between social media writing and vernacular spoken language: it is well known that online writing contains phonetically-inspired spellings, but perhaps more surprising is that these spellings reproduce some of the systematic context-sensitivity of the spoken language variables that they transcribe. Finally, I will present new research on the social properties of online dialect variation, which offers evidence that authors modulate their use of social media variables depending on both the context and their audience.