Language Standardization Research Paper

Language standardization—the process of creating the (usually written) form of a language favored among it speakers—is profoundly intertwined with world history. For example, the language given status in a region has the potential to marginalize some people and privilege others. Conquering peoples often denigrate the language of the conquered, while that same language can become a rallying point for nationalistic movements.

The standardization of languages is the often consciously initiated and sometimes elaborately planned process of creating a form of a language— usually written—that serves as the accepted and favored means of communication among the members of a speaker group, who normally use more or less aberrant variants of this standard language in their daily lives. Typically, the agents of language standardization operate within the boundaries of and with the support of a political community, usually a state, but cases of conscious language standardization without such support, even against the political will of state authorities, or transcending state boundaries, are also known.

The Nature of Language Variation

One of the fundamental constants of human language is its variation. Languages change constantly in time, and languages used by large numbers of speakers display often considerable degrees of differentiation, which may at times impede the possibility of effective communication between speakers of the same language living in geographically separated locations. It is probably safe to say that no two individuals use exactly the same language. Sociolinguists (students of language in its social context) speak of different idiolects, that is, languages as used by a single individual. In order to be an efficient tool of communication, it is of course necessary that a large number of individuals use, or be able to use, a language (or a variant, more neutrally called a linguistic code), with a reasonably high degree of uniformity. That uniformity enables speakers to communicate with other speakers effectively, ideally well enough that speakers meeting one another for the first time can communicate without problems. A group of individuals who use such a linguistic variant (acquired as any group member’s first or native language) may be called a speech community, and their linguistic variant may be loosely referred to as a language. Languages that are used by large speech communities spread over a considerable territory show, more often than not, some degree of regional variation. Regional variants (which may be characterized by linguistic differences in all linguistic subsystems, including, for example, phonology, morphology, syntax, and lexicon) are commonly referred to as dialects. Most speech communities that are sufficiently large to know some degree of social differentiation or stratification develop linguistic variants divided by social boundaries; these are called sociolects. The average member of a larger speech community will typically be a native speaker of a dialect and, where applicable, may use a sociolect as well. Normally, many individuals will be able to use—or at least to understand—other dialectal or sociolectal variants of the larger speech community, especially individuals with a high degree of geographical or social mobility (or both).

It is not always easy to differentiate between a dialect and a language, and the determination is often made according to nonlinguistic criteria. For example, there is certainly no problem in determining the boundaries of the Icelandic language. It is the only indigenous language of the population of Iceland; its internal dialectal differentiation is minimal, and any indigenous population outside of the island does not use it. On the other hand, some languages of continental Europe are not so straightforwardly definable. A case in point may be the language pair formed by Low German and Dutch: Spread over a large territory in northwestern Europe from Flanders to northeastern Germany, speakers from opposite ends of this area will have little or no ability to communicate with one another in their vernaculars; however, the actual linguistic differences largely disappear as one moves from either end to the border zones of the Netherlands and Germany, where local variants spoken by individuals on both sides of the border are close enough to allow unimpeded communication. Such zones, in which linguistic differences increase gradually with geographic distance, are commonly called dialect continua. The fact that variants spoken in one country are referred to as dialects of Dutch, while those on the other side of the border are known as variants of (Low) German, is not justifiable on linguistic reasons, but rather is a consequence of political considerations and language standardization. Cases of sometimes very large dialect continua being differentiated into languages for historical and political reasons are widespread in Europe and elsewhere; examples in Europe include the West Romance dialect continuum (or DC), comprising French, Provencal, Italian, Catalan, Spanish, and Portuguese; the Scandinavian DC, comprising Norwegian, Swedish, and Danish; and the South Slavic DC, comprising Slovene, Serbo-Croat, Macedonian, and Bulgarian. Outside Europe, national languages such as Hindi and Urdu, Thai and Lao, Turkish and Azerbaijani, and Zulu and Xhosa may be mentioned. The fact that all these variants are generally perceived as and officially referred to as languages is due to the existence of nationally recognized standard languages, which may be close to, but often differ considerably from, the variants (or dialects) acquired as first languages by many speakers.

The Importance of Writing

Whereas in unwritten languages sometimes a certain amount of dialect leveling may be observed (initiated, for example, by a commonly felt pressure to imitate a prestige variant, such as that of a politically or culturally dominant subgroup, a ruling clan, or maybe a center of trade and commerce), language standardization in the narrower sense of the word can only begin with the introduction of writing. Of course, the formation of discernable individual languages out of large dialect continua often precedes any purposeful standardizing intervention; thus, when people in northeastern France began to write in their vernacular (rather than in Latin) in 842 (with the Oath of Strasbourg, which united the Western and Eastern Franks), it marked the beginning of French as a written language, just as the so-called “Riddle of Verona” (a short text, which is possibly the first attestation of colloquial Italian, dating from the ninth century CE) marked that of written Italian. But the early phases of most written languages are—often for centuries— characterized by the parallel use of several regional (and, sometimes, social) variants; thus, premodern phases of written English (especially in the Middle English period) comprise texts in Northumbrian, Southumbrian, Mercian, Kentish, and other variants. The long way to what is nowadays known as Standard (British) English is marked, among other things, by the rise of London and its aristocracy, the works of Chaucer and the Bible translation of Wycliffe and Hereford in the fourteenth century (which set a linguistic standard much imitated and considered authoritative in following times), the codification of English orthography in Samuel Johnson’s dictionary (1755), and other important events.

At the beginning of a literary tradition, scribes and writers most often try to write down their particular vernacular—frequently on the basis of existing orthographical norms of a different language (so, for example, the orthographic norms of Latin were adopted by those writing in Romance language vernaculars). The need for a unified standard may arise for a variety of reasons: For example, rulers or governments may wish to possess a unified written medium for their realm, which will allow a centralized control of administrative affairs. In modern times, the wish to propagate the idea of a unified nation may drive the efforts to create a national standard language, which is viewed as emblematic of a (possibly new) nation-state. Literate people and learned bodies may wish to define rules regarding how literature should be produced, very often on the basis of the usage of “classical” authors, who are taken as models for all future literature, both in terms of content and aesthetic form. And, of course, regionalist and nationalist movements operating within the boundaries of a larger state may use language and its standardization to propagate the public acceptance of a regional group as a population possessing the right to separate nationhood. In some cases, standard languages that transcend state boundaries are recognized and actively used: all Germanspeaking countries recognize the standard codified by the dictionaries of the Duden publishing house based in Mannheim, Germany, and in all Arabic-speaking countries the morphological and syntactical (and, for public broadcasting purposes, also the phonological) model of classical Arabic, as codified in the Qur’an, is regarded as the unquestioned norm for any public use of the language. Only in Christian Malta, where Islam plays no role but the vernacular language is a variety of Arabic, has an autochthonous written standard, based on the Latin alphabet, been developed.

Reflecting this roster of motives, language standardization may be driven by a variety of agents, among them individuals (such as, for example, Antonio de Nebrija, the author of a descriptive grammar of Spanish in 1492), autonomous societies (such as the Gaelic League, or Conradh na Gaelige, in Ireland, active since 1893), semiofficial normative organizations (such as the Norwegian Language Council, or Norsk Spraknemnd, founded in 1951), or official governmental bodies (as was the case for the standardization of the welter of official regional languages in the former Soviet Union). Especially for the latter two cases, in which a government purposefully interferes with the norms of the official language(s) of a state or nation, the term language planning is widely used.

Reforms of Orthography

The most widespread and well-known instances of language planning or language standardization are reforms of orthography. They may be motivated by the urge to simplify existing orthographies, which are felt to be too complicated (as has been the case with the simplification of Chinese characters in the People’s Republic of China since the 1950s, the Danish orthography reform of 1948, and the simplification of Russian orthography in 1917). Another possible motive is the political wish to differentiate a language (sometimes quite artificially) from a closely related one spoken in a neighboring nation. Sometimes a different script is used to serve this purpose: a case in point is that of the “Moldavian” language, a variant of Romanian, linguistically barely different from the national language of Romania, but spoken in the territory of the former Soviet Republic of Moldova and written in Cyrillic script until the disintegration of the Soviet Union in 1991 (after which Moldovans not only reverted to use of the Latin script, but even abandoned the name Moldovan in favor of Romanian). The case of Tajik is similar: Linguistically a variant of Persian, Tajik was written in the Cyrillic script, rather than in Arabic script (as Persian is), when the area was part of the Soviet Union, and Cyrillic continues to be used in present-day Tajikistan. Finally, the case of Serbian and Croatian may be mentioned. These two are mainly differentiated by the use of the Cyrillic and Latin alphabets, respectively (linguistic differences between these variants of Serbo-Croat do exist, but their boundaries do not follow the cultural boundaries between Catholic Christian Croats and Orthodox Christian Serbs).

Standardization activities are, however, not limited to orthography: All linguistic subsystems may be the focus of planning activities. When it comes to the morphology and syntax of languages, however, the possibilities of introducing highly artificial norms that do not have any basis in any of the actually spoken varieties, are naturally limited. Those norms may be based on older spoken and written varieties of the languages, and they may be artificially conserved in a written norm, at times expanding certain usages beyond the scope actually found in the spoken languages. Thus, the use of certain past-tense forms is artificially preserved in written Standard German, while those forms have practically vanished from most spoken variants.

Reforming the Lexicon

Next to orthography, the lexicon of a standard language is arguably the area most often targeted by standardization activities. Two different tendencies may be singled out as most typical here: first, agents of language planning or standardization may feel that the language uses an unwelcome number of foreign elements (loanwords) from a different language, which should be replaced by “native” words; such attitudes and activities are usually referred to as linguistic purism. The donor language may be a formerly politically dominant language, the influence of which planners may seek to reduce after, say, political independence (as was the case with Russian elements in Latvian and other languages of former Soviet republics); it may be the language of a neighboring culture or state or nation, which is viewed as hostile by certain nationalist activists, or as the target of ethnic hatred (which explains the movements against French elements in pre–World War I Germany); finally, in some language communities a general consensus may prevail, at least among the cultural elite, that the language should be more or less kept free of any foreign elements; well-known cases of institutional purism include the normative activities of the Academie Francaise for French, or those of the University of Reykjavik for the Icelandic language, which was and continues to be extremely successful in coining and propagating new words for modern concepts based on the language material of the older Icelandic literature.

The second tendency when it comes to standardizing the lexicon is the planned expansion of the lexicon of a language. This is often felt necessary when a language is adopted as the standard written medium of a newly independent nation, but language politics for minority languages with official status in the former Soviet Union were also characterized by this tendency, as are language politics for official minority languages in the People’s Republic of China. In some cases, the adopted language may not have been used widely in writing before (or only for limited purposes), and widely known and usable lexical items for numerous purposes (technical, legal, scientific, political) may be lacking; often in such cases, language planners introduce loan elements from another language; if nationalistic ideology does not intervene, the languages of former colonial powers may remain sources of the expansion of such vocabularies (for example, the role of English as a source for modern terminology in most written languages of India, or that of French in many languages of western Africa).

Sometimes language standardization or planning activities may lead to rather unusual results: In this context, the phenomenon of diglossia should be mentioned. Diglossia is the presence of two different written norms for one language in one society. One example is modern Greek, in which well into the twentieth century two written norms competed for general acceptance, one based upon ancient Greek and considered more “pure,” and the other based on contemporary vernacular Greek (which gained much ground as the twentieth century waned). A second well-known example is that of Norwegian, in which both a city-based written norm that resembles Danish and a more countryside-based, “reformed” (i.e., largely purged of Danish characteristics) norm continue to be used side by side.

Language planners may choose a language as the national standard that is not spoken by any sizable population in their country at all, as was the case with the adoption of Urdu as the national language of Pakistan. Urdu was the (written) variety of Hindustani used in large parts of India, including the city of Old Delhi. Felt as emblematic for the new Islamic nation of Pakistan, it was adopted as its national language, although only in a relatively small pocket around Karachi was it used as a native language (Sindhi, Punjabi, Baluchi, and Pashto being more widespread as first languages in the country).

Finally, there is one case in which a language that actually died out was raised to the status of a national standard through the combined effort of a community with a strong commitment to its national identity and heritage: Israeli Hebrew. Centuries after it had ceased to be anybody’s first language, it was consciously revived by Jewish settlers in the Middle East in the early twentieth century, was declared the national language of Israel in 1948, and now boasts around three million native speakers.

Language Standardization and World History

As the examples mentioned throughout this article indicate, language and language standardization are profoundly intertwined with world history. What languages are accorded status in a region affects the lives of peoples in that region, potentially marginalizing some while securing the fortunes of others. It is not surprising that conquering peoples try to stamp out or denigrate use of the language of the conquered, nor is it surprising that language becomes a rallying point for nationalistic revolutionary movements. In nation-states in which multiple primary languages are spoken, the degree to which language issues can be amicably resolved is often a strong indicator of the stability of the country. As a political tool, language standardization has the potential both to create unity and to sow the seeds of dissent.


