Jump to content

ISO 639-2

From Wikipedia, the free encyclopedia

ISO 639-2:1998, Codes for the representation of names of languages — Part 2: Alpha-3 code, is the second part of the ISO 639 standard, which lists codes for the representation of the names of languages. The three-letter codes given for each language in this part of the standard are referred to as "Alpha-3" codes. There are 487 entries in the list of ISO 639-2 codes.

The US Library of Congress is the registration authority for ISO 639-2 (referred to as ISO 639-2/RA). As registration authority, the LOC receives and reviews proposed changes; they also have representation on the ISO 639-RA Joint Advisory Committee responsible for maintaining the ISO 639 code tables.

Find a language
Enter an ISO 639-2 code to find the corresponding language article.

History and relationship to other ISO 639 standards

[edit]

Work was begun on the ISO 639-2 standard in 1989, because the ISO 639-1 standard, which uses only two-letter codes for languages, is not able to accommodate a sufficient number of languages. The ISO 639-2 standard was first released in 1998.

In practice, ISO 639-2 has largely been superseded by ISO 639-3 (2007), which includes codes for all the individual languages in ISO 639-2 plus many more. It also includes the special and reserved codes, and is designed not to conflict with ISO 639-2. ISO 639-3, however, does not include any of the collective languages in ISO 639-2; most of these are included in ISO 639-5.

B and T codes

[edit]

While most languages are given one code by the standard, twenty of the languages described have two three-letter codes, a "bibliographic" code (ISO 639-2/B), which is derived from the English name for the language and was a necessary legacy feature, and a "terminological" code (ISO 639-2/T), which is derived from the native name for the language and resembles the language's two-letter code in ISO 639-1. There were originally 22 B codes; scc and scr are now deprecated.

In general the T codes are favored; ISO 639-3 uses ISO 639-2/T.

Scopes and types

[edit]

The codes in ISO 639-2 have a variety of "scopes of denotation", or types of meaning and use, some of which are described in more detail below.

For a definition of macrolanguages and collective languages, see ISO 639-3/RA: Scope of denotation for language identifiers.

Individual languages are further classified as to type:

  • Living languages
  • Extinct languages
  • Ancient languages
  • Historic languages
  • Constructed languages

Collections of languages

[edit]

Some ISO 639-2 codes that are commonly used for languages do not precisely represent a particular language or some related languages (as the above macrolanguages). They are regarded as collective language codes and are excluded from ISO 639-3.

The collective language codes in ISO 639-2 are listed below. Some language groups are noted to be remainder groups, that is excluding languages with their own codes, while other are not. Remainder groups are afa, alg, art, ath, bat, ber, bnt, cai, cau, cel, crp, cus, dra, fiu, gem, inc, ine, ira, khi, kro, map, mis, mkh, mun, nai, nic, paa, roa, sai, sem, sio, sit, sla, ssa, tai and tut, while inclusive groups are apa, arn, arw, aus, bad, bai, bih, cad, car, chb, cmc, cpe, cpf, cpp, dua, hmn, iro, mno, mul, myn, nub, oto, phi, sgn, wak, wen, ypk and znd.[1]

The following code is identified as a collective code in ISO 639-2 but is (at present) missing from ISO 639-5:

Codes registered for 639-2 that are listed as collective codes in ISO 639-5 (and collective codes by name in ISO 639-2):

Reserved for local use

[edit]

The interval from qaa to qtz is "reserved for local use" and is not used in ISO 639-2 nor in ISO 639-3. These codes are typically used privately for languages not (yet) in either standard. Microsoft Windows uses the qps language code for pseudo-locales generated automatically from English strings, designed for testing software localization.[2]

Special situations

[edit]

There are four generic codes for special situations:

  • mis is listed as "uncoded languages" (originally an abbreviation for "miscellaneous")
  • mul (for "multiple languages") is applied when several languages are used and it is not practical to specify all the appropriate language codes
  • und (for "undetermined") is used in situations in which a language or languages must be indicated but the language cannot be identified.
  • zxx is listed in the code list as "no linguistic content", e.g. animal sounds (code added on 11 January 2006)

These four codes are also used in ISO 639-3.

See also

[edit]

References

[edit]
  1. ^ "ISO 639-2 Language Code List - Codes for the representation of names of languages". Library of Congress.
  2. ^ "Pseudo-Locales - Win32 apps". Microsoft Learn. 7 January 2021. Retrieved 31 August 2023.
[edit]