| language {tau} | R Documentation |
Extract language, script, region and variant subtags from IETF language tags.
parse_IETF_language_tag(x)
x |
a character vector with IETF language tags. |
Internet Engineering Task Force (IETF) language tags are defined by IETF BCP 47, which is currently RFC 4646 (http://tools.ietf.org/html/rfc4646) and RFC 4647 (http://tools.ietf.org/html/rfc4646), and are used in a number of modern standards.
Each language tag is composed of one or more “subtags” separated by hyphens. For the basic format currently supported, the subtags occur as follows:
Language subtags are mainly derived from ISO 639-1 and ISO 639-2, script subtags from ISO 15924, and region subtags from ISO 3166-1 alpha-2 and UN M.49. (See package ISOcodes for more information about these standards.) Variant subtags are not derived from any standard. The Language Subtag Registry (http://www.iana.org/assignments/language-subtag-registry), maintained by the Internet Assigned Numbers Authority (IANA), lists the current valid public subtags.
See http://en.wikipedia.org/wiki/IETF_language_tag for more information.
Note that in particular so-called grandfathered and private use tags are currently not supported.
A character matrix with 4 columns named "Language",
"Script", "Region", and "Variant", giving the
corresponding subtags (or NA if these were missing from the
language tag).
## German as used in Switzerland:
parse_IETF_language_tag("de-CH")
## Serbian written using Latin script as used in Serbia and Montenegro:
parse_IETF_language_tag("sr-Latn-CS")
## Spanish appropriate to the UN Latin American and Caribbean region:
parse_IETF_language_tag("es-419")
## All in one:
parse_IETF_language_tag(c("de-CH", "sr-Latn-CS", "es-419"))