IETF Standards Summary from VeriSign, Inc.

You Are Here: US Home > Domain Name Services > Domain Name Services > IDNs > IDN Standards > IETF Standards Summary

IETF Standards Summary


The Domain Name System (DNS) only recognizes ASCII characters A-Z, 0-9 and '-'. This limits the number of characters that can be utilized to build domain names to 37 of the more than 40,000 characters identified within Unicode. To create domain names from the wider range of Unicode characters, a character-encoding scheme that uniquely maps Unicode code points to an ASCII representation must be used and standardized.

The Internet Engineering Task Force (IETF) has led the effort in standardizing the way that non-ASCII characters are to be represented within and handled by DNS. The IETF published three standards related to Internationalized Domain Names (IDN):

  • Encoding scheme for IDNs
  • Name preparation
  • IDNs in applications

Encoding Scheme

The encoding scheme for IDNs will be an ASCII Compatible Encoding (ACE) that will encode the local language characters of an IDN into ASCII characters such that DNS can accurately answer a request for an address record. There are several types of ACE. In order to select an ACE as the standard, IETF must consider the difficult balance between compression and implementation. The preferred ACE will allow the greatest number of characters (code points) to be represented and will not be difficult to deploy. The IETF has chosen an ACE known as Punycode to be the standard.

Name Preparation

The name preparation standard will provide the rules that will ensure uniqueness in registering Unicode code points. The rules outline the criteria through which a set of non-ASCII characters will be refined to ensure that there is no ambiguity within the registrations of a specific name space. These rules are Mapping, Normalization and Prohibition.

  • Mapping 
    Characters may be mapped to nothing, a single character or multiple characters based upon their usefulness in text only or case. An example of usefulness: the soft hyphen (U+00AD) is discretionary and only has use within text and is invisible or ignored. The more common example is the mapping of a capital letter to a small letter such as 'B' (U+0042) to 'b' (U+0062). This is to ensure that a registration such as ibm.com does not have a conflict with other registration such as IBM.com or iBm.com. 
     
    There are cases where a single character will map to multiple characters. The small letter sharp s or 'ß' (U+00DF) has an upper case representation of 'SS' (U+0053, U+0053). This is also the same upper case representation for 'ss' (U+0073, U+0073). Therefore, 'ß' maps to 'ss'.
  • Normalization 
    Once a set of characters has been mapped, the set is normalized. Some input method editors (IME) enter characters that look exactly like another character, but have different code points. For example, 1 is a fullwidth digit one (U+FF11) and will normalize into a digit one (1) (U+0031). Normalization also ensures predictable results through ordering where characters have a number of combining diacritics.
  • Prohibition 
    After normalization, the mapped and normalized set of characters is checked against a table of prohibited characters. These characters are prohibited for a variety of reasons but the most common are spaces that could lead to confusion and control characters that cannot be displayed.

IDNs in Applications

The IDN in applications standard focuses on the location where the Unicode to ASCII mapping will take place. The IETF's approach makes the applications that send and receive traffic from DNS (browsers, e-mail clients, etc.) encode and un-encode the Unicode characters.

The Bottom Line

All of these issues are currently outlined in the IETF Request for Comment (RCFs).

In summary, enhancing the current DNS to include more than just English characters is not a simple undertaking.

VeriSign is committed to following the IETF standards and supporting rapid deployment of this new technology.




Contact Us
Contact Us

Phone: (703) 925-6999
info@verisign-grs.com