Understanding Character Variants - VeriSign's Character Variant Solution - Blocking from VeriSign, Inc.

You Are Here: US Home > Domain Name Services > Domain Name Services > IDNs > IDN Standards > Understanding Character Variants > VeriSign's Character Variant Solution - Blocking

Understanding Character Variants



VeriSign's Character Variant Solution - Blocking

In Phase I, VeriSign implemented two new processes:

  • Legacy registrations 
    For existing IDN registrations, VeriSign will generate appropriate domain names containing character variants and prohibit them from being registered. If one of the generated domain names matches an existing IDN registration, it will continue to exist.
  • New registrations 
    For new IDN registrations, VeriSign will generate the appropriate domain names containing character variants and prohibit them from being registered. If one of the generated domain names matches an existing IDN registration or its blocked character variants, the new IDN registration will not be accepted.

The base IDN registration and its associated blocked character variants will act as a package. For example, if the base registration is deleted, the associated blocked character variants will be unblocked and become available for registration. This behavior will continue until additional functionality, such as activation of blocked character variants, is added.

By blocking the appropriate character variants, VeriSign hopes to limit the number of new IDN registrations that are adversely affected by character variants. In Phase I, VeriSign will use mapping tables developed by TWNIC to generate the character variants. The TWNIC mapping table will be replaced with the CDNC mapping table when it is available.

Phase I was implemented entirely by VeriSign.

Example

The following is an example of character variants and how they will be handled. To simplify the example, a combination of shapes has been used to represent the Unicode points that represent Traditional and Simplified Chinese characters.

A Chinese character variant can fall into two classes:

  • Class A: a variant where the string contains characters entirely in Simplified Chinese or Traditional Chinese. The VeriSign Character Variant Solution will block Class A variants.
  • Class B: a variant where the string contains characters that are both unique to Simplified Chinese and Traditional Chinese. The VeriSign Character Variant Solution will not block Class B variants.

Desired Traditional Chinese registration (base registration): image image image image

Mapping:

Traditional Chinese

Simplified Chinese

image

image

image

image

image

image

image

image

In the above Traditional Chinese registration, all of the Unicode code points (image, image, image, image) are contained within the character set used for Traditional Chinese. However, image and image are unique to Traditional Chinese but can be mapped to the characters image and image, respectively, in Simplified Chinese.

The following table shows the base registration and the character variants that were generated for the Traditional Chinese registration above:

Base Registration

image image image image

Registered

Class A variant

image image image image

Blocked

Class B variant 1

image image image image

Not blocked

Class B variant 2

image image image image

Not blocked

Only the Class A variant(s) would be blocked. This is commonly referred to as the "mirror" of the original registration. In practice, there may be multiple mirrors. The process of blocking character variants will only be applied to IDN registrations that are composed entirely of the Simplified or Traditional Chinese scripts.

The language tables deployed in the VeriSign Character Variant Solution include 
(as of April 24, 2004):

  • Chinese
  • Japanese
  • Polish (Only the Latin characters)
  • Greek: Unicode Code Points U+002D, U+0030 through U+0039,  
    U+0370 through U+03FF
  • Russian: Unicode Code Points U+002D, U+0030 through U+0039, 
    U+0400 through U+04FF, U+0500 through U+052F
  • Belarusian: Unicode Code Points U+002D, U+0030 through U+0039, 
    U+0400 through U+04FF, U+0500 through U+052F
  • Ukrainian: Unicode Code Points U+002D, U+0030 through U+0039, 
    U+0400 through U+04FF, U+0500 through U+052F
  • Serbian: Unicode Code Points U+002D, U+0030 through U+0039, 
    U+0400 through U+04FF, U+0500 through U+052F
  • Macedonian: Unicode Code Points U+002D, U+0030 through U+0039,  
    U+0400 through U+04FF, U+0500 through U+052F
  • Bulgarian: Unicode Code Points U+002D, U+0030 through U+0039,  
    U+0400 through U+04FF, U+0500 through U+052F



Contact Us
Contact Us

Phone: (703) 925-6999
info@verisign-grs.com