 |
 |
Understanding Character Variants |
 |
|
VeriSign's Character Variant Solution - Blocking
In Phase I, VeriSign implemented two new processes:
- Legacy registrations
For existing IDN registrations, VeriSign will generate appropriate
domain names containing character variants and prohibit them from being
registered. If one of the generated domain names matches an existing
IDN registration, it will continue to exist.
- New
registrations
For new IDN registrations, VeriSign will generate the appropriate
domain names containing character variants and prohibit them from being
registered. If one of the generated domain names matches an existing
IDN registration or its blocked character variants, the new IDN registration
will not be accepted.
The base IDN registration
and its associated blocked character variants will act as a package.
For example, if the base registration is deleted, the associated blocked
character variants will be unblocked and become available for registration.
This behavior will continue until additional functionality, such as
activation of blocked character variants, is added.
By blocking the appropriate character variants,
VeriSign hopes to limit the number of new IDN registrations that are
adversely affected by character variants. In Phase I, VeriSign will
use mapping tables developed by TWNIC to generate the character variants.
The TWNIC mapping table will be replaced with the CDNC mapping table
when it is available.
Phase I was implemented entirely by VeriSign.
Example
The following is an example of character variants
and how they will be handled. To simplify the example, a combination
of shapes has been used to represent the Unicode points that represent
Traditional and Simplified Chinese characters.
A Chinese character variant can fall into two
classes:
- Class A: a variant where the string contains characters entirely
in Simplified Chinese or Traditional Chinese. The VeriSign Character
Variant Solution will block Class A variants.
- Class
B: a variant where the string contains characters that are both
unique to Simplified Chinese and Traditional Chinese. The VeriSign Character
Variant Solution will not block Class B variants.
Desired Traditional
Chinese registration (base registration): 
Mapping:
In the above Traditional
Chinese registration, all of the Unicode code points ( , , , )
are contained within the character set used for Traditional Chinese.
However,
and
are unique to Traditional Chinese but can be mapped to the characters
and ,
respectively, in Simplified Chinese.
The following table shows the base registration
and the character variants that were generated for the Traditional Chinese
registration above:
Base
Registration |

|
Registered |
Class
A variant |

|
Blocked |
Class
B variant 1 |

|
Not blocked |
Class
B variant 2 |

|
Not blocked |
Only the Class A
variant(s) would be blocked. This is commonly referred to as the "mirror"
of the original registration. In practice, there may be multiple mirrors.
The process of blocking character variants will only be applied to IDN
registrations that are composed entirely of the Simplified or Traditional
Chinese scripts.
The language tables deployed in the VeriSign Character Variant Solution
include
(as of April 24, 2004):
- Chinese
- Japanese
- Polish (Only the
Latin characters)
- Greek: Unicode Code
Points U+002D, U+0030 through U+0039,
U+0370 through U+03FF
- Russian: Unicode
Code Points U+002D, U+0030 through U+0039,
U+0400 through U+04FF, U+0500 through U+052F
- Belarusian: Unicode
Code Points U+002D, U+0030 through U+0039,
U+0400 through U+04FF, U+0500 through U+052F
- Ukrainian: Unicode
Code Points U+002D, U+0030 through U+0039,
U+0400 through U+04FF, U+0500 through U+052F
- Serbian: Unicode
Code Points U+002D, U+0030 through U+0039,
U+0400 through U+04FF, U+0500 through U+052F
- Macedonian: Unicode
Code Points U+002D, U+0030 through U+0039,
U+0400 through U+04FF, U+0500 through U+052F
- Bulgarian: Unicode
Code Points U+002D, U+0030 through U+0039,
U+0400 through U+04FF, U+0500 through U+052F
|
 |