Homoglyphs
Updated over a week ago

One of the things to watch out for when sending crypto to an ENS name, or buying and trading them, are names with homoglyphs, meaning names that look identical or near-identical but are composed of different characters than the common ones.

For example, vita‍lik.eth is not the same name as vitalik.eth.

Confusing? Let's take a look at what's going on behind the scenes!

Analyzing Unicode

In situations like this it's a good idea to use a unicode analyzer to take a deeper look at what the string of text actually contains.

When analyzing vita‍lik.eth with a unicode analyzer we get this:

Can you spot the difference?

Real vitalik.eth

Fake vitalik.eth

When analyzing both ENS names one can see that one of the names is fake, containing an invisible character. A so called Zero-Width Joiner or ZWJ.

Zero-Width Joiners

For more information, view the Emojipedia article on Zero-Width Joiners

Zero-Width Joiners are invisible characters intended to be used to glue emoji together to alter their properties. For example, by analyzing the ❤️‍🔥 burning heart emoji with a unicode analyzer we can find out what characters it consists of:

As you can see the ❤️‍🔥 emoji actually consists of several different characters glued together by the use of a Zero-Width Joiner and a Variation Selector.

Variation Selectors

Variation selectors are used in unicode to display variants of an emoji.

By default some emoji are displayed with black & white representations, such as the heavy black heart we talked about in the previous example.

The full sequence looks like this:

  1. ❤ Heavy Black Heart

When analyzing the emoji in a unicode analyzer we see that it's quite simply a one character emoji:

Adding a Variation Selector to the sequence displays a colourful version of the emoji instead.

The full sequence looks like this:

  1. ❤ Heavy Black Heart

  2. Variation Selector-16

Emoji keyboards such as those found on phones usually produce emojis containing variation selectors for that purpose.

By adding a Zero-Width Joiner and a Fire Emoji we get a Heart on Fire emoji.

The full sequence looks like this:

  1. ❤ Heavy Black Heart

  2. Variation Selector-16

  3. Zero Width Joiner

  4. 🔥 Fire

As you can see the Heart on Fire emoji isn't a single character, but a combination of emoji and special characters.

In ENS names variation selectors are not allowed to be in the name, and the normalization strips those characters out when typed or pasted into the official ENS Manager App, however, when interacting directly with the contracts or when using third-party services to register ENS names that normalization process is missing, making it possible to register names with emoji sequences where the variation selectors aren't stripped out, making a name that's invalid in ENS systems.

An invisible codepoint which specifies that the preceding character should be displayed with emoji presentation. Only required if the preceding character defaults to text presentation.

Often used in Emoji ZWJ Sequences, where one or more characters in the sequence have text and emoji presentation, but otherwise default to text (black and white) display.

In ENS name normalization the Variation Selector is stripped out of the name. Any ENS name containing a Variation Selector is invalid.

Confusables

Confusables are characters which can be easily confused with other characters either by appearing exactly the same, or by appearing very similar to another character.

For example, take a look at the following characters to get an idea of how similar some characters can be:

For example: ens.eth and еns.eth are two different names, using two e's from different alphabets, can you spot the difference?

Real ens.eth

Fake ens.eth

As you can see the two names appear near identical to one another, in spite of one using a cyrillic еinstead of a latin e:

Unicode Character

Description

е

Cyrillic e

e

Latin e

This can be hard to spot, so employing a unicode analyzer is recommended as it will clearly show you what characters are in the name.

So far we've reviewed a few confusable characters, but unicode contains many more. These are results only for the letters e n and s:

Try searching for a name in the unicode confusable utility below to get an idea of just how many confusable characters (and character combinations) that exist!

Useful Links

Raffy's confusables tool can also be used to gain a better insight into confusable characters:

Small Capitals

In ENS name normalization upper case letters are normalized to their lower case counterparts. Any ENS name containing an upper case character is invalid.

Small capital characters are often confused with their regular capital (upper case) counterparts. For example:

  • Regular capitals

  • Small capitals

Regular capitals

Small capitals

Arabic numerals

One common issue we've encountered in support tickets are confusables in arabic and persian digits. Several arabic and persian keyboard digits appear identical, which leads to confusing scenarios when arabic and persian digits are mixed in one ENS name.

Arabic digits

Persian digits

Warning signs

Fortunately, there are many different types of warning signs you can look for. Services will often have indicators for names containing unusual characters, and fake homoglyph-names will often be priced unusually low for the name it pretends to be.

Service indicators

In spite of these characters being invisible, many services will warn or provide some kind of indicator that the name contains uncommon symbols.

OpenSea

OpenSea shows a warning triangle next to any name with unicode symbols in them:

Note: This doesn't mean that the name is fraudulent, simply that it contains unicode symbols. Use a Unicode Analyzer to analyze the name to make sure.

Etherscan

Etherscan shows an asterisk: * before any name that contains any characters other than a-Z0-9

Note: This doesn't mean that the name is fraudulent, simply that it contains unicode symbols. Use a Unicode Analyzer to analyze the name to make sure.

ENS.Vision

ENS.Vision also shows a warning triangle next to any name with unicode symbols in them:

Note: This doesn't mean that the name is fraudulent, simply that it contains unicode symbols. Use a Unicode Analyzer to analyze the name to make sure.

X2Y2

X2Y2 also shows a warning triangle next to any name with unicode symbols in them:

Note: This doesn't mean that the name is fraudulent, simply that it contains unicode symbols. Use a Unicode Analyzer to analyze the name to make sure.

LooksRare

LooksRare does not offer any indicator or warning for names containing unicode symbols.

Rarible

Rarible does not offer any indicator or warning for names containing unicode symbols.

Unusual prices

The type of names usually faked with homoglyphs are ones that would catch a high price on secondary markets. It's part of common sense, but it's worth repeating: if it seems too good to be true, it usually is.

If you see an ENS name on a secondary market that's very attractive but the price is unusually low, don't immediately rush to buy it, take some time to copy/paste the name into a Unicode Analyzer and verify that the name is what it seems to be.

Did this answer your question?