Chapter 6

The anatomy of a link – What makes a goo (and bad) link

Not all links are created equal.

One part of the Google algorithm is the number of links pointing at your website, but it
would be foolish to make this a raw number and not take into account the quality of
those links. Otherwise, it would just be a free for all, and everyone would be trying to
get as many links as they can with no regard for the quality of those links.

As mentioned earlier, back in the early days of search engine optimization, it pretty
much was a free for all because the search engines were not as good at determining the
quality of a link. As search engines have become more advanced, they have been able to
expand the link-related signals they can use beyond raw numbers.

Search engines can look at a number of factors in combination to give them an indicator
of quality. More to the point, search engines can attemtpt to understand whether the
link is likely to be a genuine, editorially-given link, or a spammy link. These factors
are outlined in more detail below.

There is something important to remember here, though: it isn’t really the link itself
you care about (to a certain degree). It is the page and the domain you are getting the
link from which we care about right now. Once we know what these factors are, it helps
set the scene for the types of links you should (and shouldn’t) be getting for your own
website.

Before diving into the finer details of links and linking pages, let’s look more broadly
at what makes a good link. To me, there are three broad elements of a link:

  • Trust
  • Diversity
  • Relevance

If you can get a link that ticks off all three of these, you’re into a winner! You
shouldn’t obsess over doing so, but you should always have it in the back of your
mind.

Elements of a link that affect its quality

We must also consider which elements of a link itself the search engines can use to
assess its quality and relevance. They can then decide how much link equity to pass
across that link.

Again, many of these elements are part of the “reasonable surfer” model and may include
things such as:

  • The position of the link on the page (e.g., in the body, footer, sidebar, etc.)
  • Font size / color of the link
  • If the link is within a list, and the position within that list
  • If the link is text or an image, if it is an image, how big that image is
  • Number of words used as the anchor text

We’ll look at more of these elements later.

For now, here’s the basic anatomy of a link:

URL

The most important part of a link is the URL that is contained within it. If the URL
is one that points to your website, then you’ve built a link. At first glance, you
may not realize that the URL can affect the quality and trust that Google put into
that link, but it can have quite a big effect.

For example if the link is pointing to a URL that is one of the following:

  • Goes through lots of redirects
  • Is blocked by a robots.txt file
  • Is a spammy page (e.g., keyword stuffed, sells links, machine generated)
  • Contains viruses or malware
  • Contains characters that Google can’t / won’t crawl
  • Contains extra tracking parameters at the end of the URL
  • All of these things can alter the way that Google handles that link. It could
    choose not to follow the link or it could follow the link, but choose not to
    pass any PageRank across it.

In extreme cases, such as linking to spammy pages or malware, Google may even choose
to penalize the page containing the link to protect their users. Google does not
want its users to visit pages that link to spam and malware, so it may decide to
take those pages out of its index or make them hard to find.

How this affects your work

In general, you probably don’t need to worry too much on a daily basis about this
stuff, but it is certainly something you need to be aware of. For example, you
really need to make sure that any page you link to from your own site is good
quality. This is common sense, really, but SEO professionals tend to take it a lot
more seriously when they realize that they could receive a penalty if they don’t pay
attention!

In terms of getting links, you can do a few things to make your links as clean as
possible:

  • Avoid getting links to pages that redirect to others. Certainly avoid linking to
    a page that has a 302 redirect because Google does not tend to pass PageRank
    across these
  • Avoid linking to pages that have tracking parameters on the end. Sometimes
    Google will index two copies of the same page and the link equity will be split.
    If you absolutely can’t avoid doing this, then you can use a rel=canonical tag
    to tell Google which URL is the canonical so that they pass the link equity
    across to that version

Position of the link of a page

As a user, you are probably more likely to click on links in the middle of the page
that in the footer. Google understands this and in 2004 it filed a patent, which was
covered well by Bill Slawski. The patent outlined the “reasonable surfer” model,
which included the following:

Systems and methods consistent with the principles of the invention may
provide a reasonable surfer model that indicates that when a surfer accesses a
document with a set of links, the surfer will follow some of the links with higher
probability than others.

This reasonable surfer model reflects the fact that not all of the links
associated with a document are equally likely to be followed. Examples of
unlikely followed links may include “Terms of Service” links, banner
advertisements, and links unrelated to the document.

Source: http://www.seobythesea.com/2010/05/googles-reasonable-surfer-how-the-value-of-a-link-may-differ-based-upon-link-and-document-features-and-user-data/

The following diagram, courtesy of Moz, helps explain this a bit more:

With crawling technology improving, the search engines are able to find the position
of a link on a page as a user would see it and, therefore, treat it
appropriately.

If you’re a blogger and you want to share a really good resource with your users, you
are unlikely to put the link in the footer, where few readers will actually see it.
Instead, you’re likely to place it front and center of your blog so that as many
people see it and click on as possible.

Now compare this to a link in your footer to your privacy policy page. It seems a
little unfair to pass the same amount of link equity to both pages, right? You’d
want to pass more to the genuinely good resource rather than a standard page that
users may not actually read.

Anchor text

For SEO professionals, this is probably second in importance to the URL, particularly
as Google put so much weight on it as a ranking signal, even today where, arguably,
it isn’t as strong a signal as it used to be.

Historically, SEOs have worked very hard to make anchor text of incoming links the
same as the keywords which they want to rank for in the organic search results. So,
if you wanted to rank for “car insurance” you’d try to get a link that has “car
insurance” as the anchor text.

However, after the rollout of Penguin, SEOs started taking a more cautious approach
to anchor text. Many SEO pros reported that a high proportion of unnatural anchor
text in a link profile led to a drop in organic traffic after Google Penguin
launched.

The truth is that the average blogger, webmaster, or Internet user will NOT link to
you using your exact match keywords. It’s even less likely that lots of them will!
Google finally picked up on this trick and hit websites that over-optimized their
anchor text targeting.

Ultimately, you want the anchor text in your link profile to be a varied mix of
words. Some of it keyword focused, some of it focused on the brand, and some of it
not focused on anything at all. This helps reduce the chance of you being put on
Google’s radar for having unnatural links. The truth is that a lot of the time, you
also can’t control this and it’s therefore something you shouldn’t worry too much
about.

Nofollowed vs. Followed Links

The nofollow attribute, in the context of link profile analysis, will be discussed a
little later. For now, here are some of the basics you need to know.

The nofollow attribute was adopted in 2005 by Yahoo, Google, and MSN (now Bing). It
was intended to tell the search engines when a webmaster didn’t trust the website
they were linking to. It was also intended to be a way of declaring paid links
(i.e., advertising).

In terms of the quality of a link, if the nofollow attribute is applied, it shouldn’t
pass any PageRank. This effectively means that nofollow links are not counted by
Google and shouldn’t make any difference when it comes to organic search
results.

Therefore, when building links, you should always try to get links that are followed,
which means they should help you with ranking better. Having said that, having a few
nofollow links in your profile is natural.

You should also think of the other benefit of a link – traffic. If a link is
nofollow, but get lots of targeted traffic through it, then it is worth building.
There can also be secondary benefit to nofollow links in that if you get a nofollow
link from a high quality website which has lots of traffic, you may get other links
from people who see it and also link to you from their own websites.

Link title

Check out this page for some examples and explanations of the link title
attribute.

The intention here is to help provide more context about the link, particularly for
accessibility, as it provides people with more information if they
need it. If you hover over a link without clicking it, most modern browsers should
display the link title, much in the same way they’d show the ALT text of an image.
Note that the link title is not meant to be a duplication of anchor text; it is an
aid to help describe the link.

In terms of SEO, the link title doesn’t appear to carry much weight for ranking. In
fact, Google appeared to confirm that they do not use it at PubCon in 2005,
according to this forum thread. My testing has confirmed this as
well.

Text link vs. Image link

This section so far has been discussing text based links, by that we mean a link that
has anchor text containing standard letters or numbers. It is also possible to get
links directly from images, the HTML for this look slightly different:

Notice the addition of the img src attribute, which contains the image itself. Also
note how there is no anchor text as we’d usually find with a text link. Instead, the
ALT text (in this example, the words “Example Alt Text”) is used instead.

My limited testing on this has shown that the ALT text acts in a similar way to
anchor text but doesn’t appear to be as powerful.

Link contained within JavaScript / AJAX / Flash

In the past, the search engines have struggled with crawling certain web technologies
such as JavaScript, Flash and AJAX. They simply didn’t have the resources or
technology to crawl through these relatively advanced pieces of code. These pieces
of code were mainly designed for users with full browsers which were capable of
rendering them.

For a single user who can interact with a webpage, it is pretty straightforward for
them to execute things like JavaScript and Flash. However, a search engine crawler
isn’t like a standard web browser and doesn’t interact with a page the way a user
does.

This meant that if a link to your website was contained within a piece of JavaScript
code, it was possible that the search engines would never see it – therefore your
link would not be counted. Believe it or not, this actually used to be a good way of
intentionally stopping search engines from crawling certain links.

However, the search engines and the technology they use has developed quite a bit and
they are now, at least, more capable of understanding things like JavaScript. They
can sometimes execute it and find what happens next, such as links and content being
loaded.

In May 2014, Google published a blog post explicitly stating that
they were trying harder to get better at understanding websites that used
JavaScript. They also released a new feature, fetch and render, in Google Webmaster
Tools (now called the Search Console) so that we could better see when Google has
problems with JavaScript.

More recently in 2017, Google have updated their guidelines on the
use of Ajax and have been pretty explicit about their ability to correctly
understand JavaScript. It’s clear that they still have a long way to go, but are
much better than ever at understanding JavaScript.

But this doesn’t mean we don’t have to worry about things. You still must make links
as clean as possible and make it easy for search engines to find your links. This
means building them in simple HTML whenever possible.

How this affects your work

You should also know how to check if a search engine can find your link. This is
pretty straightforward. Here are a few methods:

  • Disabling Flash, JavaScript and AJAX in your browser using a tool like the Web
    Developer Toolbar for Chrome and Firefox
  • Checking the cache of your page
  • Looking at the source code and seeing if the linked to page is there and easy to
    understand

Text surrounding the link

There was some hot debate around this topic toward the end of 2012, mainly fueled by
this Whiteboard Friday on Moz, where Rand Fishkin predicted that
anchor text, as a signal, would become weaker. In the video, Fishkin gave a number
of examples of what appeared to be strange rankings for certain websites that didn’t
have any exact match anchor text for the keyword being searched. In Fishkin’s
opinion, a possible reason for this could be co-occurrence of related keywords which
are used by Google as another signal.

It was followed by a post by Bill Slawski, which gave a number of
alternative reasons why these apparently strange rankings may be happening. It was
then followed by another great post by Joshua Giardino, which dove
into the topic in a lot of detail. You should read both of these excellent posts.

Having said all of that, there is some belief that Google could certainly use the
text surrounding a link to infer some relevance, particularly in cases when anchor
text such as “click here” (which isn’t very descriptive) is used.

If you’re building links, you may not always have control of the anchor text, let
alone the content surrounding it. But if you are, then think about how you can
surround the link with related keywords and possibly tone down the use of exact
keywords in the anchor text itself.