Chapter 6

The anatomy of a link – What makes a good (and bad) link

Not all links are created equal.

One part of the Google algorithm is the number of links pointing at your website, but it
would be foolish to make this a raw number and not take into account the quality of
those links. Otherwise, it would just be a free for all, and everyone would be trying to
get as many links as they can with no regard for the quality of those links.

As mentioned earlier, back in the early days of search engine optimization, it pretty
much was a free for all because the search engines were not as good at determining the
quality of a link. As search engines have become more advanced, they have been able to
expand the link-related signals they can use beyond raw numbers.

Search engines can look at a number of factors in combination to give them an indicator
of quality. More to the point, search engines can attempt to understand whether the
link is likely to be a genuine, editorially-given link, or a spammy link. These factors
are outlined in more detail below.

There is something important to remember here, though: it isn’t really the link itself
you care about (to a certain degree). It is the page and the domain you are getting the
link from which we care about right now. Once we know what these factors are, it helps
set the scene for the types of links you should (and shouldn’t) be getting for your own
website.

Before diving into the finer details of links and linking pages, let’s look more broadly
at what makes a good link. To me, there are three broad elements of a link:

  • Trust
  • Diversity
  • Relevance

If you can get a link that ticks off all three of these, you’re onto a winner! You
shouldn’t obsess over doing so, but you should always have it in the back of your
mind.

Elements of a link that affect its quality

We must also consider which elements of a link itself the search engines can use to assess its quality and relevance. They can then decide how much link equity (and other signals such as relevance) to pass across that link.

Again, many of these elements are part of the “reasonable surfer” model and may include things such as:

  • The position of the link on the page (e.g., in the body, footer, sidebar, etc.)
  • Font size / color of the link
  • If the link is within a list, and the position within that list
  • If the link is text or an image, if it is an image, how big that image is
  • Number of words used as the anchor text

We’ll look at more of these elements later.

For now, here’s the basic anatomy of a link:

URL

The most important part of a link is the URL that is contained within it. If the URL is one that points to a page on your website, then you’ve built a link. At first glance, you may not realize that the URL can affect the quality and trust that Google put into that link, but it can have quite a big effect.

For example if the link is pointing to a URL that is one of the following:

  • Goes through lots of redirects (or just one)
  • Is blocked to search engine crawlers by a robots.txt file
  • Is a spammy page (e.g., keyword-stuffed, sells links, machine-generated)
  • Contains viruses or malware
  • Contains characters that Google can’t / won’t crawl
  • Contains extra tracking parameters at the end of the URL

All of these things can alter the way that Google handles that link. It could choose not to follow the link or it could follow the link, but choose not to pass any PageRank across it.

In extreme cases, such as linking to spammy pages or malware, Google may even choose to penalize the page containing the link to protect its users. Google does not want its users to visit pages that link to spam and malware, so it may decide to take those pages out of its index or make them hard to find.

How this affects your work

In general, you probably don’t need to worry too much on a daily basis about this stuff, but it is certainly something you need to be aware of. For example, you really need to make sure that any page you link to from your own site is good quality. This is common sense, really, but SEO professionals tend to take it a lot more seriously when they realize that they could receive a penalty if they don’t pay attention!

In terms of getting links, you can do a few things to make your links as clean as possible:

  • Avoid getting links to pages that redirect to others. Certainly avoid linking to a page that has a 302 redirect because Google does not tend to pass PageRank across these
  • Avoid linking to pages that have tracking parameters on the end. Sometimes Google will index two copies of the same page and the link equity will be split. If you absolutely can’t avoid doing this, then you can use a rel=canonical tag to tell Google which URL is the canonical so that they pass the link equity across to that version

Position of the link of a page

As a user, you are probably more likely to click on links in the middle of the page that in the footer. Google understands this and in 2004 it filed a patent, which was covered well by Bill Slawski. The patent outlined the “reasonable surfer” model, which included the following:

Systems and methods consistent with the principles of the invention may provide a reasonable surfer model that indicates that when a surfer accesses a document with a set of links, the surfer will follow some of the links with higher probability than others.

This reasonable surfer model reflects the fact that not all of the links associated with a document are equally likely to be followed. Examples of unlikely followed links may include “Terms of Service” links, banner advertisements, and links unrelated to the document.

Source: http://www.seobythesea.com/2010/05/googles-reasonable-surfer-how-the-value-of-a-link-may-differ-based-upon-link-and-document-features-and-user-data/

The following diagram, courtesy of Moz, helps explain this a bit more:

With crawling technology improving, the search engines are able to find the position of a link on a page as a user would see it and, therefore, treat it appropriately.

If you’re a blogger and you want to share a really good resource with your users, you are unlikely to put the link in the footer, where few readers will actually see it. Instead, you’re likely to place it front and center of your blog so that as many people see it and click on as possible.

Now compare this to a link in your footer to your privacy policy page. It seems a little unfair to pass the same amount of link equity to both pages, right? You’d want to pass more to the genuinely good resource rather than a standard page that users may not actually read.

Anchor text

For SEO professionals, this is probably second in importance to the URL, particularly as Google put so much weight on it as a ranking signal, even today where, arguably, it isn’t as strong a signal as it used to be.

Historically, SEOs have worked very hard to make anchor text of incoming links the same as the keywords which they want to rank for in the organic search results. So, if you wanted to rank for “car insurance” you’d try to get a link that has “car insurance” as the anchor text.

However, after the rollout of the Penguin update on April 2012, SEOs started taking a more cautious approach to anchor text. Many SEO pros reported that a high proportion of unnatural anchor text in a link profile led to a drop in organic traffic after the Penguin update was released.

The truth is that the average blogger, webmaster, or Internet user will NOT link to you using your exact match keywords. It’s even less likely that lots of them will! Google finally picked up on this trick and hit websites that over-optimized their anchor text targeting.

Ultimately, you want the anchor text in your link profile to be a varied mix of words. Some of it keyword-focused, some of it focused on the brand, and some of it not focused on anything at all. This helps reduce the chance of you being put on Google’s radar for having unnatural links. The truth is that a lot of the time, you also can’t control this and it’s therefore something you shouldn’t worry too much about.

Nofollowed / UGC / Sponsored link attributes vs. Followed Links

The nofollow attribute, in the context of link profile analysis, will be discussed a little later. For now, here are some of the basics you need to know.

The nofollow attribute was adopted in 2005 by Yahoo, Google, and MSN (now Bing). It was intended to tell the search engines when a webmaster didn’t trust the website they were linking to. It was also intended to be a way of declaring paid links (i.e., advertising).

In terms of the quality of a link, if the nofollow attribute is applied, it shouldn’t pass any PageRank. This effectively means that nofollow links are not counted by Google as part of their link graph and shouldn’t make any difference when it comes to organic search results.

Over the years, the nofollow attribute has developed and can be used in a variety of situations. In 2019, Google introduced two new attributes designed to allow for more granularity and flexibility for webmasters to mark their links in certain ways. They also confirmed that they may change how they decide to interpret all three attributes and implied that in some cases, they may count links as part of their link graph. 

This confirmed something that some SEOs had long suspected, that Google may selectively decide when to pass link equity across a nofollow link or not. 

In terms of your work, when building links, you should always try to get links that are followed, which means they should help you with ranking better. Having said that, having a few nofollow links in your profile is natural and sometimes, can't be helped and as mentioned above, may actually help.

You should also think of the other benefit of a link – traffic. If a link is nofollow, but get lots of targeted traffic through it, then it is worth building. There can also be a secondary benefit to nofollow links in that if you get a nofollow link from a high-quality website which has lots of traffic, you may get other links from people who see it and also link to you from their own websites. This happens quite a lot with top-tier newspaper websites where it is quite common to use the nofollow attribute, but getting one from them could lead to other publications linking to you as well and some of these may be followed.

Link title

Check out this page for some examples and explanations of the link title attribute.

The intention here is to help provide more context about the link, particularly for accessibility, as it provides people with more information if they need it. If you hover over a link without clicking it, most modern browsers should display the link title, much in the same way they’d show the ALT text of an image. Note that the link title is not meant to be a duplication of anchor text; it is an aid to help describe the link.

In terms of SEO, the link title doesn’t appear to carry much weight for ranking. In fact, Google appeared to confirm that they do not use it at PubCon in 2005, according to this forum thread. My testing has confirmed this as well.

Text link vs. Image link

This section so far has been discussing text-based links, by that we mean a link that has anchor text containing standard letters or numbers. It is also possible to get links directly from images, the HTML for this look slightly different:

Notice the addition of the img src attribute, which contains the image itself. Also note how there is no anchor text as we’d usually find with a text link. Instead, the ALT text (in this example, the words “Example Alt Text”) is used instead.

My limited testing on this has shown that the ALT text acts in a similar way to anchor text but doesn’t appear to be as powerful.

Link contained within JavaScript / AJAX / Flash

In the past, the search engines have struggled with crawling certain web technologies such as JavaScript, Flash and AJAX. They simply didn’t have the resources or technology to crawl through these relatively advanced pieces of code. These pieces of code were mainly designed for users with full browsers which were capable of rendering them.

For a single user who can interact with a webpage, it is pretty straightforward for them to execute things like JavaScript and Flash. However, a search engine crawler isn’t like a standard web browser and doesn’t interact with a page the way a user does.

This meant that if a link to your website was contained within a piece of JavaScript code, it was possible that the search engines would never see it – therefore your link would not be counted. Believe it or not, this actually used to be a good way of intentionally stopping search engines from crawling certain links.

However, the search engines and the technology they use has developed quite a bit and they are now, at least, more capable of understanding things like JavaScript. They can sometimes execute it and find what happens next, such as links and content being loaded.

In May 2014, Google published a blog post explicitly stating that they were trying harder to get better at understanding websites that used JavaScript. They also released a new feature, fetch and render, in Google Webmaster Tools (now called the Search Console) so that we could better see when Google has problems with JavaScript.

More recently in 2017, Google has updated their guidelines on the use of Ajax and have been pretty explicit about its ability to correctly understand JavaScript. It’s clear that they still have a long way to go, but are much better than ever at understanding JavaScript.

But this doesn’t mean we don’t have to worry about things. You still must make links as clean as possible and make it easy for search engines to find your links. This means building them in simple HTML whenever possible.

How this affects your work

You should also know how to check if a search engine can find your link. This is pretty straightforward. Here are a few methods:

  • Disabling Flash, JavaScript and AJAX in your browser using a tool like the Web
    Developer Toolbar for Chrome and Firefox
  • Checking the cache of your page
  • Looking at the source code and seeing if the linked-to page is there and easy to
    understand

Text surrounding the link

There was some hot debate around this topic toward the end of 2012, mainly fueled by this Whiteboard Friday on Moz, where Rand Fishkin predicted that anchor text, as a signal, would become weaker. In the video, Fishkin gave a number of examples of what appeared to be strange rankings for certain websites that didn’t have any exact match anchor text for the keyword being searched. In Fishkin’s opinion, a possible reason for this could be co-occurrence of related keywords which are used by Google as another signal.

It was followed by a post by Bill Slawski, which gave a number of alternative reasons why these apparently strange rankings may be happening. It was then followed by another great post by Joshua Giardino, which dove into the topic in a lot of detail. You should read both of these excellent posts.

Having said all of that, there is some belief that Google could certainly use the text surrounding a link to infer some relevance, particularly in cases when anchor text such as “click here” (which isn’t very descriptive) is used.

If you’re building links, you may not always have control of the anchor text, let alone the content surrounding it. But if you are, then think about how you can surround the link with related keywords and possibly tone down the use of exact keywords in the anchor text itself.