Arthur C. Clarke once wrote that “any sufficiently advanced technology is indistinguishable from magic” an insight that sheds a great deal of light on why our historical predecessors, without access to much of the knowledge we take for granted today, believed some of what they did. But it also applies to contemporary technologies, some of which we depend upon greatly yet understand only in part (or perhaps not at all).
The evolution of the meaning and use of the word “Google”—from proper noun to verb—corresponds with the increasing disconnect between web users and search technology. Ten years ago, searching for content on the web was a difficult process, but today one has only to enter a few words into Google’s search bar, and Presto! (magical incantation intended) instant and accurate results. As much as this might seem like magic, it’s a thoroughly mundane—albeit ingenious—technology at work. But if search engine technology is indistinguishable from magic, the process of optimizing web content for search engines will seem just as mysterious. Unfortunately, it’s difficult to trust what we don’t understand, and mistrust breeds the very kind of problems that are rampant in the search engine optimization industry: myths, abuses, and profit for those that would rather be seen as magicians than marketers.
Fortunately, we know enough about how search engines work to optimize our content with words, not wands. While there is some value in examining the myths and abuses of SEO, I think it makes sense to first explore how it works. I’ll start with a brief explanation of how search engines (I’ll focus on Google) work, then explain how web content can be optimized for them. Knowing how search engine optimization, in it’s most basic form, really works will shed some light on the misunderstandings that often get in the way of doing it well…
How Google Works
Ultimately, Google’s purpose is to index and rank web content in order to help searchers find what they are looking for. While this is done, in part, by organizing pages on the basis of authority, the goal of Google’s increasingly sophisticated algorithm is to understand the particular queries users submit—which are more likely to be specific than general, like “synthetic insulation shell” rather than “coat”—in order to direct them to the best source for the information they need. I like the way Alexis Madrigal put it in a recent Atlantic Monthly article. While he was writing primarily about online matchmaking, I think he gets right at the heart of what Google is all about without being too technical:
If only you could Google your way to The One. The search engine, in its own profane way, is a kadosh generator. Its primary goal is to find the perfect Web page for you out of all the Web pages in the world, to elevate it to No. 1.”
So how does Google know which pages are the most authoritative? Actually, Google outsources some of this work to us. Google’s PageRank algorithm (named for cofounder Larry Page) took an entirely new approach in ranking pages purely on the basis of incoming links, rather than calculating the frequency of keywords within a page’s content in order to discern which web pages were authoritative on any given subject. What this means is that the more important a website is—the more incoming links it has—the more influential its outgoing links will be. So a link from the New York Times website, which has a PageRank of 9/10, will have a greater influence over the PageRank of the site being linked to than one from a local news source, like wral.com, which has a PageRank of 7.
PageRank ranks web pages based upon the number and influence of incoming links.
But PageRank is only one piece of the authority puzzle. Because it is primarily concerned with scoring a website based upon the volume of its incoming links, PageRank isn’t as much an indicator of authority over a particular subject as it is authority in general, so let’s call that “influence” instead. And this differentiation is really for the best. After all, even though the New York Times is a nationally trusted news source, you probably wouldn’t expect them to be a better source for information on SEO than, say, this website, even though Newfangled.com’s PageRank is 6. (Go ahead and search for “how to do SEO.” There we are, the 5th result on the first page, but the New York Times is nowhere to be seen.) By balancing PageRank with its constantly changing index of the web’s content, Google can provide search results that are representative of the most influential and authoritative sources even as those sources shift in either aspect. So, a site with a lower PageRank, or less overall influence on the web, could have a much greater authority over a particular subject. This insight is what Chris Anderson and Clay Shirky had in mind when they popularized the idea of the long tail.
It is also this differentiation that makes search engine optimization possible. Being in control of “on page” factors—those that frame a page’s content using metadata, heading specifications, friendly links, etc.—enables you to compete in the marketplace of authority.
How to Do Your Own “On Page” Optimization
Assuming you use a content management system that enables you to control the on-page factors I mentioned above, optimizing your content for search engines is actually a fairly easy process. The difficulty isn’t in the implementation so much as it is in choices you make. This should become more clear as I review the various items you’ll need to consider as you optimize your web pages.
The title tag, which appears at the top of your browser, is different from the title a page might display at the beginning of its content. For instance, this page’s title (also it’s H1, but more on that later) is “Understanding Search Engines and Optimizing Content for Them,” which you can read right above the first paragraph. But the title tag for this page is slightly different; right now, it’s, “How Search Engine Optimization Works.” Because the title tag is one of the primary pieces of information that Google analyzes when indexing web pages, it’s important that it be an accurate description of what the page’s content is actually about while also corresponding to phrases that searchers are likely to use—something our founder Eric Holter goes into much more detail about in a video on how to do SEO that is well worth your time.
With that in mind, look back at the differences between the page title and the title tag for this page. The page title is longer than I’d want the title tag to be (though not too long—anything under 70 characters will be technically suitable for Google), but it also works more from an editorial perspective than from what people are likely to use as a search query for information on SEO. Search queries don’t need to be grammatically correct sentences; they can be one word or several that in combination identify the idea you’re looking for. I think that’s pretty intuitive when it comes to searching, but anticipating the search queries that people might use to find content like yours isn’t so easy. You can use Google Trends to evaluate search terms you’re thinking of using in your title tags, but it’s also probably going to take some trial and error. That’s why I was careful to note above what this page’s title tag is right now. I might very well decide to tweak it after I have some data to show how well it’s performing.
Unlike the meta title, a page’s meta description is not visible to users, that is, unless Google displays it in its search results. Let me explain: The meta description is another way to identify the subject of a page’s content. However, the content of the meta description will be indexed and used to populate the text of the snippet displayed when that page appears in a list of search results if it is the most relevant match for the query used. If the description is duplicate content, empty, or otherwise deemed irrelevant, Google will extract content from the page itself to populate the search result snippet. But remember, Google controls whether the description appears. If it doesn’t appear, there may be nothing you can do—that makes sense for your page’s content, anyway—to change that.
Since there isn’t a character limit to meta descriptions, you can craft something more grammatically correct than your meta title, but you still want to make sure that it contains keywords relevant to your page’s subject and is as succinct as possible.
The heading tags—H1 through H6—allow you to organize a page’s content in a similar way as you might an outline. The H1, or largest heading, would be the title of the outline, which also means it can only appear once. Earlier I noted that this page’s title, “Understanding Search Engines and Optimizing Content for Them,” is also its H1. This is because we’ve built our Content Management System to automatically display the title a user creates for a page as its H1. That ensures that there’s no confusion around what the largest heading should be and, more importantly, that there is not more than one. As for the rest of the headings, there can be multiple of each. In fact, this page has several H2’s—each of the bold, blue headings above the paragraphs I’ve written are wrapped in H2 tags.
Link Text and Friendly URLs
Remember how I noted that Google’s PageRank algorithm was primarily concerned with the influence of a page? Well, one way that Google evaluates this is to look at the text used when linking to a page. The more descriptive it is of that page’s content, the better the search engine can understand the value of its incoming links. So, if I were to link to our homepage by writing, “click here to see our homepage,” I’m telling Google nothing about where I’m directing users. But if I were to link to it by writing, “Newfangled is a web development company,” I’m providing Google—and readers—with a clearer idea of the nature of the content I’m linking to.
This same principle applies to the file names of web pages, which are often called “Friendly URLs.” A URL that is more indicative of the database technology being used—something like, “https://www.newfangled.com/contentmgr/showdetails.php/id/182″—doesn’t do much to help Google interpret what it’s about, not to mention users who need something easier to remember. If you’re using an up-to-date Content Management System, it should include a rewrite engine that enables you to provide a Friendly URL for each of your pages.
There really is no magic to search engine optimization. In fact, control over and thoughtful implementation of these four on-page factors—the meta title, meta description, heading tags, and friendly links—is all you need to properly optimize your web content. And just so you’re assured I’m not over simplifying it, they are all we use to optimize the content of our website. Of course, search engine optimization is not a one-time procedure. It’s an ongoing process. The more often you add indexable, properly-optimized content to your website, the more likely you are to see significant gains in valuable traffic to your site.
There’s plenty of SEO hype out there, but most of it isn’t true.
So, What About Abuses of SEO?
Now that we understand how search engines work and how to optimize our content for them, we can return to those abuses I mentioned at the beginning of this article. Are there holes in this system that allow some people to take advantage of it? Sure. But none of them challenge the validity of the core principles we’ve learned about so far. Most are actually a matter of business ethics in general. Here’s just one example:
In an investigative piece focusing on an online eyeglasses retailer, DecorMyEyes, New York Times author David Segal tells a story of how customer complaints can actually end up benefitting an online retailer’s business because of the way search engines assess the value of incoming links. In this particular case, the volume of complaints against DecorMyEyes was shocking enough, but the way the owner participated in complaint threads—aggressively, threateningly, and always encouraging the controversy—was even more so. Why would a business owner encourage and celebrate public customer complaints? Segal’s article sheds some light on the mystery, concluding that DecorMyEyes is profiting by exploiting a vulnerability of Google’s system: not qualitatively evaluating content. While Google representatives have not officially confirmed whether its algorithms include “sentiment analysis”—which would discern between customer complaints and commendations—the DecorMyEyes story seems to confirm that it does not. Without sentiment analysis, every new complaint customers publish online that links back to the DecorMyEyes website is web content that increases their PageRank. Even though the complaints are meant, justifiably, to damage the online influence of a shady business, they’re actually doing the opposite.
But it’s also not clear that sentiment analysis is the best solution to the problem. While sentiment analysis might seem helpful from a consumer’s point of view, imagine how it might affect other kinds of searches. After all, public opinion isn’t always rational or correct. In an interview with Segal, Danny Sullivan, editor-in-chief of the blog Search Engine Land, points out that you might have a hard time finding legitimate information about a politician if Google evaluated web pages on the basis of public sentiment. However, he also suggests that Google could increase the presence of consumer reviews associated with particular sites when they appear in search results. Not a bad idea, especially because it seems that only critical mass and the determination of a few of DecorMyEyes’ extra-dissatisfied former customers is just now beginning to derail its long run of abuse. In the meantime, Google’s intention to keep qualitative evaluation to a minimum underscores that the root of this particular problem is a flaw in people, not the system itself. No matter what system we have to work with, there will always be ways to game it.
It’s All About People-Friendly Content
A little over a year ago, I wrote, “Robots don’t read, people do,” to remind our readers that, while it’s easy to become accustomed to creating search-engine-friendly content, the real point is to create content that is people-friendly and then optimize it for search engines. But bearing in mind how important search engines are to our ability to navigate the web, perhaps a minor revision to that statement might be helpful: Robots don’t read, but they help people who do.