Skip navigation
In This Article
  • 1 : How SEO Works
  • 2 : Understanding Search Engines and Optimizing Content for Them
View On Single PageView on single page
Categories
NEWSLETTERS  |  DECEMBER, 2010

Understanding Search Engines and Optimizing Content for Them


How Google Works

Ultimately, Google's purpose is to index and rank web content in order to help searchers find what they are looking for. While this is done, in part, by organizing pages on the basis of authority, the goal of Google's increasingly sophisticated algorithm is to understand the particular queries users submit—which are more likely to be specific than general, like "synthetic insulation shell" rather than "coat"—in order to direct them to the best source for the information they need. I like the way Alexis Madrigal put it in a recent Atlantic Monthly article. While he was writing primarily about online matchmaking, I think he gets right at the heart of what Google is all about without being too technical:

If only you could Google your way to The One. The search engine, in its own profane way, is a kadosh generator. Its primary goal is to find the perfect Web page for you out of all the Web pages in the world, to elevate it to No. 1."

So how does Google know which pages are the most authoritative? Actually, Google outsources some of this work to us. Google's PageRank algorithm (named for cofounder Larry Page) took an entirely new approach in ranking pages purely on the basis of incoming links, rather than calculating the frequency of keywords within a page's content in order to discern which web pages were authoritative on any given subject. What this means is that the more important a website is—the more incoming links it has—the more influential its outgoing links will be. So a link from the New York Times website, which has a PageRank of 9/10, will have a greater influence over the PageRank of the site being linked to than one from a local news source, like wral.com, which has a PageRank of 7.


          PageRank ranks web pages based upon the number and influence of incoming links.

But PageRank is only one piece of the authority puzzle. Because it is primarily concerned with scoring a website based upon the volume of its incoming links, PageRank isn't as much an indicator of authority over a particular subject as it is authority in general, so let's call that "influence" instead. And this differentiation is really for the best. After all, even though the New York Times is a nationally trusted news source, you probably wouldn't expect them to be a better source for information on SEO than, say, this website, even though Newfangled.com's PageRank is 6. (Go ahead and search for "how to do SEO." There we are, the 5th result on the first page, but the New York Times is nowhere to be seen.) By balancing PageRank with its constantly changing index of the web's content, Google can provide search results that are representative of the most influential and authoritative sources even as those sources shift in either aspect. So, a site with a lower PageRank, or less overall influence on the web, could have a much greater authority over a particular subject. This insight is what Chris Anderson and Clay Shirky had in mind when they popularized the idea of the long tail.

It is also this differentiation that makes search engine optimization possible. Being in control of "on page" factors—those that frame a page's content using metadata, heading specifications, friendly links, etc.—enables you to compete in the marketplace of authority.

How to Do Your Own "On Page" Optimization

Assuming you use a content management system that enables you to control the on-page factors I mentioned above, optimizing your content for search engines is actually a fairly easy process. The difficulty isn't in the implementation so much as it is in choices you make. This should become more clear as I review the various items you'll need to consider as you optimize your web pages.

Title Tag
The title tag, which appears at the top of your browser, is different from the title a page might display at the beginning of its content. For instance, this page's title (also it's H1, but more on that later) is "Understanding Search Engines and Optimizing Content for Them," which you can read right above the first paragraph. But the title tag for this page is slightly different; right now, it's, "How Search Engine Optimization Works." Because the title tag is one of the primary pieces of information that Google analyzes when indexing web pages, it's important that it be an accurate description of what the page's content is actually about while also corresponding to phrases that searchers are likely to use—something our founder Eric Holter goes into much more detail about in a video on how to do SEO that is well worth your time.

With that in mind, look back at the differences between the page title and the title tag for this page. The page title is longer than I'd want the title tag to be (though not too long—anything under 70 characters will be technically suitable for Google), but it also works more from an editorial perspective than from what people are likely to use as a search query for information on SEO. Search queries don't need to be grammatically correct sentences; they can be one word or several that in combination identify the idea you're looking for. I think that's pretty intuitive when it comes to searching, but anticipating the search queries that people might use to find content like yours isn't so easy. You can use Google Trends to evaluate search terms you're thinking of using in your title tags, but it's also probably going to take some trial and error. That's why I was careful to note above what this page's title tag is right now. I might very well decide to tweak it after I have some data to show how well it's performing.

Meta Description
Unlike the meta title, a page's meta description is not visible to users, that is, unless Google displays it in its search results. Let me explain: The meta description is another way to identify the subject of a page's content. However, the content of the meta description will be indexed and used to populate the text of the snippet displayed when that page appears in a list of search results if it is the most relevant match for the query used. If the description is duplicate content, empty, or otherwise deemed irrelevant, Google will extract content from the page itself to populate the search result snippet. But remember, Google controls whether the description appears. If it doesn't appear, there may be nothing you can do—that makes sense for your page's content, anyway—to change that.

Since there isn't a character limit to meta descriptions, you can craft something more grammatically correct than your meta title, but you still want to make sure that it contains keywords relevant to your page's subject and is as succinct as possible.

Heading Tags
The heading tags—H1 through H6—allow you to organize a page's content in a similar way as you might an outline. The H1, or largest heading, would be the title of the outline, which also means it can only appear once. Earlier I noted that this page's title, "Understanding Search Engines and Optimizing Content for Them," is also its H1. This is because we've built our Content Management System to automatically display the title a user creates for a page as its H1. That ensures that there's no confusion around what the largest heading should be and, more importantly, that there is not more than one. As for the rest of the headings, there can be multiple of each. In fact, this page has several H2's—each of the bold, blue headings above the paragraphs I've written are wrapped in H2 tags.

Link Text and Friendly URLs
Remember how I noted that Google's PageRank algorithm was primarily concerned with the influence of a page? Well, one way that Google evaluates this is to look at the text used when linking to a page. The more descriptive it is of that page's content, the better the search engine can understand the value of its incoming links. So, if I were to link to our homepage by writing, "click here to see our homepage," I'm telling Google nothing about where I'm directing users. But if I were to link to it by writing, "Newfangled is a web development company," I'm providing Google—and readers—with a clearer idea of the nature of the content I'm linking to.

This same principle applies to the file names of web pages, which are often called "Friendly URLs." A URL that is more indicative of the database technology being used—something like, "http://www.newfangled.com/contentmgr/showdetails.php/id/182"—doesn't do much to help Google interpret what it's about, not to mention users who need something easier to remember. If you're using an up-to-date Content Management System, it should include a rewrite engine that enables you to provide a Friendly URL for each of your pages.

There really is no magic to search engine optimization. In fact, control over and thoughtful implementation of these four on-page factors—the meta title, meta description, heading tags, and friendly links—is all you need to properly optimize your web content. And just so you're assured I'm not over simplifying it, they are all we use to optimize the content of our website. Of course, search engine optimization is not a one-time procedure. It's an ongoing process. The more often you add indexable, properly-optimized content to your website, the more likely you are to see significant gains in valuable traffic to your site.


          There's plenty of SEO hype out there, but most of it isn't true.

So, What About Abuses of SEO?

Now that we understand how search engines work and how to optimize our content for them, we can return to those abuses I mentioned at the beginning of this article. Are there holes in this system that allow some people to take advantage of it? Sure. But none of them challenge the validity of the core principles we've learned about so far. Most are actually a matter of business ethics in general. Here's just one example:

In an investigative piece focusing on an online eyeglasses retailer, DecorMyEyes, New York Times author David Segal tells a story of how customer complaints can actually end up benefitting an online retailer's business because of the way search engines assess the value of incoming links. In this particular case, the volume of complaints against DecorMyEyes was shocking enough, but the way the owner participated in complaint threads—aggressively, threateningly, and always encouraging the controversy—was even more so. Why would a business owner encourage and celebrate public customer complaints? Segal's article sheds some light on the mystery, concluding that DecorMyEyes is profiting by exploiting a vulnerability of Google's system: not qualitatively evaluating content. While Google representatives have not officially confirmed whether its algorithms include "sentiment analysis"—which would discern between customer complaints and commendations—the DecorMyEyes story seems to confirm that it does not. Without sentiment analysis, every new complaint customers publish online that links back to the DecorMyEyes website is web content that increases their PageRank. Even though the complaints are meant, justifiably, to damage the online influence of a shady business, they're actually doing the opposite.

But it's also not clear that sentiment analysis is the best solution to the problem. While sentiment analysis might seem helpful from a consumer's point of view, imagine how it might affect other kinds of searches. After all, public opinion isn't always rational or correct. In an interview with Segal, Danny Sullivan, editor-in-chief of the blog Search Engine Land, points out that you might have a hard time finding legitimate information about a politician if Google evaluated web pages on the basis of public sentiment. However, he also suggests that Google could increase the presence of consumer reviews associated with particular sites when they appear in search results. Not a bad idea, especially because it seems that only critical mass and the determination of a few of DecorMyEyes' extra-dissatisfied former customers is just now beginning to derail its long run of abuse. In the meantime, Google's intention to keep qualitative evaluation to a minimum underscores that the root of this particular problem is a flaw in people, not the system itself. No matter what system we have to work with, there will always be ways to game it.

It's All About People-Friendly Content

A little over a year ago, I wrote, "Robots don't read, people do," to remind our readers that, while it's easy to become accustomed to creating search-engine-friendly content, the real point is to create content that is people-friendly and then optimize it for search engines. But bearing in mind how important search engines are to our ability to navigate the web, perhaps a minor revision to that statement might be helpful: Robots don't read, but they help people who do.

<  1  2  
Comments
jackWeb | January 3, 2011 10:27 PM

thanks for a very helpful treatment. i have a long list now of to-dos for my website. one question: how important is it to have friendly URLs? my site doesn't have the rewrite engine you wrote about, so i'm not sure how to do this part. also, how do you find out what your page rank is?
Jenn | January 4, 2011 11:32 AM

This article on Google's blog, which is actually 2 years old, states that the search engine does not recommend rewriting URLs:

http://googlewebmastercentral.blogspot.com/2008/09/dynamic-urls-vs-static-urls.html

Would you disagree with their assessment?
Chris Butler | January 4, 2011 12:06 PM

jackWeb: In my opinion, I don't think it's essential. A website with well optimized title tags, headings, meta descriptions, and good, recurring content that does not use "friendly" URLs could do far better with organic search traffic than a website that has them but falls short in other areas. That's why I mentioned it last, even after the idea of optimized link text.

To check the PageRank of any website, you can use www.prchecker.info.

Jenn: Good question. In short, no, I don't disagree with their assessment, but that is because I don't think their assessment is that all URL rewriting is bad. Let me explain.

At the beginning of the article, the author, John Mueller, writes:
"While static URLs might have a slight advantage in terms of clickthrough rates because users can easily read the urls, the decision to use database-driven websites does not imply a significant disadvantage in terms of indexing and ranking. Providing search engines with dynamic URLs should be favored over hiding parameters to make them look static."
What he's trying to say is that while webmasters are motivated to create more "friendly" URLs for the purpose of SEO (those that contain relevant keywords), it can be done wrong and create more problems than benefits, especially if parameters indicative of important processing functions are hidden in the process. If you read through the long string of comments, you probably noticed that the article created quite a bit of controversy, mostly because it seems that it was widely misread. But within that string, I think some clarity emerged. An early comment from Ryan Williams made some sense of it:
"This article is just meant to highlight some potential problems that can arise from developers turning dynamic URLs into static ones incorrectly, rather than be some general guideline for any and all purposes. Or to put it another way, it's saying that Google bot can potentially screw up if a developer isn't savvy with how he generates his rewritten URLs, whereas if he just leaves them as their default dynamic selves Google bot will generally get it right every time."
Then, a bit later, this exchange between the author and a user going by "lordscarlet" helps to clarify his perspective:
lordscarlet: "Your replies are very different form the blog post. You are saying that responsible URL Rewriting is good. The post, however, tells you to absolutely avoid using URL rewriting."

John Mueller: "The devil is in the details :). If you look at the web on a whole, you would have to say that most sites do a bad job at rewriting URLs and that it would be easier for search engines in general to see the real URLs. However, there are exceptions and situations where a properly rewritten URL, that does not contain any irrelevant elements, can be an advantage.



If a webmaster is unsure whether or not a chosen URL scheme is perfect (and implemented perfectly), then I would recommend leaving the dynamic URLs instead of providing something which can make it hard for us to crawl a site properly and completely.



We did not significantly change our crawling and indexing system for this blog post. We have only noticed that the myth that "any rewritten URL is better than a dynamic one" is very wide-spread and is just plain wrong."
So, it seems that Google's point of view on the matter is primarily to exercise caution. I completely agree with that.

You'll notice that I placed this last in the list of on-site factors. I did this intentionally because I believe it's the least important among the others (as I mentioned to the commenter above), or, in other words, has the least impact on a website's search engine optimization compared with proper optimization of the others.

That said, we do have a re-write engine built into our CMS that handles the action properly and causes no issues with Google. The way it works is that when a "friendly" URL is specified by the content author, the CMS creates a 301 Redirect (you can read more about URL Redirection in a helpful Wikipedia entry), which tells Google to index the static URL rather than the dynamic one. A 301 Redirect is the proper method for informing Google of permanent URL changes, so no variables are hidden in this case.
Desmond Williams | January 4, 2011 1:15 PM

Great article. SEO optimization can seem like a daunting task, but your list of simple steps really puts Google indexing in perspective. Informative as always.
Jonathan Hinshaw | January 4, 2011 5:13 PM

Great Article and Fantastic information. SEO is such a tough subject as there is probably more miss-information our there than any other website related topic. Thanks for keeping it simple and making the article usable in the real world - keep up the good work!
pj | January 4, 2011 6:20 PM

i saw this on twitter and was surprised after reading that there's no mention of social media. i wonder what the relevance of seo is today in the age of facebook and twitter?
Alex | January 4, 2011 9:08 PM

A nicely written article as usual, Chris, but I'm with @pj. At this point, I see very little SEO imperative given what social media has offered the average person. What's the point of searching with Google when most of the information I want or need I can either find or is brought to me other ways?

The eyeglass scoundrel you mention from the NYTimes piece is probably not such an anomaly. I'd find it hard to believe that those who know enough about how SEO works *aren't* tempted to be dangerous, if you know what I mean. Whereas social media seems to flow on the goodwill of personal relationships, this whole SEO bit seems like a casino to me.
Chris Butler | January 5, 2011 7:28 AM

Desmond: You're right, it really can seem daunting at times, especially if you have a large amount of content that needs to be optimized. But once it becomes part of the content creation process, doing SEO well will be far less of a chore.

Jonathan: I'm glad the article met you where you are. This is a lesson I'm still learning: how to take topics that we talk about often (like this one) and continue to produce resources that are engaging and actionable to our readers. Our approach to SEO hasn't changed radically in years, but it's still a very meaningful issue that I want to make sure we continue to discuss regularly.

pj and Alex: I intentionally left social media out of this piece, mostly because I wanted to focus on how search engines work and basic on-site content optimization. However, I agree with both of you that social media is quite relevant to SEO as it contributes heavily to off-site link and awareness building.

The other issue—which was implicit in pj's comment and the main thrust of Alex's—is whether SEO is relevant today given the ubiquity of social media. In short, I would emphatically say yes; as long as people continue to search for information, I believe search engine optimization will remain very relevant to what we do. But your points are well taken that social media have greatly impacted what SEO means. Certainly for the average person, there's no shortage of content available, and given the socialization of search, the content you're likely to find interesting is often just as likely to find its way to you via your social ties. With a system like that in place, there's little need for proactive pursuance of content. On the other hand, the availability of an individual's content to family and friends is going to be much more important to them than whether its findable by a stranger. But for business, organic search traffic is critical. One thing we often remind our clients is that they need to frame their on-page factors for the customers that are looking for them but just don't know their name yet, because those people—certainly thousands of them—are out there.

Alex, your other comment about social media carrying more inherent goodwill versus the feeling that only the house wins with SEO is understandable given the DecorMyEyes story. I like this quote, attributed to Saint Augustine: "Never judge a philosophy by its abuse." (I'm publishing a blog post along these lines this morning, btw.) Sure, DecorMyEyes has abused the system, but that begs the question, is the system unethical, or is the abuser? Given the possibilities for disappointment and further frustration that more qualitative approaches could create, I can understand why Google has avoided sentiment analysis to this point. But perhaps that will have to change due to the overall sentiment web users have about search itself. We'll have to wait and see.

Thanks, everyone, for reading and commenting!
Mark O'Brien | January 5, 2011 1:40 PM

I'd like to comment on the point pj and Alex brought up as well. When questions like "what is the relevance of seo in the age of facebook and twitter" come up, I usually head right to Google Analytics to see what the data tells us.

Over the past 30 days Newfangled.com received 10,000 unique visitors from search engines, and well under 500 from all social media outlets combined.

We put roughly the same amount of work into optimizing a newsletter for search engines as we do distributing it through our preferred social media channels, and SEO seems to give us a 200% better return in terms of traffic.

Now, our goal conversion rate for Google is only 1.17% over this period, while Twitter's conversion rate is an impressive 4.52%. An argument about quality of traffic can be made, but the overall numbers are in still in favor of the search engines.

That being said, I think SEO and social media are both very important, and neither should be dismissed when trying to increase traffic to a marketing site through a thought leadership-based content strategy.
Chris Butler | January 5, 2011 1:46 PM

Mark's point is right on: the traffic is clearly indicative of the number of people who are searching for information using engines like Google. Incidentally, I put together a chart of this exact comparison back in my newsletter from October of 2009 to show readers how much conversions matter. I've added the chart itself below—you can see that organic search traffic was, by far, our greatest source at that point in 2009, and as Mark pointed out, that hasn't changed:


JT | January 5, 2011 7:47 PM

I may not have said it they way that @pj or @alex did, but questioning the relevance of SEO in light of social media isn't as naïve as you might think. @Chris, I think you're hinting at this when you point out the differences in perspective that an individual might have from a business. This is really where clear lines remain within the "socialtopia" right this now: individuals and corporate entities. Both have flocked to social media with very different agendas. I remember back in that social media guide you posted that there was some reader controversy over how businesses should use social media and that's clearly still a big issue. Look on any college campus and the divide is even clearer. Student get it so much that they don't think about it. Administrators don't get it so much that they're thinking about it all the time.

It seems that the natural mediation point here is that @pj and @alex are right to question the need for SEO. Perhaps it does not now or never will matter to them. But @Chris and @Mark are right to dig in with SEO from a corporate marketing perspective (esp as social media for that purpose hasn't found its legs yet). The data is nice, had the question been asked from a business perspective,but I suspect it was not.

Good conversation.
jennifer | January 6, 2011 7:41 PM

This is so helpful, thanks for the straight forward explanation of how this all works! I think I'm reading the conversation right, and I agree with JT that getting Google isn't exactly on the top of most peoples priority list, but it will jump up there right away as soon as they want to get noticed on the internet, like having an Etsy shop or something. My parents have one so this information is something that they are asking about now. Now that I've found your article, I'm definitely going to share it with them.
Chris Butler | January 26, 2011 10:47 AM

JT: Very much agreed. Our approach to SEO makes sense given a legitimate content-driven online strategy, which means not content for the sake of robot food, but content that people will actually benefit from reading/viewing/etc. The balance with SEO should always prioritize people, which doesn't mean that SEO is entirely on the robot side either--keep in mind that people are the ones using search engines and expecting them to be effective research tools.

All that said, there are plenty of companies/brands that just should not expect a content-driven online strategy to be effective. Plenty of people have chimed in on this over the years (I first saw it in blog form from Chris Brogan), but the general idea is that my complete lack of interest in reading a Pop Tarts blog, watching Pop Tarts videos, friending or liking Pop Tarts on Facebook or following Pop Tarts on Twitter says nothing about the likelihood that I will buy and enjoy a Pop Tart from time to time. Tim Malbon of Made by Many just posted on this and calls out the absurdity of most consumer brand engagement like this for what it really is. Check it out.

Jennifer: Thanks for reading! I'm glad it was helpful.
khuyen mai | August 11, 2011 5:37 AM

All that said, there are plenty of companies/brands that just should not expect a content-driven online strategy to be effective