A recent Wall Street Journal investigation showed that the top 50 websites in the United States install an average of 64 individual trackers to visitor computers with little to no disclosure. Some websites among the top 50 even exceeded 100 trackers. While the word installation may connote visible, mechanical processes that slow things down or conjure images of slow progress bars in your mind, the actual process of placing trackers on to visitor computers is much faster and more invisible. The fact that it goes unnoticed by the majority of web users is what makes it so effective. But the more insight that people have into what exactly is happening when they access sites like Dictionary.com, for example, the more likely they are to reconsider using them at all. This is certainly reinforced by the poll the Wall Street Journal has running in the sidebar of their privacy investigation report. As of this writing, 58% of readers who answered it reported that they are "very alarmed" about advertisers and other companies tracking their web use. I would have been interested in seeing a similar poll inquiring as to their level of concern prior to reading the article. My suspicion is that the details of the article, in particular, how tracking works and the different types of tracking that web users are likely to already have been subject to, have a direct impact upon the increasing concern of readers.
So in that spirit, let's get up to speed with how this all works. There's quite a bit of detail to cover, so get comfortable. I promise that you won't regret taking the time to read through it all...
How Tracking Works
There are two basic types of trackers, "first-party" trackers, which transfer benign text files to users' computers in order to enable websites to remember user information, such as items they've placed in their shopping cart or field data for forms that have already been filled out, and "third-party" trackers, which also transfer files to users' computers in order to gather data about much more than just their session on the tracker's site of origin. While first-party trackers are useful to users, and limited in scope to very particular data sets relevant to the site of origin, third-party trackers, unlimited in reach, are controversial and push the boundaries of privacy and ethics online.
Let's go back to Dictionary.com, one of the sites mentioned in the Wall Street Journal tracking exposé. Out of the 234 trackers it installs onto users' machines, only 11 of them are first-party trackers. That leaves 223 third-party trackers, all with unique agendas. When a third-party tracker is downloaded by a computer, it assigns that machine with a unique identification number (something like "4c812db2922...") stored inside a cookie associated with the web browser being used. So, if you've visited Dictionary.com recently, it's likely that those trackers are running on your machine right now, gathering data about you by observing your web browsing session—not just the pages you view on Dictionary.com, but all sites you visit with that browser. If the tracker is using a technology called a "beacon," it can even record your keystrokes.
Think about that for a moment. If a tracker can record every site you visit and the things you type—search queries, instant messages or emails using web-based systems, comments, etc.—it can quickly assemble a very thorough and accurate profile for you, one that is quite valuable to advertisers.
In fact, third-party trackers using beacon technology can match the data they collect about you in real time with other databases containing geolocation, financial, and medical information in order to expand your profile to predict your age, gender, zip code, income, marital status, parenthood, home ownership, as well as unique interests. If you visit other websites that work with the same ad networks that control the trackers you picked up on Dictionary.com, the original tracking file will take note of the connections. As this continues, your profile will become more and more specific to your interests as well as enriched by any information you willingly share with the participating websites. So, if you're seeing more and more advertisements that seem oddly tailored to you, it's probably not a coincidence.
If you'd like to learn more about how this all works, I found this video, produced as part of The Wall Street Journal's privacy investigation, to be very helpful:
Why Tracking Exists
One of the companies that creates third-party trackers is Lotame Solutions, an ad network that uses beacons to capture billions of data points to assemble consumer profiles. While these profiles don't contain specific names (i.e. there is no particular profile for Chris Butler), they can get extraordinarily close to identifying very particular things about individuals to the point of being unique. Lotame also groups profiles by more general categories, like foodies or film buffs, or in customized segments, all, of course, for sale. That's the point of all this, the "why." The richer the consumer data, the more predictive it can be. The more predictive the profile, the more valuable it is to advertisers and marketers. And I'm not talking about the advertisers and marketers that are likely reading this article—those that represent mid-size B2B and B2C businesses. They rarely feature ad-network advertising on their sites and have little reason for third-party tracking. I'm talking about the worldwide mega-brands that are in a dominant enough position to see the world as data points rather than people. This is a distinction I'll come back to later...
Why Unlimited Tracking is Evil
What can you do about it? Well, if I've made you drastically paranoid, you can disable cookies completely in your browser's settings and avoid the sites listed in the Wall Street Journal investigation materials. If you just want a clean slate, you can clear out your existing cookies. But, here's the really bad news: Some of the third-party trackers (including those from Dictionary.com) use a newer technology called a "Flash cookie," which was initially created to enable media players to remember unique user settings, like your preferred volume, to regenerate a tracker after you've deleted it. You just can't win with those. It's this kind of tracking practice, above all others, which has truly crossed the line from questionable to straight up wrong.
If the intrusion of Flash cookies wasn't problematic enough, more particular and troubling privacy issues—like what trackers do with information gathered while a user is viewing or inquiring for information about sensitive health topics—are beginning to emerge. According to Healthline Networks, Inc., another third-party ad network that uses beacon trackers, it does not allow advertisers to track users who have viewed information on conditions like HIV/AIDS, STD's, eating disorders, or impotence, but does let them track users who have viewed information on other health-related topics. The concern that web users could be discriminated against, in a variety of ways, due to the availability of information linking them with certain medical conditions, is very real.
The same concern exists with other kinds of potential discrimination based upon race, religion, nationality, gender, income, marital status, creditworthiness, and the like. While redlining and other financial discrimination is already illegal, the law is very fuzzy when it comes to discrimination based upon web-browsing history, or the sequestering of demographically-determined groups into unique price lists. That's just the tip of the iceberg of concerns that privacy advocates have about unlimited tracking. Unfortunately, we're going to have to wait for legal experts to sort things out before official policy catches up with what is intuitively ethical. In the meantime, I believe that a mid-sized business has very little to gain from engaging with ad networks that collect data far beyond the scope of user activity specific to their website. However, some limited tracking could be helpful to users and marketers alike.
Why Limited Tracking is Harmless and Helpful
Let's say you operate a website for a business services firm, and as part of your marketing you send out a monthly email newsletter dealing in depth with issues related to your firm's core expertise. After the introductory portion of the newsletter, you have a link that readers can follow to read the rest of the article on your website. Knowing the number of readers that click that link is the best way that you can begin to measure the success of your campaign. Most email newsletter tools make those links trackable, so the clickthrough data is easily gathered and reliable. That is limited tracking in its basic form, and is quite acceptable, provided that the recipients of the newsletter have opted in. An additional step in sophistication would be to begin tracking readers that click through from the email to your site, so that you can match their session with other goals you might be interested in measuring, such as completion of contact forms, registration for events, or downloading of assets on your website. Being able to segment those goal completions by unique sessions enables you to more accurately measure the success of your web content strategy and can all be done using very basic, cookie-based, limited tracking.
Another example of this type of tracking can help you to match prospects with their initial search engine queries that began their sessions on your site. A tracker that initiates when a user comes to your site from, say, a Google search results page (this could be done for any search engine), would assign that user a unique identification number within a cookie that stores the query they submitted that led them to click the link from Google to your page. From there, the cookie can track their session on your website (and only your website), noting which pages they view and how long they remain on those pages. If the user eventually fills out a form, their form data is matched with their initial identification number. "a4d8h622dfnb7" becomes "firstname.lastname@example.org." At best, having rich session data can enable you to better serve that prospect if they enter into your lead cycle. At something a little less than best, it can help you to evaluate whether individual pages and calls to action are doing their job effectively.
What question are you trying to answer?
Those two examples, in addition to the few I mentioned at the beginning of the article (tracking that enables websites to remember user information, auto-complete form fields, or preserve items added to shopping carts) are exactly the types of limited tracking we've offered to our clients for years now, to their great satisfaction and with clear consciences all around. But if the distinction between limited and unlimited tracking is still unclear, ask yourself this: What question are you trying to answer? Do you need to know more about your site's visitors, or do you need to know more about how your visitors are using your site? Your answer to that question will determine whether you can preserve the trust of your site's visitors. Ask them only what you need to know to serve them better and no more. And above all, don't follow them out the door!