Newfangled works with independent agencies to create lead development web platforms for their clients.

Newfangled works with independent agencies to create lead development web platforms for their clients.

Unlimited vs. Limited Web Tracking



A recent Wall Street Journal investigation showed that the top 50 websites in the United States install an average of 64 individual trackers to visitor computers with little to no disclosure. Some websites among the top 50 even exceeded 100 trackers. While the word installation may connote visible, mechanical processes that slow things down or conjure images of slow progress bars in your mind, the actual process of placing trackers on to visitor computers is much faster and more invisible. The fact that it goes unnoticed by the majority of web users is what makes it so effective. But the more insight that people have into what exactly is happening when they access sites like Dictionary.com, for example, the more likely they are to reconsider using them at all. This is certainly reinforced by the poll the Wall Street Journal has running in the sidebar of their privacy investigation report. As of this writing, 58% of readers who answered it reported that they are "very alarmed" about advertisers and other companies tracking their web use. I would have been interested in seeing a similar poll inquiring as to their level of concern prior to reading the article. My suspicion is that the details of the article, in particular, how tracking works and the different types of tracking that web users are likely to already have been subject to, have a direct impact upon the increasing concern of readers.

So in that spirit, let's get up to speed with how this all works. There's quite a bit of detail to cover, so get comfortable. I promise that you won't regret taking the time to read through it all...

How Tracking Works

There are two basic types of trackers, "first-party" trackers, which transfer benign text files to users' computers in order to enable websites to remember user information, such as items they've placed in their shopping cart or field data for forms that have already been filled out, and "third-party" trackers, which also transfer files to users' computers in order to gather data about much more than just their session on the tracker's site of origin. While first-party trackers are useful to users, and limited in scope to very particular data sets relevant to the site of origin, third-party trackers, unlimited in reach, are controversial and push the boundaries of privacy and ethics online.

Let's go back to Dictionary.com, one of the sites mentioned in the Wall Street Journal tracking exposé. Out of the 234 trackers it installs onto users' machines, only 11 of them are first-party trackers. That leaves 223 third-party trackers, all with unique agendas. When a third-party tracker is downloaded by a computer, it assigns that machine with a unique identification number (something like "4c812db2922...") stored inside a cookie associated with the web browser being used. So, if you've visited Dictionary.com recently, it's likely that those trackers are running on your machine right now, gathering data about you by observing your web browsing session—not just the pages you view on Dictionary.com, but all sites you visit with that browser. If the tracker is using a technology called a "beacon," it can even record your keystrokes.

Think about that for a moment. If a tracker can record every site you visit and the things you type—search queries, instant messages or emails using web-based systems, comments, etc.—it can quickly assemble a very thorough and accurate profile for you, one that is quite valuable to advertisers.

In fact, third-party trackers using beacon technology can match the data they collect about you in real time with other databases containing geolocation, financial, and medical information in order to expand your profile to predict your age, gender, zip code, income, marital status, parenthood, home ownership, as well as unique interests. If you visit other websites that work with the same ad networks that control the trackers you picked up on Dictionary.com, the original tracking file will take note of the connections. As this continues, your profile will become more and more specific to your interests as well as enriched by any information you willingly share with the participating websites. So, if you're seeing more and more advertisements that seem oddly tailored to you, it's probably not a coincidence.

If you'd like to learn more about how this all works, I found this video, produced as part of The Wall Street Journal's privacy investigation, to be very helpful:

Why Tracking Exists

One of the companies that creates third-party trackers is Lotame Solutions, an ad network that uses beacons to capture billions of data points to assemble consumer profiles. While these profiles don't contain specific names (i.e. there is no particular profile for Chris Butler), they can get extraordinarily close to identifying very particular things about individuals to the point of being unique. Lotame also groups profiles by more general categories, like foodies or film buffs, or in customized segments, all, of course, for sale. That's the point of all this, the "why." The richer the consumer data, the more predictive it can be. The more predictive the profile, the more valuable it is to advertisers and marketers. And I'm not talking about the advertisers and marketers that are likely reading this article—those that represent mid-size B2B and B2C businesses. They rarely feature ad-network advertising on their sites and have little reason for third-party tracking. I'm talking about the worldwide mega-brands that are in a dominant enough position to see the world as data points rather than people. This is a distinction I'll come back to later...

Why Unlimited Tracking is Evil

What can you do about it? Well, if I've made you drastically paranoid, you can disable cookies completely in your browser's settings and avoid the sites listed in the Wall Street Journal investigation materials. If you just want a clean slate, you can clear out your existing cookies. But, here's the really bad news: Some of the third-party trackers (including those from Dictionary.com) use a newer technology called a "Flash cookie," which was initially created to enable media players to remember unique user settings, like your preferred volume, to regenerate a tracker after you've deleted it. You just can't win with those. It's this kind of tracking practice, above all others, which has truly crossed the line from questionable to straight up wrong.

If the intrusion of Flash cookies wasn't problematic enough, more particular and troubling privacy issues—like what trackers do with information gathered while a user is viewing or inquiring for information about sensitive health topics—are beginning to emerge. According to Healthline Networks, Inc., another third-party ad network that uses beacon trackers, it does not allow advertisers to track users who have viewed information on conditions like HIV/AIDS, STD's, eating disorders, or impotence, but does let them track users who have viewed information on other health-related topics. The concern that web users could be discriminated against, in a variety of ways, due to the availability of information linking them with certain medical conditions, is very real.

The same concern exists with other kinds of potential discrimination based upon race, religion, nationality, gender, income, marital status, creditworthiness, and the like. While redlining and other financial discrimination is already illegal, the law is very fuzzy when it comes to discrimination based upon web-browsing history, or the sequestering of demographically-determined groups into unique price lists. That's just the tip of the iceberg of concerns that privacy advocates have about unlimited tracking. Unfortunately, we're going to have to wait for legal experts to sort things out before official policy catches up with what is intuitively ethical. In the meantime, I believe that a mid-sized business has very little to gain from engaging with ad networks that collect data far beyond the scope of user activity specific to their website. However, some limited tracking could be helpful to users and marketers alike.

Why Limited Tracking is Harmless and Helpful

Let's say you operate a website for a business services firm, and as part of your marketing you send out a monthly email newsletter dealing in depth with issues related to your firm's core expertise. After the introductory portion of the newsletter, you have a link that readers can follow to read the rest of the article on your website. Knowing the number of readers that click that link is the best way that you can begin to measure the success of your campaign. Most email newsletter tools make those links trackable, so the clickthrough data is easily gathered and reliable. That is limited tracking in its basic form, and is quite acceptable, provided that the recipients of the newsletter have opted in. An additional step in sophistication would be to begin tracking readers that click through from the email to your site, so that you can match their session with other goals you might be interested in measuring, such as completion of contact forms, registration for events, or downloading of assets on your website. Being able to segment those goal completions by unique sessions enables you to more accurately measure the success of your web content strategy and can all be done using very basic, cookie-based, limited tracking.

Another example of this type of tracking can help you to match prospects with their initial search engine queries that began their sessions on your site. A tracker that initiates when a user comes to your site from, say, a Google search results page (this could be done for any search engine), would assign that user a unique identification number within a cookie that stores the query they submitted that led them to click the link from Google to your page. From there, the cookie can track their session on your website (and only your website), noting which pages they view and how long they remain on those pages. If the user eventually fills out a form, their form data is matched with their initial identification number. "a4d8h622dfnb7" becomes "chris@newfangled.com." At best, having rich session data can enable you to better serve that prospect if they enter into your lead cycle. At something a little less than best, it can help you to evaluate whether individual pages and calls to action are doing their job effectively.

What question are you trying to answer?

Those two examples, in addition to the few I mentioned at the beginning of the article (tracking that enables websites to remember user information, auto-complete form fields, or preserve items added to shopping carts) are exactly the types of limited tracking we've offered to our clients for years now, to their great satisfaction and with clear consciences all around. But if the distinction between limited and unlimited tracking is still unclear, ask yourself this: What question are you trying to answer? Do you need to know more about your site's visitors, or do you need to know more about how your visitors are using your site? Your answer to that question will determine whether you can preserve the trust of your site's visitors. Ask them only what you need to know to serve them better and no more. And above all, don't follow them out the door!





Comments

Russ | October 5, 2010 2:22 PM
Chris, Nice article. I am always trying to perfect tracking on my sites. This allows me to really identify with who my users really are how they find the site. I currently couple Google Webmaster Tools with Google Analytics, but have also used Tracking 202 to help me with this cause. The more info about my user allows me to customize the content for them. I do believe there are so many people out there that dont track because they reaally are not analytical or technical... or simply dont understand the importance of it.
Alexander James | October 5, 2010 2:41 PM
Hey Chris,

This is a great article to educate the average web-user. I work for a company called Cocoon that provides increased online privacy as well as malware protection.

When logged into Cocoon, no information touches your hard drive, it is all stored on our servers. That's right, cookies, browsing history, all of it. In addition users' IP addresses become anonymous the second they log in. Cocoon also provides users with an ad-remover, and scans for drive-by downloads and other malware before loading web pages onto users' computers.

Are goal is to provide users with a better browsing experience where they are not bombarded with annoying ads or spam, and there is no risk in browsing the web freely.

Right now Cocoon is in beta so it's free as an add-on for Firefox. Check it out: https://getcocoon.com/

Also, here is a video explaining how exactly Cocoon works: http://www.youtube.com/watch?v=oRM4aWiCwxk

Thanks, and have a nice day!

Alex
Alex | October 5, 2010 4:29 PM
Informative piece. But I'm left feeling pretty discouraged. Your explanation of the kind of tracking that is commonplace today left me speechless and frustrated. Why has no one spoken out about it? Why aren't there laws preventing it? You can't just hang a sign on someone else's house advertising your stuff, so why can you sneak on to somebody's computer?
anonymous | October 5, 2010 4:52 PM
@Alex, people are speaking out about this and have been for years. Check out the Electronic Frontier Foundation (link provided).

@Russ, What did you read? Most of this was about the problems with tracking, not finding nifty new ways to track.

On that note, @Chris, this will be informative to your readers but I wish you would have taken a clearer stand against tracking and used this platform to win them to the cause.
Chris Butler | October 5, 2010 5:12 PM
Russ: Thanks for reading. I think your comment, especially in light of the later comment referring to the EFF (which I'll address in a moment), comes from a similar understanding of the distinction between unlimited and limited tracking. As far as limited tracking goes, I think it is appropriate and should be motivated by the desire to improve user experience, service to clients, and the effectiveness of your site in achieving its purpose—whether that is purely informational, lead-generative, or transactional.

Alexander: Thanks for providing information about Cocoon. I'll definitely take a look at it.

Alex: Again, in reference to the EFF comment, it's true, people have been speaking out about this for years, many from the EFF. Those who have been speaking out about web privacy are very mindful of the work needed to work out issues of legality as far as tracking technologies and procedures are concerned, and, as I mentioned in this article, that entanglement will likely continue for a long while before bearing any fruit.

One minor point, though. No, you can't just solicit for business on someone else's property. But drawing that analogy in this case is a bit incomplete. The third-party tracking that I described in this article is done within a relationship between a business and an advertising network. In exchange for access to that network, a company will allow the network to run its tracking tools on their website. You, the consumer affected by all this, see the solicitation on their website, not your house. However, the mechanics of it are where the analogy is correct: the tracking gets done by an infiltration of your computer. In most cases, those are clearly described in privacy policies, or thwarted by your ability to block cookies. But with Flash cookies, they're essentially sneaking a Trojan Horse onto your machine.

Anonymous: I've appreciated the goals of the EFF for as long as I've known of them. My goal here was to provide an overview of how tracking works and the various technologies being used today, while also provide my opinion on the ethics of tracking. I thought I made my position pretty clear, especially with the heading that reads, "Why Unlimited Tracking is Evil." If not though, here's my point: Limited tracking respects the user and therefore limits the scope of tracking to only general session data on the site of origin. It does not allow anyone to share session data with any other parties, period. Unlimited tracking, insofar as it tracks anything beyond the scope of session data on the site of origin, is not acceptable to me, regardless of how commonplace it is.

By the way, I've tried to consistently mention privacy issues on our site, whether in this newsletter or on our blog. If you dig deep, you'll find a post from April, 2009 on cloud computing and privacy that even includes a video of Brad Templeton, the EFF chairman, presenting at the 2009 BIL Conference.
Jennifer Merrel | October 5, 2010 6:33 PM
A few of my freelance clients have asked me about doing tracking on their sites (I manage them for them on server space I pay for). I've tried to avoid having to figure out how to do it by bringing up the privacy issues, but honestly, most of them don't care about that and my lack of know-how ends up being obvious.

I think this info will help me punt, but what if they wanted to just do the limited kind of tracking? How can I help them do that? Are there any tools you would recommend?
R.J. | October 6, 2010 4:57 PM
Really helpful information. Thanks for the writeup!
Mark O'Brien | October 7, 2010 12:20 PM
Jennifer,

Custom limited on-site tracking that goes beyond what Google Analytics is able to do is of course possible, and is getting more popular as time goes on, so it is not surprising that you're getting requests for this.

One important thing to keep in mind is that the sort of programming required to do this sort of thing is very complex and therefore time consuming and expensive. Installing Google Analytics on a site is easy. Custom programming the tools you'd need to to individually track people is not. This type of functionality is along the same lines of complexity as building custom modules for E-Commerce, complex intranets, and multi-lingual capabilities.

It might help you to frame the conversations you have with your clients about custom, individualized, and limited on-site tracking with this point on complexity and cost.
John Kuefler | October 7, 2010 3:33 PM
Great post, Chris. I found it interesting that you cavalierly categorized third-party tracking as evil, and first-party tracking as good (because it is "harmless and helpful"). I imagine most third-party tracking businesses would characterize their services as "harmless and helpful" also. It would be interesting to know if the average web user considers the scenario you describe for limited tracking (in which the user's search engine query is later matched with their individual identity if they fill out a form) is any more or less "evil" than the scenarios you describe relative to third-party tracking. I'm sure it depends on the sensibilities of the user.

At our agency we discuss these issues frequently and wonder if people are becoming desensitized into accepting these things as facts-of-life, or if people's privacy concerns will eventually create enough uproar to cause privacy laws to change. It will be interesting to see how it all plays out. Thanks again for an enlightening piece.
Chris Butler | October 7, 2010 3:53 PM
Jennifer: What Mark said... plus, it's ok for you to decide where your capabilities end and whether its a good business decision to extend them.

R. J.: Glad you enjoyed it. Thanks for reading!

John: It was rather cavalier, I'll admit that. The heading you mentioned was intentionally inflammatory—not in that I don't actually feel that way, but in that the language was probably stronger than it needed to be—in order to hopefully incite some conversation around the issue. That seems to have worked.

But, to the point: The distinction between limited and unlimited tracking is an important one. Limited tracking, compared to what people either know or suspect is already being done online, hardly even feels like "tracking" any more, at least not in the pejorative sense. I find it akin to "watching the store." As a consumer, you wouldn't expect a retail chain to be able to improve the customer experience without drawing from surveillance footage recorded on the premises. After all, the store is their property; they have the right to record what happens there. I don't find anything unethical about that practice, provided that the information captured with that method isn't sold to anyone else, or used for anything other than improving the business.

Unlimited tracking is where limited tracking methods cross the line in multiple ways. The ethics become questionable when the specific things being tracked extend beyond what is happening "in the store"—as if the surveillance camera at The Gap hopped down off the wall and followed you out, recording the rest of your time at the mall, including that embarrassing stop in to Cinnabon. If this happens some day and the Gap says they're just trying to anticipate what your waist size will be for next time, don't believe it! That's really the point: Limited tracking helps the business and the customer. Unlimited tracking really just helps the advertising networks, while giving them fairly lame ways to claim that they are helping the consumer (i.e. "If we know what you like, we can show you the ads for the stuff you want anywhere you go.). Thanks, but no thanks. What if my preferences change? What if my preferences are different in various circumstances? What if I consider some of my preferences private and others public? So, while evil is probably overkill, this kind of thing is a very slippery slope. The portion of the article devoted to exploring some of the more concerning areas of privacy policy surrounding these practices make that clear.

I'm glad to hear that you all at Callahan Creek are discussing this. I think you're right that the lack of concern around this issue is probably more indicative of unfamiliarity, not apathy. Hopefully, as awareness increases, people will seize the opportunity to make intentional decisions about what they feel is and is not OK.

Thanks for reading and commenting!
Justin Kerr | October 7, 2010 4:56 PM
Chris,

A very thought-provoking post, as always.

I find it interesting, from a sociological perspective, that anyone would consider web tracking an invasion of their privacy but, at the same time, post intimate details of their daily life on Facebook, Twitter or a personal blog.

Just sayin'.
Chris Butler | October 7, 2010 5:04 PM
Justin: There is a fairly large difference there. I, for one, am not prone to sharing much personal information online, whether on Facebook or anywhere else. But, people have the choice of what level of personal detail they are comfortable sharing in those venues, whereas if they visit Dictionary.com, they lose control of what information they share. Remember, some of those third-party trackers can record keystrokes in general, which could include messages you send "in private" to a friend over web-based chat or searches you enter in Google. I suppose the user can regain control by not using those websites that allow third-party tracking, but that defeats the purpose for those sites being there in the first place.
brian | October 7, 2010 8:16 PM
I'm with @Justin. People who are indignant about online privacy seem to forget that the internet is_not_private. You know what the internet is, so act accordingly!
Greg | October 9, 2010 2:24 PM
This is really not worth all the worry. There is no privacy anymore. Privacy today is an illusion. We need something like a modern Dr. STrangelove - How I Learned to Stop Worrying and Love the Modern Police State.
Chris Butler | October 9, 2010 4:35 PM
Justin, Brian, Greg, any everyone else, I'm sensing that this conversation is touching on something deeper and more foundational to human experience than simply whether we can continue to surf the web without feeling watched or followed. Of course, I'm glad for this! Believe me, I had to restrain myself a bit in not introducing more philosophical considerations into the article itself...

Right now, a firm called Numenta is working on trying to update computing technology based upon the human neocortex. Founder Donna Dubinsky has pointed out that even though a 3 year old child can instantly identify a dog by seeing an incomplete picture of it (say, it's legs only), a computer cannot do this. Today's image search technology, for example, is based upon indexing metadata that human beings have associated with image files, but not based upon the computer's ability to actually identify what is pictured. Numenta, in constructing a computing model that can learn this kind of thing, envisions all kinds of applications from more sophisticated pattern recognition, financial fraud detection, instant pathological results, and, in case you were wondering the connection with this article, even better ad technology based upon web user behavior prediction algorithms.

So here's my point, when I first learned about Numenta's ambitions, the only one that stirred me emotionally was the advertising stuff. I thought to myself, sure, perhaps the ads will be more targeted to the things I want, but what if I want to seek those things out on my own. What if I want that kind of free will? Well, every other application that they hope to create challenges my free will, too. In fact, the goal of enabling machines to do what we cannot do—or in some cases, what we can do but much, much faster and more accurately—could be understood as the final boundary of human potential, beyond which what we've created will surpass our own abilities. Now that is a sobering thought.

When we are emotionally stirred by the introduction of new technologies—particularly those which involve issues of privacy—I think it's really because it prompts a deeper question: What kind of world do we want to live in? Do we want to live in a world where we have more leisure, better health, and greater wealth than ever before at the cost of acknowledging our own limits and handing over control to machines, or do we want to live in a world where we continue to be the greatest organism it has ever seen, perhaps still having to work hard and suffer? I'm not sure what the answer is. I think that in today's culture, most people probably do want to preserve the preeminence of humanity, yet would have a hard time admitting it. Surely, something about losing our place at the top of the food chain must stir us to our very core...

Your thoughts?
Carolyn | October 11, 2010 1:29 PM
So, I'm curious to know what other sites use "flash based cookies" which regenerate after you clear them. Can you speak more to that? And, is there anything we consumers can do to avoid them? I'd love to hear a comment by dictionary.com or any other sites using this type of cookie to hear their reasoning.
Chris Butler | October 11, 2010 6:21 PM
Carolyn: I'd recommend looking through the data that the Wall Street Journal has collected on the top 50 most popular sites on the web, which has been visually organized based upon the level of exposure a user will have to tracking technology. The "Very High" and "High" categories include dictionary.com, merriam-webster.com, comcast.net, careerbuilder.com, photobucket.com, and msn.com, which all install between 118 and 234 individual trackers on user's computers. Just so I don't dwell too much on dictionary.com, take a look at msn.com, which installs 131 cookies, 23 beacons, and 53 first-party trackers, or careerbuilder.com, which installs 93 cookies, 1 Flash cookie, 20 beacons, and 4 first-party trackers. That this stuff is insidious would be a considerable understatement for those who care about web privacy policy.

What can you do? Again, the Wall Street Journal offers a pretty comprehensive step-by-step guide. Most of the article pages from their series also include a Tracker Scan tool (scroll down and find it on the left side of most pages) that allows you to enter any website's URL to see its privacy policy. You can also download a tool called TrackerScan that will show you the ad trackers on any webpage.

I haven't been able to track down any on-the-record comments from Dictionary.com regarding their use of tracking technology, but I'm keeping my eye out. Howeber, I think their privacy policy, albeit subtly, says it all:
"In the course of serving ads on our sites, these companies may place or recognize a cookie on your computer or use other technologies such as pixel tags to track you across various sites where they display ads and record your activities to show you targeted ads. These cookies contain a unique identifying number that is anonymous, and are not linked to any personally identifiable information that you voluntarily give to us. We do not share your personally identifiable information with advertisers and you remain anonymous to the advertisers. Some of our advertisers and advertising networks are members of the Network Advertising Initiative. They enable you to opt out of being tracked by their cookies by clicking on the "Consumer Opt-out" button on the page at http://www.networkadvertising.org/."
The interesting connection here (for me) is that they're participating in a system which profits off of users' passive participation, and appears to be flexible to their preferences, yet requires them to seek out a relatively obscure means of opting out of that system. Remind you of anything? This is the almost exact same setup that created all kinds of controversy for Facebook last year. But we see nothing comparable in terms of user push-back when it comes to third-party advertising networks. Interesting.

But wait, there's more! Later on in the privacy policy statement, a direct mention to Flash cookies is made. Get a load of this:
"Some advertising service companies also use flash cookies to serve advertisements that use flash media technology. Adobe offers a web tool that runs locally on a user’s computer, and allows users to delete flash cookies, as well as set permissions for sites to drop and store cookies on the user’s computer. The tool can be found at http://www.macromedia.com/support/documentation/en/flashplayer/help/settings_manager06.html and http://www.macromedia.com/support/documentation/en/flashplayer/help/settings_manager07.html"
Another opt-out requirement! In this case, if a user wants to prevent Flash cookies from resurrecting themselves—or, in other words, if a user wants to really delete them, which is what they think they're doing when they use the tool their browser provides to delete cookies—they have to find another tool and install it independently on their computer in order to do so. This appears helpful, but it's really an intentional obstacle between users and their ability to actually control how they engage with information online. I hope we quickly get to a place where this sort of thing is not just uncool, but illegal as well.
Justin Kerr | October 18, 2010 12:05 PM
Chris,

Just saw this article on Bloomberg News about an effort to alert users when they're being tracked and allowing them to opt out of receiving targeted ads based on their usage data.

http://www.bloomberg.com/news/2010-10-04/advertising-groups-offer-program-to-let-u-s-consumers-block-web-tracking.html

↑ top