It’s always fun to learn whole new layers of technology. What I’m
posting about here is probably known by a lot of people, but my recent
involvement in two new start-up companies has really started to have me
think about the breadth and depth of data mining occurring on the
Internet involving personal behavior and habits. And one of the
largest harvesters of all of that personal information is Google.
There are already others who cover this much better than I … Google Watch is one … however I still wanted to blog about this.
Two of the four start-ups that I am now involved in are working on web
applications – hosted services – that want to provide new levels of
social and affiliate networks. With one start-up we are creating
a new form of video advertising on the net, with a full affiliate
marketing network behind it. So it becomes important to track
when affiliates (bloggers or web sites that host the ads) cause sales
to occur. When that happens they get paid a commission.
With the other start-up we are creating a new interactive media type
that can be spread virally through web sites, e-mail and IM. With
this solution we want to be able to track and map the viral spread to
acknowledge and reward the people who are able to cause the most spread.
As my teams and I began to build both of these solutions we began to
examine how other vendors are accomplishing the same things. We
have now looked at dozens of implementations, and then created our own
solutions that we believe will give us what we are after. While
doing this I began to see a pattern that is an amazing wealth of
personal information that Internet users are giving away about
themselves … about who they REALLY are. On one of the largest
consumers of all of this personal behavioral information is
Google. It’s really the scale of their ability to gather this
data that caused me to pause and think.
It all starts with a cookie
In doing some research into how ot track consumers, I was surprised to
find that most people agree that 99%+ of web browsers operate using the
default settings when it comes to cookies. Cookies are the small
pieces of data that a web site can pass down to your web browser, and
from then on – until the cookie expires – that data is passed back to
the web site every time that you access it. Cookies can be
defined to last for a very short amount of time – just that particular
session – or a very long amount of time … decades, or even hundreds
So when you first visited Google … the very first time … you got
your first Google cookie. And this is a good starting point …
when did YOU lose your Google-virginity? When exactly was that
first time? Google knows. Even if you have changed
computers, browsers, upgraded, etc. there is a chance that Google still
knows. They know the year, day, hour, minute, and second.
You were given the mark of Google. Ok … big deal … so what.
Tracking what you search
The first thing they are now able to do is track every single search
that you perform on Google. Lots of people know about this, and
understand this is the case. They also know the time of day, day
of the week, phase of the moon, weather conditions, popular news, and
even the popularity of that particular search when you did it! So
what searches do you tend to do late at night during a full moon?
Ask Google … they know!
In my opinion, it’s not really the details of what you searched that
have the real value … it is when you did them, and in what sequences,
and what other patterns emerge about you. This is where your true
identity begins to emerge. What? You were on-line searching
on a Friday night? Not out with friends?
Proliferation of AdWords
Ok … now this next part is where I started to really think.
While working on how to dynamically inject video advertising into a web
page, I found that Google is using a very interesting technique for
AdWords and Google Analytics. Again … it’s very simple and
easy, and many people know this … however many people do not.
And the implications are very interesting.
If you have a web site, and you choose to place AdWords on your web
site, Google will give you a nice little bit of HTML to embed in your
page. That HTML includes a script tag that will fetch a snippet
causes the AdWords ads to be rendered within your web page. It’s
actually pretty impressive that when I browse to your website, without
being told a thing, my browser will automatically load your page and go
and load the script from Googles servers. Clean …
transparent. Ok yeah … and when it did that … the Google
cookie went with that request. Remember the Google cookie?
Yes … now it’s not just the searches that you do on Google’s web
site that are being tracked, but also every single web page that you
visit that contains Google AdWords!
Tracking what web sites you visit
Google is now notified by your browser any time that you visit a web
site that hosts Google AdWords … and it only gets better.
Google recently announced Google Analytics. This is a service
that allows web site owners to get detailed analysis of the traffic to
their web site, and about the visitors to their web site. Any web
site owner who wants this impressive reporting can simply request that
Google give them an account. When approved, Google will provide
access to the Google Analytics web site, and there you get … another
little bit of HTML to put into your web pages. The little snippet
again requests a script from Google, and of course passes along your
So now Google knows what you search, and what sites you visit that have
AdWords, and now any site that uses Google Analytics. I’m digging
to find figures to understand just how much of the Internet now falls
into this category, but it is a large number of sites. And just
like the searches, Google not only knows what web sites you have
visited, but at what time, in what order. Combined with their
broad indexes of Internet content, they have the ability to categorize
those sites. Combined with all other types of data they can
really begin to get an idea of just who you are, what you do and when,
on the Internet. I really begin to wonder what some of the
patterns must look like.
If Google knows your real identity also …
Now … they know you by your cookie, but do they really know who you
are? Well, if you choose to use any number of Google services –
gMail, AdWords, AdSense, etc. – then the answer is yes! In most
cases, you join these services and begin to disclose personal
information that just might be a solid connection to the real
you. And remember, each time you use these services that nice
little Google cookie ensures that they know it’s you. Closing the
loop. Connecting the dots.
Lastly … your friends? Well … Google now knows via gMail who
you communicate with, and at what intervals and times. They now
know the type of people that your friends and contacts hang out
with. Google knows that YOU are the type of person that all of
these people communciate with. From their e-mail address they
might even draw the direct connection to yet another person who they
have collected all of the data about … from their Google
cookie. I haven’t really spent too much time thinking about how
much deeper all of this goes … however it makes sense why Google
wants all the storage and bandwidth they are building out. It’s
not about providing search to you … it’s about owning a perspective
of you that no one else on the planet could recreate right now.
Google knows you like no one else. Google knows more about you
and I then we know about ourselves. Google will use this to
provide us what we really want … right? Google will do no evil
… right? Google would never use this data to use us … to
manipulate our undistinguished behaviors … right? The Internet
is here, and some things appear to be inevitable …
Google knows who you REALLY are.