Google says it is going to reduce the length of time it keeps personally identifiable information about users from infinite to merely obscene. There are some positives in this announcement but it mostly emphasizes how bad things are and will continue to be. Google aired its plans in a blog entry on March 14.
The company has been under pressure for a long time over its assumption that it was fine to keep a lifelong record of every search query each of its users, along with the IP address the query was executed from and a cookie ID to link together queries from a user's computer even if the IP address changes. Google is not alone in this belief. To one degree or another all of the search engine companies say they save the same basic information -- although AOL says it does not keep IP addresses. Google does not exactly say why it thinks it needs to keep a record of all of your queries (Its log retention FAQ says vaguely: "We use this information to improve the quality of our services and for other business purposes. For example, we use this information for fraud detection and prevention purposes, to identify system problems and to combat denial of service attacks."
But it is reasonable to assume the main reason Google keeps the logs is to get in our heads and see how we think so it can feed us ads that we will respond to. Google has done quite well convincing advertisers that it knows how to do this and the logs make this possible. But it's hard to see that Google needs years worth of logs in which individual searchers can be easily identified. Under its new policy Google will maintain logs forever but will do some simple tweaks to the data after 18 to 24 months to make it a little harder to identify the individual searcher. These tweaks are not likely to be all that effective in hiding people's identities, as AOL found out when it released a pile of similar data. I would think that the most reliable information Google needs to know about me in order to target ads comes from the last few months -- it's not often that I'll still be interested in a topic I was looking at four years ago. Google says that the 18 to 24 month duration was chosen to be compatible with possible future data retention laws in various parts of the world. But the company acknowledges in its FAQ that the laws could wind up calling for a retention period as short as six months. Why not make the Google retention period based on the laws where the hardware is located? Maybe Google wants an excuse for long retention because it is has not yet thought of all the ways it can exploit the information it has about us. Google has come very late to the realization that some people are worried about the information it stores about them. This is a good first step but it would be far better for Google to make its information anonymous in a few days or weeks rather than years.
Disclaimer: Harvard does not forget easily, at least its former students, because they are a revenue source. But the university has not expressed an opinion on others remembering activities, so the above is my own opinion.