Google, Yahoo and other search engine companies are in the data gathering business. The fact that they offer you and me the service of locating things on the Internet is a means to an end, and that end is data about what you and I do online. They are like the folks that hoard string - the more string they gather the better they feel even if there is little or no actual use for most of what they gather.
Left to their own devices the search engine companies would keep the data they collect about you forever, but they are slowly waking up to the fact that the Orwellian aspects of their business was beginning to get to people. Recently Google decided to reduce the time it kept data about your searches to 18 months, an improvement but not a cure for the problem.
Europe feels different about privacy than the United States does. In the United States there are almost no controls on the information that companies collect about you. There are a few controls on what data government can collect, but it looks like the government can get around the restrictions by buying access to the same data from the private sector. In Europe they have the quaint idea that people have a right to some level of privacy, and the government enforces it in law. The best that the US government is willing to do is establish voluntary guidelines.
The EU has now fired a warning shot across the bows of the search engine companies. A draft report on the relationship between search engine business models and European privacy laws has found plenty to worry about. The report concludes that search engine companies have not shown that they actually need to retain information collected about us for more than six months. The search engine companies put forth a number of reasons that they want to keep the data longer than that, but the report (in Section 5.2) effectively demolishes the reasons. For what it's worth, I find it very hard to imagine how data older than some small number of months (six sounds fine) can contribute meaningfully to predicting what I'm interested in so that I can be shown ads that might attract my attention (which, after all, is the reason to collect the data in the first place).
Google posted a reaction to the EU draft report in which the company repeated some of the arguments that the draft report, in my opinion, properly dismissed.
The draft report also concluded that IP addresses are personal information because they can help identify a person. The Google response referred to a previous posting on the topic. But this posting, given the best possible spin, is deeply misleading. It is patently absurd to ignore the very large number of IP addresses that are not dynamic and to ignore the fact that even dynamic addresses can be stable for months at a time (as mine was when I had cable-based Internet service). There is no question in any reasonable person's mind that many IP addresses do identify a person. It is just this kind of clever but purposefully misleading "information posting" that has destroyed the credibility of the search engine companies when it comes to privacy-related topics.
Disclaimer: Harvard is required by law to protect the privacy of its students and has not offered an opinion that this is not a good idea. In any case, the above review is my own.