AOL's internal mistake that led it to release detailed keyword search data for roughly 658,000 of its users is being highlighted by at least one Washington legislator as a chance to inject new interest into a consumer privacy bill before Congress.
Massachusetts Rep. Edward J. Markey, the senior Democrat on the Telecommunications and Internet Subcommittee of the House Energy and Commerce Committee, is using the AOL incident to renew his call for Congress to pass legislation that aims to limit the amount of personal data that can be retained by companies' Web sites.
Markey is the author of the Eliminate Warehousing of Consumer Internet Data Act of 2006, which hopes to bolster consumers' Internet privacy by preventing online businesses from storing personal information for indefinite periods of time.
The congressman, who also wrote the Social Security Number Protection Act pending before the House, contends that the AOL miscue serves as further proof of the inherent dangers of companies allowed to retain large amounts of sensitive information about their customers.
"In this digital information age, the personal data we hand over to dozens of Web sites are the keys which unlock the personal lives and valuable possessions of millions of Americans," Markey said in a statement.
"Internet companies are often able to glean personal information through a computer user's surfing and searching of Internet sites; this stored-up data about consumers' Internet use should not be needlessly kept in perpetuity, inviting data thieves or fraudsters, or disclosure through judicial fishing expeditions."
During the last week in July, AOL published information from roughly 20 million search queries on its research site, before abruptly pulling the information down after privacy watchdogs criticized the maneuver.
The company has said that it only issued the data for academic reasons, without realizing how easy it might be for someone to match the search information with the names of specific users, a feat that has already been achieved.
The data, which has been mirrored on multiple Web sites, represented a random selection of searches conducted over a three-month period (March to May 2006) and includes a numbered user ID, the actual query, the time of the search and the destination domain visited. In some cases the data included personal names, addresses and Social Security numbers.