AOL And Google Release Word Data

AOL release the logs of all searches made by 500,000 of their users over the course of three months (thx Adam). Privacy issues aside, keyword researchers in the SEM community are sure to find it rather useful.

Google release…erm….something cryptic-sounding:

“Here at Google Research we have been using word n-gram models for a variety of R&D projects…We processed 1,011,582,453,213 words of running text and are publishing the counts for all 1,146,580,664 five-word sequences that appear at least 40 times. There are 13,653,070 unique words, after discarding words that appear less than 200 times”.

The dataset will be delivered on six DVDs.

Leave a Reply