Institute for Software Research
School of Computer Science, Carnegie Mellon University
Automatic Categorization of Privacy Policies:
Waleed Ammar*, Shomir Wilson**, Norman Sadeh**, Noah A. Smith*
This report also appears as Language Technologies Institute
Privacy policies are a nearly ubiquitous feature of websites and online services, and the contents of such policies are legally binding for users. However, the obtuse language and sheer length of most privacy policies tend to discourage users from reading them. We describe a pilot experiment to use automatic text categorization to answer simple categorical questions about privacy policies, as a first step toward developing automated or semi-automated methods to retrieve salient features from these policies. Our results tentatively demonstrate the feasibility of this approach for answering selected questions about privacy policies, suggesting that further work toward user-oriented analysis of these policies could be fruitful.