In an article in The Atlantic (h/t Bruce Schneier), Woodrow Hartzog and Evan Selinger argue for using the concept of obscurity in place of privacy when discussing the degree to which data is easily accessible:
Obscurity is the idea that when information is hard to obtain or understand, it is, to some degree, safe. Safety, here, doesn’t mean inaccessible. Competent and determined data hunters armed with the right tools can always find a way to get it. Less committed folks, however, experience great effort as a deterrent.
Online, obscurity is created through a combination of factors. Being invisible to search engines increases obscurity. So does using privacy settings and pseudonyms. Disclosing information in coded ways that only a limited audience will grasp enhances obscurity, too. Since few online disclosures are truly confidential or highly publicized, the lion’s share of communication on the social web falls along the expansive continuum of obscurity: a range that runs from completely hidden to totally obvious.
This is great framing, as it offers an important way of understanding the nuance that is lost when discussing things in terms of privacy, which is often treated as a binary concept. However, the article doesn’t touch on the fact that there are both pros and cons of data falling at any given point along the obscurity continuum.
Consider, for example, whois records, which provide contact information for registrants of domain names. These live somewhere in the middle of the obscurity spectrum. Registrars are supposed to publish the information via the whois service, so the records are not completely private, though some people do conceal their information behind a privacy proxy. (A privacy proxy completely obscures the who, though it does not obscure the means to contact the registrant, as the proxy service is supposed to provide a pass-through email address.) Those that don’t use proxies have their contact information published in plain text. However, automatically grabbing and parsing the information is non-trivial, due to unstructured distribution of whois servers, lack of data format standardization, and rate limiting imposed by registrars.
If you worry, as many people do, about the harvesting of whois records en masse for use by spammers or other criminals, this partial obscurity is a blessing. It makes it more difficult or “expensive” for criminals to do their work. For those of us working to identify malicious actors and correlate badware domains, or trying to automate the process of reporting compromised websites, though, the same obscurity is a curse. The same dichotomy will occur with most changes in data obscurity, including the introduction of Facebook Graph Search, which was used as an example in the article.
Hartzog and Selinger end their essay with the following call to action:
Obscurity is a protective state that can further a number of goals, such as autonomy, self-fulfillment, socialization, and relative freedom from the abuse of power. A major task ahead is for society to determine how much obscurity citizens need to thrive.
Taking into account the negative aspects of obscurity (or, put another way, the benefits of transparency), and the fact that there’s no one-size-fits-all solution, I’d amend their conclusion as follows:
Obscurity and transparency can each in its own way further a number of goals, such as autonomy, self-fulfillment, socialization, and relative freedom from the abuse of power. A major task ahead is for society to determine the balance of transparency and obscurity that citizens need in various aspects of their lives to thrive.