18
Nov
OkCupid Study Reveals the Perils of Big-Data Science
To revist this informative article, check out My Profile, then View conserved tales.
May 8, a small grouping of Danish researchers publicly released a dataset of almost 70,000 users associated with on line site that is dating, including usernames, age, sex, location, what type of relationship (or intercourse) they’re enthusiastic about, character faculties, and responses to several thousand profiling questions utilized by the website.
Whenever asked perhaps the scientists attempted to anonymize the dataset, Aarhus University graduate pupil Emil O. W. Kirkegaard, whom ended up being lead regarding the work, responded bluntly: “No. Information is currently general public.” This belief is duplicated within the draft that is accompanying, “The OKCupid dataset: a tremendously big general general public dataset of dating website users,” posted to your online peer-review forums of Open Differential Psychology, an open-access online journal additionally run by Kirkegaard:
Some may object to your ethics of gathering and releasing this information. Nevertheless, all of the data based in the dataset are or had been currently publicly available, therefore releasing this dataset just presents it in an even more form that is useful.
For everyone worried about privacy, research ethics, while the growing training of publicly releasing big information sets, this logic of “but the information has already been general public” is definitely an all-too-familiar refrain utilized to gloss over thorny ethical issues.