Web scraping as a topic before the US Supreme Court - GESIS publication is cited


Categories: GESIS-News

A GESIS publication is cited in a recent case before the US Supreme Court to inform legislation about the social and scientific benefits of web scraping.

In the USA, several legal disputes are currently underway whose outcome could have far-reaching implications for the method of "web scraping" and thus for the entire scientific field of computational social science. For example, the case "Nathan van Buren vs. United States" is primarily about a former police officer who used the technical infrastructure of his police station to search for a specific person. In doing so, he violated the Computer Fraud and Abuse Act (CFAA), a law that has since become extremely controversial and dates back to 1986 and thus to a time far before digitalisation.

The CFAA, in its narrow interpretation, which is hardly in keeping with the times, has the potential to criminalize web scraping and sensitively affect an entire field of research:

„Section 1030(a)(2)(C) of the Computer Fraud and Abuse Act (“CFAA”) proscribes “exceeding authorization” when accessing a computer and thereby obtaining information. This statute is both criminal and civil. Moreover, it is capable of both a broad reading, proscribing any use of any kind of electronic device in any manner and for any purpose not expressly permitted by the device’s owner— essentially using the CFAA to furnish civil, contractual prohibitions with criminal penalties; or a narrow reading, interpreting the Section as akin to a “data theft statute”— thereby restricting this provision of the statute to the proscription actually set forth in its text.“ (https://www.acm.org/binaries/content/assets/public-policy/ustpc-amicus-brief-vanburen-v-us.pdf)

The „Association for Computing Machinery“ (ACM), as the worldwide association of computer scientists, has reacted to this and submitted a statement (amicus curiae) to the US Supreme Court. It argues that web scraping of publicly available web content is an important basis for research. It contains 20 exemplary scientific references that exemplify the importance of web scraping for research and society.

One of the citations quoted is from GESIS: Aniko Hannák wrote this work with colleagues from ETH Zurich, Northeastern University and GESIS during a guest stay at the GESIS location in Cologne.

Hannák, A., Wagner, C., Garcia, D., Mislove, A., Strohmaier, M., & Wilson, C. (2017, February). Bias in online freelance marketplaces: Evidence from taskrabbit and fiverr. In Proceedings of the 2017 ACM conference on computer supported cooperative work and social computing (pp. 1914-1933). https://doi.org/10.1145/2998181.2998327