Workshop on Data and Algorithmic Transparency

Workshop on Data and Algorithmic Transparency (DAT'16)

November 19, 2016, New York University Law School

Co-located with the Data Transparency Lab Conference and the FATML'16 Workshop

The Workshop on Data and Algorithmic Transparency (DAT'16) is being organized as a forum for academics, industry practicioners, regulators, and policy makers to come together and discuss issues related to increasing role that "big data" algorithms play in our society. Our goal is to provide a venue for fruitful discussions and high-quality academic research papers focused on increasing understanding and transparency of large-scale data collection and the systems and algorithms that it powers. The workshop is co-located with two other highly related venues: the Data Transparency Lab Conference and the FATML'16 Workshop, and we encourage attendees to consider attending these other events as well.

Updates

November 3, 2016: All of the DAT'16 papers are now available on the Program Page.
October 21, 2016: Registration for DAT'16 is now open — more details are available on the Venue Page.
October 20, 2016: We have posted the schedule for DAT'16!
October 17, 2016: We're excited to announce that DAT'16 will be hosted at NYU Law School's Lipton Hall! (a short subway ride from DTL'16, at Columbia)
October 7, 2016: We are pleased to announce that DAT'16 accepted 15 papers!
August 31, 2016: Due to a number of requests, we've decided to push back the paper submission deadline one week; it's now September 16, 2016 at 11:59:59pm Hawaii time.
August 10, 2016: We're aiming for a very interactive workshop, and are soliciting submissions of both unpublished and previously-published work.
July 24, 2016: We're thrilled to announce a terrific program committee of 18 researchers!
June 30, 2016: The webpage for DAT'16 is now live!

Transparency and oversight of the algorithmic world: A new role for computer science research

The pervasiveness of data and algorithmic systems in society has generated a new class of research questions that the public is intensely interested in: Are my smart devices surreptitiously recording audio? Does my search history allow inferring intimate details that I haven’t explicitly searched for? Is the algorithm that decides my loan application biased? Do I see different prices online based on my browsing and purchase history? Are there dangerous instabilities or feedback loops in algorithmic systems ranging from finance to road traffic prediction?

Answering these questions requires empirical investigation of computer systems in the wild, with the goal of bringing transparency to these systems. Computer scientists are uniquely poised to carry out this research. The nascent literature on these topics makes clear that a combination of skills is called for: building systems to support large-scale, automated measurements; instrumenting devices to record and reverse-engineer network traffic; analyzing direct (leakage-based) and indirect (inference-based) privacy vulnerabilities; experimenting on black-box and white-box algorithmic systems; simulating and modeling these systems; machine learning; and crowdsourcing.

Computer science research today is largely siloed into disciplinary communities, none of which is well suited to tackle these interdisciplinary challenges. We call for the emergence of a new research community aimed providing transparency and ethical oversight of digital technologies through empirical measurement and analysis. We envision this research feeding into a broader effort that would include law, policy, enforcement, the press, privacy advocates and civil-liberties activists.

This new field is complementary to many existing disciplines. It draws techniques from measurement research, but investigates systems “from the outside” and is concerned with societal effects of systems rather than performance characteristics. It is also similar to security research, but the systems being studied do not have specifications of correct behavior. Finally, transparency research informs areas such as privacy-by-design and discrimination-aware data mining in creating systems that respect privacy and minimize bias. (Note that we are co-located with the FATML‘16 workshop.)

Investigating computer systems to identify effects of societal concern has the effect of holding companies’ feet to the fire. This will result in new ethical challenges. For example, is it a conflict of interest for a transparency researcher to accept industry funding? Regardless, we are confident that as the new community comes together to solve technical challenges, it will evolve an ethos and a set of norms to navigate these dilemmas as well.

Thanks to...

Many thanks to the the Information Law Institute and the Technology Law and Policy Clinic, both at NYU, for agreeing to help host DAT'16!

Workshop on Data and Algorithmic Transparency (DAT'16)

Call for Papers

Today, our lives are increasingly being influenced by systems and algorithms built using large-scale data collected from end users. These systems that are used for tasks as diverse as recommendation of online content (Facebook), targeting of advertisements (DoubleClick), allocation of labor (Uber), medical diagnoses, and prediction of criminal activity (Northpointe). However, external observers—including researchers, lawmakers, regulators, privacy advocates, and the press—typically have only limited visibility into these systems, as both the algorithm itself and the data powers it are hidden. As a result, the increasing ubiquity of these systems has raised significant concerns about their fairness, data privacy, potential for discrimination, and vulnerability to manipulation. Addressing these concerns and properly understanding the impact these systems are having on both end users and a society as a whole is a highly multidisciplinary challenge.

The Workshop on Data and Algorithmic Transparency (DAT'16) is organized with these challenges in mind. We aim to bring together researchers and industry practitioners from a variety of areas, including networked systems, privacy, security, economics, and human-computer interaction. Our goal is to provide a venue for fruitful discussions and high-quality academic research papers focused on increasing understanding and transparency of large-scale data collection and the systems and algorithms that it powers.

The workshop solicits submissions for talks for both previously published and unpublished work. For unpublished work, authors can submit original, unpublished ideas in the form of completed work, position papers, and/or work-in-progress papers of up to 5 pages in length (excluding references). For previously published work, authors can submit a talk abstract with a pointer to the previously published paper. We particularly encourage papers that propose new research directions, have practical impact on real-world users, or could generate lively debate at the workshop.

Accepted papers will be made available on the workshop website; however, the workshop's proceedings can be considered non-archival, meaning contributors are free to publish their work in archival journals or conferences.

Topics

Topics of interest include, but are not limited to the following:

Measurements
- Real-world measurements of user tracking (web-based tracking, mobile device tracking, location tracking, etc)
- Measurements and analyses of real-world ``big data'' algorithms and systems
- Tools, techniques, and methodologies for measuring large-scale algorithmic systems
Privacy/security
- Analysis of the flow of personal data in online systems and algorithms
- Adversarial manipulation of algorithmic systems
- Privacy issues with large-scale data collection and algorithms
- Privacy-preserving systems for users to avoid data collection / algorithmic harms
Systems

Approaches and techniques to increase algorithmic transparency
Studies of long-term evolvability and "feedback loops"

Approaches for increasing interpretability of algorithms and their outputs
New interfaces for exposing data collection, usage, and algorithmic uncertainty to users

Economics

Incentives for system providers to increase transparency
Advertising-friendly privacy-preserving approaches to systems design

Multidisciplinary

Defining "fairness and "discrimination"
Research to support legal/policy interventions

Format

Similar to other non-archival workshops, we envision the first year of DAT to be a paper workshop. Thus, there will be no proceedings, and authors should feel free to re-submit work appearing in DAT'16 to other venues. Accordingly, the format of DAT'16 will be slightly different than a typical CS workshop: we envision a discussion of each presented paper by both the author and discussed by a panel (with the author playing a minority role). Multiple attendees will have read each paper before the workshop in order to facilitate this format. More details on the format will be provided as the workshop approaches.

Important Dates

Complete Paper Submissions Due: ~~September 9, 2016~~ September 16, 2016, 11:59:59PM HST
Notification to Authors: October 7, 2016
Camera-Ready Papers Due: November 1, 2016

Submission Instructions

Please submit your papers using this site.

To be considered for the workshop, papers must be received by the paper submission deadline mentioned above. Submissions of previously unpublished work must not be under submission to any other venue.

Submissions should be formatted using the sig-alternate-10pt LaTeX style file (or an equivalent Microsoft Word template). Submissions of unpublished work must be no longer than 5 pages including figures and tables, plus as many pages as needed for references; submissions for previously published work need only consist of an abstract with a URL to the previously published paper. The title, author names, affiliations, and an abstract should appear on the first page. Pages should be numbered. Camera-ready versions of all accepted papers will be available online to registered attendees before the workshop. After the workshop, accepted papers will be made available on the conference web site.

Workshop on Data and Algorithmic Transparency (DAT'16)

Organization

Program Committee Chairs

Alan Mislove, Northeastern University
Arvind Narayanan, Princeton University

Program Committee

Alan Mislove, Northeastern University
Anupam Datta, Carnegie Mellon University
Arvind Narayanan, Princeton University
Aylin Caliskan-Islam, Princeton University
Christian Sandvig, University of Michigan
Christo Wilson, Northeastern University
Claude Castelluccia, INRIA
Krishna Gummadi, MPI-SWS
Frank Pasquale, University of Maryland
Joseph Calandrino, FTC OTech
John Byers, Boston University
Michael Tschantz, ICSI
Nick Diakopoulos, University of Maryland
Nick Feamster, Princeton University
Nikos Laoutaris, Telefonica
Phillipa Gill, University of Massachusetts at Amherst
Roxana Geambasu, Columbia University
Saikat Guha, Microsoft Research

Local Arrangements Chair

Augustin Chaintreau, Columbia University

Organizing Committee

Alan Mislove, Northeastern University
Arvind Narayanan, Princeton University
Krishna Gummadi, MPI-SWS
Nikos Laoutaris, Telefonica

Program

The DAT'16 workshop features a strong lineup of 15 papers, a mix of previously published work and original work submitted to the workshop.

Each session will start with short presentations followed by a panel with authors of all papers in the session. The panel discussion will be led by one or two commenters. Audience members are welcome to participate in the discussion. To facilitate in-depth, productive discussion, workshop participants are encouraged to read or skim the papers beforehand (papers will be made available on this page by November 1).

Finally, a note about the original papers appearing at DAT: we sought exploratory papers with the goal of spurring discussions at the workshop. Readers should be aware that these previously unpublished papers do not necessarily represent completed research, and that the program committee was not tasked with rigorously vetting their accuracy.

8:45 - 9:00	Welcome, overview Alan Mislove (Northeastern University), Arvind Narayanan (Princeton University)
9:00 - 10:10	Session 1: Measurements of tracking and data collection Commenter: Joseph Calandrino (FTC OTech) Who Knows What About Me? A Survey of Behind the Scenes Personal Data Sharing to Third Parties by Mobile Apps (Previously published) Jinyan Zang (Harvard University), Krysta Dummit (Massachusetts Institute of Technology), James Graves (Carnegie Mellon University) Online Tracking: A 1-million-site Measurement and Analysis (Previously published) Steven Englehardt (Princeton University), Dillon Reisman (Princeton University), Arvind Narayanan (Princeton University) Tracking the Trackers: Towards Understanding the Mobile Advertising and Tracking Ecosystem (Not previously published) Narseo Vallina-Rodriguez (ICSI/IMDEA Networks), Srikanth Sundaresan (Samsara), Abbas Razaghpanah (Stony Brook University), Rishab Nithyanand (Stony Brook University), Mark Allman (ICSI), Christian Kreibich (ICSI/Lastline), Phillipa Gill (UMass)
10:10 - 10:40	Coffee Break
10:40 - 11:50	Session 2: Transparency, accountability, and ethics Commenter: danah boyd (Microsoft Research/Data & Society) Accountable Algorithms (Previously published) Joshua A. Kroll (CloudFlare, Inc and Princeton University Center for Information Technology Policy), Joanna Huey (Princeton University Center for Information Technology Policy), Solon Barocas (Microsoft Research and Cornell University), Edward W. Felten (Princeton University), Joel R. Reidenberg (Fordham University Law School), David Robinson (Upturn and Yale Law School Information Society Project), Harlan Yu (Upturn and Stanford Law School Center for Internet and Society) Industry needs to embrace data ethics: here’s how it could be done (Not previously published) Mark Van Hollebeke (Data & Society), Bethan Cantrell (Microsoft), Javier Salido (Microsoft) Algorithmic Transparency in the News Media (Previously published) Nicholas Diakopoulos (University of Maryland, College Park), Michael Koliska (Auburn University)
11:50 - 1:20	Lunch On your own
1:20 - 2:30	Session 3: Investigations of specific platforms Commenters: Aylin Caliskan (Princeton University) Bias in Online Freelance Marketplaces (Not previously published) Aniko Hannak (Northeastern University), Claudia Wagner (GESIS), David Garcia (ETH), Markus Strohmaier (GESIS), Christo Wilson (Northeastern University) Stereotypes in Search Engine Answers: Local or Global? (Not previously published) Gabriel Magno (DCC-UFMG-Brazil), Camila Souza Araújo (DCC-UFMG-Brazil), Wagner Meira Jr. (DCC-UFMG-Brazil), Virgilio Almeida (Berkman Klein Center - Harvard University) Algorithmic Labor and Information Asymmetries: A Case Study of Uber's Drivers (Previously published) Alex Rosenblat (Data & Society Research Institute), Luke Stark (Dartmouth University)
2:30 - 3:00	Coffee Break
3:00 - 4:10	Session 4: Research methods and ethics Commenters: Rachel Goodman (ACLU), Esha Bhandari (ACLU) Quantifying Search Bias: Investigating Sources of Bias for Political Searches in Social Media (Previously published) Juhi Kulshrestha (MPI-SWS), Motahhare Eslami (University of Illinois at Urbana-Champaign), Johnnatan Messias (MPI-SWS), Muhammad Bilal Zafar (MPI-SWS), Saptarshi Ghosh (IIEST Shibpur), Krishna Gummadi (MPI-SWS), Karrie Karahalios (University of Illinois at Urbana-Champaign) Auditing Search Engines for Demographic Bias in Performance (Not previously published) Rishabh Mehrotra (University College London), Ashton Anderson (Microsoft Research), Fernando Diaz (Microsoft Research), Amit Sharma (Microsoft Research), Hanna Wallach (Microsoft Research), Emine Yilmaz (University College London) Defending against Sybil Devices in Crowdsourced Mapping Services (Previously published) Gang Wang (Virginia Tech/UC Santa Barbara), Bolun Wang (UC Santa Barbara), Tianyi Wang (UC Santa Barbara/Tsinghua University), Ana Nika (UC Santa Barbara), Haitao Zheng (UC Santa Barbara), Ben Y. Zhao (UC Santa Barbara)
4:10 - 4:40	Coffee Break
4:40 - 5:50	Session 5: Privacy Commenter: Krishna Gummadi (MPI-SWS) Should You Use the App for That? Comparing the Privacy Implications of App- and Web-based Online Services (Previously published) Christophe Leung (Northeastern University), Jingjing Ren (Northeastern University), David Choffnes (Northeastern University), Christo Wilson (Northeastern University) Keeping Internet Users in the Know or in the Dark? The Data Privacy Transparency of Canadian Internet Carriers (Previously published) Andrew Clement (University of Toronto), Jonathan Obar (York University) A Smart Home is No Castle: Privacy Vulnerabilities of Encrypted IoT Traffic (Not previously published) Noah Apthorpe (Princeton University), Dillon Reisman (Princeton University), Nick Feamster (Princeton University)
5:50 - 6:00	Wrap up Alan Mislove (Northeastern University), Arvind Narayanan (Princeton University)

Workshop on Data and Algorithmic Transparency (DAT'16)

Information for attendees

The workshop will be held on Saturday, November 19, 2016 in Lipton Hall, located within D’Agostino Hall at New York University's Law School. The full address of the venue is:

D’Agostino Hall, New York University School of Law
108 West 3rd Street [between MacDougal & Sullivan Streets]
Lipton Hall
New York, NY 10012

Or, if you prefer, you can view the location in Google Maps.

Registration

The DAT'16 workshop is open to academics, industry practitioners, regulators, policy makers, and the press. The registration fee is $35, primarily to help underwrite the cost of the venue. If this registration fee presents a hardship, potential attendees can email the PC chairs to ask for a waiver of the registration fee. Tickets are being made available on a first-come, first-served basis, so register soon!

You will need to register via the DTL Registration Site, which covers all three co-located events. You are encouraged to also register for and attend 2016 DTL Conference and the 2016 FatML Workshop, both described in more detail below.

Accommodations

NYU Law School is conveniently located in Manhattan, close to a large number of hotels and mass transit options. An overview of the options for hotels, and the different tradeoffs that different neighborhoods offer, is available from NYU's visitor page.

Co-located events

The workshop will be co-located with two related events that we believe will be of interest to attendees of DAT'16:

The 2016 Data Transparency Lab Conference, hosted by the Data Transparency Lab, brings together world-class researchers, industry leaders, policymakers, developers and communicators who are leading the development of an Internet that is more respectful to personal data online. The 2016 DTL Conference will be held on November 16 and November 17 at Columbia University.
The 2016 Workshop on Fairness, Accountability, and Transparency in Machine Learning, an interdisciplinary workshop that considers issues of fairness, accountability, and transparency in machine learning. It will address growing anxieties about the role that machine learning plays in consequential decision-making in such areas as commerce, employment, healthcare, education, and policing. The 2016 FatML workshop will be held on November 18 at New York University Law School.

Updates

Transparency and oversight of the algorithmic world: A new role for computer science research

Thanks to...

Sponsors

Workshop on Data and Algorithmic Transparency (DAT'16)

Call for Papers

Topics

Format

Important Dates

Submission Instructions

Workshop on Data and Algorithmic Transparency (DAT'16)

Organization

Program Committee Chairs

Program Committee

Local Arrangements Chair

Organizing Committee

Program

Workshop on Data and Algorithmic Transparency (DAT'16)

Information for attendees

Registration

Accommodations

Co-located events