The Workshop on Data and Algorithmic Transparency (DAT'16) is being organized as a forum for academics, industry practicioners, regulators, and policy makers to come together and discuss issues related to increasing role that "big data" algorithms play in our society. Our goal is to provide a venue for fruitful discussions and high-quality academic research papers focused on increasing understanding and transparency of large-scale data collection and the systems and algorithms that it powers. The workshop is co-located with two other highly related venues: the Data Transparency Lab Conference and the FATML'16 Workshop, and we encourage attendees to consider attending these other events as well.

Updates

Transparency and oversight of the algorithmic world: A new role for computer science research

The pervasiveness of data and algorithmic systems in society has generated a new class of research questions that the public is intensely interested in: Are my smart devices surreptitiously recording audio? Does my search history allow inferring intimate details that I haven’t explicitly searched for? Is the algorithm that decides my loan application biased? Do I see different prices online based on my browsing and purchase history? Are there dangerous instabilities or feedback loops in algorithmic systems ranging from finance to road traffic prediction?

Answering these questions requires empirical investigation of computer systems in the wild, with the goal of bringing transparency to these systems. Computer scientists are uniquely poised to carry out this research. The nascent literature on these topics makes clear that a combination of skills is called for: building systems to support large-scale, automated measurements; instrumenting devices to record and reverse-engineer network traffic; analyzing direct (leakage-based) and indirect (inference-based) privacy vulnerabilities; experimenting on black-box and white-box algorithmic systems; simulating and modeling these systems; machine learning; and crowdsourcing.

Computer science research today is largely siloed into disciplinary communities, none of which is well suited to tackle these interdisciplinary challenges. We call for the emergence of a new research community aimed providing transparency and ethical oversight of digital technologies through empirical measurement and analysis. We envision this research feeding into a broader effort that would include law, policy, enforcement, the press, privacy advocates and civil-liberties activists.

This new field is complementary to many existing disciplines. It draws techniques from measurement research, but investigates systems “from the outside” and is concerned with societal effects of systems rather than performance characteristics. It is also similar to security research, but the systems being studied do not have specifications of correct behavior. Finally, transparency research informs areas such as privacy-by-design and discrimination-aware data mining in creating systems that respect privacy and minimize bias. (Note that we are co-located with the FATML‘16 workshop.)

Investigating computer systems to identify effects of societal concern has the effect of holding companies’ feet to the fire. This will result in new ethical challenges. For example, is it a conflict of interest for a transparency researcher to accept industry funding? Regardless, we are confident that as the new community comes together to solve technical challenges, it will evolve an ethos and a set of norms to navigate these dilemmas as well.

Thanks to...

Many thanks to the the Information Law Institute and the Technology Law and Policy Clinic, both at NYU, for agreeing to help host DAT'16!

Sponsors

Workshop on Data and Algorithmic Transparency (DAT'16)

Call for Papers

Today, our lives are increasingly being influenced by systems and algorithms built using large-scale data collected from end users. These systems that are used for tasks as diverse as recommendation of online content (Facebook), targeting of advertisements (DoubleClick), allocation of labor (Uber), medical diagnoses, and prediction of criminal activity (Northpointe). However, external observers—including researchers, lawmakers, regulators, privacy advocates, and the press—typically have only limited visibility into these systems, as both the algorithm itself and the data powers it are hidden. As a result, the increasing ubiquity of these systems has raised significant concerns about their fairness, data privacy, potential for discrimination, and vulnerability to manipulation. Addressing these concerns and properly understanding the impact these systems are having on both end users and a society as a whole is a highly multidisciplinary challenge.

The Workshop on Data and Algorithmic Transparency (DAT'16) is organized with these challenges in mind. We aim to bring together researchers and industry practitioners from a variety of areas, including networked systems, privacy, security, economics, and human-computer interaction. Our goal is to provide a venue for fruitful discussions and high-quality academic research papers focused on increasing understanding and transparency of large-scale data collection and the systems and algorithms that it powers.

The workshop solicits submissions for talks for both previously published and unpublished work. For unpublished work, authors can submit original, unpublished ideas in the form of completed work, position papers, and/or work-in-progress papers of up to 5 pages in length (excluding references). For previously published work, authors can submit a talk abstract with a pointer to the previously published paper. We particularly encourage papers that propose new research directions, have practical impact on real-world users, or could generate lively debate at the workshop.

Accepted papers will be made available on the workshop website; however, the workshop's proceedings can be considered non-archival, meaning contributors are free to publish their work in archival journals or conferences.

Topics

Topics of interest include, but are not limited to the following:

  • Measurements
    • Real-world measurements of user tracking (web-based tracking, mobile device tracking, location tracking, etc)
    • Measurements and analyses of real-world ``big data'' algorithms and systems
    • Tools, techniques, and methodologies for measuring large-scale algorithmic systems
  • Privacy/security
    • Analysis of the flow of personal data in online systems and algorithms
    • Adversarial manipulation of algorithmic systems
    • Privacy issues with large-scale data collection and algorithms
    • Privacy-preserving systems for users to avoid data collection / algorithmic harms
  • Systems
    • Approaches and techniques to increase algorithmic transparency
    • Studies of long-term evolvability and "feedback loops"
  • HCI
    • Approaches for increasing interpretability of algorithms and their outputs
    • New interfaces for exposing data collection, usage, and algorithmic uncertainty to users
  • Economics
    • Incentives for system providers to increase transparency
    • Advertising-friendly privacy-preserving approaches to systems design
  • Multidisciplinary
    • Defining "fairness and "discrimination"
    • Research to support legal/policy interventions

Format

Similar to other non-archival workshops, we envision the first year of DAT to be a paper workshop. Thus, there will be no proceedings, and authors should feel free to re-submit work appearing in DAT'16 to other venues. Accordingly, the format of DAT'16 will be slightly different than a typical CS workshop: we envision a discussion of each presented paper by both the author and discussed by a panel (with the author playing a minority role). Multiple attendees will have read each paper before the workshop in order to facilitate this format. More details on the format will be provided as the workshop approaches.

Important Dates

Complete Paper Submissions Due: September 9, 2016 September 16, 2016, 11:59:59PM HST
Notification to Authors: October 7, 2016
Camera-Ready Papers Due: November 1, 2016

Submission Instructions

Please submit your papers using this site.

To be considered for the workshop, papers must be received by the paper submission deadline mentioned above. Submissions of previously unpublished work must not be under submission to any other venue.

Submissions should be formatted using the sig-alternate-10pt LaTeX style file (or an equivalent Microsoft Word template). Submissions of unpublished work must be no longer than 5 pages including figures and tables, plus as many pages as needed for references; submissions for previously published work need only consist of an abstract with a URL to the previously published paper. The title, author names, affiliations, and an abstract should appear on the first page. Pages should be numbered. Camera-ready versions of all accepted papers will be available online to registered attendees before the workshop. After the workshop, accepted papers will be made available on the conference web site.

Workshop on Data and Algorithmic Transparency (DAT'16)

Organization

Program Committee Chairs

Program Committee

Local Arrangements Chair

Organizing Committee

Program

The DAT'16 workshop features a strong lineup of 15 papers, a mix of previously published work and original work submitted to the workshop.

Each session will start with short presentations followed by a panel with authors of all papers in the session. The panel discussion will be led by one or two commenters. Audience members are welcome to participate in the discussion. To facilitate in-depth, productive discussion, workshop participants are encouraged to read or skim the papers beforehand (papers will be made available on this page by November 1).

Finally, a note about the original papers appearing at DAT: we sought exploratory papers with the goal of spurring discussions at the workshop. Readers should be aware that these previously unpublished papers do not necessarily represent completed research, and that the program committee was not tasked with rigorously vetting their accuracy.

8:45 - 9:00Welcome, overview
Alan Mislove (Northeastern University), Arvind Narayanan (Princeton University)
9:00 - 10:10

Session 1: Measurements of tracking and data collection
Commenter: Joseph Calandrino (FTC OTech)

Who Knows What About Me? A Survey of Behind the Scenes Personal Data Sharing to Third Parties by Mobile Apps (Previously published)
Jinyan Zang (Harvard University), Krysta Dummit (Massachusetts Institute of Technology), James Graves (Carnegie Mellon University)

Online Tracking: A 1-million-site Measurement and Analysis (Previously published)
Steven Englehardt (Princeton University), Dillon Reisman (Princeton University), Arvind Narayanan (Princeton University)

Tracking the Trackers: Towards Understanding the Mobile Advertising and Tracking Ecosystem (Not previously published)
Narseo Vallina-Rodriguez (ICSI/IMDEA Networks), Srikanth Sundaresan (Samsara), Abbas Razaghpanah (Stony Brook University), Rishab Nithyanand (Stony Brook University), Mark Allman (ICSI), Christian Kreibich (ICSI/Lastline), Phillipa Gill (UMass)

10:10 - 10:40Coffee Break
10:40 - 11:50

Session 2: Transparency, accountability, and ethics
Commenter: danah boyd (Microsoft Research/Data & Society)

Accountable Algorithms (Previously published)
Joshua A. Kroll (CloudFlare, Inc and Princeton University Center for Information Technology Policy), Joanna Huey (Princeton University Center for Information Technology Policy), Solon Barocas (Microsoft Research and Cornell University), Edward W. Felten (Princeton University), Joel R. Reidenberg (Fordham University Law School), David Robinson (Upturn and Yale Law School Information Society Project), Harlan Yu (Upturn and Stanford Law School Center for Internet and Society)

Industry needs to embrace data ethics: here’s how it could be done (Not previously published)
Mark Van Hollebeke (Data & Society), Bethan Cantrell (Microsoft), Javier Salido (Microsoft)

Algorithmic Transparency in the News Media (Previously published)
Nicholas Diakopoulos (University of Maryland, College Park), Michael Koliska (Auburn University)

11:50 - 1:20Lunch
On your own
1:20 - 2:30

Session 3: Investigations of specific platforms
Commenters: Aylin Caliskan (Princeton University)

Bias in Online Freelance Marketplaces (Not previously published)
Aniko Hannak (Northeastern University), Claudia Wagner (GESIS), David Garcia (ETH), Markus Strohmaier (GESIS), Christo Wilson (Northeastern University)

Stereotypes in Search Engine Answers: Local or Global? (Not previously published)
Gabriel Magno (DCC-UFMG-Brazil), Camila Souza Araújo (DCC-UFMG-Brazil), Wagner Meira Jr. (DCC-UFMG-Brazil), Virgilio Almeida (Berkman Klein Center - Harvard University)

Algorithmic Labor and Information Asymmetries: A Case Study of Uber's Drivers (Previously published)
Alex Rosenblat (Data & Society Research Institute), Luke Stark (Dartmouth University)

2:30 - 3:00Coffee Break
3:00 - 4:10

Session 4: Research methods and ethics
Commenters: Rachel Goodman (ACLU), Esha Bhandari (ACLU)

Quantifying Search Bias: Investigating Sources of Bias for Political Searches in Social Media (Previously published)
Juhi Kulshrestha (MPI-SWS), Motahhare Eslami (University of Illinois at Urbana-Champaign), Johnnatan Messias (MPI-SWS), Muhammad Bilal Zafar (MPI-SWS), Saptarshi Ghosh (IIEST Shibpur), Krishna Gummadi (MPI-SWS), Karrie Karahalios (University of Illinois at Urbana-Champaign)

Auditing Search Engines for Demographic Bias in Performance (Not previously published)
Rishabh Mehrotra (University College London), Ashton Anderson (Microsoft Research), Fernando Diaz (Microsoft Research), Amit Sharma (Microsoft Research), Hanna Wallach (Microsoft Research), Emine Yilmaz (University College London)

Defending against Sybil Devices in Crowdsourced Mapping Services (Previously published)
Gang Wang (Virginia Tech/UC Santa Barbara), Bolun Wang (UC Santa Barbara), Tianyi Wang (UC Santa Barbara/Tsinghua University), Ana Nika (UC Santa Barbara), Haitao Zheng (UC Santa Barbara), Ben Y. Zhao (UC Santa Barbara)

4:10 - 4:40Coffee Break
4:40 - 5:50

Session 5: Privacy
Commenter: Krishna Gummadi (MPI-SWS)

Should You Use the App for That? Comparing the Privacy Implications of App- and Web-based Online Services (Previously published)
Christophe Leung (Northeastern University), Jingjing Ren (Northeastern University), David Choffnes (Northeastern University), Christo Wilson (Northeastern University)

Keeping Internet Users in the Know or in the Dark? The Data Privacy Transparency of Canadian Internet Carriers (Previously published)
Andrew Clement (University of Toronto), Jonathan Obar (York University)

A Smart Home is No Castle: Privacy Vulnerabilities of Encrypted IoT Traffic (Not previously published)
Noah Apthorpe (Princeton University), Dillon Reisman (Princeton University), Nick Feamster (Princeton University)

5:50 - 6:00Wrap up
Alan Mislove (Northeastern University), Arvind Narayanan (Princeton University)

Workshop on Data and Algorithmic Transparency (DAT'16)

Information for attendees

The workshop will be held on Saturday, November 19, 2016 in Lipton Hall, located within D’Agostino Hall at New York University's Law School. The full address of the venue is:

D’Agostino Hall, New York University School of Law
108 West 3rd Street [between MacDougal & Sullivan Streets]
Lipton Hall
New York, NY 10012

Or, if you prefer, you can view the location in Google Maps.

Registration

The DAT'16 workshop is open to academics, industry practitioners, regulators, policy makers, and the press. The registration fee is $35, primarily to help underwrite the cost of the venue. If this registration fee presents a hardship, potential attendees can email the PC chairs to ask for a waiver of the registration fee. Tickets are being made available on a first-come, first-served basis, so register soon!

You will need to register via the DTL Registration Site, which covers all three co-located events. You are encouraged to also register for and attend 2016 DTL Conference and the 2016 FatML Workshop, both described in more detail below.

Accommodations

NYU Law School is conveniently located in Manhattan, close to a large number of hotels and mass transit options. An overview of the options for hotels, and the different tradeoffs that different neighborhoods offer, is available from NYU's visitor page.

Co-located events

The workshop will be co-located with two related events that we believe will be of interest to attendees of DAT'16:

  • The 2016 Data Transparency Lab Conference, hosted by the Data Transparency Lab, brings together world-class researchers, industry leaders, policymakers, developers and communicators who are leading the development of an Internet that is more respectful to personal data online. The 2016 DTL Conference will be held on November 16 and November 17 at Columbia University.
  •  
  • The 2016 Workshop on Fairness, Accountability, and Transparency in Machine Learning, an interdisciplinary workshop that considers issues of fairness, accountability, and transparency in machine learning. It will address growing anxieties about the role that machine learning plays in consequential decision-making in such areas as commerce, employment, healthcare, education, and policing. The 2016 FatML workshop will be held on November 18 at New York University Law School.