Aditya Parameswaran

Starting July 1, 2019, I will be an assistant professor in the School of Information (I School) and Electrical Engineering and Computer Sciences (EECS) at the University of California, Berkeley.

I am presently an assistant professor of Computer Science at the University of Illinois (UIUC).

My research interests are broadly in building tools for simplifying data analytics, i.e., empowering individuals and teams to leverage and make sense of their datasets more easily, efficiently, and effectively.

Biographical Sketch

Aditya Parameswaran is a soon-to-be (starting July 1) Assistant Professor in the School of Information (I School) and Electrical Engineering and Computer Sciences (EECS) at the University of California, Berkeley. Aditya Parameswaran is currently an Assistant Professor in Computer Science at the University of Illinois, Urbana-Champaign. He spent a year as a PostDoc at MIT CSAIL following his PhD at Stanford University, before starting at Illinois in August 2014. He develops systems and algorithms for "human-in-the-loop" data analytics, synthesizing techniques from database systems, data mining, and human computation.

Click here for a longer bio.

Quick Project Links



  • —, 2019: Berkeley logo *BIG NEWS!* I am moving to UC Berkeley, starting early Summer, with a faculty appointment at the School of Information and the EECS Department. Berkeley has been making some exciting moves in the broad data and information space, including a new Division of Data Science and Information, a very popular Masters in Information and Data Science program, and a new and increasingly popular data science major. I'm looking forward to being a part of this journey, and to moving back to the Bay Area! Illinois has been a wonderful home for the past 5-odd years with terrific students, a collegial research environment, and Midwestern charm (plus baby goats!); I'm going to miss my colleagues, the university, and the town terribly.
  • March 20, 2019: Silu Huang accepted her offer to be a researcher at the DMX group at Microsoft Research! I interned in the DMX group back in 2010 and it's a fantastic place to work.
  • March 15, 2019: DataSpread's asynch computation framework was accepted at SIGMOD'19. Congrats to Mangesh, Tana et al.!
  • March 1, 2019: Yihan Gao defended his thesis on data compression and data extraction, titled "Extracting and Utilizing Hidden Structures in Large Datasets". Yihan will return to his undergraduate alma mater to be an assistant professor at Tsinghua University. Congratulations Yihan!
  • February 10, 2019: Our demo paper on DataSpread's navigation, formula, and relational capabilities was accepted at ICDE'19. Congrats Mangesh, Tana, Sajjadur, and gang!
  • February 1, 2019: Mangesh Bendre is leaving the group to start as a research scientist at Visa Research. Congratulations Mangesh! Mangesh's thesis on DataSpread is up at this link.
  • December 15, 2018: Congratulations to Doris and gang for our IUI 2019 paper (link here) identifying a new fallacy in data exploration-the drill-down fallacy-and developing techniques to work around it.
  • December 1, 2018: Yay! My student Mangesh Bendre (coadvised with Kevin Chang) defended his thesis on DataSpread. Mangesh has spearheaded the development of DataSpread, and was instrumental in many of the key innovations so far: the hybrid data model, positional indexing, and asynchronous formula computation.
  • November 15, 2018: Congrats to Doris and the rest of the Helix team for the VLDB 2019 paper on the design of Helix, our human-in-the-loop machine learning system.
  • August 13, 2018: Congratulations to Silu, Liqi et al. (w/ Aaron Elmore) for a "best of conference" nomination for our VLDB 2017 paper on our versioned database system Orpheus's design and implementation!
  • August 13, 2018: Doris's IEEE D.E. Bulletin paper articulating our vision for a visual discovery assistant, called VIDA, will be out soon. Thanks to Alexandra for inviting us.
  • August 10, 2018: Shreya's paper on incorporating constraints for more accurate crowd-powered sorting was accepted as a short paper at CIKM 2018.
  • July 11, 2018: Thrilled to receive the Army Research Office Young Investigator Program Award for our work on decoupling perspectives in crowdsourcing. Thanks to the CS Department for the generous article!
  • June 30, 2018: Our SIGMOD blog post on why visual data exploration introduces a number of new data management challenges is up. Thank you to Georgia Koutrika for inviting me!
  • June 15, 2018: helix Our project page for Helix is up!
  • June 15, 2018: Short papers accepted at IDEA on iteration in machine learning workflows, and at HCOMP on quality evaluation methods for crowdsourced segmentation.
  • May 27, 2018: I am serving on the steering committee of the DSIA workshop @ VIS 2018. Consider submitting your latest and greatest work here!
  • May 2, 2018: Demos on Helix, our human-in-the-loop machine learning tool, and ShapeSearch, our flexible shape-based trend-line querying tool, were accepted at VLDB 2018.
  • April 25, 2018: Our paper on Needletail, an efficient sampling engine for browsing, was accepted at the HILDA Workshop at SIGMOD 2018. We've used Needletail in a number of papers on scalable approximate visualization generation, so we're glad to have this finally out there!
  • April 15, 2018: Our paper on accelerating human-in-the-loop ML, a vision paper for the Helix project, was accepted at the DEEM Workshop at SIGMOD 2018. Lots more to come on Helix in the near future. Congrats Doris!
  • April 1, 2018: Thrilled to receive the 2018 Dean's Excellence in Research Award from the University of Illinois, given to assistant professors with an outstanding research profile + impact. Delighted to be able to celebrate with the group (photo on the right)!
  • March 1, 2018: Happy to be recognized with a spot on the "List of teachers rated as Excellent" for the second year in a row!
  • February 15, 2018: Our demo paper (w/ folks at UChicago) on generating succinct diffs between data versions was accepted at SIGMOD 2018.
  • February 10, 2018: Mangesh's paper on data models and indexes for scalable spreadsheets has been accepted to ICDE 2018! This paper lays out the groundwork for our DataSpread project, many years in the making.
        Click here for more news.

Synergistic Activities

I am an Associate Editor for SIGMOD Record, focusing on vision articles. Please consider sending us your most controversial and/or interesting papers!

I serve on the steering committees of HILDA (Human-in-the-loop Data Analytics) at SIGMOD and DSIA (Data Systems for Interactive Analysis) at VIS. Lots of excitement around this nascent area at the intersection of databases, data mining, and visualization/HCI - join us!

This cycle, I am serving as an Area/Associate Chair for HCOMP 2020, VLDB 2020, and SIGMOD 2020, as a Program Committee member for VLDB Demo 2019 and HILDA 2019 (phew!) I've served on the program committees of VLDB, KDD, SIGMOD, WSDM, WWW, SOCC, HCOMP, ICDE, and EDBT, many of them multiple times.

I co-organized HILDA 2017. I was the SIGMOD 2016 Undergraduate Research Chair.

Recent Releases

Medium Blog

Selected Projects


Zenvisage: A visualization recommendation system

Zenvisage is a tool for effortlessly visualizing insights from very large data sets. It automates finding the right visualization for a query, significantly simplifying the laborious task of identifying appropriate visualizations.

Project page here. Try it live here!


Helix: An Accelerated Human-in-the-loop Machine Learning System

Helix accelerates the iterative development of machine learning pipelines with a human developer "in the loop" via intelligent caching and reuse.

Project page here.


DataSpread: A Spreadsheet-Database Hybrid

DataSpread is a tool that marries the best of databases and spreadsheets.

Project page: here


Orpheus: Relational Dataset Version Management at Scale

DataHub (or "GitHub for Data") is a system that enables collaborative data science by keeping track of large numbers of versions and their dependencies compactly, and allowing users to progressively clean, integrate and visualize their datasets. OrpheusDB is a component of DataHub focused on using a relational database for versioning.

Project page: here


Populace: A Suite of Crowd-Powered Algorithms

Our work has developed a number of algorithms for gathering, processing, and understanding data obtained from humans (or crowds), while minimizing cost, latency, and error. Since 2014, our focus has been on optimizing open-ended crowdsourcing: an understudied and challenging class.

Project page: here