Aditya Parameswaran

I am an assistant professor of Computer Science at the University of Illinois (UIUC). My research interests are broadly in simplifying and improving data analytics, i.e., helping users make better use of their data.

My work involves building real data analytics systems with principled foundations, designing algorithms (with formal guarantees) for the systems, as well as mining data obtained from such systems.

Biographical Sketch

Aditya Parameswaran is an Assistant Professor in Computer Science at the University of Illinois (UIUC), with affiliate appointments at the Institute for Genomic Biology and the Beckman Institute for Advanced Science and Technology. He spent a year as a PostDoc at MIT CSAIL following his PhD at Stanford University, before starting at Illinois in August 2014. He develops systems and algorithms for "human-in-the-loop" data analytics, synthesizing techniques from database systems, data mining, and human computation.

He has received the NSF CAREER Award (2017), the TCDE Early Career Award (2017), the Dean's Excellence in Research Award (2018) and the C. W. Gear Junior Faculty Award from the University of Illinois (2017), multiple "best" Doctoral Dissertation Awards (from SIGMOD, SIGKDD, and Stanford in 2014), an "Excellent" Lecturer award from Illinois (2016), a Google Faculty award (2015), the Key Scientific Challenges award from Yahoo!, five best-of-conference citations (VLDB 2010, KDD 2012, ICDE 2014, ICDE 2016, AISTATS 2017), a best demo honorable mention (SIGMOD 2017). He is an associate editor of SIGMOD Record, serves on the steering committee of the HILDA (Human-in-the-loop Data Analytics) Workshop, and has served on program committees of various database, data mining, web, systems, and crowdsourcing conferences. His research group is supported with funding from the NSF (CAREER, Medium, AITF, BigData), the NIH (2X), Adobe, the Siebel Energy Institute, and Google.

Quick Project Links



  • February 10, 2018: Mangesh's paper on data models and indexes for scalable spreadsheets has been accepted to ICDE 2018!
  • February 1, 2018: Thrilled to receive the 2018 Dean's Excellence in Research Award from the University of Illinois!
  • December 12, 2017: New paper on quickly identifying a succinct difference (or "diff") between two relational datasets here. We characterize the complexity of this problem, based on varying the classes of operators and types of attributes.
  • December 10, 2017: More Kelly news! Kelly received the Snap Research Scholarshop and the CRA Undergraduate research award honorable mention. Woohoo!
  • December 1, 2017: Interested in trying out our latest version of Zenvisage? Here's the link: More at our Medium blog post.
  • November 18, 2017: Paper studying scalability issues in Microsoft Excel accepted at CHI 2018. Congrats Kelly Mack (an amazing achievement for an undergrad)! In other news, Kelly was also nominated for the CRA undergraduate research award.
  • November 10, 2017: Paper on our automatic data lake extraction tool accepted at SIGMOD 2018. Congrats Yihan and Silu!
  • October 15, 2017: My O'Reilly Blog post on "Enabling Data Science for the Majority" is live! In here, I articulate that there are 5 BIG challenges in democratizing data science, and describe some of our work as well as some of the other work in this space. Read this if you want to find out what's new and cool in data science research.
  • October 10, 2017: New preprint on characterizing the spectrum of scalability issues in Microsoft Excel via Reddit posts here as part of our DataSpread project. Led by Kelly, our intrepid undergrad!
  • October 1, 2017: The Zenvisage gang chronicle our multi-year effort in participatory design with Zenvisage along with scientists from material science, genetics, and astrophysics is chronicled here. Many interesting insights on how visual exploration systems like Zenvisage can fit into scientific data exploration workflows + many real instances of valuable scientific findings gained from the process!
  • September 11, 2017: Thanks to new funding from the NSF Algorithms in the Field (AitF) program, we can advance scalable visualization by applying sublinear time techniques, along with the super smart theory duo of Ronitt Rubinfeld (MIT) and Ilias Diakonikolas (USC). NSF page here.
  • September 1, 2017: VLDB Blog Posts! Here they are:
    • "Towards Automating Insight", here.
    • "Drawing Conclusions Early with Incvisage", here.
    • "Painless Data Versioning for Collaborative Data Science", here.
    • "Crowdsourcing in Practice: Our Findings", here.
  • August 15, 2017: Grateful to receive the C.W. Gear Junior Faculty Award from the University of Illinois! Thanks to the Department of CS for being such a supportive environment for junior faculty!
  • August 1, 2017: New/Updated preprints:
    • on Needletail, our "any-k" browsing and sampling engine, here.
    • on FastMatch, an algorithm for rapidly matching histograms to a target, applying a variety of systems and algorithmic ideas, here; a key component of Zenvisage.
    • on DataSpread, studying representation and indexing schemes for spreadsheet data, here.
    • on Datamaran, our unsupervised extraction tool for large-scale extraction from data lakes, here.
  • May 18, 2017: The OrpheusDB demo received a best demo honorable mention! Congrats to Liqi + Silu! Missed it at SIGMOD? You can still catch it here: video.
  • May 15, 2017: Paper on IncVisage: our incrementally improving visualization algorithm and interface has been accepted to VLDB'17! Paper here. Joint work with theorists at MIT and Waterloo, and HCI/Viz folks at Illinois. Perhaps the first paper that has theory, DB, and HCI co-authors? (Would love to be corrected if not.)
  • April 15, 2017: Thrilled and honored to receive:
    • The NSF CAREER Award: Abstract here. Excited to pursue the vision of optimizing "open-ended" crowdsourcing! Vision paper from IEEE Data Engg. Bulletin here.
    • The TCDE (Technical Committed on Data Engineering) Early Career Award, awarded for an individual's whole body of work in the first 5 years after the PhD. The award citation: The award is for developing new interactive tools and techniques that expand the reach of data analytics, enabling powerful data-driven discoveries by experts and non-experts alike.
  • April 15, 2017: Orpheus Updates: demo accepted at SIGMOD 2017; paper accepted at VLDB 2017 (no revisions!); open-source release here.
  • April 3, 2017: The New York Times cited Adam Marcus and my book on crowdsourced data management. Article here.
  • April 3, 2017: The HILDA 2017 workshop (co-located with SIGMOD) program is up.
  • March 1, 2017: Manas's TKDE paper on smart drill-down (from the "best of ICDE 2016") was accepted.
  • February 20, 2017: Vision paper on next-gen visualization recommendation systems with Manasi Vartak, Sam et al. is out at SIGMOD Record. Link here.
  • January 30, 2017: My student Silu Huang won the MSR Faculty Fellowship: the first Illinois student since 2011! A great honor! Silu has been recently working on Orpheus.
  • January 30, 2017: Yihan's paper on calibrating classifiers has been accepted as an ORAL presentation at AISTATS'17!
  • January 15, 2017: Our paper analyzing a very large log of all tasks from a popular crowdsourcing marketplace has been accepted at VLDB'17. Learn all about how a marketplace operates, what the distribution of tasks look like, and how the workers behave here.
  • January 10, 2017: Three of our key analytics tools, DataSpread, Zenvisage, and OrpheusDB, are moving out of private betas with a few interested parties to the public, available for easy download and deployment. More details and download links here:
  • January 1, 2017: New preprint release on Catamaran, our new fully-unsupervised data extraction tool from machine generated data: no examples or supervision needed! Preprint here.
        More News

Synergistic Activities

I am an Associate Editor for SIGMOD Record, focusing on vision articles. Please consider sending us your most controversial and/or interesting papers!

I served as a co-chair for the HILDA (Human-In-the-Loop Data Analytics) Workshop at SIGMOD 2017, and now serve on the steering committee. Website here. Please consider submitting a paper to further this nascent area at the intersection of databases, data mining, and visualization/HCI.

I've served on the program committees of VLDB, KDD, SIGMOD, WSDM, WWW, SOCC, HCOMP, ICDE, and EDBT, many of them multiple times.

I served as an Area Chair for SIGMOD 2017. I was the SIGMOD 2016 Undergraduate Research Chair. Our competition has concluded; we had 3X the number of submissions in 2016 compared to previous years.

Recent Releases

Medium Blog

Selected Projects


Zenvisage: A visualization recommendation system

Zenvisage is a tool for effortlessly visualizing insights from very large data sets. It automates finding the right visualization for a query, significantly simplifying the laborious task of identifying appropriate visualizations.

Project page here. Try it live here!


DataSpread: A Spreadsheet-Database Hybrid

DataSpread is a tool that marries the best of databases and spreadsheets.

Project page: here


Orpheus: Relational Dataset Version Management at Scale

DataHub (or "GitHub for Data") is a system that enables collaborative data science by keeping track of large numbers of versions and their dependencies compactly, and allowing users to progressively clean, integrate and visualize their datasets. OrpheusDB is a component of DataHub focused on using a relational database for versioning.

Project page: here


Populace: A Suite of Crowd-Powered Algorithms

Our work has developed a number of algorithms for gathering, processing, and understanding data obtained from humans (or crowds), while minimizing cost, latency, and error. Since 2014, our focus has been on optimizing open-ended crowdsourcing: an understudied and challenging class.

Project page: here