Aditya Parameswaran

I am an assistant professor of Computer Science at the University of Illinois (UIUC) . My research interests are broadly in simplifying and improving data analytics, i.e., helping users make better use of their data.

My work involves building real data analytics systems with principled foundations, designing algorithms (with formal guarantees) for the systems, as well as mining data obtained from such systems.

Biographical Sketch

Aditya Parameswaran is an Assistant Professor in Computer Science at the University of Illinois (UIUC), with affiliate appointments at the Institute for Genomic Biology and the Beckman Institute for Advanced Science and Technology. He spent a year as a PostDoc at MIT CSAIL following his PhD at Stanford University, before starting at Illinois in August 2014. He develops systems and algorithms for "human-in-the-loop" data analytics, synthesizing techniques from database systems, data mining, and human computation.

He has received the NSF CAREER Award, the TCDE Early Career Award, multiple "best" Doctoral Dissertation Awards (from SIGMOD, SIGKDD, and Stanford), an "Excellent" Lecturer award from Illinois, a Google Faculty award, the Key Scientific Challenges award from Yahoo!, five best-of-conference citations (VLDB 2010, KDD 2012, ICDE 2014, ICDE 2016, AISTATS 2017), and a Gold Medal from IIT Bombay. His research group is supported with funding from by Toyota, Adobe, the Siebel Energy Institute, the NIH (2x), the NSF (3x), and Google.

Quick Project Links

                       

News

  • April 15, 2017: Thrilled and honored to receive:
    • The NSF CAREER Award: Abstract here. Excited to pursue the vision of optimizing "open-ended" crowdsourcing! Vision paper from IEEE Data Engg. Bulletin here.
    • The TCDE (Technical Committed on Data Engineering) Early Career Award, awarded for an individual's whole body of work in the first 5 years after the PhD. The award citation: The award is for developing new interactive tools and techniques that expand the reach of data analytics, enabling powerful data-driven discoveries by experts and non-experts alike.
  • April 15, 2017: Orpheus Updates: demo accepted at SIGMOD 2017; paper accepted at VLDB 2017 (no revisions!); open-source release here.
  • April 3, 2017: The New York Times cited Adam Marcus and my book on crowdsourced data management. Article here.
  • April 3, 2017: The HILDA 2017 workshop (co-located with SIGMOD) program is up.
  • March 1, 2017: Manas's TKDE paper on smart drill-down (from the "best of ICDE 2016") was accepted.
  • February 20, 2017: Vision paper on next-gen visualization recommendation systems with Manasi Vartak, Sam et al. is out at SIGMOD Record. Link here.
  • January 30, 2017: My student Silu Huang won the MSR Faculty Fellowship: the first Illinois student since 2011! A great honor! Silu has been recently working on Orpheus.
  • January 30, 2017: Yihan's paper on calibrating classifiers has been accepted as an ORAL presentation at AISTATS'17!
  • January 15, 2017: Our paper analyzing a very large log of all tasks from a popular crowdsourcing marketplace has been accepted at VLDB'17. Learn all about how a marketplace operates, what the distribution of tasks look like, and how the workers behave here.
  • January 10, 2017: Three of our key analytics tools, DataSpread, Zenvisage, and OrpheusDB, are moving out of private betas with a few interested parties to the public, available for easy download and deployment. More details and download links here: http://tiny.cc/three-tools.
  • January 1, 2017: New preprint release on Catamaran, our new fully-unsupervised data extraction tool from machine generated data: no examples or supervision needed! Preprint here.
  • December 1, 2016: Two new paper updates:
    • Our paper on SlimFast: a data fusion algorithm, spearheaded by Theo Rekatsinas has been accepted at SIGMOD'17!
    • Our vision paper on Open-Ended Crowdsourcing was accepted to appear at the IEEE data engineering bulletin, spearheaded by the amazing Tova Milo.
  • December 1, 2016: I've given a bunch of talks on our three tools for human-in-the-loop data analytics: a distinguished colloquium at Northwestern, a keynote at the Enterprise Intelligence workshop at KDD'16, and BigData events at Illinois and Chicago. Grab the slides here.
  • December 1, 2016: My exceptional PhD student, Silu Huang, was a finalist in the prestigious Microsoft Research PhD fellowship competition, with an in-person interview at MSR HQ -- so proud of her! Fingers crossed for the eventual outcome.
  • November 15, 2016: Many new preprints! Grab 'em while they're hot:
    • From the Zenvisage project: a paper on visualizations that incrementally improve over time, and a paper on our rapid sampling engine for visualizations.
    • From the Orpheus project: a paper describing data models and partitioning schemes for relational dataset versioning.
    • From the Populace project: a paper on consensus-based clustering of unstructured data.
    • From the DataSpread project: a paper evaluating representation schemes and indexing structures for billion cell spreadsheets.
  • November 15, 2016: We delivered our tutorial on crowdsourced data management at HCOMP'16: slides part 1 part 2.
  • November 1, 2016: Thanks to Adobe for supporting our research efforts!
  • November 1, 2016: New releases for : a paper on the Zenvisage query language, ZQL and our smart-fuse query optimizer, accepted at VLDB'17 here, plus a demonstration paper accepted at CIDR'17 here.
  • October 15, 2016: I am one of the chairs of the Human-in-the-loop Data Analytics (HILDA) Workshop at SIGMOD'17, along with the peerless Joe Hellerstein, from Berkeley, and Carsten Binnig from Brown. Website here. Follow us on twitter.
  • October 1, 2016: My outstanding MS student, Vipul Venkataraman won the Siebel Scholarship: cool cash prize of $20K. Well-deserved!
  • September 15, 2016: Participated in a fun panel on "Will AI eat us all?" with the eminent team of Sunita Sarawagi, Sihem Amer-Yahia, H. Jagadish, and Ihab Ilyas at VLDB'16. Short answer: no.
  • September 1, 2016: Thanks to NSF for funding our work on DataSpread with an NSF BigData grant. Some Illinois press here.
  • September 1, 2016: Participated in an invited workshop on the "Theory and Models for Crowds and Networks" with an eminent team of researchers in Oaxaca, Mexico. I presented a tutorial on the data management community's take on crowdsourcing. Slides here.
  • August 1, 2016: New slick websites for projects:
    • , our spreadsheet-database hybrid: here.
    • , our versioned database system: here.
    • , our visualization recommendation system: here.
    • , our project on optimizing crowdsourcing: here.
  • July 15, 2016: Our new paper on producing intelligent summaries of facets of papers, with Xiang Ren, Tarique Siddiqui, and Jiawei Han has been accepted at CIKM 2016!
  • June 20, 2016: Our paper on data exploration at ICDE 2016 was invited to the TKDE "best of conference" issue, an honor reserved for the top few papers at the conference. Great job Manas!
  • June 15, 2016: After two years of extensive collaborations with folks at the two institutes, I am now an "official" affiliate of the Institute for Genomic Biology, and the Beckman Institute for Advanced Science and Technology.
  • June 1, 2016: Our paper on Squish: a tool for compression of relational datasets was accepted at KDD 2016! Our code is open-source and available on Github.
  • May 1, 2016: New release on our visual data exploration platform zenvisage. Paper here, and website dedicated to Zenvisage here. Contact us if you'd like to test run zenvisage on your datasets!
  • April 15, 2016: We just received a small seed grant from the Siebel Energy Institute to develop Zenvisage in collaboration with battery scientists at Carnegie Mellon! Excited to see what happens next.
  • April 10, 2016: We received a whopping 3X the number of submissions for the undergraduate research contest. Who knows what these young researchers will accomplish next?
  • April 1, 2016: Our paper on Decibel, the storage engine underlying DataHub, was accepted at SIGMOD 2016!
  • March 1, 2016: Thrilled to be among the "List of Teachers Ranked as Excellent by their Students" at Illinois! Happy to see that students enjoy my classes.
  • January 6, 2016: Adam and I are proud to finally release a book on crowdsourced data management, a labor of love under development for two years. The book not only covers the state of the art, but also contains a survey of both industry users of crowdsourcing and managers of crowdsourcing marketplaces. We hope that this book will be the definitive reference for how crowdsourcing is used in practice. Do send us comments!
  • January 1, 2016: Our vision paper on the unsolved challenges in large-scale data crowdsourcing was accepted at TKDE.
  • December 15, 2015: Our paper on interactive exploration using a more expressive drill-down operator was accepted at ICDE 2016 in Finland.
  • November 25, 2015: Some Illinois press on our NSF-funded DataHub grant. Thrilled and honored to be working with the amazing Sam Madden and Amol Deshpande at solving the problems underlying collaborative data analytics.
  • November 15, 2015: Our paper on optimally managing worker and answer quality in crowdsourcing was accepted at SIGMOD 2016.
  • October 1, 2015: We just heard word that NIH has funded our BD2K commons supplement. Looking forward to working with folks at UChicago to improve data publication workflows!
        More News

Synergistic Activities

I am serving as a co-chair for the HILDA (Human-In-the-Loop Data Analytics) Workshop at SIGMOD 2017. Regular research papers, summary of recent work, and vision pieces, all welcome. Website here.

I am serving as an Area Chair for SIGMOD 2017. I've served on the program committees of VLDB, KDD, SIGMOD, WSDM, WWW, SOCC, HCOMP, ICDE, and EDBT, many of them multiple times.

I am an Associate Editor for SIGMOD Record. Please consider sending us your most controversial and/or interesting papers!

I am the SIGMOD 2016 Undergraduate Research Chair. Our competition has concluded; we had 3X the number of submissions this year compared to previous years.

Recent Releases



Medium Blog




Selected Projects

zenvisage

Zenvisage: A visualization recommendation system

Zenvisage is a tool for effortlessly visualizing insights from very large data sets. It automates finding the right visualization for a query, significantly simplifying the laborious task of identifying appropriate visualizations.

Project page: here


dataspread

DataSpread: A Spreadsheet-Database Hybrid

DataSpread is a tool that marries the best of databases and spreadsheets.

Project page: here


Datasift

Orpheus: Relational Dataset Version Management at Scale

DataHub (or "GitHub for Data") is a system that enables collaborative data science by keeping track of large numbers of versions and their dependencies compactly, and allowing users to progressively clean, integrate and visualize their datasets. OrpheusDB is a component of DataHub focused on using a relational database for versioning.

Project page: here


crowd-alg

Populace: A Suite of Crowd-Powered Algorithms

Our work has developed a number of algorithms for gathering, processing, and understanding data obtained from humans (or crowds), while minimizing cost, latency, and error. Since 2014, our focus has been on optimizing open-ended crowdsourcing: an understudied and challenging class.

Project page: here