Publications
The Myria System
- The Myria Big Data Management and Analytics System and Cloud Services. Jingjing Wang, Tobin Baker, Magdalena Balazinska, Daniel Halperin, Brandon Hayes, Bill Howe, Dylan Hutchison, Shrainik Jain, Ryan Maas, Parmita Mehta, Dominik Moritz, Brandon Myers, Jennifer Ortiz, Dan Suciu, Andrew Whittaker, and Shengliang Xu. CIDR 2017.
- Demonstration of the Myria Big Data Management Service. D. Halperin, V. T. Almeida, L. L. Choo, S. Chu, P. Koutris, D. Moritz, J. Ortiz, V. Ruamviboonsuk, J. Wang, A. Whitaker, S. Xu, M. Balazinska, B. Howe, and D. Suciu. SIGMOD 2014.
Big Data Management Theory
- A Worst-Case Optimal Multi-Round Algorithm for Parallel Computation of Conjunctive Queries. Bas Ketsman, Dan Suciu. PODS’2017
- What do Shannon-type inequalities, submodular width, and disjunctive datalog have to do with one another? Mahmoud Abo Khamis, Hung Ngo, Dan Suciu(2017). (under review)
- Worst-Case Optimal Algorithms for Parallel Query Processing. Paul Beame, Paraschos Koutris, Dan Suciu. ICDT 2016.
- Query Processing for Massively Parallel Systems. Paraschos Koutris. Ph.D. Dissertation, 2015
- Skew in Parallel Query Processing. Paul Beame, Paraschos Koutris, Dan Suciu. PODS 2014.
- Communication steps for parallel query processing. Paul Beame, Paraschos Koutris, Dan Suciu. PODS 2013.
- Parallel Evaluation of Conjunctive Queries. Paraschos Koutris, Dan Suciu. PODS 2011.
Big Data Systems Research
- PipeGen: Data Pipe Generator for Hybrid Analytics. Brandon Haynes, Alvin Cheung, and Magdalena Balazinska. SOCC 2016.
- Comparative Evaluation of Big-Data Systems on Scientific Image Analytics Workloads. Parmita Mehta, Sven Dorkenwald, Dongfang Zhao, Tomer Kaftan, Alvin Cheung, Magdalena Balazinska, Ariel Rokem, Andrew J. Connolly, Jacob VanderPlas, Yusra AlSayyad. VLDB 2017
- High-performance parallel systems for data-intensive computing. Brandon Myers. Ph.D. Dissertation, 2016
- Compiling queries for high-performance computing. Brandon Myers, Bill Howe, and Mark Oskin. UW Technical Report 2016
- Asynchronous and Fault-Tolerant Recursive Datalog Evaluation in Shared-Nothing Engines. Jingjing Wang, Magdalena Balazinska, and Daniel Halperin. VLDB 2015
- From Theory to Practice: Efficient Join Query Evaluation in a Parallel Database System. Shumo Chu, Magda Balazinska and Dan Suciu. SIGMOD 2015
- Gaussian Mixture Models Use-Case: In-Memory Analysis with Myria. Ryan Maas, Jeremy Hyrkas, Olivia Grace Telford, Magdalena Balazinska, Andrew Connolly, and Bill Howe. IMDM 2015 (at VLDB)
- Big-Data Management Use-Case: A Cloud Service for Creating and Analyzing Galactic Merger Trees. Sarah Loebman, Jennifer Ortiz, Lee Lee Choo, Laurel Orr, Lauren Anderson, Daniel Halperin, Magdalena Balazinska, Thomas Quinn, and Fabio Governato. SIGMOD Workshop on Data Analytics in the Cloud (DanaC) 2014.
- Compiled Plans for In-Memory Path-Counting Queries. Brandon Myers, Jeremy Hykas, Daniel Halperin, and Bill Howe. IMDM 2013 (at VLDB)
Profiling and Visualization
- Viziometrics: Analyzing Visual Patterns in the Scientific Literature. Poshen Lee, Jevin West, Bill Howe (2017). (under review)
- Voyager: Exploratory analysis via faceted browsing of visualization recommendations. K Wongsuphasawat, D Moritz, A Anand, J Mackinlay, B Howe, J Heer (2016). IEEE transactions on visualization and computer graphics.
- Viziometrix: A platform for analyzing the visual information in big scholarly data. P Lee, JD West, B Howe (2016). Proceedings of the 25thInternational Conference Companion on World Wide Web.
- Perfopticon: Visual Query Analysis for Distributed Databases.Dominik Moritz, Jeffrey Heer, and Bill Howe.
- Dynamic Client-Server Optimization for Scalable Interactive Visualization on the Web. Dominik Moritz, Jeffrey Heer, and Bill Howe. Dominik Moritz, Daniel Halperin, Bill Howe, Jeffrey Heer Computer Graphics Forum (Proc. EuroVis) 2015.
Big Data Management as a Cloud Service
- Elastic Memory Management for Cloud Data Analytics. Jingjing Wang and Magdalena Balazinska. USENIX ATC 2017.
- Toward Elastic Memory Management for Cloud Data Analytics. Jingjing Wang and Magdalena Balazinska. BeyondMR 2016. (slides).
- High-variety cloud databases. S Jain, D Moritz, B Howe. Data Engineering Workshops (ICDEW) 2016
- A Vision for Personalized Service Level Agreements in the Cloud. Jennifer Ortiz, Victor T. Almeida, Magda Balazinska. SIGMOD Workshop on Data Analytics in the Cloud (DanaC) 2013.
- Changing the Face of Database Cloud Services with Personalized Service Level Agreements. Jennifer Ortiz, Victor T. Almeida, Magda Balazinska. CIDR 2015.
- PerfEnforce Demonstration: Data Analytics with Performance Guarantees. Jennifer Ortiz, Brendan Lee, Magda Balazinska. SIGMOD 2016.
- PerfEnforce: A Dynamic Scaling Engine for Analytics with Performance Guarantees. Jennifer Ortiz, Brendan Lee, Magda Balazinska, Joseph L. Hellerstein. arXiv 2016
- SLAOrchestrator: Reducing the Cost of Performance SLAs for Cloud Data Analytics. Jennifer Ortiz, Brendan Lee, Magda Balazinska Johannes Gehrke, Joseph L. Hellerstein. USENIX ATC 2018.
SQLShare: Interactive DB-as-a-Service