Myria is a distributed, shared-nothing Big Data management system and Cloud service from the University of Washington. We derive requirements from real users and complex workflows, especially in science.
Extracting knowledge out of Big Data today is a high-touch business, requiring a human expert who deeply understands the application domain as well as a growing ecosystem of complex distributed systems and advanced statistical methods. These experts are hired in part for their statistical expertise, but report that the majority of their time is spent scaling and optimizing the relatively basic data manipulation tasks in preparation for the actual statistical analysis or machine learning step: identifying relevant data, cleaning, filtering, joining, grouping, transforming, extracting features, and evaluating results.
The Myria project focuses on building a new system called MyriaDB for Big Data Management that is both fast and flexible, offering this system as a cloud service, and addressing both the theoretical and systems challenges associated with Big Data Management as a Cloud service.