Service Provisioning through High Level, Complexity Hiding Interfaces
Citation
Guangchen Ruan, Hui Zhang, Esen Tuna and Eric Wernert, "Service Provisioning through High Level, Complexity Hiding Interfaces," 2020 IEEE International Conference on Big Data (Big Data), Atlanta, GA, USA, 2020, pp. 3288-3294.
Description
Over the past decade, cyberinfrastructure community like XSEDE has substantially fostered and enriched knowledge discovery of scholars, researchers, and engineers from a variety of domains through enabling access to advanced computing systems, where continuing support for classic packages and parallel computing frameworks (e.g., MPI and OpenMP) has been well established. However, with the rise of "Big Data" era, an ever increasing demand from user community is the desire to run sophisticated, state-of-the-art distributed frameworks that handle various data related tasks. Examples include Hadoop and Spark for data processing and analytics, Cassandra and Redis for scalable on-disk and in-memory data stores, Apache Airflow for distributed workflow engine, just to name a few. Though by design such frameworks provision high-level, user friendly programming APIs for business logic composition, their deployment process oftentimes is both complex and complicated, requiring expertise well beyond what the majority of cyberinfrastructure users may have. To bridge the gap, in this paper we propose the concept of provisioning such frameworks through "cyberinfrastructure managed", system wide services, especially through leveraging "high level, complexity hiding interfaces" principle when designing interfaces that are exposed to users for framework setup and shutdown. In particular, we use Spark-as-a-Service at Indiana University as a concrete case study to illustrate how we applied the design principles. Furthermore, to demonstrate the generality of the design, we showcase Cassandra-as-a-Service, a work-in-progress prototype.
Date
Dec 2020
Type
Conference Paper