CADRE: A Cloud-Based Data Service for Big Bibliographic Data

Citation

“CADRE: A Cloud-Based Data Service for Big Bibliographic Data”, accepted to Proceedings of 2021 ACM International Conference on Information and Knowledge Management, Queensland, Australia. X. Yan, G. Ruan, D. Nikolov, M. Hutchinson, C. Kankanamalage, B. Serrette, J. McCombs, A. Walsh, E. Tuna, V. Pentchev, November 2021.

Description

Large bibliographic data sets hold the promise of revolutionizing the scientific enterprise when combined with state-of-the-science computational capabilities. Providing high-quality data services for large network datasets such as the Microsoft Academic Graph, which contains more than two billion citation links, poses significant difficulties for universities. Data systems based on the property graph model are capable of delivering efficient graph query services for large networks. However, real-life queries often combine multiple types of data models. To satisfy the needs of different user groups, we developed and deployed a cloud-based data system consisting of scalable graph and text-indexed query engines. For non-expert users, the property graph model also presents a technological barrier. To alleviate the steep learning curve, we designed an
intuitive graphical user interface for query-building. For advanced users, a scalable notebook service in our platform provides a more flexible computing environments where the query results can be further analyzed. These systems form the data-backbone of the Collaborative Archive and Data Research Environment (CADRE).

Date

Aug 2021

Staff

Guangchen Ruan
Dimitar Nikolov
James McCombs
Alan Walsh
Esen Tuna

Projects

CADRE

Services

MySQL

Type

Conference Paper