Events Calendar

PhD Final Exam – Benjamin McCamish

Usable and Scalable Data Querying for Scientific Discovery

Scientists and engineers have to analyze and query multiple large databases. Analysis over databases created by phasor measurement units can provide insight into the health of the grid, thereby improving control over operations. Realizing this data-driven control, however, requires validating, processing and storing massive amounts of PMU data efficiently, which is not always achieved with modern systems. Furthermore, users should know formal query languages, such as SQL, and the structure and content of the database to use these systems. But, scientists do not usually know concepts, such as query languages, and the content and structure of the databases. Finally, the information related to most queries is spread across multiple data sources, where each represents information in a distinct form. Traditionally, users have to write programming rules to integrate the data in these data sources into one database with a homogeneous structure. This, however, takes! a great deal of time and effort. Moreover, end-users often do not have the required programming background and expertise to write and maintain these rules. To address these challenges, we proposed novel methods to query multiple large databases easily and efficiently. We also describe a PMU data management system that supports input from multiple PMU data streams, features an event-detection algorithm, and provides an efficient method for retrieving archival data. To make database systems more usable, database systems offer keyword query interfaces where users do not need to know formal query languages and content and structure of the schema. As keyword queries are inherently ambiguous, it is challenging for database systems to answer them precisely. Using extensive empirical studies, we show that users explore and learn to formulate more precise keyword queries in their course of interaction with the database system. We propose an effective and efficient online learning algorithm that adapts to the user learning in the interaction with convergence guarantees. Furthermore, we set forth a novel approach to learning rules to integrate and query multiple databases progressively using end-user feedback. In our framework, each data source learns to translate its information to a form compatible with other data sources. We show that our method delivers effective rules using a modest number of interactions with the end-user.

Co Advisor: Arash Termehchy
Co Advisor: Eduardo Cotilla-Sanchez
Committee: David Maier
Committee: Liang Huang
Committee: Alan Fern
GCR: Kyle Niemeyer

Tuesday, August 13 at 11:00am to 1:00pm


Kelley Engineering Center, 1007
110 SW Park Terrace, Corvallis, OR 97331

Event Type

Lecture or Presentation

Event Topic

Research

Organization
Electrical Engineering and Computer Science
Contact Name

Calvin Hughes

Contact Email

calvin.hughes@oregonstate.edu

Subscribe
Google Calendar iCal Outlook

Recent Activity