Interactively discovering and ranking desired tuples by data exploration

Citation DataVLDB Journal, ISSN: 0949-877X, Vol: 31, Issue: 4, Page: 753-777

Publication Year2022

9
Citations
0
Usage
3
Captures
0
Mentions
0
Social Media

Metric Options: Counts1 Year3 Year

Metrics Details

Citations
9
- Citation Indexes
  9
Captures
3
- Readers
  3

Article Description

Data exploration—the problem of extracting knowledge from database even if we do not know exactly what we are looking for —is important for data discovery and analysis. However, precisely specifying SQL queries is not always practical, such as “finding and ranking off-road cars based on a combination of Price, Make, Model, Age, Mileage, etc”—not only due to the query complexity (e.g.,the queries may have many if-then-else, and, or and not logic), but also because the user typically does not have the knowledge of all data instances (and their variants). We propose DExPlorer, a system for interactive data exploration. From the user perspective, we propose a simple and user-friendly interface, which allows to: (1) confirm whether a tuple is desired or not, and (2) decide whether a tuple is more preferred than another. Behind the scenes, we jointly use multiple ML models to learn from the above two types of user feedback. Moreover, in order to effectively involve human-in-the-loop, we need to select a set of tuples for each user interaction so as to solicit feedback. Therefore, we devise question selection algorithms, which consider not only the estimated benefit of each tuple, but also the possible partial orders between any two suggested tuples. Experiments on real-world datasets show that DExPlorer outperforms existing approaches in effectiveness.

Bibliographic Details

DOI10.1007/s00778-021-00714-0

URL IDhttp://www.scopus.com/inward/record.url?partnerID=HzOxMe3b&scp=85123090782&origin=inward; http://dx.doi.org/10.1007/s00778-021-00714-0; https://link.springer.com/10.1007/s00778-021-00714-0; https://dx.doi.org/10.1007/s00778-021-00714-0; https://link.springer.com/article/10.1007/s00778-021-00714-0

AUTHOR(S)

Xuedi Qin; Chengliang Chai; Yuyu Luo; Tianyu Zhao; Guoliang Li; Jianhua Feng; Xiang Yu; Nan Tang; Mourad Ouzzani

PUBLISHER(S)

Springer Science and Business Media LLC

TAG(S)

Computer Science

Provide Feedback

Have ideas for a new metric? Would you like to see something else here?Let us know