DATA-IMP: An Interactive Approach to Specify Data Imputation Transformations on Large Datasets

Citation DataLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), ISSN: 1611-3349, Vol: 13591 LNCS, Page: 55-74

Publication Year2022

2
Citations
0
Usage
1
Captures
0
Mentions
0
Social Media

Metric Options: Counts1 Year3 Year

Metrics Details

Citations
2
- Citation Indexes
  2
Captures
1
- Readers
  1

Conference Paper Description

In recent years, the volume of data to be analyzed has increased tremendously. However, purposeful data analyses on large-scale data require in-depth domain knowledge. A common approach to reduce data volume and preserve interactivity are sampling algorithms. However, when using a sample, the semantic context across the entire dataset is lost, which impedes data preprocessing. In particular data imputation transformations, which aim to fill empty values for more accurate data analyses, suffer from this problem. To cope with this issue, we introduce DATA-IMP, a novel human-in-the-loop approach that enables data imputation transformations in an interactive manner while preserving scalability. We implemented a fully working prototype and conducted a comprehensive user study as well as a comparison to several non-interactive data imputation techniques. We show that our approach significantly outperforms state-of-the-art approaches regarding accuracy as well as preserves user satisfaction and enables domain experts to preprocess large-scale data in an interactive manner.

Bibliographic Details

DOI10.1007/978-3-031-17834-4_4

URL IDhttp://www.scopus.com/inward/record.url?partnerID=HzOxMe3b&scp=85140459145&origin=inward; http://dx.doi.org/10.1007/978-3-031-17834-4_4; https://link.springer.com/10.1007/978-3-031-17834-4_4; https://dx.doi.org/10.1007/978-3-031-17834-4_4; https://link.springer.com/chapter/10.1007/978-3-031-17834-4_4

AUTHOR(S)

Michael Behringer; Manuel Fritz; Holger Schwarz; Bernhard Mitschang

PUBLISHER(S)

Springer Science and Business Media LLC

TAG(S)

Mathematics; Computer Science

Provide Feedback

Have ideas for a new metric? Would you like to see something else here?Let us know