PlumX Metrics
Embed PlumX Metrics

DATA-IMP: An Interactive Approach to Specify Data Imputation Transformations on Large Datasets

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), ISSN: 1611-3349, Vol: 13591 LNCS, Page: 55-74
2022
  • 2
    Citations
  • 0
    Usage
  • 1
    Captures
  • 0
    Mentions
  • 0
    Social Media
Metric Options:   Counts1 Year3 Year

Metrics Details

Conference Paper Description

In recent years, the volume of data to be analyzed has increased tremendously. However, purposeful data analyses on large-scale data require in-depth domain knowledge. A common approach to reduce data volume and preserve interactivity are sampling algorithms. However, when using a sample, the semantic context across the entire dataset is lost, which impedes data preprocessing. In particular data imputation transformations, which aim to fill empty values for more accurate data analyses, suffer from this problem. To cope with this issue, we introduce DATA-IMP, a novel human-in-the-loop approach that enables data imputation transformations in an interactive manner while preserving scalability. We implemented a fully working prototype and conducted a comprehensive user study as well as a comparison to several non-interactive data imputation techniques. We show that our approach significantly outperforms state-of-the-art approaches regarding accuracy as well as preserves user satisfaction and enables domain experts to preprocess large-scale data in an interactive manner.

Bibliographic Details

Provide Feedback

Have ideas for a new metric? Would you like to see something else here?Let us know