A framework for statistically-sound customer segment search - Espace pour le Développement Accéder directement au contenu
Communication Dans Un Congrès Année : 2021

A framework for statistically-sound customer segment search

Résumé

We develop S4, a Statistically-Sound Segment Search framework that combines principled data partitioning and sound statistical testing to verify common hypotheses in retail data and return interpretable customer data segments. Our framework accommodates one-sample, two-sample, and multiple-sample testing, to provide various aggregations and comparisons of customer transactions. To control the proportion of false discoveries in multiple hypothesis testing, we enforce an FDR-controlling procedure and formulate a unified optimization problem that returns customer data segments that satisfy the test for a given significance level, maximize coverage of the input data, and are within a risk capital. We develop a greedy algorithm to explore different data partitions and test multiple hypotheses in a sound manner. Our extensive experiments on four retail data sets examine the interaction between significance, risk and coverage, and demonstrate the expressivity, usefulness, and scalability of S4 in practice.
Fichier principal
Vignette du fichier
S4DSAA2021.pdf (2.24 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03379740 , version 1 (15-10-2021)

Identifiants

Citer

Sihem Amer-Yahia, Laure Berti-Equille, Abdelouahab Chibah. A framework for statistically-sound customer segment search. The 8th IEEE International Conference on Data Science and Advanced Analytics, Oct 2021, Porto (virtual), Portugal. ⟨10.1109/DSAA53316.2021.9564199⟩. ⟨hal-03379740⟩
39 Consultations
97 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More