TY - JOUR
T1 - Artificial neural networks, classification trees and regression: Which method for which customer base?
AU - Linder, Roland
AU - Geier, Jeannine
AU - Kölliker, Mathias
PY - 2004/7/1
Y1 - 2004/7/1
N2 - The most commonly used modelling methods for targeting customers in direct marketing are artificial neural networks (ANNs), classification trees (CTs) and logistic regression (LR). These methods differ in how rules for the association between purchase behaviour and customer information are derived from the data. The authors investigated the predictive performances of the three methods in a competitive test in a simulated direct marketing scenario. The experimental design consisted of a number of situations comprising varying sample sizes and data complexities. The results show that the performance of all methods increased with the size of the customer base. This relation was less strong for ANNs than for CTs and LR, especially when data complexity was high. As a consequence ANNs outperformed the other methods when sample size was small, but CTs and LR yielded better results when sample size was large --- with LR being generally superior to CTs. The combination of the prediction scores of ANNs, CTs and LR into a single model revealed synergistic effects among the three modelling approaches. The combination mostly resulted in better results than any single model. This study shows that ANNs may be especially valuable for small customer bases, but might not be used in isolation for analysing larger customer bases. Irrespective of the size of the customer base and the underlying data complexity, the combination of ANNs, CTs and LR into a single model mostly resulted in the best prediction, suggesting that model combination might be a safe way of maximising predictive performance when the degree of data complexity is unknown (as is the case for most real customer bases).
AB - The most commonly used modelling methods for targeting customers in direct marketing are artificial neural networks (ANNs), classification trees (CTs) and logistic regression (LR). These methods differ in how rules for the association between purchase behaviour and customer information are derived from the data. The authors investigated the predictive performances of the three methods in a competitive test in a simulated direct marketing scenario. The experimental design consisted of a number of situations comprising varying sample sizes and data complexities. The results show that the performance of all methods increased with the size of the customer base. This relation was less strong for ANNs than for CTs and LR, especially when data complexity was high. As a consequence ANNs outperformed the other methods when sample size was small, but CTs and LR yielded better results when sample size was large --- with LR being generally superior to CTs. The combination of the prediction scores of ANNs, CTs and LR into a single model revealed synergistic effects among the three modelling approaches. The combination mostly resulted in better results than any single model. This study shows that ANNs may be especially valuable for small customer bases, but might not be used in isolation for analysing larger customer bases. Irrespective of the size of the customer base and the underlying data complexity, the combination of ANNs, CTs and LR into a single model mostly resulted in the best prediction, suggesting that model combination might be a safe way of maximising predictive performance when the degree of data complexity is unknown (as is the case for most real customer bases).
UR - http://www.imi.uni-luebeck.de/de/content/artificial-neural-networks-classification-trees-and-regression-which-method-which-customer
U2 - 10.1057/palgrave.dbm.3240233
DO - 10.1057/palgrave.dbm.3240233
M3 - Journal articles
SN - 1741-2447
VL - 11
SP - 344
EP - 356
JO - Journal of Database Marketing Customer Strategy Management
JF - Journal of Database Marketing Customer Strategy Management
IS - 4
ER -