March 2016

March 2016 Meeting

When
March 16, 2016 | 12:00pm - 2:00pm

TOPIC
On Gradient Boosting and Comparative Performance in Business Applications

ABSTRACT
Business applications usually involve classifications issues, such as response/no-response in direct marketing, customer attrition and fraud detection in many industries, etc. Many techniques are used, such as logistic regression, neural networks and trees. We present Friedman's Gradient Boosting (GB) algorithm that belongs in the data mining toolbox for both classification and regression problems. We start by focusing initially on the issue of the bias-variance trade and then present a simple GB algorithm.

We also present an application in the classification context of a direct marketing application and compare the relative performance of GB to other methods, such as logistic regression and classification trees. We provide Linear dependence plots to compare the relative model interpretation of the different methods.

We conclude with final remarks on pros and cons of the method.

Speaker

Leonardo Auslender

Leonardo Auslender

Independent Statistical Research Consultant

Leonardo Auslender is a statistician (and economist) with more than 25 years of business experience and SAS expertise. His area of expertise is in the area of Giga-Data Analysis and Methods, and has written papers and given lectures on Variable Selection, Missing Value Imputation, Tree Regression, Support Vector Machines, Market-Basket Analysis, Data Base Marketing, CRM, GDP and (Relative Price) Inflation studies, Expectation Formations, Productivity and Technology effects in the economy. He was a lecturer of Finance and Macroeconomics at Rutgers University. He presented two seminars on Market Basket Analysis in New York City (Informs and Amcis), a two-day seminar at the NYC Direct Marketing Association on Variable and Feature Selection in November, 2004, on Colinearity and Variable Selection at the December 2005 SCMA meeting in Auburn, Alabama, on Modeling issues at the SAS M2007 and M2008 Data Mining Conferences and at the Informs in NYC.