CRISP-DM Methodology for Data Mining in Marketing

CRISP-DM

Data science has been around for some time. Over the years, data scientists came to recognize the need for a standard methodology and procedures for best practices in data mining and analysis. Combining their knowledge gained from years of experience, they created a well-structured approach to this process. The Cross Industry Standard Process for Data Mining, generally known as CRISP-DM was created as an open standard so as to provide a clear model for analysis. This serves not just as a road map of how to mine and analyze data, but also to increase the possibility of professional collaboration. When an organization uses CRISP-DM, it can help clients understand what standards to expect.

CRISP-DM is a process made up of six different phases. These include Business Understanding, Data Understanding, Data Preparation, Modeling, Evaluation and Deployment. These phases are, at a nominal level, approached sequentially, however the process itself is iterative, meaning that any models and understanding are designed to be improved by subsequent knowledge gained throughout the process.

The procedure used within CRISP-DM is demonstrated in the image below Let’s walk through this process from the perspective of analyzing data for a direct marketing organization.

Business Understanding

Many businesses, when seeking a greater understanding of their customers, clients, or target markets, will typically already have a set of data. For example, they may have a list of customer contacts, either from those who have made a purchase, filled out a form online, or from data purchased through a third-party.

The first stage of CRISP-DM is to gain a true understanding of the business, and to identify the specific needs or goals that an organization has. Understanding a business involves identifying which problems a business has that they wish to solve.

At a high level, a business may wish to increase response rates for various marketing campaigns. During the Business Understanding phase, one of the first tasks is to drill down to define the problem more specifically. For example, the question can be narrowed to identifying which customer subsets are most likely to make repeat purchases, or how much they are willing to spend. The key point here is that before beginning, a clear picture needs to be drawn, both of a business’ goals at a high level as well clear measurable markers for being able to determine the success or failure of an initiative. Through this process, a clear picture of each individual business goal is drawn so that the analytics plan can be correctly tailored

Data Understanding

Once an understanding of the goals of the organization are defined, a process of identifying what exists in the data that the client already has available is begun. A company may have information about name, address, and/or other contact information for a customer or potential client. They may also have information about past purchases. Depending on the source of the data, there may be information about customer interests, or family-makeup. All this information can be extremely useful for future campaigns.

Sometimes data is included that is not particularly useful for the stated business needs of a client. For instance, occasionally a contact may exist outside of a targeted geographical area or may not share demographic information best suited for a specific campaign purpose. Often, especially if provided through form-based response campaigns or mailers, there may be metadata about consumers that may be incomplete or outright missing. Data can sometimes be massaged to provide missing information, but this is not always the case. Sometimes many entries may be missing too many crucial details to be of use, or data may be inaccurate.

A direct marketer may have multiple different datasets. They may have a simple CSV or Excel file of contacts, but they may also have one or more databases of well-structured information. On top of this, a client may have access to various non-structured Big Data datasets, in NoSQL databases. Each of these datasets are examined to identify exactly what they contain.

Data Preparation

Once we have developed a solid understanding of what data does or does not exist, the data is prepared and analyzed in a way to make it useful. The data preparation process is extensive and tends to take approximately 80% of the project-time.

Much of this process is handled through the use of ETL software (Extract, Transform, and Load) which is designed to pull data from multiple locations, transform them into useful formats and then load them into a uniform location, such as a database.

Data Dictionary

The first part of the data preparation process involves creating a data dictionary. Information is divided into segments. In order to ensure that each element is clearly understandable, particularly to someone who may not be a data scientist, each element of metadata is defined in a way to make them human-readable.

This process is applied to all initial data sources. In many cases we may discover that some data in one set duplicates another. Ideally, if we have unique identifiers which can tie each of these together, this makes the process considerably easier.

Examples of data that are defined and encoded can be

Date of last purchase
Date of mailing
Purchase dat

Data Analytics

Part of the data preparation process involves identifying and creating new data points which can be calculated from existing entries. These can be used for multiple purposes. Often, particularly with direct marketing data, it is possible to create a predictive model or set of data categories, that can be useful for many direct marketing campaigns, such as time between mailing and purchase, or time between purchases, or even drive time to closest retail locations.

Below are a few examples of new data points that can be created through the analytics process.

A major part of this phase involves the identification of any data entries which may not be useful, so that they may be removed from the process. Alternately, if there is data that is incomplete in some instances but complete in others (such as an incomplete address, or missing postal code), they can be separated for future analysis. In many cases this data can be constructed based on other provided criteria (for an easy example, zip code can be determined from an address).
Gender information can often be extracted using first name of respondents. Where it is unclear, it is often possible to get a fairly high level of accuracy based on probability charts.
Mailing addresses can help identify geo-coordinates or drive distance which can be helpful for identifying concentrated locations of target markets.

Once basic information is identified as described above, it is possible to create groupings of individuals, based on demographic characteristics and user behavior (e.g., women between the ages of 45-55 in the Northeastern US with college degrees who show an interest in sports).

Using this information, it becomes possible to create a number of useful profiles which can then be used for predictive modeling, to create well-targeted marketing campaigns.

Modeling and Artificial Intelligence

Information gained through data preparation is then used to create various models of behavior. In the case of a direct marketing target market, “training data” of ideal typical individuals who are calculated to be defined as the “ideal customer” is created. These customer profiles then serve as models for scaling the success of campaigns via modeling.

Using various Machine Learning tools, numerous tests are then executed with this data. The performance of each test is then measured to see how well it performs against other potential models. The ML tools themselves can pick up new patterns which a human might not immediately recognize.

Multiple blind tests are run to see how well actual tests compare against the model, and how that model performs with our sample data. Through this process, new opportunities often appear for tweaking the model to improve response rates, average purchase amount, customer reactivation, etc.

In good evaluation procedures, it is best not to rely entirely on machine learning. After all, machines tend to lack human intuition. For instance, in the absence of actual buyer data, a test could discover that a significant number of people who are interested in purchasing fast cars are 18-year-old boys. This, of course, makes sense, however it is unlikely that a young boy can make such a purchase. Because of this, using some amount of subjective judgment to determine whether an approach makes sense is considered to be a good practice.

Graphical Presentation

As information is processed, in order to truly understand results at a cognitive level, it can be very important to provide clear reporting visually. Graphical presentation tools become essential for not just understanding but for detecting patterns. Where a string of numbers may not look like much to most people, if demonstrated using a graph trends can be come immediately apparent. To handle this, there are a number of useful tools which can quickly generate some standard reports, ranging from some basic bar charts, scatter plots and regression trend analysis.

Iterative Process

CRISP-DM is by nature iterative. Each stage not only informs future stages but also past ones. As the diagram shows, as new information is learned, it is applied to previous steps. Each portion of the process informs and re-informs the models. As data is prepared, new data points appear. As models are created and evaluated, these improve. Results from “final” deployments can be converted back into new models for future testing and evaluation.

Key to the CRISP-DM process is the principle that business understanding and data understanding inform each other. With the modeling process, new information is continually added to the data preparation process as we work to build new models. With each new deployment, this brings more business understanding. Each deployment in turn brings new data.

As we continually work with the process, the models improve with the goal of creating better and better results.

Conclusion/Our Approach

Xperra applies the principles of CRISP-DM using each of the methods described in order to provide clear, and workable business intelligence models from existing client data. By combining these data with external datasets, we use advanced data modeling and machine learning methods to create new meanings to help direct marketing organizations gain the most out of their markets.

We combine client datasets with widely available broad data that consumers have provided about themselves. Using some already well-tested models, this allows us to draw a detailed picture of customer and prospect lists. We are able, to a high-level of accuracy able to determine not only who the people in the dataset are but also what their likes are, as well as being able to pick up certain less-obvious characteristics about their likely or actual consumer behavior.

With the information we gain, we create new machine-learning models and improve the accuracy with each run-through. It may be important to note that we may not share every step of the process with a client as it goes through the rougher stages. For instance, some models which may have been expected to perform well may not produce optimal results. In order to save a client’s time (and to reduce anything that might kill well-deserved optimism), we focus on providing information that will create real results. Similarly, we need to control for data leakages which might show false positives in a model which could throw off the accuracy or effectiveness of actual campaigns. To help clients avoid making poor decisions based on misleading data, we will run through many evaluations before presenting results.

However, we provide access to useful insights that appear during the evaluation stages, so that it is possible to understand which factors or features most matter to a campaign.

Overall, due to our adherence to CRISP-DM, organizations working with Xperra can expect high-quality information out of their data, which can be used to get the best ROI out of their marketing campaigns. The longer they work with us, the better results they can expect to receive.

Topics

Business Intelligence AI in Business Marketing Analytics

CRISPy Data Mining in Marketing Organizations

CRISP-DM

Business Understanding

Data Understanding

Data Preparation

Data Dictionary

Data Analytics

Modeling and Artificial Intelligence

Graphical Presentation

Iterative Process

Conclusion/Our Approach

Topics

Resources

Other Articles

5 Models for Engaging Top Marketing Analytics Talent

ETL Tool Comparison: KNIME vs Alteryx

CRISPy Data Mining in Marketing Organizations

CRISP-DM

Business Understanding

Data Understanding

Data Preparation

Data Dictionary

Data Analytics

Modeling and Artificial Intelligence

Graphical Presentation

Iterative Process

Conclusion/Our Approach

Topics

Resources

Other Articles

5 Models for Engaging Top Marketing Analytics Talent

ETL Tool Comparison: KNIME vs Alteryx

Related content