Conducting Data Audits to Ensure Integrity

Conducting Data Audits to Ensure Integrity
By Perry and Rhonda Drake

The following article, written by Perry and Rhonda Drake of Drake Direct, appeared in the February 15, 2006 issue of DM News Online.

Whether you are involved in a new business or one that is well established, the importance of conducting data audits to ensure the integrity of your customer information cannot be stressed enough. Without clean and accurate data you will not be in the best position possible to maximize the use of your customer information for decision making purposes.

Whether a new business or one that is well established, the issues surrounding data integrity are the same. For a new business, the information captured and how it is used will evolve and change quickly. For an established business, changes to the database and the information it contains will also occur, but over a longer time period, at the request of marketing managers and analysts long since gone. In either case, there will be discrepancies in how fields are used and updated within and between product lines and marketing divisions. Many times quick fixes or patches will also be implemented to old legacy systems in order to accommodate new business models that the current system was not designed to handle. Be aware.

All of these situations are reality and are therefore a part of life for any database marketer. There is no such thing as a perfect database. But that is okay as long as the data issues are not significant in number and that when they are more significant in number they are either fixed or properly documented in the data dictionary. For example, a field may have been temporarily used by fulfillment to handle a unique billing test for a specific promotion, or certain customer actions may not have been converted prior to a particular year when changing to a new database system. Either situation may be unavoidable, but as long as it is documented, errors in the use of this data can be minimized.

Customer relationship issues occur when you try to use the values of a corrupt data field to personalize a promotional piece or to select names for a promotion that should not have been selected. Nothing will sour a customer relationship more than to incorrectly refer to him or her as a new customer in your copy when in fact he has been a customer for many years; or, to send two drastically different offers for the same product or service to a husband and wife residing at the same address because you did not properly household your file.

Maintaining a customer database is complex and errors will occur but with a little effort each year via an audit, you can minimize the negative effect they can cause to the business.

In this article we will discuss in detail the following six points:

  1. Why it is important to conduct a data audit
  2. How often an audit should be conducted
  3. Whether an audit should be conducted in house or outsourced
  4. Customer data issues to expect during an audit
  5. The steps in conducting an audit
  6. Best practices

Why it is important to conduct a data audit
There are three main reasons to audit a database:

  1. To ensure the accuracy in selection of customers and prospects for promotional offers.
    Promotional offers typically drive off of RFM data, demographic data or affinity data. Thus it is important to make sure all of these elements are accurate if you want to make offers to the right target. The importance of clean RFM data lies in the ability to properly segment “hot” leads and high value customers from the rest. Clean reliable demographic data makes sure you are targeting properly by lifestage or economic strata. Having reliable affinity data helps to ensure the relevance of your communication to a customer. Customer relationships can be damaged when a marketer continually sends irrelevant communications to customers due to bad data.
  2. To optimize deliverability of your package to ensure it gets into the hands of the right customer.
    Cleaning up deliverability fields will directly impact the ROI on marketing efforts. Postal address, email and telephone fields should be reviewed on a routine schedule (1-3 times per year). In addition to proper NCOA processing, address standardization, householding and depuding of the file, ongoing maintenance of email bouncebacks, opt-outs, and telephone verification should be done on a schedule. Sending multiple promotions to the same household is not only a waste of promotional dollars but again can damage your customer relationship. Deduping ensures a single promotion is sent to a particular household. In addition it provides protection against different offers being presented to those within the same household.
  3. To ensure maximum compliance of customer preferences.
    Good direct marketers are sensitive to their customer’s preferences with respect to promotional channel. Whether this information is explicitly collected from the customer in a communication designed to establish communication protocol, or if it is inferred from utilizing DMA preference lists (DMA mail and telephone preference lists) it is good to observe a consumer’s preference and communicate via the channel preferred. For example, some magazine customers may explicitly request they not receive any promotional pieces other than those relating to their magazine renewal. To accommodate this request a patch to your current system may need to be put in place. For juvenile marketers, if your children’s information is on your database as recipients, you will want to protect these names to avoid any possible issue with promoting them or making offers to them. This will involve the development of fields and procedures that must be set and maintained properly.

As you can see, data audits are conducted to not only ensure we are using the data correctly to maximize our profit but also to maximize the customer relationship.

How often an audit should be conducted
How often we should conduct an audit really depends on the usage of such data. If you are using the data to segment and select names for future promotions via mining or to personalize your copy, then you will want to run an audit at least one a year. Keep in mind that it will be easier to conduct an audit the second time around. The first time will be much more time consuming. Remember, your customer database is your most valuable asset. Without your database you are not in business. So, the amount of effort put into making sure your database is clean and efficient is time well spent.

For new businesses, an audit will need to be done more frequently until it becomes stable. A new business will be very dynamic. As such data capture and usage needs will be evolving daily.

For direct marketers in the mode of acquiring new businesses, more frequent data audits will also need to be conducted. In this situation, time will need to be spent to properly integrate the new data fields and denote differences where they exist.

Whether an audit should be conducted in-house or outsourced
A data audit can either be conducted in-house or out sourced. Due to resource issues it may not be possible to conduct a database audit in-house even if there are knowledgeable, capable database analysts on staff. In the event that current demands on resources prohibit conducting an audit in house, consider utilizing those knowledgeable about the data and its usage as participants in the audit process. The questions below should assist in determining who should be on the database audit team.

  1. Does the function of database analyst, database administrator, or an equivalent currently exist within your organization?
    If there is a knowledgeable point person in your organization who is currently well versed with the data and usage of the data, then this individual should be involved with the mechanics of the audit. The individual who knows which database variables are used in selections, personalization, and delivery can easily prioritize the most important variables to evaluate.
  2. Can reports be easily generated on a single variable at a time for the purpose of reviewing data quality?
    If the current database platform in use by your organization provides for a means of creating distributions of variables, or if the data can be extracted from the database and reported via an external software package, then you have the tools required for a data review.
  3. Does the in-house staff have adequate time available to conduct a data audit?
    If the database team is already backlogged with projects and a data audit project is a high priority, it may be better to outsource. This is especially true if the database has never undergone an audit. The first audit will most likely uncover more data anomalies requiring investigation and therefore take up more time than subsequent audits.
  4. Who has the authority to restate data that is proven to be bad?
    When erroneous data is identified, and a fix can be defined, do the audit participants have free reign to “restate” the data? Will the organization require a process to present recommended changes? This is a good practice in the event restated data impacts fields used for current name selection as these changes may impact counts of segments currently available for marketing.

Customer data issues to expect during an audit
What can you expect in your audit? Pretty much anything. The most common issues you should be on the look out for are listed below:

Dates out of range – Be aware of dates out of range. If dates are out of range and you cannot determine the correct values, then mark them as missing. Dates may be out of range due to (1) input errors, or (2) conversion issues to a new system.

Values not in domain – Non-date fields may contain values not in the domain. Every effort should be made to correct them or mark them as missing or unknown. These data fields may take on values not in the domain due (1) input errors, (2) conversion issues to a new system, (3) old source values, for example, that are not longer being used , or (4) values that were used temporarily by fulfillment or billing to deal with system constraints.

Missing values – Some missing values are legitimate and others are not. Examples of legitimate missing data are when you do not have complete age or income coverage on everyone. Examples of missing values that need to be corrected or noted are due to conversion issues and sloppy fulfillment systems. Every effort should be made to clean these up or at least document why particular non conforming values are present and for what groups of customers or prospects.

Additionally, you may have created missing value or incomplete data because you wanted to save time and money in the conversion of your database to a new system. While this is understandable, especially for small direct marketers, it is not advised as you are in essence losing customer history which is hard to replace.

The steps in conducting an audit
The steps involved in conducting the audit are not complex but will be time consuming initially.

First determine which records will be audited. For a marketer with a database of 500,000 or less a full audit can be conducted relatively easily. However, if the database is significantly larger, it makes more sense to select an nth of the database for auditing. A representative nth should provide a good cross section to highlight any problems that exist, and reduce the processing time associated with conducting the audit.

Next identify the fields which need to be audited. These will be the data fields used for purposes of selecting customers for promotion, personalization, reporting, and complying with customer communication preferences.

Tabulate or run distributions on all audit fields. Once identified, you will run tabulations/distributions of the data and compare the values against the domains in your data dictionary.

Analyze results on a field by field basis. For all fields that are noted with discrepancies of missing or invalid data, an assessment should be made to determine what proportion of records are problematic? Are the problems characterized by a single underlying issue or many? On an initial audit there may be many data issues that are uncovered and an analysis of the types of problems by field can be useful to prioritize the data problems. Such a prioritization can deliver the maximum clean up impact for the time expended.

Keep in mind that the integrity of the data may also vary depending on the source. For example, you may capture age of child with the customer’s first order. For orders place via the phone, the child’s age is confirmed whereas this cannot be done when orders are placed via the mail. As such this data field will be more accurate when obtained via the telephone versus not. As a result this data may not be as strong when used for selection purposes if the source is not also considered.

Characterize the data problems and look for patterns to be resolved. In the analysis of field by field results the audit may reveal that 80% of the problem fields related to data conversion issues when fulfillment systems were changed. If that is the case, then the users of the data can feel comfortable that once the data issues are resolved, they will not recur so long as another conversion is not undertaken. If there are data issues related to manual input or other issues, then these data issues should be investigated fully to address and correct their source.

Best practice: check the data during the database build or update
One opportunity available following the completion of a data audit, is a ready made list of important variables and valid domains. This list can be used to conduct quality assurance on data prior to its incorporation in your database. Prescreening data on load will keep bad data out of your database to begin with. While prescreening is a good step, not all data issues can be foreseen so doing this will not eliminate the need to conduct an audit, but it will ensure that the audit on data integrity reveals fewer problems.