Did Your New Marketing Test Really Beat the Control?

By Perry D. Drake

The following article, written by Perry D. Drake, appeared in Inside Direct Mail, January 2000.
It provides direct marketers with a means of test comparison.


In the September, 1999 issue of Inside Direct Mail we discussed how to assess a single test response rate by taking into account the sampling error associated with the test results. In particular, how to place bounds around a single test response rate in order to assess the range in which the actual response rate is likely to fall in roll-out with a certain level of confidence. This method of test assessment is primarily used to determine the potential of a new "list" in terms of being a new-name generator or to assess a new product in terms of its potential for a large scale roll-out.

When interest revolves around assessing one test result against another, however, we employ a hypothesis test. For example, you will conduct a hypothesis test when interested in determining if the lift in response or payment rate seen for the new format/offer versus the control is real or simply due to error associated with the test samples. Based on the results of hypothesis testing, specific marketing decisions can be made with confidence.

Setting the Hypothesis Having tested a new format versus the control, you must now determine if the observed difference in response rates is meaningful and, therefore, not simply due to sampling error. When conducting a hypothesis test, you are testing the following "main" hypothesis statement for truth:

The response rate of the new format test equals the response rate of the control format
Versus the "alternative" hypothesis statement:
The response rate of the new format test is not equal to the response rate of the control format
Based on the test sample results for both panels, you will either accept the main hypothesis or reject it in favor of the alternative hypothesis.

If you accept the main hypothesis, you infer that any observed difference in the test panel results is solely due to sampling error and is not meaningful.

If you accept the alternative hypothesis and the response rate of the test panel is greater than the control, you infer that the test has in fact beaten your control. Likewise, if you accept the alternative and the response rate of the control panel is greater than the test, you infer that the control is in fact better than the test.

Conducting the Hypothesis Test
To conduct a hypothesis test, the following information is required:

  • The response rates for both the control and test panels, which we will label Pc and Pt.
    These are the response rates obtained from your test sample results (i.e. percent responders, payment rates, percent tele-marketing hits).

  • The sample sizes for both the control and test panels, which we will label nc and nt.
    These are the number of names tested per panel. NOTE: In order to use this formula, both sample sizes, when multiplied by their respective response rates and when multiplied by one minus their respective response rates, must all be greater than or equal to 5. These boundaries assure both samples meet the requirements necessary to be considered normally distributed -- the basis of being able to conduct a hypothesis test.

  • The desired confidence level.
    In conducting a hypothesis test you may incorrectly conclude the test has beaten the control when in fact it will be no different than the control in a roll-out situation. The amount of risk you are willing to take is up to you. You control the amount of risk by setting the confidence level accordingly. A general rule is to set the confidence level at 90%or higher. Anything lower than this is considered too risky. For example, with an 85% confidence level, there is a 15% chance you will conclude the two test results are different when in fact they are not.
To conduct a test of the main hypothesis that the two test panel response rates are the same versus the alternative that they are different, first calculate the hypothesis "Test Statistic" value denoted as TS:



Given the confidence level chosen, you will reject the main hypothesis that the response rates for both panels are the same in favor of the alternative hypothesis using the following three "decision rules":
  • 90-percent Confidence Level
    Reject the main hypothesis if TS is greater than 1.645 or less than -1.645.

  • 95-percent Confidence Level
    Reject the main hypothesis if TS is greater than 1.96 or less than -1.96.

  • 99-percent Confidence Level
    Reject the main hypothesis if TS is greater than 2.575 or less than -2.575.

To illustrate the use of this formula, assume the marketing director at ACME Direct has conducted a new direct mail format test. The results of this new format test and the current control format are shown below:

  Number
of Customers
Mailed
Number
of Customers who
Ordered
Response Rate
Control Format 9,978 348 3.49%
New Format 10,002 416 4.16%

The marketing director wants to determine if the lift in response for the new format test is meaningful or due to sampling error with 95% confidence. This is accomplished by performing a hypothesis test.

Using the formulas previously mentioned, he will first calculate the Test Statistic value:


At the 95% confidence level, the decision rule is to reject the main hypothesis if the value of TS is greater than 1.96 or less than -1.96. Since TS = -2.48 is less than -1.96, the marketing director will reject the main hypothesis and conclude that the two response rates are different with 95% confidence. In other words, the marketing director can be 95% certain that the new test format has in fact beaten the control format.

If the marketing director wants to be 99% confident in the results of his test, he will reject the main hypothesis if the value of TS is greater than 2.575 or less than -2.575. Since TS = -2.48 is not less than -2.575, he will not be able to reject the main hypothesis and must conclude the test format is not different from the control format in terms of response.

Should the marketing director base his decision on the results of the hypothesis test at the 95% or 99% level of confidence? He will come to two totally different conclusions depending upon his choice.

Setting the Confidence Level
In order to determine the level of confidence to use when conducting your hypothesis test, ask yourself: How much risk am I willing to take in concluding the two test response rates are different when, in reality, they are not different?

To best illustrate the process of determining the confidence level to use, reconsider the example in which the marketing director was faced with not knowing whether to use a 95% or 99% confidence level.

His decision will be based on the amount of risk he is willing to take in the final decision.

  • If the costs associated with the new format are significantly higher than the costs associated with the control format, there is a major risk in changing to a new format if, in reality, it ends up performing worse or the same as the control format. In this case, if the new format's results are no better or worse than the control format, erroneously changing to the new format will yield an increase in promotional costs with a zero-to-negative change in the overall response rate. He should set the confidence level high (95%or 99%) to minimize the chance of concluding the test format has outperformed the control format when, in reality, it did not.

  • If the costs associated with both promotional formats are similar, there is certainly less risk if the marketing director decides to change to the new format when, in reality, it performs worse than the control format. In this case, if the new format's results are worse than the control format, erroneously changing to the new format will yield a negative change in the overall response rate but leave promotional costs unchanged (unlike the first scenario). As a result, he will set the confidence level at "industry standard" levels (90% or 95%) since the risk in making an incorrect decision is not as high, relatively speaking, as in the first scenario.

  • If the costs associated with the new format are lower than the costs associated with the control format, there is also less risk if he decides to change to the new format when, in reality, it ends up performing worse than the control format. In this case, erroneously changing to a new format will yield a negative change in the overall response rate but this time promotional costs will decline offsetting this fact. In this case, he may want to set the confidence level at 90% since, relatively speaking, the risk in making an incorrect decision is not as high as in the first or second scenarios.
Assuming the test format in this example is only moderately more expensive than the control format, a 95% confidence level seems appropriate. There is no need to be more conservative than this. As such the marketing director will conclude that the new format has beaten his control and can feel comfortable in switching to the new format for his next major promotional effort.

Hypothesis testing can provide a powerful means of assessing two test results. As was the case with confidence intervals a hypothesis test will not give you a definitive answer. The answer you obtain will depend on the confidence level chosen and the amount of error associated with your test panels. Use the results of hypothesis testing as a guide to test panel interpretation.



Return to Listing of All Articles     Return to Home Page