"Sampling" for Internal Audit, ICFR Compliance Testing

Internal Auditing and Management Testing: Sampling TechniquesByJames J. Finn, MBA, CISA, and CIA Independent Consultant James J. Finn, is the founder of an independent Financial, IT, and ICFR consulting business, and has worked as a CFO, program manager (PMO), internal auditor, and compliance consultant for small, medium and large public companies and for Mutual Insurance Companies. Mr. Finn holds a BSBA degree in Finance with Honors,

and an MBA from Northeastern University, Boston Massachusetts. Through the years, Mr. Finn has acquired over 25 years of hands-on experience at various financial positions ranging from “Management Trainee” at the First National Bank of Boston, to “CFO and VP of Finance” at a commercial printer, Dynagraf Inc. Also, as a qualified, CIA, and CISA, he has focused on internal controls and compliance programs for Sarbanes Oxley, since 2004.In addition to authoring this “Guideline”, he has written comments to the SEC on Sarbanes Oxley related issues, and was the editor for a comprehensive accounting policy and procedures guideline for Digital Equipment Corporation’s worldwide internal “Product Line Management Accounting” system, and a “White Paper” titled “The Great SOX Caper” which discusses the impact of AS-2 and AS-5 on SOX programs.

Version 1.50, 2/10/10

While this document is believed to contain correct information, the author, James J. Finn does not make any warranty, express or implied, or assume any legal responsibility for its accuracy, completeness, or usefulness. Reference herein to any specific product or publication does not necessarily constitute or imply its endorsement, recommendation, or favoring by the author. The views and opinions expressed are those of the author.

Intellectual Property of James J. FinnCopyright 2009 ©

For discussion and negotiation purposes only -1-

2010

Finn Consulting LLC

James J Finn

Table of Contents

I. Guideline Overview.................................................................................................3

A. Description.......................................................................................................3

B. Scope and Application of the Guideline..........................................................3

C. Purpose of the Guideline..................................................................................4

II. Sampling and Risk...................................................................................................5

A. What is sampling?............................................................................................6

B. What is Sampling Risk?...................................................................................7

III. Sample Bias.............................................................................................................9

A. Risk of Sample Bias.........................................................................................9

B. How Sample Bias Arises................................................................................10

1. Bias from Sampling Procedures..........................................................10

2. "Crazy Eddies, Inc.", an example of fraud..........................................12

IV. Use of Sampling in Auditing.................................................................................15

A. Sampling Methods and Procedures................................................................15

B. Statistical Sampling........................................................................................17

1. General considerations.........................................................................17

2. Specific Considerations for Auditing..................................................20

3. Valid Statistical Sampling Examples...................................................21

C. Nonstatistical Sampling.................................................................................27

1. General Considerations:.......................................................................27

2. Specific Considerations for auditing...................................................28

D. Testing of Controls, Non Inferential sampling..............................................32

1. Intended End Use of a Sample (inferential vs. non inferential)..........32

2. Sampling Steps for tests of Controls...................................................36

E. Practical Limitations on Sampling.................................................................37

F. Statistical Inference and Sample Size............................................................38

G. The Rise and fall of Statistical Sampling in Auditing...................................39

V. Effective Statistical Sampling................................................................................40

A. Probability Theory.........................................................................................40



B. Sampling Method vs. Sample Selection Methods.........................................41

1. Techniques for selecting samples........................................................42

VI. Comparative Analysis of Sampling.......................................................................44

1. Sampling in Financial Reporting Processes........................................44

2. Statistical Process Control (SPC)........................................................45

Bibliography......................................................................................................................45

I. Guideline Overview

A. Description

This guideline surveys the concepts underlying the use of sampling techniques to strengthen the sufficiency, relevance, and reliability of evidence collected to support internal audit conclusions and managements testing for internal control and financial reporting procedure effectiveness. Evidence that is derived from effective sampling techniques would be one way of fulfilling the requirements of “Practice Advisory 2310-1: Identifying information”, and, since sampling is based on testing a relatively small number of items, it can be a cost effective technique. This guideline is intended to provide practical information related to improving sampling techniques. In addition, this guideline reviews the history of sampling, and provides examples of erroneous conclusions in auditing caused by intentional sample bias (fraud) or by unintentional sample bias (incorrect sampling training or techniques). Sampling is viewed in a comparative manner that provides insights into the use of similar sampling techniques in industries where sampling is governed by military specifications, and ISO commercial standards.

Sampling, as used in internal auditing, has generally relied on the PCAOB authoritative guidance contained in AU 350, the AICPA Audit Guide, and the prior AICPA guidance as provided by SAS-39, which has been amended by SAS-111. These sources are analyzed and expanded upon to address the practical application of statistical and non-statistical sampling techniques for internal auditing and management control testing.

B. Scope and Application of the Guideline

Sampling is essentially a process of gathering partial information with an expectation that the partial information can be used to determine either a statistic (e.g. the mean, or median value) of a population of interest, or to estimate the percentage of occurrence of an items feature (attribute) in a population of interest. In addition to using sampling to determine statistics for a population, sampling can also be used to determine the probability that a “lot” or “batch” of products or transactions have an “acceptable” percentage of deficiencies. This application of sampling is generally referred to as “Lot Acceptance” testing, or, in auditing, as “controls testing”. In each case, sampling and examining items for a statistic or the presence or absence of an attribute is a process, which can be used with great effect by an internal auditor or



management to acquire an understanding of the population being audited. Examining a sample provides one basis for reaching a conclusion regarding a population’s statistic (mean or deviation), or of a controls acceptability or effectiveness based on control attributes. Once an auditor has reached a conclusion regarding the features or statistics of interest for a population, then the process of sampling and testing can be considered complete. However, the conclusion reached as a result of ‘sampling’ and examination still must be presented and accepted by management in order to provide any practical business value from the effort. Issues related to sampling procedures and techniques used to reach the conclusion, may be questioned by management or peers who have a different view of the results.

Once a conclusion has been reached based on a sample, management, (the auditee) can challenge the credibility and believability of the sample and the auditor may be in a position of having to prove that their conclusion (and sample) were credible. Of course, if the conclusion is one favorable to the auditee, then there may not be any questions about the sampling method or inference; however, if the auditee was ‘Crazy Eddies’ (See Section III, B, 2), and a sample of inventory item quantities indicated that many of the inventoried product boxes were empty, or did not exist, and that the count sheets were overstated, then the auditor would be required to have used a defendable sampling method to support the unfavorable conclusion.

If an unfavorable result based on sampling is being challenged, a full statistically valid sampling methodology with a tight confidence level and precision interval should be used to verify the initial results. This will generally result in a relatively large or larger, sample size. If a larger sample size is required to provide adequate credibility to support an unpopular, but valid negative conclusion, the benefit to stakeholders may outweigh any reasonable cost. Alternatively, if a statistically valid sample is impractical, a nonstatistical sample can still provide supporting audit evidence by properly considering other elements of the audit risk structure and taking additional precautions such as narrowing the sample frame, selecting as large a sample size as is practical, and selecting only the most relevant document(s) to be sampled.

C. Purpose of the Guideline

The purpose of this guideline is to provide an analysis related to the use of sampling methods and sample selection techniques for management testing and internal auditing. In addition, a secondary purpose is to provide a context for using sampling plans in internal auditing and management testing for compliance programs such as Sarbanes Oxley. The scope will include researching relevant authoritative auditing guidance, and work by others on using sampling and probabilities in internal auditing. Also in scope will be a comparison of control-testing sample plans with similar sample plans as they have been developed and used commercially in other industries. A focus for this comparison will be on the special application of attribute sampling referred to as ‘Testing of Controls’, or, in manufacturing quality control, “Lot Acceptance sampling”. Acceptance sampling plans are widely used in manufacturing for quality control actions such as ‘sentencing’ product lots as either accepted or rejected base on assertions of an acceptable quality level. Also in-scope is basic probability concepts related to sampling, and inferential statistics. The importance of ‘sample risk’ as it relates to the selection of a ‘sample frame’ is discussed to ensure that the sample frame is representative of the population, and that sources of potential errors that may be introduced by inadequate sample sizes are highlighted. A further purpose of this guideline is to move beyond an intuitive



acceptance of samplings’ credibility and ease of communication, to acquire an understanding of the quantitative risks of sampling techniques. This understanding will assist the practitioner to develop an unbiased and accurate view of the sampling processes and procedures being applied, and to avoid becoming an unsuspecting victim of excessive sampling risk, or of fraudulent and erroneous sampled data.

Sections II and III examine, some of the risks associated with sampling, especially sample bias, that can distort the source of samples (the sample frame), and produce results that are erroneous. An unknown sample risk from sample bias occurs whenever there is a systemic bias or preference in selecting the sample that results in drawing items from a population in a non-random or preferential manner without that being the auditors’ intention. Examples of sample bias and its impact on audit conclusions or decision-making are presented. Section III focuses specifically on sample bias and examines some of the common causes of biased sampling.

Section IV analyzes the uses of sampling within the auditing profession, and provides an examination of the history related to the use of statistical sampling and non-statistical sampling. In addition, this section provides an introduction in to “Acceptance” sampling which highlights’ the sometimes-confusing aspects of using small sample sizes to test internal control effectiveness. Also examined in this section are some of the limitations of sampling techniques, including the risks involved in projecting or inferring sample features to a population. Section IV evaluates both statistical and nonstatistical sampling methodologies and compares them in terms of their relative strengths and weaknesses. It highlights appropriate ways to apply sampling when auditing financial transactions and processes. This section also examines the auditing community's acceptance of statistical sampling,i and the observed recent reluctance for applying the more reliable and involved statistical methods of sampling in day-to-day audit work (See Section IV, F). Section IV also explores non inferential sampling and how the ‘Testing of Controls” and the Sarbanes-Oxley Act have impacted the auditor’s use of sampling in ways other than to project values and percentages to a population.

Section V develops a background and understanding of the underlying Probability concepts supporting statistical sampling, as well as examining the various methods of selecting samples whether they are statistical samples or non-statistical sample plans.

Section VI compares sampling as used in internal auditing with similar attribute based “Acceptance Sampling” plans in wide use in the high-tech and other industries, and also briefly discusses the concepts used in sampling for Statistical Process Control (SPC).

II. Sampling and Risk

Internal auditing employs many methodologies and techniques to provide management with assurances that assets are being safeguarded, operations are being performed as intended, and that the organization is in compliance with laws applicable to the company. One of these techniques consists of sampling a population and testing the sample, and, as a result, increasing the auditors' knowledge about the population. However, sampling carries with it the risk that what is selected for the sample is not the same as what is in the population. This is the major risk



of using samples to form conclusions. This risk that the sample does not represent the population is referred to as sample risk, or sample bias.

Using a well-designed sample plan provides one source of sufficient evidence to support an auditor’s conclusionii, and, using a statistically valid sampling plan provides the strongest basis for a quantitative measure of sample riskiii. It also provides a sample size and selection method that is representative of the population. The intuitive nature of sampling enhances the persuasive power and credibility of auditing or testing by supporting the technique of inferring or projecting features from the sample to the larger population. Furthermore, because sampling and projecting features of the sample to a population can also be intuitive, it is a simple method for communicating auditing results across barriers such as different backgrounds, specializations, levels of authority, and complex technical specifications. This section introduces the basics of sampling, the benefits, and the risks.

A. What is sampling?

Sampling is selecting less than 100 percent of items from a population that contains the complete set of the items of interest, and evaluating only the selected items for a pre-determined value, characteristic, or attribute. The population is frequently either an account with transactions that have amounts making up the account balance, or a document that has evidence of the performance or completion of a control procedure. The Audit Standards define audit sampling as follows:

Audit sampling is the application of an audit procedure to less than 100 percent of the items within an account balance or class of transactions for the purpose of evaluating some characteristic of the balance or class.1 This section provides guidance for planning, performing, and evaluating audit samples.iv

This definition is specific, but is not quite identical to some working definitions used outside of the audit profession. A definition that may be closer to one used by a business process control manager, or general business manager could include a reference to inferring or projecting the measured characteristic of the sample to the population. However, this difference of a reference requiring projecting results from samples to the population is referenced in different areas of the audit standards.

Using a sample to infer or project conclusions about a population characteristic is a technique universally used in many aspects of business, as well as professions and trades. For example, a criminologist may sample substances obtained during a drug raid to decide if the entire lot is an illegal substance. The medical profession relies on extensive sampling through clinical trials in order to determine if new pharmaceutical compounds are safe for human use. Aircraft engineers sample and test metals and other aircraft and technical components to determine if they will perform according to design expectations. One definition of sampling that expresses this purpose of sampling was found in an Internet article related to “Six Sigma” sampling, which defines sampling as follows:

Sampling is a method to draw inference about one or more characteristics of a large group by examining a smaller but representative selection of group items.



This selection is referred to as the sample. This selection can be probability driven or judgment/non-probability driven. v

As discussed earlier, the initial difference in definitions (i.e. projecting characteristics

from the sample to the population) is compensated for in the first sentence of a separate paragraph of the auditing standard, which states: “The auditor should project the misstatement results from the sample to the items from which the sample was selected.”vi Thus, on a conceptual level, the definition and purpose in AU Section 350 of sampling is essentially the same as the more general commercial definition once the concept of projecting characteristics of the sample to the population is considered. This is significant, since once it is concluded that inferring and projecting sample characteristics to a population is an objective that is included in the auditing standard, then additional considerations must be taken during sampling to ensure the sample used is valid for inferential applications. This is especially true when selecting a sample frame and determining the sample size - since these are two areas where the overall sample risks can be affected by the representativeness of the sample. Also, since projecting or “Inferring” characteristics from the sample to the population is part of the standard, it is critical that practitioners’ understand the difference between “Classical” sampling and “Acceptance” sampling which is discussed in Section IV, D, Non Inferential Sampling.

B. What is Sampling Risk?

Sampling risk is a significant component of overall audit risk. Audit risk consists of both sampling risk and other risks, which may be procedural in nature, or related to other factors involved in the audit. The auditing standards address audit risk as follows:

The uncertainty inherent in applying audit procedures is referred toas audit risk. Audit risk includes both uncertainties due to sampling and uncertainties due to factors other than sampling. These aspects of audit risk areSampling risk and non-sampling risk, respectively. 3 [As amended, effective for audits of financial statements for periods beginning on or after December 15, 006, by Statement on Auditing Standards No. 111.]vii

Sampling risk occurs because there is a probability that the items selected as a sample are not representative of the population. As a result, the sampled item’s deviations, or their percent of attribute successes or failures may not be the same in the sample as they are in the population. Thus, the auditor evaluating the characteristics of the sample may not reach the same conclusions they would if they examined the entire population.viii Selection of samples that are not representative of the population can occur as the result of errors made in the sample procedures such as the determination of the sample frame and/or the sample size. These sampling errors can come from multiple sources, but the increased sample risk resulting from errors is usually caused by not including a sufficient representation of the population’s variability or attribute proportions in the sample. A sample may be biased systemically and, as a result, not be representative of the population. A sample may be sized too small and, as a result, not represent all dispersions or attributes of items in the population. A sample bias may result from an intentional or unintentional selection of a stratum, or an incorrect portion of the true population, or result from a subjective, but inadequate, determination of sample size. Of the potential sample errors mentioned, sample bias by an incorrect sample frame selection, or by the Intellectual Property of James J. FinnCopyright 2009 ©


determination of an inadequate sample size, can be a very significant source of sampling risk. The possibility that an auditor may reach an incorrect conclusion because of a sampling error (sample risk) is addressed in the audit standard as follow

Sampling risk arises from the possibility that, when a test of controlsor a substantive test is restricted to a sample, the auditor's conclusions may be different from the conclusions he would reach if the test were applied in the same way to all items in the account balance or class of transactions. That is, a particular sample may contain proportionately more or less monetary misstatements or deviations from prescribed controls than exist in the balance or class as a whole. For a sample of a specific design, sampling risk varies inversely with sample size: the smaller the sample size, the greater the sampling risk.ix

Sampling risk is classified in the audit standards by the potential impact of sampling errors on the conclusions that an auditor may reach. These risks of potentially incorrect conclusions are divided into two categories. The categories relate to conclusions effecting either substantive testing or the testing of controls.

Substantive testing risks are:

a) The risk of incorrect acceptance.b) The risk of incorrect rejection.

Control testing risks are:

a) The risk of assessing control risk too low.b) The risk of assessing control risk too high.

Refer to the auditing standards for details.x

It is important to note that, for financial auditors, the impact of assessing control risk either too low or too high is to affect the determination of the level of substantitive testing required by an external auditor. Thus, if control risk is assed as being too high, it simply means that more substantive testing will be applied than may have been needed. However, for control procedures design, if management testing assesses control risk as either too high or too low, the process control design can be completely ineffective, or can result in an excessively costly overdesign of the control procedures. Because of this difference in the end use of the sampling and testing results, management testing for the purpose of designing and implementing internal control processes and procedures should be more robust and employ larger sample sizes than would be adequate for acceptance testing (control testing). The elimination or reduction of sample bias is so important when the results are to be used for design purposes, that 100% testing, or a statistically valid sample size should be used wherever possible rather than acceptance sampling techniques. As a result of the importance of the risks of sample bias, and of the impact sample risks can have on audit conclusions, the sources and relevance of sample bias are reviewed next.



III. Sample Bias

A. Risk of Sample Bias

Regardless of the particular techniques employed, all sampling methods base their efficacy on selecting a representative sample from a larger population of items. The sample must be selected from a population with the necessary attributes or values of interest. Otherwise, the sampling results will fail to represent accurately the characteristic values or attributes being tested for the audit. A sample frame can be misleading whenever it does not include items with the attribute that needs to be tested, or if it only contains an abnormally small portion of items with the attribute, or because there is an intentional or unintentional preference in the items selected to be included in the sample frame. If a sample frame is incorrectly selected, or an inadequate sample size is used, the entire sample can be flawed and may not be representative of the population. Even if the sample frame is selected from a relevant population, and the sample size is adequate, there is still a possible source of sample bias. The potential source of sample bias is the introduction of any systemic preference when selecting (picking or drawing) the sample items. All of these sampling variables impact the reliability of the sample when it is used for inferences about the population.

Whenever a sample frame is selected that is either not as complete as is assumed to be (e.g. accounts payable vouchers that only include those that were paid by checks, but which should also have included those that were paid by cash or wire transfers); or is not composed of the type of items required to test a control attribute (e.g. vouchers as opposed to disbursements), then the sample frame can introduce an unknown bias, and can be considered inadequate for projections or inferences related to the controls original intended target population. Also, when the items available to be selected within a sample frame consist of items with a systemic departure from the randomness anticipated, or when the method of selecting the samples is favoring some items over other items in the population, there is a high probability of sample bias. However, the major risk in biased sampling is that the bias is an unknown bias. Intentional stratification, however, is a known and purposeful method of sampling bias designed to reduce sampling risk in a population with a high dispersion (variation of item values), it is not a form of unknown sampling risk. However, it does redefine the sample frame (population) to a stratum.

Having an unknown or unintended systematic preference when selecting items in the sample is sample bias, and this is a major risk inherent in the use of sampling. Sample bias can produce samples, and subsequent testing results, that are not representative of the actual or intended “target” population. In addition, if a sample frame is biased by consisting only of a stratum of population items, or, conversely consisting of a whole mix of different items when a stratum is expected, this can be considered sample bias. It is a biased presentation of items to be sampled because it is not what the tester is expecting, (i.e.) it is not the target population or target sample frame. A sample drawn in a biased manner may represent only the samples features, and not the intended population’s features. Since the sample does not represent the target population, projections to the population can be incorrect, or, in some way flawed to the extent the sample is not a “mini” model of the population. Because sampling bias may be unknown or not quantified, any inference or projection to the population based on the biased sample may lead to an incorrect conclusion, as well as to corresponding incorrect management decisions related to the testing results.



Occurrences of intentional or unintentional sample bias may create easily communicated misinformation with high intuitive credibility, resulting in incorrect conclusions and potentially incorrect management decisions. In some cases, sampling may appear correct intuitively, but not be representative in the actual environment because of an unknown sample bias created by an incorrect selection methodology, see section III B, below.

The risk of sampling bias effecting auditing results began as sampling became necessary as corporations became larger, global, and more complex. An increase in financial transaction volume, complexity, incompatible processing operations, and asset size of corporations occurred because of company growth driven by product demand and/or by growth by acquisition (mergers and acquisitions). Combined companies create their own set of unique sampling problems when dissimilar workflows or business processes are operating at different levels of control effectiveness, but are producing a common output product.). In addition to internal growth fueled by product demand, corporations experienced sudden growth through acquisitions, mergers, or other combinations. This combination of internal ‘organic’ growth and the sudden need to combine different transaction processing systems (absorbed with the acquired businesses) created both an increase in transactions to be audited and an increase in the workflow differences related to processing the transactions. Combining workflows and systems disrupted existing legacy processes and introduced an increased possibility that ‘end transaction’ sample frames may be more representative of one workflow than of another. Consequently, as the volume and complexity of 100% transaction testing became unmanageable, the auditing profession had little choice but to adopt sampling techniques. Along with the adoption of sampling techniques came the insidious threat of sample bias, and the need to be on guard against the negative aspect of bias impacting audit conclusions. In order to minimize the effects of sample bias it is necessary to understand the origins of sample bias and then to develop a structured method of designing sampling frames and sampling procedures to ensure that samples selected from a population accurately represent the population, and contain the features or attributes intended to be tested.

This guideline focuses on sampling methodologies and techniques to be used by the internal auditor or individuals performing management testing for compliance, and the impact of sampling methodologies on the reliability and usability of statistical or nonstatistical samples to support conclusions used in auditing and management testing. This work explores examples of sampling techniques ranging from intuitive nonstatistical checking of a few items to the design of a valid statistical sampling plan. Intuitive, subjective, or judgmental selections of samples (nonstatistical sampling) may be used to gather information, or to support and provide documentation for an auditor’s conclusion. The auditors’ conclusion, when using non-statistical sampled information as supporting documentation, is generally also based on other analytic information obtained during the audit. While nonstatistical sampling is an acceptable documentation methodology, a valid statistical sample can reduce sampling risk by using probability mathematics to quantify the risk, and to infer characteristics from a sample to the population.

B. How Sample Bias Arises

1. Bias from Sampling Procedures

Sample bias in classical variable sampling, or attribute sampling can rise from a judgmental preference in how samples are selected or from the sample frame, or from a lack 0f consideration of the variability of items in the population. Bias introduced by using an Intellectual Property of James J. FinnCopyright 2009 ©


ineffective sample size can result from incorrect estimates of the occurrence rate of failures of items in the population. Since randomness is widely known as a method of selecting samples, when an inference is planned for a population’s average value (mean), or when inference is used to determine the proportion of an attribute feature in a population, the practitioner frequently considers the most appropriate method to achieving “randomness” in the sample selection procedures. However, there is not always as much effort spent evaluating the variability or dispersion in the population items, or of evaluating the estimated occurrence rate for an attribute feature. This is an oversight that can introduce a sample bias caused by an incorrect sample size not being representative of the population, and, therefore, selecting a sample with too little or too much of a population trait. The sample size may be incorrect because the population items variability or proportions is one of the critical variables for sample size calculations. When the variation or portions are not reasonably estimated, the sample size calculations may produce a sample size that is too small to consider the actual dispersion or proportion in the population, and is thereby biased and not representative. Thus, a bias caused by not including enough items to represent the true population increases the sample risk. Although nonstatistical sampling risk related to variability in the population can be reduced by approximating or estimating the key variables and using appropriate statistical sampling tables, or by using formulas to ensure the sample size is adequate, statistical sample size calculations still require a best estimate of the standard deviation (sigma) or the expected occurrence rate (percentage) in the population. These requirements for probability based sample size calculations may be addressed by estimating the required variables based on an analysis of the population, or, preferably by taking a pilot sample (possibly 10 to 15 items) to get a level of comfort regarding these variables. Creating this initial starting point by having a supportable estimate of these parameters is a very important, but often overlooked part of an effective sampling procedure. However, while a pilot sample is desirable when sampling for population statistics such as a population mean and precision interval, or the percentage(s) of attribute feature(s) in the population, it is less critical when there is a previous year’s sample history.

However, “acceptance-sampling” sample sizes are smaller by design since they represent the number of items to be selected without finding a specified number of deficiencies (c). This is because acceptance sampling is based on the laws of probability (binomial, or hypergeometric) and uses smaller sample sizes to determine the “chance” that a defect will be found in a given sample. Attribute acceptance sampling plans are used in auditing when performing probabilistic sampling for tests of controls to determine an estimated level of control risk.xi When “acceptance sampling” sample sizes are being used, the results should not be projected as values or proportions to the population without recalculating a sample size for confidence level and interval based on the normal probability curve and standard deviation. This will facilitate an accurate forecast with known statistical limits. Concurrent with the results of the sample size calculations, an additional source of sampling bias, the “what” or sample frame to be sampled is simultaneously designed. The “sample frame” is sometimes (incorrectly) assumed to be obvious. The “what” that is to be sampled is referred to as the sample frame or population, and is the only population that the sample will be representative of. It is the only population to which evaluation results can be inferred or projected. Selecting an incorrect sample frame may be one of the less noticed sample bias threats to effective use of sampling. Sample bias caused by an inadequate definition of the sample frame usually results from not comparing the “what” that is being sampled with the “what” of the control attribute being tested.



A simple example illustrates this danger. Consider a set of internal controls in accounts payable where an auditor is instructed to verify that a control activity requiring all checks over $5000 to be approved by a second person is operating effectively. Conjecturing that the recommended sample selected was a minimum of 25 checks out of weekly check runs averaging 1000 checks per week. These checks were selected by dividing the most recent 1000 check numbers by 25 to obtain a sampling interval of 40. The sample selection process then selected every 40th check until 25 had been selected. Once the sample was selected they were reviewed to verify that both a primary and secondary approver had signed all checks over $5000. When tested, the appropriate two authorizers signed all nine checks over $5000. However, this test is not as credible as it could be because of its “Achilles heel” sample bias caused by a poorly designed sample frame. The sampling technique was biased because “what” were sampled (the sample frame) were all checks issued, rather than all issued checks over $5000. Selecting a sample frame that included many checks that could not be tested for the second signature biased this sample. The objective of the audit was to determine if a control activity requiring checks that should have a second signer (checks > $5000.00) was effective. This is an application of sampling for acceptance testing - not of sampling to determine the percent of checks over $5000. The business workflow’s control activity, which is what we are testing to determine its effectiveness, was applied only to the portion of the check population that was over $5000.00; correspondingly the sample frame should consist only of these documents – (i.e.) checks over $5000.00. This is because the portion of ‘defects’ subject to the binomial distribution testing procedure is the percent of checks that should be signed that were not (occurrences of failed control). The first sample frame included a population of checks that, by definition, could not have the defect being tested for acceptance. An unbiased sample frame could have contained only checks over $5000. This is preferred since the attribute success or failure to be verified is the dual signatures on all checks over $5000. Thus, the sample frame to test for dual signatures should be all checks over $5000.

The sample frame selected is the population boundary or ‘set of items’ that is intended to be available for selecting samples from. It is also the population about which projected conclusions will be relevant to. However, in the case of “Crazy Eddies Inc.”, company personnel were manipulating the sample frame and counts. The sample frame being counted did not always consist of valuable electronic products, in some cases empty boxes were included in the counted items in order to increase the quantity of inventory items being valued for financial reporting of inventories. Manipulation of what is actually in the sample frame is controllable by company personnel and has been a frequent source of intentional sample bias by Fraud. Crazy Eddies is reviewed in the next section.

2. "Crazy Eddies, Inc.", an example of fraud

This is an example of a financial fraud based on misstated inventory quantities. Auditing the ‘quantity’ of items counted is an important internal control for companies with inventories or investment based financial assets. Confirming quantities of items that have been used as the basis for financial reporting of asset values is one of the viable applications of sampling. As such, sampling can be used as an effective and efficient technique to verify product inventory counts or investment-certificate ‘counts’ for financial valuation. An internal auditor’s verification of quantities counted is an appropriate use of sampling techniques since ‘quantity counts’ is an area historically targeted for fraud and/or unintentional misstatement. Financial reporting based on item counts is susceptible to misstatement because counting item quantities is labor intensive, Intellectual Property of James J. FinnCopyright 2009 ©


boring, susceptible to human error, and historically a source of fraudulent reporting. Thus, although this example focuses on a fraudulent manipulation of inventory quantity counts; it is also an ideal example of the kinds of counting errors and misstatements that an internal auditor may encounter in the field that could be detected through the use of sampling procedures to audit inventory quantities. In this example, both the counted population and the ‘count sheets’ of recorded quantities were not representative of the true inventory population. The inventory population “sample frame” consisted of empty boxes or previously counted merchandise moved from other locations, which in many instances, was fraudulently being presented by management as “valuable merchandise”. In addition, company personnel altered the quantities recorded on the manual count sheets when they were left unattended. An effective internal audit sampling procedure could have selected a target sample frame definition that included verifying that the boxes sampled at a specific location contained actual merchandise, and then could have performed sampled recounts to confirm that the original counts recorded were either correct or were not.

“Crazy Eddies, Inc.” was an electronics consumer products distributor operating on the east coast of the United States, which was managed by Eddie Antar (Eddie) and a tight group of family members. “Crazy Eddies” financial reporting fraud was perpetrated, in many instances, by inventory quantity misrepresentations. These misrepresentations were based on both the quantity counts recorded and on “What” (the sample frame) was being counted. While sampling techniques related to inventory count verifications by internal auditors is not specifically referred to in the referenced legal or commercial documentation related to the trial, it is the type of fraud or misrepresentation that points out the importance of defining accurately “What” is being counted - whether in total or for a sample. This example also highlights the high importance of audit sampling to confirm that quantities recorder on count sheets are accurate. Rather than sampling and verifying what was being counted, it was assumed that the items were as management represented (i.e.) expensive commercial electronics devices in their original boxes.

During its existence as a business “Crazy Eddies” management and other family members engaged in multiple examples of financial fraud both as a private business, and, after its initial public offering (IPO), as a publicly traded company. Sam E. Antar, the CFO of Crazy Eddies, and now a convicted felon as a result of the fraudulent activities at the business, produced the 10-k and 10-Q’s required during the years Crazy Eddies operated as a public company. During the time between 1984 and 1987 inventory counts were fabricated, or modified, and empty boxes were counted posing as boxes that were filled with higher priced merchandise. Because of this fraud committed in inventory counting as well as in revenue manipulation, Eddie Antar managed to misstate SEC financial reports to his advantage. The fraudulent inventory counting (and the apparent lack of effective sampling controls over counting) allowed the fraud to remain undetected until new management took over the company in 1987. After his involvement and conviction for SEC violations, Sam Antar has become an advocate for reducing financial fraud and maintains a web site focusing on improving financial reporting.xii

Some of the most effective and long running frauds involving financial misstatements are simple and involve miscounting or misevaluating inventory. If sampling is used as an audit technique for testing for this type of problem, the ‘Achilles’ Heel’ of the sampling procedures will frequently filter down to the sample frame since the sample frame is the population available to be sampled. In a similar manner, sampling frames can be fraudulently defined to produce sampling results that are not representative of the intended or target audit/test population. If the



intent is to commit a fraud, the distortions in a sample frame may be willfully concealed. For example, the inventory fraud started soon after Crazy Eddies IPO in September 1984, and continued undetected until the full impact of the inventory fraud was realized in 1987. The full amount of the inventory valuation ‘shortfall’ was not known until after a new executive team took over management of the company. The inflation of inventory counts began shortly after the IPO and resulted in an initial two million dollar fraudulent increase in inventory valuation. This impacted the 1985 fiscal year 10-k, as per the transcript from the trial at which the SEC was the plaintiff, and indicated a material misstatement of profitability.

For the 1985 fiscal year, Crazy Eddie filed with the SEC an annual report on Form 10-K. In that filing, Crazy Eddie reported that it earned pretax income of $ 12.6 million. This figure, however, was fraudulently inflated by $ 2 million, or almost twenty percent, as a result of the warehouse inventory inflations. When Eddie sold the stock on behalf of the Relief Defendants, he was fully aware of the fraudulent inventory inflation and its effect on Crazy Eddie's pretax income.xiii

This initial two million dollar overvaluation of inventory was achieved by misstating the inventory counts. However, this initial misstatement was small compared to the cumulative inventory misstatement discovered in 1987. In 1987, when the new management team valued the inventory, they discovered a $ 60 million shortfall. This is also mentioned in the Court transcript as follows:

On November 6, 1987, Crazy Eddie's shareholders voted to remove the company's existing management backed by Eddie. A new management installed by Zinn and Palmieri took control of the company. n30 Soon thereafter, it discovered a $ 40 million inventory shortfall. This estimate was subsequently increased to $ 60 million. Zinn and Palmieri ultimately disclosed this inventory shortfall to the public.xiv

This huge shortfall, which was due mainly to counting distortions, has resulted in the auditors for Crazy Eddies being criticized for not doing their job. The areas of criticism focus on not auditing all inventory locations simultaneously, and leaving audit count sheets available to be changed overnight. One of the documented criticisms of the inventory audits is in an article by “Joseph T Wells” indicating that there was numerous instances of inventory fraud involving counting that could have been caught.

Rather than climb over boxes in the warehouse, the auditors asked employees to assist them. Crooked employees volunteered. An employee would stand on top of a stack of television sets, for example, and call down the count to the auditors. If there were 10 sets, the worker would claim there were 25. Repeated many times, this clever trick helped to greatly increase the inventory count. The message here is obvious: If you're supposed to verify the inventory count, then you must observe it. xv



IV. Use of Sampling in Auditing

Unfortunately, sampling is rarely applied consistently in the auditing or compliance-testing field. The underlying understanding of sampling principles and concepts often varies from person to person. Different auditors may draw different conclusions from the same or similar data. This is due to the fact that the level of understanding and experience in applying sampling techniques varies greatly from one individual to another.

In addition, the acceptance of nonstatistical sampling as evidentiary documentation in auditing has complicated the situation.

Comment: The knowledge that, in many instances, sampling did not have to conform to a ‘valid statistical sampling’ approach may have removed most of the corral fences around the use of sampling. In response to the acceptance of nonstatistical sampling as evidentiary material,xvi some auditors began to wander from the underlying statistical and probability concepts to an approach based predominantly on judgment and intuitive risk valuation.

The previous sections II and III focused on sample risk and the element of ‘sample bias. Bias is a pervasive problem that must be guarded against by the auditor when designing a sample plan, especially when determining the sample frame and sample size, both of which contribute to sample risk when unknown or unmanaged. The sample frame should be clearly defined for a sampling plan, and the sample risk should be quantified and, although not required by the standards, documented where appropriate.xvii Sampling risk, in many instances can be minimized as well as quantified by applying probability theory in order to determine a statistically valid sample size. This capability to quantify sampling risk is not available when relying on a nonstatistical sample size.

The previous sections also focused on potential and/or actual negative impacts of sample bias, sample size and sample count misrepresentations on financial reporting, as well as on a specific example of the misuse of inventory counts to distort financial reporting. In these examples, sampling methods and procedures could have been an effective internal audit procedure to detect quantity misstatements or errors. In the next section, sampling methods and procedures are examined as a method of preventing distortions in results based on sampling.

A. Sampling Methods and Procedures

The effectiveness of the application of sampling techniques can be improved by instituting a structured and documented process for designing and applying sampling procedures. The methods and procedures used can also be standardized and improved by appointing a designated and specially trained auditor to establish and monitor the preferred methods to be used. A matrix summarizing the primary steps and activities that could be used as a base for developing an audit department procedure is provided below. This table of components for an effective sampling process has been developed based on information extracted from “Sampling: A Guide for Internal Auditors”xviii, and modified by this author. This matrix relates to both variables sampling and attribute sampling.



Summary of sampling plan elements Attribute Sampling Variables Sampling Determine the objectives of the audit.

Based on a prescribed control procedure.

Based on a dollar amount in an account balance or class of transactions and expected risks.

Define the attribute and the deviation condition(s).

Attribute is either present or not, a binary condition.

NA

Define the sample unit with the variable to be evaluated NA

Units or items containing amounts (e.g.) invoices with dollar amounts

Define the population (the sample frame).

A specific definition of the pool of all items eligible for sampling. For example "all customer invoices processed by the accounting department located at Boston, MA, USA".

A specific definition of the pool of all items eligible for sampling. For example "all invoices for customers with a bill-to location in the USA".

Determine the sample selection method.

Alternatives to be considered include Statistical vs. Non-statistical, and random or judgmental selection.

Alternatives to be considered include Statistical vs. Non-statistical, and random or judgmental selection.

Determine the sample size. Based on the sample selection method.

Based on the sample selection method.

Perform Audit procedures on the sample items.

Based on audit standards and internal procedures


Evaluate the sample results and express a conclusion.



A key element of any sampling plan is to “Determine the objectives of the sample.” as indicated in the upper left corner of the matrix. While important for variables testing, this is especially important for internal auditors when performing “Attribute Sampling” for control operating effectiveness:

A control objective provides a specific target against which to evaluate the effectiveness of controls. A control objective for internal control over financial reporting generally relates to a relevant assertion and states a criterion for evaluating whether the company's control procedures in a specific area provide reasonable assurance that a misstatement or omission in that relevant assertion is prevented or detected by controls on a timely basis. xix

In order to prevent sampling risk from becoming the weak point in an audit and testing process, it is essential to do a careful review and matching of the items being sampled to the objective(s) of the control activity, and to the audit purpose. Comparing the attribute features required to affect the controls risk-mitigating features to the documents can perform this. A comparison of the audit objectives to the samples needed to test a control is necessary in order to ensure that the control evidence being sampled does have the attribute feature required to



reduce the risks as intended by the control. For example, if the objective of a control activity is to ensure that revenue is not recognized until a product has been shipped and invoiced, and an auditor is selecting a sample of pick-lists in manufacturing, something is probably wrong, and the sample frame selection technique may need to be reviewed. While pick-lists are part of the complete “Order to Revenue” workflow, it is usually not the point at which revenue recognition is controlled. The auditor should probably be sampling shipping verification documents such as a shipper’s waybill, packing list, or a customer’s acceptance of delivery. Other possible sources for samples of control activities effectiveness could be an invoicing document file containing evidence indicating that a required matching of invoices to shipping documents has occurred.

The preceding has presented a summary of how an organized approach to sampling can help reduce sampling risk by organizing and using effective sampling procedures. The sample procedures are the foundation of repeatable and reliable sampling results. The procedures can facilitate the audit of controls required for compliance by developing uniform sample plans to audit the controls for financial reporting. However, field effort is still needed to ensure that a statistically valid sample represents not only the statistically valid confidence levels and precision, but also that the sample plan is representative of the controls target population. The next section develops statistical sampling further.

B. Statistical Sampling

1. General considerations

Statistical Sampling is a method of sampling where the number of items required for the sample (the sample size) is determined by mathematics based on probabilities and characteristics of the population such as variability and proportion. Both mathematical tablesxx and formulas are available for use by an auditor to determine a statistically valid sample size. In addition to a sample size based on probabilities, statistical sampling requires selecting the sample randomly based on an equal probability that each item in the population might be selected. The mathematics of probability distributions and algorithms take into consideration either the variance in the population from the mean (standard deviation, sigma) or the percent of the expected occurrence (success or failure) of the population attribute when determining a valid statistical sample size. Using a sample that is statistically valid provides the auditor with the ability to infer or project sample results directly to the population with a quantifiable sample risk by establishing a confidence level and range of precision (precision interval). Both are available as components of the calculation of sample size based on probabilities. The sample frame, confidence level, and precision interval can be determined by the practitioner based on their acceptability for the intended end use of the audit testing being performed. The auditor as part of designing the statistical sampling plan establishes the acceptable precision and confidence levels initially. A 90% confidence level or 95% confidence levels are common confidence levels for tables and calculations. Since a statistical sample allows the auditor to define how tightly or precise an inference from the sample to the population will be (precision interval), and since these key parameters are determined by the auditor, the sampling assumptions, while not mandatory, should be defined and documented when performing an audit test based on sampling.xxi

Valid variable statistical sampling procedures should take into consideration a sampling frame designed to represent the population, and the dispersion between the population mean and the items value, also referred to as variability or dispersion of the items in the population. By Intellectual Property of James J. FinnCopyright 2009 ©


measuring or estimating the variability of items in the population an auditor can determine a value for the formula variable sigma; the auditor may then calculate a sample size that represents the desired confidence level and “precision” using appropriate formulas. The “precision interval” is the population’s target-range for the estimate. Statistical sampling is used when a statistic of the sample (e.g. the ‘mean’) in the case of variables sampling, or the proportion of successes or failures in the case of attribute sampling is intended to be projected (inferred) to the population from the sample. This is the branch of statistics referred to as “Inferential Statistics” and is based on a statistically valid sample randomly selected from the population. The sample size is determined by applying an appropriate formula (or tables). The calculated sample size takes into consideration the variability of items in the population (standard deviation or δ) for variables sampling, or the expected percent of an attribute (portion) in the case of attribute sampling. The precision or acceptable margin of error inherent in an estimate of a feature to the population is frequently expressed as being related to the confidence level selection. The use of statistical sampling in internal auditing is mentioned on page II-115, Section E: Engagement tools, Part II of the “IIA CIA learning system”xxii which states:

Statistical sampling methods produce a scientifically random sample with test results that may be quantified in terms of a confidence level and precision. As an example an auditor might describe the results of testing a random sample of items this way: “We are 95% confident that the error rate of the population is 6%, plus or minus 3%.

At this point in the discussion of statistical sampling, we have established some of the underlying structure, which leads to the benefits of a statistical sample. This is the linking of probabilities to a sample size, and to the projection from the sample to the population. It is this mathematical transformation using a normal probability distribution that determines these formulas for calculating sample size. The formulas derived from the application of advanced mathematics allow an auditor to determine a sample size by using input variables including the probability distribution factor (Z), and a confidence interval or precision of the error rate of the population. The result of a calculated sample size is expressed as referred to in the above quote, as “We are 95% confident that the error rate of the population is 6%, plus or minus 3%.” It is this ability to quantitatively define the sample-based inference to the populations sample risk in terms of the confidence level and confidence interval that quantifies the statistical credibility of the projection from the sample to the population. The formulas for determining sample size and precision include the affect of the normal probability distribution including a “Z” factor (standard deviation factor for the probability distribution) in the calculation.

There are two related but different applications for determining a statistical sample size that have been coved thus far in this guideline. These are determining a sample size for an average value and its precision range, and a formula for determining the sample size for an attribute-sampling plan to estimate the percent of an attribute feature and its range or standard error percent. The latter is also referred to as population percents for a “dummy variable”xxiii. Determining the sample size for an attribute sample depends on what the sample will be used to conclude. Specifically, the sample size for evaluating a population’s average and both the upper and lower limits of the normal curve is frequently larger than a sample intended to determine the probability that the percent of the feature in the population falls outside an upper or lower one sided limit.xxiv Thus, there are three basic approaches of interest to determining sample sizes for



statistical sampling that are effective for making inferences to a population, and these are as follows:

1. Sample size for variable sampling to infer a mean value and the range of variance from the mean (two sided test).

2. Sample size for attribute sampling to infer a percent for a ‘success or failure’ of an attribute and the range of variance or estimated error from the percentage (two sided test).

3. Sample size to infer whether an attributes percent of ‘success or failure’ is probably greater than or less than a selected percent (one sided limit test).

There is also a special application of attribute sampling for ‘testing of controls’, which should not be used for inferring to a population, will be addressed later in this manuscript in a section on “Tests of Controls”. However, the first two methods of determining a sample size can be used to infer two different features of a population. These features are (1) determining a sample size to infer or project a dollar value or “mean” of a sample to the relevant population - which is usually an account balance or an account misstatement; or (2) determining a sample size to infer or project an attributes population proportion or percentage for “success or failure” – which can be a control attribute such as the dual sign off of checks over some specific dollar amount. The latter is binomial in that it either exists or does not exist, (I.E. each trial has the same specific probability of occurrence) thus, the use of the term “success or failure” of a control attribute.

The formulas for determining a valid statistical sample size are the simplified result of more complex calculations, which rely, among other advanced techniques, upon determining the maximum error of the estimates. There is a structured mathematical methodology supporting both the sample size and the inference precision when a valid statistical sample is used. This is not so when a nonstatistical sample is used. Before moving on to a comparison of statistical and nonstatistical sampling it is important to clarify that there are some other sampling techniques that may be confused with valid statistical sampling. Statistical sampling is frequently confused with “Random” sampling, or, in some cases with ‘Haphazard’ sampling selection. Statistical sampling requires a sample size that is calculated based on probabilities and has incorporated population variability or proportions into its calculation. Thus, even when an auditor goes through very rigorous procedures to ensure that sampling was done on a random basis, it is not necessarily a valid statistical sample, see the below reference:

…Any sampling procedure that does not measure the sampling risk is a nonstatistical sampling procedure. Even though the auditor rigorously selects a random sample, the sampling procedure is a nonstatistical procedure if the auditor does not make a statistical evaluation of the sample results” xxv

Thus, randomness alone does not produce a valid statistical sample that can be used for projecting from the sample to the population. Random sampling is a method of ‘drawing’ or selecting the sample, but it is not a method of valid statistical sampling, nor is it a method of quantifying the sample risk. Although it is true that valid statistical samples must be selected or drawn randomly from the population to benefit from a normal probability distribution, not all



samples that are drawn randomly are valid statistical samples. Additional variables are needed in order to produce a random sample that may be used to statistically infer features of the sample to the population. To be a valid statistical sample, the sample size is calculated using a “Z” factor for probability confidence levels, and the variable, sigma as well as the expected error rate. The size of a valid statistical sample can be determined using appropriate statistical tables, or by an appropriate formula, and, it is this calculation of a sample size based on probabilities that is a major factor separating a nonstatistical sample from a statistical sample. The difference in classifying sampling for auditing as ‘statistically valid” as opposed to random, haphazard, or judgmental is demonstrated by the following excerpt from the auditing section of the operating manual from the department of the Treasury:

Random sampling is a method in which each member of the population has an equal chance of being selected. This method ensures no bias is used in the sample selection. However, a random sample, without additional attributes, does not imply a statistical sample …

Statistically valid sampling is a method that combines the use of a random sample method with additional criteria including confidence level, expected error rate, and precision. This method is used when the auditors want to make a statement about the population from which the sample was selected. Using this method, outcome measures may be projected to the population.xxvi

It is clear from the above excerpt that the “Operations Manual” recognizes statistically valid sampling as the accepted method to use “…when the auditors want to make a statement about the population from which the sample was selected” xxvii

Simply stated, statistical sampling is a credible sampling method based on probability mathematics for projecting features from a sample to the population with a quantifiably defined sampling risk. When using a valid statistical sample, the statistical risks of making a projection can be quantified and stated in a formal manner. This is not the case for a nonstatistical sample method where projections from the sample to the population are based on an auditor’s judgment. However, as mentioned earlier in this manuscript, nonstatistical sampling has its applications and value, but they should not be confused with the inferential capability and value of statistical sampling.

2. Specific Considerations for Auditing

This section focuses on the operational use of statistical sampling for auditing. Auditors can use classical variable statistical sampling to estimate the dollar amount in an account or class of accounts, or to estimate the difference in dollar amounts between the transaction ‘book amounts”, and the transactions audited amounts for an account or class of accounts. This is an application suitable for ‘classical variables’ sampling, which can be used to project a population ‘mean’ or average, and which requires an estimate of variability in the population in order to calculate a valid statistical sample size. Examples of this application of variables sampling are as follows:

What is the average selling price for a product What is the average error amount in payable processing



What is the average invoice amount What is the average check amount What are average discounts taken by a major customer

In addition to variables sampling to determine an average value for a target population, auditors can use attribute sampling to determine the percentage of a population that has a feature they are interested in. Examples of this application of attribute statistical sampling are as follows:

What percent of invoices are over $5000.00 What percent of inventory items have a zero standard cost What percent of purchase orders for fixed assets are signed What percent of cash disbursements are over $50,000 What percent of checks over $5000.00 have duplicate signatures

If an auditor decides to use statistical sampling, the sample size should be calculated using a formula or a statistical table designed for that purpose (or auditing software with the capability). Variables sampling can be used whenever the objective of the audit is to verify that an expected amount or other variable such as an average price, cost, or days-outstanding is in the appropriate ‘value’ range required by management or required for operation of an internal control. It is also used to evaluate the significance or materiality of errors made when recording financial transactions into a journal or general ledger. In this instance, the amounts recorded in a general ledger account would be compared to sample transactions drawn from a sample frame containing all the transactions making up the account balances. The average ‘mean’ of the misstatements can then be inferred to the account balance and a conclusion made regarding the significance of the errors.

However, it is not always ‘errors’ that constitute the risk for a misstatement. The risk may be that a corporate pricing policy is not being followed. Valid statistical sampling can provide a basis for testing the reasonableness of an account such as revenue or accounts receivable balances that are based on pricing standards. While it cannot verify that all transactions adhered to the policy, it can provide some confidence to an auditor that the account balances are reasonable or unreasonable when compared to what would be expected if the pricing policies were being followed.

3. Valid Statistical Sampling Examples

The application of statistical sampling concepts and techniques can be demonstrated through a review of the concepts involved in sample size determination for both a classical variable sample and an attribute sample. The discussion will focus on determining a sampling size for two sided testing which includes determining a range in which the true mean or deviation percent may reside, rather than a one sided (also referred to as one tailed) test. Two sided testing generally requires larger sample sizes since the analytic application requires that the sample must be sufficient to determine a specific value or percent as well as a range for the population inference as opposed to determining if the value is greater than or less than a tail percent. The confidence interval provides a precision or range that the true value or percent lies within. This is also referred to as a precision interval or standard error. The decision regarding the level of confidence required should be made at the beginning of the sample plan design process. Intellectual Property of James J. FinnCopyright 2009 ©


Although the plan can always be redesigned using a higher or lower confidence level, the confidence level for this example sample plan will be 95%.

The strength of the two-sided evaluation is that a projection of a value or percent can be made to the population directly from the sample. However, the strength of a one sided test is to provide a probability based estimate of the risk that a deviation rate exceeds a pre determined tolerable or acceptable rate. One-sided testing is primarily interested in the probability that the true mean or deviation percent may be greater than or less than an upper or lower limit. The decision regarding the testing method and the confidence level that will be used should be made before the sample plan is determined in order to ensure the techniques are appropriate for a calculated statistical sample. This emphasis on deciding between two sided or one sided testing is reinforced in Montgomery’s Auditing, 12th Edition:

When designing an attributes test, the auditor should decide if it is necessary to estimate the range within which the true deviation rate lies (i.e. whether upper and lower deviation limits are relevant) or if it is sufficient to test whether the true deviation rate either exceeds or falls below a certain tolerable level.xxviii

Thus, the objective of these examples is to demonstrate the relative sensitivity of the variables used in calculating a sample size used to support an inference (projection) of average values, or percentages of an attribute feature, with a two-sided precision interval. Although tables and software are available to determine the appropriate sample size more quickly, and allow more iteration’s or “modeling” of the variables to select your sample than Example 1, Example 1 allows insight into the variables used to calculate sample sizes. See Exhibit 1 for a calculation of sample sizes at the 95% confidence level.

Sample Size; Example 1A

In the first calculation of a statistically valid variable sample size, based on a normal distribution, the “Z” value of 1.96 is selected from column on the far right at Seq.# 4 and placed in the input cell located at SEQ # 1; "Z" Std deviation factor. The next step is to input a value for the precision interval, which in this case is selected as .1 or 10%. The final value that is input for this formula is the value of sigma, which can be determined by a separate calculation based on a pilot sample or determined judgmentally by the practitioner. This value is important to auditors since a large sigma, such as is found in financial transactions, will increase the required sample size substantially. To demonstrate this, a larger sigma based on a greater population unit variation, will be used for the calculation of Example 1B. However, as can be seen from Example 1A, the unadjusted sample size for the population of 3000 is 96 items (Seq. # 17), and the adjusted sample size is 93 (Seq. # 25). These results are for a relatively small sigma of .5.

Exhibit 1A

Seq. Sample size calculation - Formula Z factorsIntellectual Property of James J. FinnCopyright 2009 ©


# Variables Sampling CalculationsCL Z

1 "Z" Std deviation factor 1.96 80% 1.282 "A" Precision interval ("L") 0.1 85% 1.443 "s" Std deviation (Sigma) 0.5 90% 1.654 95% 1.965 99% 2.5867 Formula 89 n= (Z^2)(S^2)/A^2

1011 Numerator 1.01213 Denominator 0.011415

16n = Unadjusted Sample size 96

1718 "N" Population 30001920 na = n/[1+(n/(N)]21 Numerator 962223 Denominator 1.0322425 "na" Adjusted Sample size 93

Sample Size; Example 1B

In the second calculation of a statistically valid variable sample size, based on a normal distribution, the input values are the same as in Example 1A, However, the final value that is input for this formula is the value of sigma, which has been increased from .5 to 1.5. This value is important to auditors since this larger value of sigma, as may be found in financial transactions, will increase the required sample size substantially. See Example 1B the unadjusted sample size for the population of 3000 items is 864 (Seq.# 17), and the adjusted sample size is 671 (Seq.# 25). In this instance, the adjustment for population size also has a greater impact on the adjusted sample size, since the sample size is a larger portion of the total population.

Exhibit 1B



Seq. #

Sample size calculation - Variables Sampling

Formula Calculations Z factors

CL Z

1 "Z" Std deviation factor 1.96 80% 1.282 "A" Precision interval ("L") 0.1 85% 1.443 "s" Std deviation (Sigma) 1.5 90% 1.654 95% 1.965 99% 2.5867 Formula 89 n= (Z^2)(S^2)/A^2


16n = Unadjusted Sample size 864

1718 "N" Population 30001920 na = n/[1+(n/(N)]21 Numerator 8642223 Denominator 1.2882425 "na" Adjusted Sample size 671

Sample Size; Example 2A

In this first calculation of a statistically valid Attribute sample size, based on a normal distribution, that could be used to evaluate and infer a percent or portion of a population that has failed a control attribute, the “Z” value of 1.96 is selected from column on the far right at Seq. # 4 and placed in the input cell located at SEQ # 1; "Z" Std deviation factor. The next step is to input a value for the precision interval, which in this case is selected as .1 or 10%. The final value that is input for this formula is the value of “P, the probability of occurrence” (which is a difference from the variable ‘sigma’ used for variable sampling), and can be determined by a separate calculation based on a pilot sample or determined judgmentally by the practitioner. However, the value used in the sample example is 50% since this will result in the largest sample size if the other parameters remain the same. The most sensitive and important value for determining the sample size in this example is the desired “Precision Interval” which is the accuracy or range of precision that can be estimated. This value is important to auditors since too large a ‘Precision Interval’ can lead to uncertainty where high-risk financial controls exceed the estimated or acceptable probability of occurrence for failures. To demonstrate this, a larger value Intellectual Property of James J. FinnCopyright 2009 ©


based on a smaller (tighter) precision interval, will be used for the calculation of Example 2B. However, as can be seen from the first Example 2A, the unadjusted sample size for this population of 3000 is 96 items (Seq.# 17), and the adjusted sample size is 93 (Seq.# 25).

Exhibit 2A

Seq. # Sample size calculation, for Attributes

Formula Calculation "Z" factors

Attribute Sample CL Z

1 "Z" Std deviation factor 1.96 80% 1.28

2"p" Probability of occurrence (f) 50% 85% 1.44

3 "A" Precision interval ("L") 0.1 90% 1.654 95% 1.965 99% 2.586789 Formula

1011 n=Z^2(p)(1-P)/A^21213 Numerator 0.96041415 Denominator 0.0116

17 n = Unadjusted Sample size 961819 "N" Population 30002021 Numerator 962223 Denominator 1.0320133332425 Nⁿ = Adjusted Sample size 93

Sample Size; Example 2B

In this second calculation of a statistically valid Attribute sample size, based on a normal distribution, the “Z” value of 1.96 is selected from the column on the far right at Seq.# 4 and placed in the input cell located at SEQ # 1; "Z" Std deviation factor. The next step is to input a value for the precision interval, which in this case is selected as .05 or 5%. The final value that is input for this formula is the value of “P, the probability of occurrence” (which is a difference



from the variable ‘sigma’ used for variable sampling). The value used in this sample example is 50% since this will result in the largest sample size if the other parameters remain the same. However, the most important value for determining the sample size in this example is the desired “Precision Interval” which is the accuracy or range of precision that can be estimated. This value is important to auditors since too large of a ‘Precision Interval’ can lead to uncertainty where high-risk financial controls exceed the estimated or acceptable probability of occurrence for failures. To demonstrate this, the smaller value based on a (tighter) precision interval, is used for the calculation of Example 2B. As can be seen from this Example 2B, the unadjusted sample size for this population of 3000 is 384 items (Seq.# 17), and the adjusted sample size is 341 (Seq.# 25). This is a significant increase in sample sizes over Example 2A, and indicates the cost of increasing the precision tightness achievable by the sample.

Exhibit 2B

Seq. # Sample size calculation, for Attributes

Formula Calculation "Z" factors

Attribute Sample CL Z

1 "Z" Std deviation factor 1.96 80% 1.282 "p" Probability of occurrence (f) 50% 85% 1.443 "A" Precision interval ("L") 0.05 90% 1.654 95% 1.965 99% 2.586789 Formula10

11n=Z^2(p)(1-P)/A^2


17 n = Unadjusted Sample size 3841819 "N" Population 30002021 Numerator 3842223 Denominator 1.1280533332425 Nⁿ = Adjusted Sample size 341

Acceptance Sampling:Intellectual Property of James J. FinnCopyright 2009 ©


Binomial Distribution;

While manual or spreadsheet calculations can be complex if the auditor wishes to calculate a complete attribute acceptance sampling plan with multiple values of “C”, which is the critical number of defects allowed in the sample (given a specified probability that there are fewer than XX % deficient controls in the population) it is possible to create an Example that indicates the concept of the decreasing probability of selecting a sample size and getting zero defects – given assumptions about population size and the population defect quantity or percent. Below is a sample calculation based on binomial distribution calculations that clearly shows the decreasing probability of selecting items without getting a defective control. It is important to realize that these calculations indicate the probabilities “without replacement”.

Sample Size Defects 300 Prob. C=0 Reliability

Population 3000

5 Prob of 0 or less defects 0.588 0.41210 Prob of 0 or less defects 0.343 0.65715 Prob of 0 or less defects 0.198 0.80220 Prob of 0 or less defects 0.113 0.88725 Prob of 0 or less defects 0.064 0.936

30 Prob of 0 or less defects 0.036 0.964

35 Prob of 0 or less defects 0.020 0.98040 Prob of 0 or less defects 0.011 0.98945 Prob of 0 or less defects 0.006 0.99450 Prob of 0 or less defects 0.003 0.99755 Prob of 0 or less defects 0.002 0.99860 Prob of 0 or less defects 0.001 0.999

65 Prob of 0 or less defects 0.000 1.000

C. Nonstatistical Sampling

1. General Considerations:

Previous analysis have focused on an internal auditors need to collect sufficient information by sampling, and then make projections or inferences about account balances, potential misstatements, or the percent of a control feature in the target population. These projected values or proportions can be used to assure management that the expected values or proportions are operating within intended policy guidelines and policies. Inferential statistics is the field of study dedicated to understanding and performing the task of projecting (inferring) sample characteristics (statistics) to the population that was sampled based on the mathematics of probabilities. Inferential statistics require statistically valid samples in order to quantify the sample risk and support the inferences (projections). Nonstatistical sampling however, does not



have a quantified sample risk, and projections are justified based on the auditors’ judgment. An exception can occur when a nonstatistical sample is statistically analyzed after being performed, and its usefulness and sample risk, as a statistical sample for projecting values or features to a population is determined.

Nonstatistical sampling, also referred to as judgmental sampling, involves a determination of sample size based on the practitioner’s judgment. The judgmental determination of a sample size is based on parameters considered relevant by the individual performing the sampling. In general, there are no specific rules limiting nonstatistical sampling, however, there are guidelines provided by auditing standards that are designed to consider elements of sampling risk as part of the process for determining the sample size. However, the bottom line is that regardless of what guidelines are followed, it is still non-statistical in that it is not based on, or bound by, the rules or axioms of probability mathematics. Nor is it a substitute for inferential statistical techniques, which requires a valid statistical sample in order to provide a measure of the sampling risk associated with a projection or inference.

2. Specific Considerations for auditing

Nonstatistical sampling is an option available to the auditor in which the sample size, in addition to other sampling considerations, is determined based on the auditors’ judgment. The audit standards are clear that AU Section 350, Audit Sampling, applies to both statistical sampling and nonstatistical sampling:

There are two general approaches to audit sampling: nonstatistical and statistical. Both approaches require that the auditor use professional judgment in planning, performing, and evaluating a sample and in relating the audit evidence produced by the sample to other audit evidence when forming a conclusion about the related account balance or class of transactions. The guidance in this section applies equally to nonstatistical and statistical sampling. [Revised, March 2006, to reflect conforming changes necessary due to the issuance of Statement on Auditing Standards No. 105.]xxix

If elements of designing a reliable sample are fulfilled, and a non-statistical sample size is

large enough to include sufficient population variations to be representative of the target population, then non-statistical sampling could produce results that are similar to the results obtained by statistical sampling. This is discussed in the auditing standard AU 350:

The sufficiency of audit evidence is related to the design and size of an audit sample, among other factors. The size of a sample necessary to provide sufficient audit evidence depends on both the objectives and the efficiency of the sample. For a given objective, the efficiency of the sample relates to its design; one sample is more efficient than another if it can achieve the same objectives with a smaller sample size. In general, careful design can produce more efficient samples. [Revised, March 2006, to reflect conforming changes necessary due to the issuance of Statement on Auditing Standards No. 105.]xxx



The “Sample Frame” for a non-statistical sample may be designed to be similar or even identical to that used for a statistical sample. A sample frame defines the items in the population that are available for sampling, and can be similar or the same for either nonstatistical sampling (judgmental sampling), or statistical sampling. An effort to properly select the items in the population in a random manner to minimize sample bias can be performed with the same diligence for either a statistically valid sample or a nonstatistical sample. Either a nonstatistical sample or a statistical sample may be used to provide sufficient audit evidence. See AU 350 Audit Sampling:

Either approach to audit sampling, when properly applied, can provide sufficient audit evidence. [Revised, March 2006, to reflect conforming changes necessary due to the issuance of Statement on Auditing Standards No. 105.]xxxi

The difference between the two sampling techniques is that the sample risk related to inferring values or characteristics to the population when using a nonstatistical sample cannot be quantitatively defined as it can be for a statistically valid sample. The AICPA recognizes both the potential benefits of nonstatistical sampling and the risks involved when a target population has a large variation or deviation of items in the population. See on Page 42 of the Audit Guide for Audit Sampling:

The characteristics (such as the amounts) of individual items in a population often vary significantly. The auditor subjectively considers this variation when determining the appropriate sample size for a substantive test. The appropriate sample size generally decreases as the variations become smaller.xxxii

However, considering the variation of items in a large population is difficult unless the variation is measured and quantified. This variation in value(s) of individual items in the population or “Standard deviation” is frequently calculated and referred to as “sigma” (δ) when calculating a classical variable statistical sample. Knowing and understanding the implications of this value is valuable if an auditor is to infer an account balance from a sample, or is to infer an error in an account balance based on a sample. Since the nonstatistical sampling method is not a probabilistically application of sampling, the technique of inferring account amounts (balances or misstatements) has an element of subjectivity that can be relevant and can contribute to audit risk. In fact, it is not possible to statistically infer or project an account balance with confidence using a nonstatistical sample without recognizing these uncertainties related to having an undefined sampling risk, and a high dependence on subjective judgment.

Using a stratification technique to strengthen the application of a nonstatistical sample is highlighted in the following paragraph from the AICPA Audit Guide - Audit Sampling, page 42

By separating a population into relatively homogeneous groups, the auditor can minimize the effect of variation of amounts for items in the population and thereby reduce the sample size. Common basis for stratification of substantitive tests are, for example, the recorded amounts of the items, the nature of controls related to processing the items, and special considerations associated with certain items (for example portions of the population that might be more likely to contain misstatements). The auditor selects separate



samples from each group and combines the results for all groups in reaching an overall conclusion about the population.xxxiii

The effort to separate a population by stratum for sampling improves the data from which a projection is being developed for the population, and is effective for both statistical sampling and nonstatistical sampling. However, it still does not ensure a statistically valid sample, and does not have a quantifiable confidence level or precision interval. This reality is noted in the AICPA Audit Guide to Audit Sampling, PAR 2.18.

…Any sampling procedure that does not measure the sampling risk is a nonstatistical sampling procedure. Even though the auditor rigorously selects a

i AICPA Statement on Auditing Standards No. 111, Amendment to Statement on Auditing Standards No. 39, Audit Samplingii AICPA Professional Standards, Audit Sampling, Section 350.45iii AU 350.46iv AU 350.01v Sanjaya Kumar Saxena, Discover 6 Sigma Sampling, http://www.discover6sigma.org/post/2007/02sampling/ Last visited 11/24/2007 vi AU-350.26vii AU 350.08viii AU 350 .10ix AU 350.10x AU 350.12xi AICPA Audit Guide, Audit Sampling, New Edition as of April 1, 2001, AAG-SAM 3.02xii Sam Antar, Whitecollarfraud.com, Visited 7/15/07, http://www.whitecollarfraud.com/index.html xiii Civil Action No. 93-3988 (HAA)

UNITED STATES DISTRICT COURT FOR THE DISTRICT OF NEW JERSEY

xiv Id.xv “So that’s why they call it a pyramid scheme”, © October 2000Association of Certified Fraud Examiners; Visited 7/15/07 , http://www.acfe.com/fraud/view.asp?ArticleID=31 xvi Id. at .04xvii AAG – SAM 2.38xviii Barbara Apostolou, Sampling: A Guide for Internal Auditors, (Copyright 2004, IIARF)xix PCAOB Auditing Standard No. 5, An Audit of Internal Control Over Financial Reporting That is Integrated with An Audit of Financial Statements, Page A1-41-Standard Appendix A-Definitions, A2. xx Barbara Apostolou, Sampling: A Guide for Internal Auditors, (Copyright 2004, IIARF)Appendix A through E.

xxi AAG-SAM 3.67xxii Institute of Internal Auditing’s training and review course for the CIA certification examination.xxiii Professor Arthur Schlfeifer, “Sampling and Statistical Inference” note 9-191-092 xxiv Vincent M. O’Reilly, Barry N. Winograd, James S. Gerson, Henry R. Jaenicke, Montgomery’s Auditing, 12th Edition. P 344, 345xxv AAG-SAM 2.17Intellectual Property of James J. FinnCopyright 2009 ©


http://www.acfe.com/fraud/view.asp?ArticleID=31

http://www.whitecollarfraud.com/index.html

http://www.discover6sigma.org/post/2007/02sampling/

random sample, the sampling procedure is a non-statistical application if the auditor does not make a statistical evaluation of the audit results.xxxiv

Thus, regardless of other methods used to determine sample size, a sample size is still not

based on probability theory unless a statistical analysis is performed on the sample plan. A nonstatistical sample size is still determined based on subjectivity or “Judgment.

A focus on attribute sampling is intended to assist internal auditors to improve the testing of internal financial controls and workflow processes controls related (primarily) to transaction processing, and to controls where the number of times a control is applied is too high to allow 100% testing of the control attributes. While the need for the application of control attribute sampling is clear in high volume transaction applications such as disbursements, it may also apply to areas such as capitalizing Fixed Asset acquisitions, and the movement and valuation of Inventories as well as controls related to the iterative process of developing standard costs for inventory valuations. The dividing line between 100% testing and sampling is a practical consideration (cost vs. benefit) determined by the frequency at which the control is being applied to the transactions, the availability of documentation needed for testing, and the amount of time required to test each item. If a control is being applied weekly (a frequency of 52) and the required documents for testing are in one folder and it takes a minute to test each document, then 100% of the items could be tested. If, however, the required documents are co-mingled in separate files with other unrelated documents and require sequential sorting and the folders are filed at different locations, then selecting a nonstatistical sample and applying professional judgment to determine the testing results may be a viable option. How many to select for a nonstatistical sample is based on the auditors experience and judgment.

The use of control attribute testing to determine whether or not specific process controls are being performed effectively has a wide range of application, but it can be confused and used in place of variable testing, especially when an attempt is made to apply the results of a sample plan designed for acceptance testing to infer a dollar balance or to extrapolate a dollar discrepancy that was found during acceptance testing. This may result in a serious testing error, especially if the sample size was nonstatistical, and determined based on a “Lot Acceptance Sampling” approach, but is assumed by the auditor to be usable as a “Variables Sample” for determining dollar values or, more frequently, percents of defects in population controls or inventory counts. Nonstatistical sampling is prone to this error in usage because the confidence level calculations inherent in statistical sample size calculations are not available as

xxvi The Operations Manual; Office of Treasury Inspector General for Tax Administration”, Chapter 300 Page 4xxvii Id. at 4xxviii ISBN 0-471-34605-5, Montgomery’s Auditing, 12th Edition; Vincent M. O’Reilly, Barry N. Winograd, James S. Gerson, Henry R. Jaenicke xxix AU 350.03xxx AU 350.04xxxi AU 350.04xxxii AAG-SAM 5.07xxxiii AAG-SAM 5.08xxxiv AAG-SAM 2.18 Intellectual Property of James J. FinnCopyright 2009 ©


“guidelines”. Normally, in statistical sampling the use of a “Margin of Error”, or a “Standard Deviation” vs. a percent of occurrence would separate the two sample selection size calculations and provide a guideline as to their appropriate end use.

Attribute sampling, when applied properly, is a very powerful application of statistical sampling in internal auditing. The use of attribute sampling for operational effectiveness testing of financial processing controls is well accepted and extremely valuable to the accurate testing and auditing of Internal Controls of Financial Reporting (ICFR). A reference to this application of attribute sampling is available in the “AICPA Audit Guide to Audit Sampling” on Page 12 in the section “Types of Statistical Sampling Plans” which states:

2.28 Some examples of tests of controls in which attributes sampling is typically used include tests of the following: Voucher Processing Billing Systems Payroll and related personnel-policy systems

Although this reference from the “AICPA Audit Guide to Sampling” is in the section for “Types of Statistical Sampling Plans”, attribute sampling can be performed with both statistical sampling and nonstatistical sampling plans. The same chapter in the AICPA Audit Guide covers nonstatistical sampling some what in a previous paragraph in the section “Nonstatistical Sampling and Statistical Sampling” with the following statement;

2.19 A properly designed nonstatistical sampling application can provide results that are as effective as those from a properly designed statistical sampling application. However, there is one difference: Statistical sampling explicitly measures the sampling risk associated with the sampling procedure.xxxv

The application of statistical sampling to control attribute measurement and acceptance could become a dominant and pervasive control feature for the Internal Control of Financial Reporting (ICFR) if attribute sampling is widely accepted. Without measuring and knowing the sampling risk associated with the sampling procedure, the results of testing a sample are almost always subjective in nature rather than being based on a statistical and probabilistic structure. An additional consideration is that nonstatistical samples are frequently not random as is required for statistical inference. By their nature they are biased in that some units are more likely to be selected because of human nature and preferences than are other units in the total population. In financial transaction processing where there is a wide variance of dollar amount per transaction, a non-statistical sample may not consider a wide enough distribution of items when samples are selected, thus eliminating the ability to probabilistically infer a sample characteristic to the population. This limitation exists regardless of the sophistication of the methods used in selecting the sample.

Even though some of the available guidelines for nonstatistical sampling recommend a thorough and comprehensive analysis of the population, and qualitative risk factors, there is no way to assess whether or not such analysis was done, or how the analysis was performed unless it is documented. The thoroughness and quality of an analysis of the population may itself

xxxv AAG-SAM 2.27Intellectual Property of James J. FinnCopyright 2009 ©


depend on a sample to determine whether or not the results of the sample testing should be useable to any extent other than to support an auditor’s judgment regarding the sample itself. The only way to qualify a sample is to evaluate its randomness and appropriateness to be representative of the population being tested while considering the end use of the results of the sample. If an auditor is untrained or not supervised properly in sampling techniques, the selection of a sample may be the equivalent of “Got a hunch, pick a bunch”. Also, management may request a “dollar amount” be place on an ineffective control or financial reporting process in order to rank needed remediation in some dollar priority order. The team doing the audit complying with this requirement by using the same sampled items (which were nonstatistical, non random, and intended to be used for attribute testing of controls) by pulling the dollar amounts off the existing sampled documents, and then using a ratio methodology to extrapolate a dollar amount of risk to the ineffective control should first perform a statistical validation of the sample and the results before responding.

Doing a second sample that is a statistically valid variable sample focused on the high dollar strata transactions that could be misstated would be the first step, and then extrapolate the average misstatement (sample mean) to the population. This method of projection of dollar risk may at least be qualified by a sampling risk as defined by the confidence level, precision and some measures of variation from a mean (δ).

Comment: In practice, I believe that this extra effort is imperative if the inference is to have any credibility, and to prevent the audit client from investing substantial sums of money to fix something that may be deficient but still immaterial for financial reporting.

D. Testing of Controls, Non Inferential sampling

1. Intended End Use of a Sample (inferential vs. non inferential)

When the intended use of the sample is to project significant and relevant characteristics of the sample to the population the sample should be an “inferential type” sample. An inferential sample is a valid statistical sample, and should be required in all instances where the sample size is not cost prohibitive. In order to determine the percent or proportion of defective controls in a process, attribute sampling can be used to infer the percent of deficient controls by calculating a statistically valid sample size either by using tables, the appropriate formulas, or statistical auditing software. If the intended use of the sample is to determine what percent of checks over a set dollar amount have effective duplicate signatures (the control), assuming a deviation rate greater than 30% will usually result in a sufficient sample size for evaluation based on a normal distribution. However, if a maximum sample size is desired, it can be calculated by assuming a 50% deviation rate in the population. This is the maximum sample size because it is the maximum deviation rate for a binomial distribution. However, if one only wants to know - “what are the chances that the control is only defective 10% of the time or less” then an “acceptance” sampling methodology and corresponding sample size can be used. Thus, the intended end use of the sample, which can be either to determine a confident estimate of a controls specific percent defective, or merely to determine the “chances” that the deviant controls are probably in an acceptable range, can be the major factor in the final decision of what



sampling method to use. However, as with many other decisions, the cost vs. benefit considerations also weigh-in on the final decision.

Certainly if 100% of an item is accessible with very little incremental effort over drawing a sample, this is preferred to either sampling method. However, when an internal auditor is auditing a transaction-based population with items in the hundreds, thousands, or even hundreds of thousands of items, 100% manual testing becomes impractical (if not impossible), and a portion of the population must be selected using sampling. This is where the intended end use of the sample can become a major factor in deciding between statistical sampling and nonstatistical sampling, and between one sided and two sided testing, and between classical sampling or acceptance sampling. If the intended use of the sample’s test result is to support a conclusion where projected sampling results will be the primary audit evidence, then valid statistical sampling is preferred in order to provide confidence levels and statistical inferential credibility to the conclusions. On the other hand, if the intended end use of the sample’s test results is to be just one minor consideration among many other stronger arguments, then a nonstatistical or acceptance sample could be adequate.

As can be seen in the previous paragraph an underlying reason for selecting between statistical or nonstatistical sampling there is whether the auditor intends to use sampling for an inferential application, or to use sampling to determine the level of risk or probabilities involved in accepting or rejecting an internal control’s assertion of effectiveness. The latter is a typical end use of internal audit sampling that is used specifically to evaluate the risk inherent in accepting internal controls as being effective. Evaluating risks that controls may not be effective, or, alternatively, of not accepting (rejecting) internal controls that are effective is an objective of testing of controls.xxxvi This special application of attribute sampling (actually it can also used for variable sampling, but that is of no concern here) is referred to in authoritative audit guidance as “Sampling in tests of Controls”. xxxvii Chapter three in the AICPA Audit Guide on Audit Sampling discusses this application further.

In other industries similar sampling methodologies are generally referred to as “Lot Acceptance Sampling”, and are used in quality control to sentence lots as either accepted or rejected ‘manufactured products. Sampling in tests of controls (where probabilities of deviations are less than 20%) is usually based on hypergeometric or binomial probability distributions, as opposed to the normal distribution that we previously discussed. In the previous examples, formulas and calculations for samples size and statistics were based on the normal distributions, and were either two-sided tests, or. If they were one-tailed tests, they would have used a ‘t’ test for significance. However, an ‘acceptance or rejection’ of ‘risk’ based on probabilities using the special application of attribute sample plans can be efficient with smaller sample sizes than are required for the two sided testing method, and is referenced in ‘Montgomery’s Auditing, 12th Edition” in the section on ‘Statistical Tests of Controls’.xxxviii Sampling for tests of controls is especially valuable early in an audit since it can be a major factor in determining the level of substantive testing that may be necessary, and in what areas the testing should be most intense.

xxxvi AAG-SAM 3.29xxxvii AAG-SAM 3.01xxxviii Vincent M. O’Reilly, Barry N. Winograd, James S. Gerson, Henry R. Jaenicke, Montgomery’s Auditing, 12th Edition, P 343 - 347Intellectual Property of James J. FinnCopyright 2009 ©


An efficient audit would require substantive testing only as necessary for the audit to be effective in determining the auditee’s reliability and fairness of financial reporting. However, while efficiency is desirable, a primary consideration is that the auditors’ conclusions from sampling are not biased in such a way as to result in the acceptance of ineffective controls as being effective. The importance of testing for internal controls effectiveness, and the risk related to the auditor of accepting controls as being effective when they are not, is discussed in the ‘AICPA Audit Guide on Audit Sampling’ as indicated below.

The risk of assessing control risk too high relates to the efficiency of the audit. The auditors assessed level of control risk based on a sample may lead him or her to increase the scope of substantive tests unnecessarily to compensate for the higher level of perceived risk. Although the audit may be less efficient in this example, it is nevertheless effective. However, the second aspect of sampling risk in performing tests of controls – the risk of assessing control risk too low – relates to the effectiveness of the audit. If the auditor assesses control risk too low, he or she inappropriately reduces the evidence obtained from substantive tests. Therefore, the discussion of sampling risk in the following paragraphs relates primarily to the risk of assessing control risk too low.xxxix

Because this testing for the effectiveness of internal controls is a significant factor in determining the course and development of the subsequent substantive testing, the design of a sample plan must be as reliable as is practical. The testing for the effectiveness of internal controls is the foundation for determining the remaining levels and types of testing. However, it should be clear that the testing of internal controls is not necessarily the major component for an auditor to form their opinion regarding the quality of the financial reports themselves. The testing of controls provides evidence related to the overall internal control environment, not the quality of the financial reports. For this reason, valid statistical sample sizes and inferential techniques are frequently not used. Rather, testing techniques based on acceptance sampling concepts are preferred. This is discussed in the audit-sampling guide as follows:

Samples taken for tests of controls are intended to provide evidence about the operating effectiveness of the controls. Because a test of controls is the primary source of evidence about whether the controls are operating effectively, the auditor generally wishes to obtain a high degree of assurance that the conclusion from the sample would not differ from the conclusion that would be reached if the test were applied to all transactions. Therefore, in these circumstances the auditor should allow for a low level of risk for assessing the control risk too low. Although consideration of risk is implicit in all audit sampling applications, it is explicit in statistical sampling.xl

If the testing of internal controls indicates a high risk of control ineffectiveness, then a greater amount of substantive testing may be necessary. However, since this method of ‘acceptance’ testing for internal control effectiveness is based on finding either more or less deviations then the critical value ‘c’ in the sample than has been calculated as acceptable based

xxxix AAG-SAM 3.30xl AAG-SAM 3.31Intellectual Property of James J. FinnCopyright 2009 ©


on probabilities, it has the advantage of generally requiring smaller sample sizes than would be needed for two sided testing inferential testing. It is important to understand that the acceptance method of sampling and evaluation allows ‘sentencing’ a lot as either good or bad, but does not tell you anything about the process creating the items in the lot. The effectiveness of smaller sample sizes allows an auditor to evaluate the risks of accepting an auditee’s internal controls as being effective in a fast and efficient manner, however, the sample size and the cutoff number (critical value ‘c’), is determined by two simultaneous equations that are non linear, and it is not intended to be used as a statistically valid sample size for inferring or projecting deficiency rates or item values to the population. When determining the sample size (n) and the cutoff defective quantity (c) two points are required. These two values are the tolerable percent defective in the population, and the percent defective that will result in rejecting the assertion that the control is effective. In industry applications these points are referred to as Alpha and Beta probability levels.

The calculations for the sample plan values of n and c are usually performed by a computer and are based on a binomial distribution, a hypergeometric distribution, or a Poisson formula. In addition to being calculated using formulas and a computer, a “Larson Diagram” or published tables is sometimes used. However, the important “take away” from this discussion is that, in acceptance testing, an operating characteristics curve is calculated for the two risk points (acceptance percents and rejection percents), which integrate the sample size (n) and the number of allowed defects for the sample plan. Because of the mutual dependency of all elements of the sampling plan, the sampling size (n) and the number of defects acceptable (c) are mathematically bound together. For this reason, the sampling plan is referred to in some industries as a ‘npc’ sample plan. For example, if an auditor selects a sample plan from appropriate tablesxli of 93 items (‘n’ sample size) with a critical value of 1 (‘c’ = 1), and exceeds the critical value by finding 20 deficient control attributes; it is certainly appropriate to conclude the internal controls are probably ineffective, however, it is not appropriate to extrapolate the 20% deviation rate to the population without testing the statistical validity of the sample size, confidence level, and precision interval using formulas similar to the ones used in the attribute sample size example. A sample, originally used for testing control effectiveness, can be used as a pilot sample to determine a statistically valid attribute sample size using the previous formulas. One result of the statistically valid sample size calculation is that a sample size over 200 items could be required to infer a 20% deviation rate in the population, given the preferred confidence levels and precision intervals indicated in Example 3 below.

Exhibit 3

Sample size calculation - Attribute: Z factorsCL Z

"Z" Std deviation factor 1.96 80% 1.28"p" Probability of occurrence (f) 0.2 85% 1.44"A" Precision interval ("L") 0.05 90% 1.65"s" Std deviation 0.400 95% 1.96

99% 2.58

xli AAG-SAM APP A, A-1Intellectual Property of James J. FinnCopyright 2009 ©


Formula A1 Formula A2

n=Z^2(p)(1-P)/A^2 n=16s^2/L2

Numerator 0.614656 Numerator 2.56

Denominator 0.0025 Denominator 0.01

n = Unadjusted Sample size 246

n = Unadjusted Sample size 256

"N" Population 3000

Numerator 246

Denominator1.08195413

3

Nⁿ = Adjusted Sample size 227

Even less appropriate than projecting a control deficiency rate to the population, is to take a ‘value’ amount such as the ‘mean’ or deviation of dollar amounts found in the acceptance samples, and project that statistic to the account balance or class of transactions without determining the effectiveness of the sample size. The acceptability of a sample for projections can be determined by using formulas similar to the ones used in the variables sampling exercise for confidence level and precision interval. A recalculated valid statistical sample size for Classical Variables sampling could be calculated based on using the acceptance samples standard deviation (sigma) as a pilot sample.

In order to perform a test for control effectiveness using this reduced sample size sampling method, the following steps could be followed.

2. Sampling Steps for tests of Controls

When designing a sample plan for tests of controls, the steps required are the same as other methods of attribute sampling described in the table included at section IV, A; however, the method of determining the sample size is the part of the sampling method that is different. The sample size and critical value can be determined using a binomial nomograph, or special applications of statistical software, however, appropriate combinations of sample sizes and critical values can also be determined using tables provided by the AICPAxlii.

xlii AAG-SAM APP A, Table A1, A2Intellectual Property of James J. FinnCopyright 2009 ©


In order to use these tables, the auditor must determine the factors needed to ‘pick’ the correct sample plan. These factors include a judgmental estimate of the expected population deviation rate, the tolerable population deviation rate, and the acceptable risk of assessing control risk too low. Because these factors can be specific to the particular circumstances the factors are determined by the auditor using their experience and judgment and may initially be expressed in a relative and qualitative manner. The general effect of these factors is discussed in the AICPA Audit Guidexliii. Once the magnitude of these factors has been determined, it is necessary to view them quantitatively when using the tables.

The auditor must first quantify the probability of assessing the Control Risk too low’ in order to select an appropriate level table of sample sizes. The two most frequent tables used are those published in the audit guide as tables A1, and A2.These are 95% risk Table A1), and 90% risk (Table A2). Once the desired table has been selected, it is then necessary to quantify the ‘Expected population deviation rate’ and the ‘Tolerable Rate’ in order to select sample plans from the tables. An example of this process to select a sample plan is as follows.

Based on the audit environment, a 5% risk of Assessing Control Risk Too Low is considered appropriate. This results in using the ‘Table A.1’.xliv

Based on previous audits, the “Expected Population Deviation Rate” is considered to be low so a rate of 1% is selected as the row in the table.

The control being tested is the approval of purchase orders, so the tolerable rate cannot be too high. A tolerable rate of 5% is selected as the column in the table.

The result of this process is to select a sample size of 93 items, and a critical value of (1) defect. If subsequent testing of the items sampled results in finding 1 or fewer deviations, the auditor can conclude that the desired risk of assessing control risk too low is not more than the tolerable ratexlv. Additional interpretations of the results of sampling are also included in the Audit Sampling Guidexlvi

E. Practical Limitations on Sampling

When comparing and deciding whether to use statistical sampling or nonstatistical sampling for population means or attribute percents, the decision will be influenced by physical and economic realities. The auditor must be able to access and select the required items within the sample frame, given the time allowed, and the resources available. Thus, nonstatistical sampling may be preferred when resources and documented evidence are scarce since smaller sample sizes are easier to perform with these constraints. However, this concession to cost or time limitations may restrict the auditor from inferring statistically valid conclusions about the population. This is especially true when testing high volume transaction based systems such as accounts payable processes, customer invoicing, and payroll processing. These accounting

xliii Id. Exhibit A. 1xliv Id. page 85xlv Id. at A.4xlvi Id. at P 84, A5, A.6, A.7 Intellectual Property of James J. FinnCopyright 2009 ©


processes may have small but significant control deficiencies that may require a large sample size to measure. Stratifying the population into subgroups could reduce the size of the sample; however each subgroup will still need an adequate sample size of its own.

Nonstatistical sampling may assist an auditor to determine that a portion of transactions in a sample have a deficient characteristic, alternatively, they could find that there are not any failure characteristics or deficiencies. In either case, the non statistical sampling methodology does not support inferring that the population either has the same proportion or percent of the failure characteristic, or that the population does not have any failure characteristics or deficiencies. The nonstatistical sample based information does not support a probability based direct projection since the sampling risks are not quantified. Any projection must be based on the auditors’ judgment. This is not true if a statistically valid sampling plan is used.

An internal auditor could have an intentional and documented basis for selecting a nonstatistical sample over a statistical sample, in order to support their judgment. A preference for using a nonstatistical sample when a statistical sample would result in minimal additional costs, and provide better support for the audit conclusions, should require sufficient documentation. This preference may be caused by the lack of understanding that the nonstatistical sample may not fill their need to infer or project the results. Or non statistical sampling may be selected based on knowing from prior experience that deviations are extremely small or that the results will are highly predictable. However, as indicated previously, statistical sample sizes are generally less risky although they are large when compared to nonstatistical samples.

F. Statistical Inference and Sample Size

Even when an auditor uses a valid statistical sample, the relevance of a projection of the sample results to the population must be understood and evaluated based on sound judgment. A projection may be inadequate because of the selection of variables involved in determining the sample plan. The internal auditor must review and determine the adequacy of the confidence level assumption i.e. is 90 % or 95% really sufficient or is the population being audited sensitive enough to require a 99% confidence level. The variables involved for calculating sample size are different for variable sampling, attribute sampling, and the special form of attribute sampling – acceptance or npc sampling. Variable sampling plans are applied most frequently to confirm account balances, or misstatement amounts in dollars, and are therefore most concerned with values and variability in the population or sigma. Whereas, acceptance-sampling plans are applied most frequently to determine whether a control attribute is operating effectively, and, are therefore most concerned with the binomial distribution, or binary conditions – such as whether reconciliation exists or doesn’t exist. Variable sampling plan formulas require the input of a value for the variance or “δ” of the population (usually an estimate or result of a pilot sample) since the inference will be in the form of a population mean and the variance around that mean. These components of sample size determination must be evaluated knowledgably, and taken into consideration when performing a projection from samples to populations

This analysis focus on attribute sampling plans since they are the essence of sampling plans for determining either the proportion of “successes or failures” of controls in the population. These are the most comprehensive sampling plans by internal auditors or process Intellectual Property of James J. FinnCopyright 2009 ©


managers to test a control attributes success or failure percent in a population. In attribute sampling plans two of the critical input variables are the ‘expected occurrence rate’ or probability of occurrence, and the desired ‘Precision”, or acceptable range of probable inference. The usability of attribute sampling plans and the resulting statistical inference is calculated by the variables mentioned previously and a selection of the related Confidence Level or tolerable deviation rate.

The reliability of a statistical inference from a sample is dependent on the selection of confidence levels and population parameters. If an auditor selects a 90% confidence level with an expected occurrence rate of 10% and a desired precision of 10 % the sample size will be quite low because it is a ‘loose’ set of specifications for the projection or inference. If the statistical sample size for this set of variables (population of 500) is approximately 25, this is a small but easily managed sample size for most audit situations. It is convenient to find this many items in most filing systems, and, it is a statistically valid calculated sample size. However, what the projected results may be saying is not that valuable. The sampling results may not be useful information even though it is a statistically valid Sampling plan because the quantity selected may not be representative of the population. However, if we were to go to the other extreme and design an attribute sampling plan based on detecting even a small percentage of attribute failures in the population we could use a sample based on a 99% confidence level with an expected occurrence rate of .1% and a desired precision of .1 %. With those sample plan specifications the sample size calculation results in a much larger sample of 465 for a population of 500 items. Again this sample size is a statistically valid sample, but it is almost as useless as the prior example because, in this case, the sample size is impractical and saves almost no cost or time as compared to testing a 100% selection of the population.

G. The Rise and fall of Statistical Sampling in Auditing

Many external auditing firms focused on Statistical (probabilistic) sampling for the period of time up to and immediately after the issuance of SAS 39; however, that focus has changed during the decade of the 1990’s to a preference for the use of nonstatistical sampling to base a judgmental projection of values or control deviation ratings. The use of sampling plans in Internal Audit and Sarbanes-Oxley testing of internal controls that are not statistically valid for inference, has become an acceptable methodology because of its simplicity, acceptability, and cost effectiveness – not because of its inferential statistical reliability. Sarbanes-Oxley acceptance testing for the risk of control effectiveness has became a popular auditing procedure, however, the results, in some instances, were projected to the population without verifying the statistical validity of the sample size. The result can be similar to inferring from a non-statistical sample. The basis for a projection to a population from a sample that is not statistically valid for inferential statistics should be defined as such by the auditor. In some instances, the application of non-statistical sampling is also referred to as “Judgmental Sampling” or “Haphazard Sampling”.

The emergent use of nonstatistical sampling techniques has been criticized by one of the authors of SAS No. 39xlvii, the auditing standard that supported nonstatistical samplings’ acceptability as evidence in auditing.

xlvii SAS 39 has been amended by SAS 111 Intellectual Property of James J. FinnCopyright 2009 ©


Throughout the 1960s and ’70s, the largest accounting firms devoted extensive resources to the development and implementation of statistical sampling procedures. The firms wrote new policies and guidance, developed time-sharing and batch computer programs, and trained specialized staff. Monetary unit sampling was developed and became a widespread audit tool. The AICPA issued Statement on Auditing Procedure (SAP) 54 and published Statistical Auditing, by Donald M. Roberts. Then, in 1980, the Auditing Standards Board (ASB) issued SAS 39, Audit Sampling (AU 350). Members of the Statistical Sampling Subcommittee that wrote SAS 39, which included this author, expected that the imposition of risk, materiality, and selection requirements would further establish statistical sampling as a principal audit testing procedure. In fact, the opposite has occurred, largely because the ASB gave nonstatistical sampling equal evidentiary weight.xlviii

Thus, statistical sampling held a preferred position in auditing for a few decades, but then appears to have gone out of favor. My conjecture on why this occurred is that the effective use of valid statistical sampling can be demanding on an auditor as well as costly for the client. However, these negative factors may not have discouraged the use of statistical sampling if a credible argument could be put forward to justify its use. I believe that a credible argument can be made to justify the cost of training internal auditors on the detail theory and application of valid statistical sampling based on the conceptual view that a financial transaction processing and control system is, in fact, a process that can be statistically controlled rather than a collection of individual tasks. Effective cost reductions can be achieved by organizing the volume of activities into predictable and controllable processes and using statistical sampling to ensure the process is operating effectively. This would be an end objective in applying statistically valid sampling plans to an internal audit, or as part of the design of internal financial controls and procedures.

V. Effective Statistical Sampling

A. Probability Theory

Much of the credibility of inferential statistical projections and believability of conclusions about attributes of a population is founded on sampling that is based on the application of probability theory. Consequently, the more thoroughly an internal auditor understands probability concepts, history, and principles, the better equipped they are in determining when to use statistical sampling as opposed to nonstatistical sampling. Also, even in those instances where the auditor is not the person determining the sampling methodology, the auditor will be better prepared to summarize the audit conclusion and explain the test results. The credibility of probability theory and probability axioms has a long and rich history supporting the current body of theoretical and applied principles.xlix Although the first recorded publication of a correct method of calculating probabilities for gaming was in the 16th century, the undocumented application of probabilities in the form of gaming probably took place in ancient Greece and

xlviii Neal B. Hitzig, Statistical Sampling Revisited, CPA Journal, May 2004/Vol. LXXIV No. 5. xlix Dimitri P. Bertsekas, John N. Tsitsikilis, Introduction to Probability, MIT, Athena Scientific, Belmont Massachusetts P 17 Intellectual Property of James J. FinnCopyright 2009 ©


Rome. Over time, many of the observations and experiments related to probability theories were formalized. The end result of this process of analysis and mathematical proofs has been a formalized and specialized set of axioms that constitute 20th century probability theory. As a result of this extensive vetting, the leverage of increased credibility is added to an internal auditors evaluation by applying statistically valid sample sizes is well worth the cost, especially in those instances where the effectiveness of a critical internal process or control activity is being tested in a highly visible environment.

Even at a very basic level, the selection of a population to be tested, or the effectiveness of selecting a sample at all (versus inquiry or observation) can be improved by applying quantitative probability concepts. However, regardless of where the term is used, the usage of the term ‘probability’ has a wide range of interpretations. One of the interpretations of probabilities include measuring the percent of an attribute based on the frequency of occurrence (see page 2,l) to a completely subjective evaluation of entity controls based purely on an educated opinion. The percentage probabilities based on a frequency of occurrence are the quantitative assessments of specific control attributes such as date, amount, signature, and existence of a filed document, whereas a subjective attribute may be “sufficient support”, or ‘adequate review’. Even determining ‘what’ the population to be sampled should be, can be a point of disagreement or uncertainty. However selecting a population and a sample frame can be thought out in a clearer manner by applying the logic of “sets” (see page 5 footnotes 19). An example of an application of logical sets would be in the area of AP transaction sampling. There is a large universe of different types of AP transactions, but if the specific attribute to be tested is whether a ‘receiver for a document’ is properly verified prior to vouching a vendor invoice, the auditor actually wants to select only that set of transactions that require a receiver to be created. This is frequently manufacturing materials, capitalized purchases, and manufacturing MRO items. This is a subset of the total AP transaction universe, and would constitute the appropriate population for the sample frame. This is not a small or unimportant problem. Quality assurance evaluations on previously tested controls specified verifying the receiver was matched to the PO, and yet the population for selecting the sample frame was defined simply as ‘AP vouchers’; however ‘vouchers’ also included approved documents that did not require or have ‘receivers’.

Thus, the “take away” recommendation on the issue of probability theory for sample selection is that an internal auditor, who has a solid understanding of probabilities and how they impact sample sizes and sampling in general, is much more likely to be an effective communicator of the credibility of sampled test results – even under client pressure. I don’t think management - without a statistically valid sample would have believed a negative conclusion by the auditors at “Crazy Eddies”.

B. Sampling Method vs. Sample Selection Methods

Analysis up to this point has focused on dividing sampling methodologies into major classifications or methods. All sampling methods can be categorized into one of two categories. These categories are: 1.) Statistically valid sampling; or 2.) Nonstatistical sampling. In a statistically valid method of sampling the sample size is based on calculations or algorithms relying on probability theory and rigorous mathematical proofs which provide quantitative

l Id.



values for measures of confidence level, variability, occurrence rates, and precision for the range of inferred values. Nonstatistical sampling does not provide these mathematical links to probability distributions. However, while previous discussions were intended to reduce confusion regarding the usage of statistical and nonstatistical sampling, there is also another aspect of sampling where there may be significant confusion in both auditors and clients. This confusion occurs when they are determining or discussing a ‘sample selection method’. The sample selection method relates to how the samples are physically acquired or ‘drawn’ from the population rather than to the category of either statistical or non-statistical sampling plan determination.

The sample selection method is mandated when using statistical sampling to a method that provides an equal probability to each population item of being selected (i.e.) a simple random sample. Generally, the most effective technique for achieving this ‘randomness’ is to use random number sample selection, or interval sampling beginning with a random starting point. In addition to these two methods, “cluster” sampling may be accepted in some instances where there is a multi stage sample selection procedure being used. A multistage selection usually first selects clusters (or locations) at random, and then samples 100% of all items within the selected cluster(s), or alternatively, calculates a statistically valid sample size within the random cluster. Test results are determined for the selected clusters, which are then inferred to all other clusters. However, the use of cluster sampling for statistically valid sampling programs may not be acceptable to some testing programs unless there is sufficient reason to believe that all clusters have an identical distribution of attributes. In the treatment of cluster sampling, the IRS Treasury Departments internal audit program does not accept applying the results of one branch to all branches, in the respect that the results from statistically valid samples selected at one field office are not acceptable as the basis of conclusions at other field offices.

1. Techniques for selecting samples

Statistical sampling requires a truly random sample selection method – nonstatistical sampling does not. In order to achieve an acceptable random sample it is necessary to use a truly random method of selecting the sample – not a potentially human biased method such as haphazard selection by a person. Selecting a truly random sample is frequently done using software to generate random numbers within the sample frame’s range of reference numbers. However, using a structured method to select a truly random sample does not, by itself, create a statistically valid sample. While a truly random sample selection method is required for a valid statistical sample, it is only one factor in creating a statistically valid sample. A statistically valid sample requires the calculation of a sample size based on a suitable confidence level and precision range and the consideration of either the proportion or variability in the population being sampled.

Nonstatistical sampling, however, has any number of available sample selection methods that can be applied based on the circumstances or the professional judgment of an auditor. However, the conclusions related to the sample can only be applied to the sample. Any projection to the population is actually based on a subjective evaluation by the auditor. In fact, as mentioned above, a structured random sample selection method may actually be used to select the samples, even though the end result is still a nonstatistical sample size selection Intellectual Property of James J. FinnCopyright 2009 ©


methodology. Other popular sample selection methods, their appropriate use, and brief definitions are as follows:

a. Haphazard:

This is a method that approximates the effect of random sampling by selecting samples without any intended bias and with no discernable pattern. Examples of this method include selecting document files from a filing cabinet in an unbiased or haphazard manner, or selecting purchase orders from a listing of all purchase orders with no apparent bias or preference. When using this sample selection method, the auditors’ intent is to select items that are chosen without intentional bias. However, even when an auditor applies a haphazard sample selection method based on Excel to generate selection numbers, or a random number generator pick the items for selection, they are using a random sample selection method, but may be applying it to a nonstatistical sampling plan as discussed in the previous paragraph.

b. Interval:

This is a method that can be accepted as a random sample if the beginning point for the interval is chosen at random and there are no periodic or cyclic recurrences expected in the population. However, this method may be biased if a listing that is used for the representative sample frame has a built in cyclic variation or recurrence. An example of this problem occurs when the auditor is testing checks over a specific dollar amount, and the only listing available is a check register that includes all checks over the specified dollar amount in which the first 5 checks for each weekly check run are for the same items over the dollar amount such as rent, insurance, employee benefits or other cyclic payment. As a result of this periodicity, those checks, which are run every check run, have a higher probability of being selected thus; there is a bias in the selection. When using this sample selection method, the practitioners’ intent is to select items that are chosen without a known bias, as a result, it can be applied to statistical sampling, but it is also frequently used for nonstatistical sampling. In any case, it is best used if the auditors’ judgment is that periodicity is an acceptable bias, or is non-existent. Finally, even though the auditor may be comfortable applying it to a nonstatistical sampling plan, it may not be appropriate for all statistically valid sample plans because of the possible bias caused by a periodic or cyclic re-occurrence.

c. Judgmental:

This is a method of selecting samples (or actually multiple methods of selecting samples) that is based on a deliberate sample bias being either created or accepted by the auditor. The previous comments regarding interval sampling could be interpreted as creating a judgmental sample based on the fact that the auditor accepted a known cyclical bias based on their judgment. However, usually the bias is more specific and intentional. An example of a judgmental sample selection may be one based on sampling only purchase orders from a specific buyer, over a specified dollar amount, or from a specific vendor. Even a haphazard sample selection could fall under the umbrella of a judgmental sample if the samples are selected haphazardly from a one month period that was



determined judgmentally, or from a specific filing cabinet that is determined by the auditors judgment (probably an unlocked, conveniently located cabinet).

d. Block:

This is a method of sample selection where a sequential group of items is selected from a list, or a group of physically contiguous folders or documents are selected from a file. I do not see anything unique about this method other than that it is quick and convenient as long as there is a list or a filing repository. I have only used this method of sample selection where I expected there to be no differences in the test results regardless of where, on a list, I selected the samples. An example would be when testing a Pos-Pay electronic approved check listing by agreeing the check number and amounts to the corresponding check register. A block of sequential transactions could be a one-week check run, and, since it is an automated control, if that block of transactions is appropriate, then I can be reasonably certain that all other blocks produced by the same process are also appropriate.

e. Probability Proportional to Size

Probability Proportional to Size (PPS) sampling is a method of sampling based on attribute sampling in which the attribute is dollars rather than a physical attribute. This method of selecting the sample creates an intentional systemic bias toward the selection of high dollar items. This intentional selection of a stratum consisting of high dollar items is indicated in the sampling guide for internal auditors as follows:

The sampling technique is named for the manner in which the accounts are selected to be audited. The probability of an account being included in the sample is proportional to its size. As a result, large accounts have a higher probability of being audited. Thus the technique automatically provides audit evidence to large accounts in a population.li

This method of sampling requires using a set of tables and methods of evaluation that are specific to the methodology. Details for using this methodology can be found in the sampling guide for internal auditorslii

f. Computer Assisted Auditing Techniques (“CAAT”)

The number of transactions being processed in today’s financial payments systems and financial reporting systems make it almost impossible to manually select and audit 100% of the transactions for control attributes or dollar amounts. This has provided an impetus for Computer Assisted Auditing Techniques (CAAT). CAAT software, with well-designed cost effective sampling programs, is some of the most economically feasible methods to sample transactions (up to 100%), for control acceptance testing, or account ‘Summing’ assurance. For these reasons, the technique can be the cornerstones for gathering auditable information on automated

li Barbara Apostolou, Sampling: A Guide for Internal Auditors, (Copyright 2004, IIARF) P 45lii Id. At P 47Intellectual Property of James J. FinnCopyright 2009 ©


transaction-based processes, and for systems utilizing large database management systems. For organizations with consolidated subsidiaries, the value of CAAT sampling and testing of transactions from the subsidiary for accuracy and correctness can be a significant time saver, and there is audit software available for this purpose. One of the more effective ways to accomplish an audit of all transactions is to have the transaction details safely copied from the consolidating closing process to an auditors’ “Sandbox” and use CAAT techniques to verify and sample the detail transactions. This eliminates the risk of changing transactions in the actual “Production” database.

As food for thought, consider that in modern high transaction-volume, computer based accounting systems 100% of the auditable transactions are in the IT system’s databases or files at one point or another. Given that consideration, Computer Assisted Auditing Techniques that can apply 100% testing capability for stored transactions makes a lot of sense. VI. Comparative Analysis of Sampling

1. Sampling in Financial Reporting Processes

Once a financial reporting process has been described, the reporting and operational risks analyzed, and the workflow documented, preferably as a flowchart with a supporting walkthrough narrative, the internal auditor can evaluate the effectiveness of specific control attributes by sampling transactions that flow through the complete process, or through complete branches of the reporting process. Generally, the sampling can be done by selecting documents that evidence whether or not the control activity or attribute is being performed. This may be basic, but it is critical if one is basing an audit conclusion related to the process on a sampling plan. There must be adequate documentation that all steps in the process have been performed. Generally a valid statistical sampling plan can verify the operational effectiveness of this complete workflow. If a financial process is organized as a sequence of specific activities performed in a stable and controllable process, the auditor should be able to sample process flow documents rather than the individual transaction documents and verify whether or not the entire process is performing as designed. This could require the process itself to be designed with the capability to collect the necessary information needed (process flow documents), and to provide documented evidence that the attributes and process activities were being performed and recorded in a controlled manner. A possible equivalent of this would be monitoring a manufacturing process using statistical charts to continuously inform the auditor and the process manager if the transaction processing steps were functioning in a stable and controlled manner. This would be a form of statistical process control.

2. Statistical Process Control (SPC)

Statistical process control (SPC) is a technique of relying on a monitoring process to produce predictable quality and quantity production results rather than performing one extensive inspection at the end of a manufacturing line. Although it is predominantly used in manufacturing to produce standard products, it can also be applied to processing transactions for financial reporting. The focus of statistical process control is to design an effective method of processing items, and then to measure and control any deviations or variations in the items being processed or produced. If a process is properly designed and resourced to produce a desired Intellectual Property of James J. FinnCopyright 2009 ©


product and level of quality, it will do so consistently and reliably if there are no deviations from the process activities. The purpose of SPC is to ensure that the process is performing reliably and consistently. Statistical sampling of activities and process quality is an essential component of SPC, as well as is continuous monitoring of the process for any deviations. Samples of the process and product are drawn and converted to charts and other graphical representations of the quality of the process. Variations in the process are discovered quickly, and the “root cause” of the deviation is determined and corrected. In this “feedback loop” manner, the process is statistically controlled, and the products or transactions are processed properly.

Bibliography

PCAOB Auditing Standard No. 5, An Audit of Internal Control Over Financial Reporting That is integrated with An Audit of Financial Statements

AICPA Statement on Auditing Standards No. 111, Amendment to Statement on Auditing Standards No. 39, Audit Sampling

AICPA Professional Standards, Audit Sampling, Section 350

AICPA Audit Guide, Audit Sampling, New Edition as of April 1, 2001

Barbara Apostolou, Sampling: A Guide for Internal Auditors, (Copyright 2004, IIARF)

The Professional Practices Framework, March 2007, (Copyright 2004, The IIA Research Foundation, IIARF)

Vincent M. O’Reilly, Barry N. Winograd, James S. Gerson, Henry R. Jaenicke, Montgomery’s Auditing, 12th Edition

Neal B. Hitzig, Statistical Sampling Revisited, CPA Journal, May 2004/Vol. LXXIV No. 5.

Dimitri P. Bertsekas, John N. Tsitsikilis, Introduction to Probability, MIT, Athena Scientific, Belmont Massachusetts P 17



"Sampling" for Internal Audit, ICFR Compliance Testing

Documents

Transcript of "Sampling" for Internal Audit, ICFR Compliance Testing