Measuring Compliance: Overcoming Four Challenges and Matching Method to Purpose

Over the past few decades, we have seen the rapid rise of a global “compliance industry.”^[1] With the emergence of this industry came criticisms about whether compliance efforts (e.g., compliance programs, trainings) by corporations are truly effective, or whether programs are at best superficial and at worst symbolic^[2]. Now more than ever corporations are under pressure to show that the compliance programs they have developed actually deliver on their promises. Government agencies have provided recommendations to companies and/or to law enforcement in their efforts to evaluate whether these programs are actually doing what they are supposed to do. The US Department of Health and Human Services, for example, issued a policy document for regulated entities in 2017 titled “Measuring Compliance Program Effectiveness: A Resource Guide,” while the US Department of Justice’s Criminal Division issued a guidance document for prosecutors titled “Evaluation of Corporate Compliance Programs.” These efforts herald an emerging period in corporate compliance—one that prioritizes data collection demonstrating that programs really do reduce undesirable behaviors (e.g., by increasing the accuracy of disclosures, maintaining consistent records), not simply using vague indicators (e.g., number of trainings provided, number of employees on the compliance team) that tell us more about the process than the outcome. In sum, corporations must demonstrate that the implementation of such strategies precedes an actual change in behavior.

To that end, measurement of effectiveness has become the new focus in corporate compliance. This changes the perspective from examining the content of a compliance program to scrutinizing the methods used to evaluate and measure impact. Given this new emphasis on evaluation validity, it is more important than ever that compliance practitioners and compliance scholars understand how their methodological decisions in studying or assessing compliance impact their ability to draw accurate conclusions.

Unfortunately, even within a seemingly cohesive field of study focusing on corporate compliance, there have been conceptual and methodological divergences without a broader discussion about the benefits and costs associated with different research methods. Our edited in-press book, titled Measuring Compliance: Assessing Corporate Crime and Misconduct Prevention (Cambridge University Press)^[3], describes the utility of various research methods and their inherent advantages and disadvantages. Our primary goal in this post, and in the publication of our book, is to encourage compliance practitioners and researchers to really, truly, think about the research question being asked and whether the data they are using can answer that question^[4]. All too often, researchers limit themselves to the use of available data (generated by the corporation) and fit their questions into a narrow box. For the purpose of this post, we focus on the trade-offs associated with methodological decisions in the financial compliance domain. Before we do so, let us first look at the four fundamental challenges that plague any compliance measurement.

Four Core Challenges in Conducting Compliance Research

There are four core challenges that come with measuring compliance outcomes: conceptual ambiguity, the “dark figure” of corporate compliance, establishing causality, and using other people’s data or previous research.

Conceptual Ambiguity.

It is important to note that the term “corporate compliance” means different things to different people—whether you’re a regulator or a regulated entity, an investor, or on the Board of Directors dictates your definition of compliance, as well as the behaviors that fall under compliant versus noncompliant labels. All law ultimately requires interpretation, and thus establishing whether there is compliance or not often entails an interpretation of rules that are not clear-cut and may be different for different circumstances and even for different actors. This challenges measurement efforts because evaluating compliance requires a uniform and simple measure of the behavior involved and simple labeling of whether such behavior is lawful or illegal.

All of this has become more complex because compliance oversight and measurement is no longer the sole domain of government actors. With the so-called “governance turn,”^[5] a wide range of actors have come to play a role in defining what is compliance or not, including the corporations themselves, private independent monitors hired to provide oversight, credit rating agencies, or insurance companies who provide liability insurance. Each of these parties can define what compliance looks like to them—and then might measure compliance according to their own criteria. As a result of all this, the measurement of compliance comes to depend on who defines and interprets what is and what is not compliant, and—because of the increasing number of parties engaging in such definitional or interpretative actions—data look increasingly different from one measurement source to another (often, even when looking at measurements of the same type of behavior). In all this, there is also a challenge involving the dynamism of financial regulation that can come to inhibit the usage of uniform concepts in compliance measurement. For example, a financial institution may measure the presence of in-person consumer identification protocols as a sign of compliance with “Know Your Customer” regulations. However, changes in banking (e.g., more mobile applications and verification procedures) are likely to impact the adequacy of traditional identification verification measures for truly following the spirit of Know Your Customer regulations—as technology changes the nature of financial transactions over time, measuring compliance as if it were a static process will result in a different picture of compliance than what might be expected by regulators.^[6]

The Dark Figure.

In criminology, there is a long history of studying the underreporting of crime. It is widely recognized that the most used sources of data about crime (e.g., the FBI’s Uniform Crime Reports) woefully undercount how much crime actually occurs. This primarily happens because victims don’t report their victimizations to the police; the variance depends on the type of crime being measured (e.g., sexual assaults are among the most historically underreported while car thefts tend to be most accurately reported). In corporate compliance there is a similar problem, and it may be even greater. External regulators surely are not able to discover all regulatory violations that occur inside of corporations. Corporations are more likely to find out about deviance before law enforcement and prefer to use internal mechanisms over contacting the regulators or law enforcement agencies. However, even corporate compliance departments do not know all illegal activities in the organization. Victims of illegal corporate behavior may not necessarily know that the conduct was illegal or that it affected them negatively, they may be unwilling to come forward due to either being ashamed about their victimization or worried about repercussions from reporting these crimes to external parties (e.g., being unable to ever recoup lost money^[7]), or they may have been involved in the deceit themselves (e.g., subprime mortgage borrowers^[8]).

In any compliance measurement or assessment, this means that there is no objectively valid estimate of how much compliance and non-compliance there is. Any method of study will face this crucial problem, as the sensitivity of illegal behavior makes it hard to get people to admit that they themselves or their colleagues broke the law. As such, any assessment or measurement approach must account for its inherent limits in assessing the true nature of the non-compliance problem it studies. The dark figure undermines any compliance assessment, as without a proper understanding of how much compliance or violation there is at any given time, it is impossible to establish whether compliance management efforts are having an effect or not. At worst, the measurement itself can come to shape the extent to which compliance is measured, through a so-called observer effect.^[9]

Causality.

Although we use the word “cause” in everyday lifeto mean that something had an impact on something else, it’s important to realize that empirical or scientific research imposes a strict set of criteria for establishing that factor A “causes” factor B. Specifically, it’s not enough to see that two things change together to make a strong argument that A caused a change in B—scientifically, in addition to observing any changes, you also have to determine that A changed first as well as to eliminate alternative explanations for the observed changes.

For example, a company may implement a training to increase compliance with recordkeeping as stipulated by Sarbanes-Oxley. They assign all administrative staff who have been with the company for less than one year to receive the initial training. Over the next year, recordkeeping compliance increased by a large margin across all relevant employees in the company. At first glance, it would appear that the training itself had a better impact than expected! However, there are a few things that might have happened aside from the training that could explain it. The assignment of relatively new employees to the training may have disguised the fact that they would have improved over time naturally, as they gained more experience. Alternatively, there may have been a well-publicized scandal whereby a corporation got in trouble for noncompliance in this domain over the course of that year; if employees were motivated to be more in compliance due to a deterrent effect, the training itself may not have had an impact at all. In this case, had one used random assignment to training (as opposed to choosing a particular type of employee) and carefully assessed pre-training knowledge and compliance versus post-training compliance or knowledge at the individual level, we would be much more confident that the training had the intended impact.

Secondary Data and Other People’s Research.

An easy (and common) way to assess compliance is by using existing data collected, for instance, through governmental inspectors or by companies themselves. Such data may have an advantage becasue it provides a readily available (should one have access) picture about compliance over a longer period of time and covering larger organizations and jurisdictions. Such type of data is excellent for statistical analysis and may even be used to study natural or quasi experiments where compliance before and after a change in compliance strategies can be studied.

However, such data comes with core drawbacks, closely related to the first and second fundamental problems. First, secondary data suffers from the dark-number problem just like any other source of data, yet users of such data may assume that corporate or government inspection data may represent actual compliance given the authoritative nature of the source compared to survey data. A second problem is that each such dataset is rooted in specific interpretations of what is and what is not compliance; these interpretations are made by the organization or individuals that collected it. Often the exact interpretation of law and definitions used to define core aspects of compliance are not sufficiently clear, and there may well be multiple and conflicting conceptualizations within the data. Moreover, the definitions used may not align with those of the researcher or practitioner evaluating compliance. Furthermore, it is important to realize that the limitations and unknowns associated with such data may mean that it is not well-suited to a particular research question. For example, if I want to understand how many employees failed to take a training on securities regulation compliance and what demographics are associated with noncompliance, the company would have to have decided that such information was worth keeping and worth putting into one place. Finally, there may be inconsistencies when combining data from multiple sources of collection: secondary data from one company or organization looks very different from data provided from another organization and, as such, makes it difficult to run consistent analyses and compare findings across large groups.

Relatedly, there are a lot of studies about corporate compliance and, again, these studies have been conducted using different definitions of compliance, different methods, and different populations of people or industries. Our ability to compare completed studies is limited by the inconsistencies in how those studies were conducted, even if they are looking at the same research questions.

Choosing the Appropriate Method for Your Research Question

In confronting these four challenges and making the right choices in selecting the best measurement method or methods, it is essential to match the methods of measurement to the purpose of the study and the research question (instead of changing their research question to fit the data or drawing inaccurate conclusions from ill-suited data). In this, there is no “perfect” single method—each approach to measure compliance comes with strengths and weaknesses and none can singularly overcome all four core challenges. We encourage potential researchers to ask themselves four questions, drawn from the four core challenges, about their interests when deciding on data collection strategies:

Are you interested in understanding how compliance is conceptualized by various parties, or are you approaching your study of compliance with a clear definition of what compliance means to the people interested in the results?
Are you concerned with actual noncompliance, or is your research question answerable with a focus on only recorded instances of noncompliance, proxies (indirect measures) of such compliance, or even just the outputs from the compliance management system?
Do you want to firmly establish a causal relationship between your predictor and your outcome, or is your research question more descriptive or exploratory in nature?
How do we make sense of existing data and research?

Question 1. With regards to conceptual challenges, if you are more interested in assessing how compliance is defined by various parties, then qualitative methods like intensive interviews and focus groups^[10], ethnographies^[11], and “mixed methods” studies^[12] are better suited for exploring the complicated nature of compliance as it is experienced by the people within corporations. These methods all encourage the researcher to “get close” to the people in organizations, to build trust, and to explore their lived experience. Survey research^[13] can also get various perspectives but is more superficial in exploring conceptual differences authentically from the perspective of people in the organizations studied.

If your project more simply wants to determine what narrowly (and clearly) defined compliance looks like, then existing data such as corporate data^[14], regulatory inspection data^[15], and aggregate-level outcome data^[16] are excellent approaches.

Question 2. Many compliance researchers would love to examine noncompliance as it is actually experienced in everyday life, not simply measure compliance that is detected by external parties or as is reported on surveys. Again, here, qualitative methods will be better for getting at the “dark figure” of noncompliance because these researchers spend a lot of time encouraging people to share their experiences in a confidential setting^[17]. Qualitative researchers are often carefully trained in the art of interviewing people about sensitive subjects, and when a scholar can directly observe the inner workings of an organization (as in an ethnography or field research) then they might see noncompliance themselves that might otherwise have gone unnoticed.

In the case of quantitative data collection efforts like surveys, experimental studies and the various forms of existing data described above, a fundamental problem exists in that much of the time people will be aware that they are being studied. People know that their organization is recording their efforts, and survey questions or experiments are often transparent about their purpose. All of this means that people are unlikely to act “naturally” when they know they are being monitored but don’t trust or don’t know the person collecting the information.

The trade-off here, of course, is the scope of the data collected. Qualitative work is labor intensive and therefore studies using this method are unlikely to include a large amount of people or organizations. For those who wish to get a broader picture of compliance across a larger organization or across many organizations, they will likely have to rely on quantitative data that may be more vulnerable to the dark number of compliance.

Question 3. When thinking about causal relationships, experimental methods in which people are randomly assigned to an intervention^[18] are often considered to be the best way to truly determine that a supposed predictor has an actual impact on an outcome and that the relationship is not due to something else. When surveys are conducted multiple times on the same people, they also compile strong evidence of a causal impact so long as many possibly confounding factors are accounted for. Unfortunately, though, much compliance scholarship relies on cross-sectional research and existing data which are limited in establishing time ordering and in eliminating rival hypotheses. Of course, as mentioned above, the trade-off here is that with increased causal inference through quantitative methods, the use of experiments or surveys may not provide the best valid picture of the behavior itself.

Question 4. Finally, it’s important for scholars to think about how they can use existing data and previous research in a meaningful way. To overcome the difficulties in knowing what research study is best reflecting the question of interest, or how to make sense from inconsistent results stemming from methodological choices, systematic reviews, and meta-analyses^[19] are excellent methods for synthesizing data across studies and coming to an “overall” conclusion about a body of research. In fact, if enough data exists on a topic, meta-analyses can further inform us about how methodological choices impact the findings.

Again, it’s also important to understand that existing data (e.g., corporate or inspection data) has limitations, but we should also emphasize that it is appropriate to use and can add much-needed value to the literature. It is especially useful in a mixed methods study, where the scholar can triangulate the results from the secondary data with another form of primary data collection. Furthermore, there are increasingly sophisticated statistical techniques (such as data simulations^[20]) that help improve upon the utility of existing data.

Additional considerations. In addition to the four questions, scholars should also consider two factors that matter immensely when conducting research: generalizability and feasibility. Generalizability refers to the idea that the findings from a single study’s sample can be applied to a larger group of people who did not actually participate in that research. If you want to be able to take the results of your study and talk about them as though they apply to a larger group of people, then surveys are often thought to be the best method simply because they are so easy to distribute (i.e., they can reach many people very quickly). Furthermore, it’s often easier to use “probability sampling designs”—which means that you choose a sample of people randomly from a larger list—when distributing surveys through the mail, over the phone, or in-person. Addresses and phone numbers can be gathered from various government sources, workplaces, schools, etc. and can be used to reduce bias in recruiting study participants.

Unfortunately, the same methods discussed above that help ensure more authentic data or establishing causal mechanisms are those that make it more difficult to obtain information from a large or representative group of people. Experiments, ethnographies, intensive interviews, etc. are all more difficult to implement and are harder to recruit participants for; such research designs often collect data from one specific setting and relatively few people compared to the larger population of interest.

Finally, feasibility refers to the notion that researchers are often limited in terms of the time and financial resources they have. The study being conducted should be one that can be fully completed given time and funding constraints. If a scholar wants to maximize efficiency, survey research and using existing data are often thought to be strongest in this area. As stated in the previous paragraph, experiments, ethnographies, intensive interviews, etc. are all more difficult to implement; they often require some form of travel, compensation, or intense recruitment efforts and thus increase the time and money required from the researcher.

Conclusion

In sum, Measuring Compliance seeks to provide compliance scholars and practitioners with a repository of methodological strategies that can be used to get a great fit between the research question of interest and the data collected. In contributing to this repository, our authors did a fantastic job of outlining both the benefits and limitations of each strategy. Again, it must be emphasized that there is no single method that will be good at everything; knowing the trade-offs from choosing one strategy over another is essential. It’s also important to realize that despite methodological limitations, all research has the potential to make a meaningful contribution to knowledge. Compliance researchers should be encouraged to do the best research they can (given any constraints they operate under) while also being discouraged from making claims about their research that simply are not warranted (e.g., claiming that a training definitively caused a change in behavior when an experimental design was not used; claiming that the findings apply to an entire industry when only one company was studied).

There is no question, however, that compliance research can be improved across the field. Primarily, we believe that the use of “mixed methods” studies—which combine both qualitative and quantitative methodological approaches to study one research question—can overcome many limitations associated with the use of a single method. Quantitative methods are often pitted against qualitative methods in terms of their benefits and costs—for example, qualitative methods often achieve in-depth understandings of a phenomenon but are narrow in their generalizability, while quantitative methods often take a broader view of the population of interest but do so more superficially. When you combine a qualitative approach with quantitative approach, one can overcome many of the limitations associated with both methods.

As compliance researchers ourselves, we naturally encourage academic, governmental, and private organizations to make more time and money available for the study of these problems! Perhaps pessimistically, however, we don’t see a sudden influx of either happening any time soon and the easiest or cheapest methodological choices will likely continue to proliferate in compliance research. As stated above, all research can contribute meaningfully to the body of knowledge. Existing data sources are often goldmines of information, and we think the field would benefit from having such data made more available to scholars and for organizations to collaborate with researchers to ensure that their data is prepared for scholarly analysis as well as practical applications.

Relatedly, we believe that the field would benefit more generally from increased partnerships between scholars and practitioners. Being able to access the people within corporations who define compliance, observe actual misconduct, participate in compliance trainings, etc. is incredibly difficult for people outside of the organization; inaccessibility ultimately limits our ability to see what actual compliance (i.e., behaviors missing from official records) looks like. Having scholar-practitioner teams who can work together to understand more fully what is happening within organizations would improve not only our knowledge of corporate behavior but also could improve the experiences of employees and the culture of the organization as well. We recognize that there will be concerns about research conducted in such collaborations—conflicts of interest are not unlikely to arise, and the corporation may fear legal repercussions if an outside researcher uncovers misconduct. Despite that, however, we know that compliance practitioners and compliance scholars have a similar goal—to prevent noncompliance and protect people from harm. With that goal in mind, collaborative teams could feasibly agree upon what the true outcome of interest is and how to handle noncompliance in a transparent manner before data collection begins. For example, “nudge units” (in which government agencies hire researchers to work with government employees to design policy changes that impact citizen behavior) have increasingly proliferated across the globe. These efforts might serve as a template for private organizations to obtain help in understanding and motivating compliance within their organization, protecting their workers as well as themselves from undesirable outcomes.

Melissa Rorie is an Associate Professor of Criminal Justice at the University of Nevada, Las Vegas

Benjamin van Rooij is a Professor at the University of Amsterdam

This post is adapted from their paper, “Measuring Compliance: The Challenges in Assessing and Understanding the Interaction between Law and Organizational Misconduct” available on SSRN.

^[1] A recent analysis estimated that the global market size for corporate government, risk, and compliance reached US$31.27 billion in 2019; https://www.grandviewresearch.com/industry-analysis/enterprise-governance-risk-compliance-egrc-market.

^[2] See, e.g., the following references: Chen, Hui, and Eugene Soltes. 2018. “Why compliance programs fail and how to fix them.” Harvard Business Review 96 (2):115-125.; Krawiec, Kimberly D. 2003. “Cosmetic compliance and the failure of negotiated governance.” Wash. ULQ 81:487; McKendall, Marie, Beverly DeMarr, and Catherine Jones-Rikkers. 2002. “Ethical compliance programs and corporate illegality: Testing the assumptions of the corporate sentencing guidelines.” Journal of Business Ethics 37 (4):367-383.; Parker, Christine, and Vibeke Lehmann Nielsen. 2009. “Corporate Compliance Systems Could They Make Any Difference?” Administration & Society 41 (1):3-37.; Weaver, Gary R, Linda Klebe Treviño, and Philip L Cochran. 1999. “Corporate ethics practices in the mid-1990’s: An empirical study of the Fortune 1000.” Journal of Business Ethics 18 (3):283-294.

^[3] Rorie, M. and van Rooij, B (eds.). in press. Measuring Compliance: Assessing Corporate Crime and Misconduct Prevention. Cambridge University Press. ISBN: 9781108488594.

^[4] See also van Rooij, B. and Rorie, M. in press. “Measuring Compliance: The Challenges in Assessing and Understanding the Interaction between Law and Organizational Misconduct.” In Rorie, M. and van Rooij, B (eds.). (in press). Measuring Compliance: Assessing Corporate Crime and Misconduct Prevention. Cambridge University Press. ISBN: 9781108488594.

^[5] van Wingerde, K., and Bisschop, L. in press. “Measuring compliance in the age of governance: How the governance turn has impacted compliance measurement by the state.” In Rorie, M. and van Rooij, B. Measuring Compliance: Assessing Corporate Crime and Misconduct Prevention. Cambridge University Press. ISBN: 9781108488594.

^[6] See, e.g., Arner, D.W., Zetzsche, D.A., Buckley, R.P. et al. 2019. “The Identity Challenge in Finance: From Analogue Identity to Digitized Identification to Digital KYC Utilities.” European Business Organization Law Review 20, 55–80. https://doi.org/10.1007/s40804-019-00135-1.

^[7] See Deem, D. L. (2000). Notes from the field: Observations in working with the forgotten victims of personal financial crimes. Journal of Elder Abuse & Neglect, 12(2), 33-48.

^[8] See Nguyen, T. H., & Pontell, H. N. (2010). Mortgage origination fraud and the global economic crisis: A criminological analysis. Criminology & Public Policy, 9(3), 591-612.

^[9] Van Rooij, B., Wu, Y., & Na, L. (2021). Compliance Ethnography: What gets lost in compliance measurement. In M. Rorie & B. Van Rooij (Eds.), Measuring Compliance: Assessing Corporate Crime and Misconduct Prevention. Cambridge University Press.

^[10] Rinfret, S., and Pautz, M. in press. “Engaging qualitatice research approaches to investigate compliance motivations: Understanding the how and why of compliance.” In Rorie, M. and van Rooij, B. Measuring Compliance: Assessing Corporate Crime and Misconduct Prevention. Cambridge University Press. ISBN: 9781108488594.

^[11] van Rooij, B., Wu, Y., and Li, N. in press. “Compliance ethnography: What get lots in compliance measurement.” In Rorie, M. and van Rooij, B. Measuring Compliance: Assessing Corporate Crime and Misconduct Prevention. Cambridge University Press. ISBN: 9781108488594.

^[12] Jordanoska, A., and Lord, N. in press. “Mixing and combining research strategies and methods to understand compliance.” In Rorie, M. and van Rooij, B. Measuring Compliance: Assessing Corporate Crime and Misconduct Prevention. Cambridge University Press. ISBN: 9781108488594.

^[13] Rorie, M., in press. “Self-Report Surveys and Factorial Survey Experiments.” In Rorie, M. and van Rooij, B. Measuring Compliance: Assessing Corporate Crime and Misconduct Prevention. Cambridge University Press. ISBN: 9781108488594.

^[14] Soltes, E., in press. “Measuring compliance risk and the emergence of analytics.” In Rorie, M. and van Rooij, B. Measuring Compliance: Assessing Corporate Crime and Misconduct Prevention. Cambridge University Press. ISBN: 9781108488594.; Pellafone, R. in press. “A practical way to measure corporate compliance efforts.” In Rorie, M. and van Rooij, B. Measuring Compliance: Assessing Corporate Crime and Misconduct Prevention. Cambridge University Press. ISBN: 9781108488594.

^[15] Stafford, S., in press. “Using Regulatory Inspection Data to Measure Environmental Compliance”. In Rorie, M. and van Rooij, B. Measuring Compliance: Assessing Corporate Crime and Misconduct Prevention. Cambridge University Press. ISBN: 9781108488594.

^[16] Blanc, F., and Colletti, P. in press. “Using Outcomes to Measure Aggregate-Level Compliance: Justifications, Challenges, and Practices”. In Rorie, M. and van Rooij, B. Measuring Compliance: Assessing Corporate Crime and Misconduct Prevention. Cambridge University Press. ISBN: 9781108488594.

^[17] van Rooij, B., and Rorie, M. in press. “Admitting noncompliance: Interview strategies for assessing undetected legal deviance.” In Rorie, M. and van Rooij, B. Measuring Compliance: Assessing Corporate Crime and Misconduct Prevention. Cambridge University Press. ISBN: 9781108488594.

^[18] Rorie, M., in press. “The use of randomized experiments for assessing corporate compliance.” In Rorie, M. and van Rooij, B. Measuring Compliance: Assessing Corporate Crime and Misconduct Prevention. Cambridge University Press. ISBN: 9781108488594.

^[19] Schell-Busey, N. in press. “Using meta-analysis/systematic review to examine corporate compliance.” In Rorie, M. and van Rooij, B. Measuring Compliance: Assessing Corporate Crime and Misconduct Prevention. Cambridge University Press. ISBN: 9781108488594.

^[20] West, M., and Rorie, M., in press. “Data Simulations as a Means of Improving Compliance Measurement.” In Rorie, M. and van Rooij, B. Measuring Compliance: Assessing Corporate Crime and Misconduct Prevention. Cambridge University Press. ISBN: 9781108488594.

Leave a Reply Cancel reply