1 Introduction

The need to incorporate empirical research methods in legal scholarship remains a debated topic.1 Nevertheless, it is clear that ‘[…] empirical legal scholarship has expanded dramatically in the past decade.’2 Law schools increasingly hire empirically oriented researchers, while nearly half of the legal articles in the law reviews analyzed by Diamond and Mueller refer to empirical data.3 Legal researchers, however, rarely conduct their own empirical research.4 Hence, the most common approach to answering legal questions involves the reading and interpretation of legal material without a specific approach as to the selection of materials and the method used to draw conclusions.5

This article does not delve into the discussion of whether legal scholars ought to conduct empirical research. Instead, it focuses on when and how a particular research method,6 commonly utilized by empirical researchers, namely “systematic content analysis” (SCA), could be highly beneficial to legal scholarship. SCA is a systematic and replicable technique applied to analyze variety of texts, ranging from interview transcripts to legal texts such as case law and legislation.7 To be precise, SCA is a research technique and therefore its application does not necessarily result in an empirically oriented paper. Therefore, it can be used by scholars to enhance traditional legal research without producing an empirical study.

Support for SCA was already voiced in 2008 by Hall and Wright, who claimed that SCA provides a valuable alternative to analyze the content of judicial opinions.8 Moreover, in 2012, Oldfather and others argued that SCA should be incorporated in legal research. They stated that ‘[…] systematic content analysis reflects both traditional legal scholarship’s attention to texts and empirical legal studies’ systematization.’9 Hall and Wright further proposed that SCA could provide the basis for a unique legal empirical methodology.10 One of their main arguments was that SCA ‘[…] comes naturally to legal scholars because it resembles the classic scholarly exercise of reading a collection of cases, finding common threads that link the opinions, and commenting on their significance.’11 They further noted that SCA enables not only the better reading of cases, but also a deeper understanding of the case law.12 Their work, however, is solely focused on the utility of SCA in analyzing judicial opinions.

This article has a broader scope and discusses other possible uses of SCA in legal scholarship. Furthermore, as Hall and Wright’s article is almost a decade old, this article provides a current overview of the existing arguments for and against applying to SCA in legal research. In shedding light on the utility of SCA in legal research, the aim is not to discredit other research methods originating from social sciences. Rather, the choice to focus on SCA was influenced by the author’s positive experience with this method in the author’s ongoing PhD research.

Moreover, in advocating for the need to apply SCA in legal research, this article does not aim to present SCA as a replacement for traditional legal research. It simply argues that SCA is a beneficial supplement as it enables legal scholars to gain fresh insights into their sources.13 To further support this claim, the second section provides an in-depth overview of SCA, as well as its various stages. Subsequently, the third section assesses the suitability and necessity of SCA by considering its limitations and benefits. In addition, the fourth section reflects on the issues, ranging from emotional to practical, that may be faced by legal scholars when attempting to conduct SCA. This article concludes by emphasizing the need to familiarize not only researchers, but also law students to the world of SCA.

2 Systematic Content Analysis

SCA is frequently utilized by social scientists as well as researchers in communication studies to analyze interview transcripts, literature, and field notes amongst other sources. The advantage of SCA is that it can produce both qualitative and quantitative results by enabling the researcher to compress ‘[…] many words of text into fewer content categories based on explicit rules of coding.’14 Furthermore, this method empowers the researcher to analyze large volumes of data ‘[…] with relative ease and in a systematic fashion.’15 It is considered by some as ‘a sophisticated form of accounting.’16 Thereby, the researcher can detect words and concepts in the data in order to draw inferences and find patterns. The feature that distinguishes SCA from traditional legal research is that the data collection and its subsequent analysis must follow a specific path. The sections below provide a detailed overview of the various stages that legal scholars must journey through to conduct a SCA.17

2.1 Stages of Systematic Content Analysis

Hall and Wright claim that SCA of judicial opinions can be divided into three stages: ‘(1) selecting cases; (2) coding cases; and (3) analyzing the case coding, often through statistical methods.’18 In my opinion, it is more accurate to divide the various stages of SCA in legal research into five steps: (1) determination of a suitable research question or hypothesis for SCA; (2) identification and collection of sufficient data for analysis; (3) coding of the data, which has its own stages; (4) drawing of conclusions/observations; and (5) reporting the findings in a manner comprehensible to the legal community.

The first stage of SCA is the same as the initial stage of most legal research; it involves the determination of the research question or hypothesis. When a legal researcher opts for SCA, she must not only ensure that the questions/hypotheses posed are an interesting topic for research, but that they are also suitable for SCA. Although it is impossible to list all scenarios where SCA would be suitable, there are several successful instances where SCA has been successfully used in legal research. The most reoccurring use of SCA in law involves the analyses of large volumes of data such as case law with equal weight/value.19 For instance, Van Harten utilized SCA to analyze all publicly available arbitral awards (140 awards) addressing jurisdictional matters under investment treaties until May 2010 to study arbitrators’ behavior in asymmetrical adjudication.20 In addition, SCA can be used to answer questions for which there is no answer in legal scholarship, judicial opinions or legislative texts. In my research, I employed SCA to study the content of 172 Alternative Dispute Resolution (ADR) agreements in order to answer the question ‘what are the parties’ obligations under an ADR agreement?’ Amongst other aspects, I coded the agreements for the obligation to refrain from litigating and/or arbitrating, participation obligations and the obligation to produce information. The coding enabled me to draw conclusions about the agreements under analysis.

The second stage of SCA requires the identification and collection of data for analysis. Here, the researcher must determine whether her study aims at covering an entire population or a (representative) sample.21 Legal scholars employing SCA tend to limit the scope of their research based on subject matter, geography, language and/or time. Diamond and Muller limited their study to law journals written in English published from 1998 to 2008.22 The scope can also be very limited. Manderson and others, in their article ‘Building Information Modelling and Standardised Construction Contracts: a Content Analysis of the GC21 Contract’, limited their scope to the GC21 2nd edition, an Australian standardized construction contract, to identify possible changes to facilitate the implementation of a Building Information Modelling in a collaborative environment.23

Moreover, a researcher may decide to set rather open parameters for sample selection, such as ‘all Alternative Dispute Resolution agreements provided by surveyed respondents’.24 Parameters differing from the traditional time and geographical components are needed when it is impossible to estimate the population. For example, it is impossible to know how many commercial contracts have been concluded with a mediation clause in a given time range and geographical scope. This is because these agreements are not publicly recorded. Lastly, even with a clear scope, the total population may be too large to analyze. In these situations, the researcher must resort to sampling techniques including true random systematic sampling, such as every tenth case; quota sampling, such as all cases up to 100 cases per year per jurisdiction; or purposive sampling, such as cases cited by arbitrators in investment disputes.25

The third stage of a SCA requires the researcher to code the data. The codes are assigned by the researcher to words or concepts in the material. They can be used to track for example the subject-matter, reasoning, arguments as well as superficial aspects of data such as length, author, language. There are multiple approaches to coding that reflect the aim of the researcher. The varying approaches are extensively discussed by Saldaña in The Coding Manual for Qualitative Researchers.26 Coding also must be distinguished into stages. Hall and Wright observe four stages of coding: (1) […] create a tentative set of coding categories a priori. Refine these categories after thorough evaluation, including feedback from colleagues, study team members, or expert consultants. (2) Write a coding sheet and set of coding instructions (called a “codebook”), and train coders to apply these to a sample of the material to be coded. Pilot test the reliability (consistency) among coders by having multiple people code some of the material. (3) Add, delete, or revise coding categories based on this pilot experience, and repeat reliability testing and coder training as required. (4) When the codebook is finalized, apply it to all the material.27

This article agrees with the steps discussed by Hall and Wright. The start list can be created following a preliminary examination of the data (emergent coding) or on the basis of a specific theory (a priori coding).28 It is advisable that the start list includes several non-duplicative categories and sub-categories to fine-tune the analysis. For instance, parties’ participation obligations under an ADR agreement can be divided to the following sub-categories: obligation to set up the mediation, obligation to attended, obligation to behave in good faith during the mediation and obligations as to confidentiality and privacy. There is no limitation regarding the number of categories and sub-categories as long as the researcher explains her need to code these aspects.

Regarding the application of the codes to the data, it is expected that the researcher or a second researcher (co-coder) ensures the objectivity and reliability of the codes assigned through repeated the coding.29 Reliability can be affected when there is ambiguity regarding the meaning of the codes and the strategy used to apply them. Accordingly, it is essential that the researcher develops explicit written instructions regarding how the codes are to be assigned. Although it is not mandatory to apply formal reliability testing, researchers may conduct stability30 and reproducibility31 tests.32 Hall and Wright, however, found that it was not common for legal studies utilizing SCA to address reliability: ‘65% (87/134) of projects we reviewed had no discussion of coding reliability.’33 This is not surprising, as legal scholars in general tend not to be versed in statistics. Section four further discusses the issues that can arise from the lack of familiarity with reliability testing.

The fourth stage of SCA requires the researcher to draw conclusions and make observations on the basis of the coding. There is a tendency amongst users of SCA to count and document the reoccurrence of codes. Here, the researchers that rely on software for data analysis can utilize the built-in features. As describing the various features of these software would be too time-consuming, this study provides a simplified overview of a popular software Nvivo. Software such as Nvivo supports qualitative and mixed methods research. The purpose of Nvivo is to aid the researcher in organizing, analyzing and finding insights in unstructured or qualitative data ranging from articles, interviews, survey responses, legislation and case law. Software meant for the analysis of qualitative data enables the research to organize and manage her material to gain insights to the data and to delve deeper into the various aspects of the data. The features of Nvivo are numerous, ranging from the creation of charts, word clouds, comparison diagrams to coding queries. The most efficient way to understand the functionality of Nvivo is to watch the training videos that are freely available on the web.

To demonstrate, in a study of 172 ADR agreements, the author sought to assess how many of these agreements required the parties to utilize the rules of dispute resolution providers. Following the coding of the agreements for this condition, I conducted a query of all of the agreements that contained the specified code. According to the query, 71 percent of the agreements required the application of procedural rules of designated dispute resolution providers. To reiterate, SCA does not necessarily require the researcher to apply sophisticated statistics.34 Results of simple count frequencies are still valuable and provide insights into the material under analysis. Therefore, quantitative findings as such can have qualitative value if they are used to draw inferences.

The final stage of SCA relates to the reporting of the various links and themes as well as the conclusions drawn. The researcher must first and foremost ensure that the conclusions reached are not presented as reflective of aspects or data not analyzed. Furthermore, as emphasized by Epstein and Maritn, it is of the essence to clearly report on the above stages in a manner comprehensible by the legal community. This is an indispensable aspect to a successful application of SCA in legal research as the relative newness of this method might raise more questions than praise. In particular, questions might arise as to the validity of a SCA conducted by a junior legal scholar, as we often have very little to no non-law research experience.35 Section four further discusses this challenge.

3 Suitability and Necessity of Systematic Content Analysis

The 2008 study by Hall and Wright found that the application of SCA by legal scholars was increasing annually.36 There has been no further research into the frequency of the use of SCA in law ever since. This begs the question of whether the absence of recourse to this method is because legal research must follow traditional methods or if it is because legal researchers are unaware of the utility of SCA. Perhaps the lack of use of SCA is due to its functional difference to traditional legal research. SCA provides objective and scientifically verifiable findings. Legal researchers, however, often seek to analyze leading decisions or legislation:

[I]n a subjective, interpretative, or advocacy mode akin to the way a literary critic interprets poetry […] These rhetorical goals are not advanced meaningfully by systematic, objective verification of claims about the content of a collection of judicial opinions.37

Therefore, SCA is limited in its utility, as it cannot be used to evaluate the correctness of judicial opinions or legislation.38 According to Hall and Wright, SCA ‘[…] does not fully capture the strength of a particular judge’s rhetoric, the level of generality used to describe the issue, and other subtle clues about the precedential value of the opinion. To some extent, the method trades depth for breadth.’39 SCA is also limited to acting as a tool for observation: ‘more broadly, content analysis involves a compromise between, on one hand, the benefits of systematized review and analysis and, on the other hand, the costs of reducing complex qualitative phenomena to quantitative indicators.’40 Evidently, SCA would be suitable in cases where the researcher seeks to analyze legal material in a replicable manner and from an objective stance.41 This is not always the aim in legal scholarship, as in times, an influential scholar’s specific insights are sought about a particular case. In such a situation, the reader relies on the author’s judgement to select the cases worthy of discussion.42

Due to the limitations of SCA, this article only promotes SCA as a supplement to traditional legal research and not as a replacement. When used properly, SCA can strengthen the study of law. Moreover, according to Siems and Síthigh, the shift in legal scholarship towards the inclusion of social science methods ‘[…] may prove an eventual disciplinary consensus, turning law into a mature science.’43 The section below further discusses when and how the incorporation of SCA in legal scholarship can enhance legal research.

First, traditional legal research, while effective in addressing legal questions, does not enable the in-depth analysis of the content of legal material, as it does not contain a specific approach to studying legal material. Furthermore, it does not provide sufficient answers to questions regarding the frequency of certain practice as well as the relationship between multiple aspects of the sources under study. As Hall has stated, ‘content analysis can identify previously-unnoticed patterns that warrant deeper study, or sometimes correct misimpressions based on more ad hoc surveys of atypical cases.’44 There have been several instances of legal researchers relying on SCA to disprove previous claims based on a conventional view. For instance, Henderson and Eisenberg employed SCA to disprove critics who argued that product liability litigation was rapidly increasing.45

SCA hence enhances the existing approach to the analysis of legal texts, as it requires the researcher to record the consistent features in the data by applying codes. To illustrate, Matheson conducted a SCA of piercing the corporate veil in the parent-subsidiary context. By statistically explaining the propensities of modern courts for piercing the corporate veil in the parent-subsidiary situation, his aim was ‘[…] that courts, commentators, and practitioners may be better equipped to understand and predict under what circumstances a court is likely to exercise its equitable discretion and hold a parent company liable.’46

Second, SCA provides researchers with an alternative route when traditional legal research does not provide answers. As discussed above, there are no judicial opinions and legislative texts. Moreover, there are few scholarly works that address the parties’ obligations under ADR agreements. This means that the author would be the first legal scholar to attempt to answer these questions in a systematic manner. The choice of SCA was fitting, as the parties’ agreement to resort to ADR is the basis of their rights and obligations under the contract.

Third, as mentioned above, various coding software, some of which are freely available, enable the organized and at times automatic analysis of large volumes of data. As Chaib notes, Nvivo helped her ‘[…] process the large amount of cases in a structured way’.47 The benefit of having the ability to organize and analyze larger volumes of data is threefold. First, the research method enables the researcher to produce more representative findings. For example, Vandaele used content coding software to analyze 976 cases of the Belgian Council of State from 1996 to 2012.48 Her aim in studying this representative sample was to assess ‘[…] if the interpretation of the interest to sue by the Council of States is indeed as eccentric as often claimed.’49 The software enabled her to keep better track of the codes she used to qualify the content of the cases. Her reliance on SCA and the accompanying software meant she was the first researcher to answer her research question in a systematic manner. The ability to analyze and organize large datasets liberate the researcher from the limitations of the human mind in organizing and remembering numerous points. The analysis of vast datasets ‘[…] promise[s] to lead us to a greatly enhanced understanding of the workings of the judiciary and the legal system more generally.’50

Lastly, according to Hall and Wright, studies that rely on SCA are more likely to be cited on average than those that do not:

[U]sing the number of citations as a crude but widely accepted proxy for the influence of a scholarly work, content analysis projects published during the 1990s generated an average of seventy-seven citations per article, with a mean of thirty citations per project. Considering projects from all decades, 87% received at least one citation and 71% generated at least five citations. These citation patterns compare favorably to the general trends in legal scholarship. According to Thomas Smith’s ongoing research on the citation patterns in legal scholarship, 40% of articles receive no citations at all.51

Their finding is not surprising, as SCA empowers the researcher to justify a claim numerically, which is easy to comprehend, refer to and reproduce.

4 Remaining Challenges for Legal Scholars

Overall, utilizing SCA when appropriate is beneficial to legal scholarship but it can be rather challenging without adequate knowledge. Although SCA is a suitable method to test research hypothesis, answer legal questions, several practical obstacles remain. This section provides a non-exhaustive overview of the most prominent challenges that legal researchers, unfamiliar with non-legal research, may face in utilizing SCA. The various challenges can be summarized to four categories: (1) lack of familiarity and training; (2) time-consuming; (3) demanding data collection and (4) fear of criticism.

If a legal scholar wishes to conduct empirical research including SCA, the first challenge faced is the unfamiliarity with this method and lack of training.52 In learning about conducting a valid SCA, one is limited to materials that are intended for social scientists and empirical researchers.53 The problem with these sources is that they describe the practicality of this method for non-legal purposes, such as analyzing interviews, media messages or understanding society as such. Unsurprisingly, legal scholars who have embarked on the journey to conduct a SCA have a tendency to design their research without referring to previous studies, learning to conduct SCA “on the fly”.54

The inability to justify one’s approach on the basis of prior proven strategies may affect the reliability of legal research utilizing SCA. Consequently, legal studies utilizing SCA have to extensively explain their choices. As a result, ‘articles [based on SCA] commonly approach or exceed one hundred pages, laboriously detailing how the author devised techniques and resolved quandaries.’55 The ensuing insecurity regarding the validity of the findings of a researcher new to the world of empirical research is one that might prevent recourse to such methods. Thus, there is a need for specialized courses for law students and researchers to learn how to incorporate SCA in legal research.56 However, there is considerable work needed to create a guide or a course for the members of the legal community interested in the use of SCA.

The second challenge faced in conducting SCA is one that cannot be remedied with courses nor skills training, as it relates to the time required to conduct such analysis. Already in 1982, Gupta found that ‘content analysis, though a very useful technique, is time consuming.’57 The reasons as to why such method is time-consuming are twofold. First, as discussed in the second section, SCA must be reliable, which means that the study must be reproducible with similar results. To achieve reliability, a researcher must at a minimum code the data several times and, when needed, apply a formal reliability test.58 Reliability testing is a time-consuming exercise that must be undertaken for a valid utilization of SCA. Second, SCA may be a time-strenuous exercise if the researcher opts to analyze a large data set, which was often the case in previous applications of SCA. This is because larger volumes of data enable the more nuanced results.

Although the time required to code larger data sets can be slightly mitigated if the researcher utilizes specialized coding software,59 the utility of these software can only be accessed if the researcher is familiar with the functioning thereof. The software, while user friendly, is not similar to the software most commonly known in the legal world, such as Microsoft Word and EndNote. Moreover, the data entered in any software must be formatted (e.g. from Pdf to Word) and at times translated in order to be readable by the software. The readability of the material is key if a researcher opts for automatic coding. Nevertheless, such software are tools that, in addition to aiding in the coding of the text, aid the researcher in quantifying her workload and results. This can have a motivating affect, as it provides the researcher with tangible proof of her progress.

The third challenge to conduct a valid SCA relates to data collection. Here again, the challenge is twofold. First, the researcher must ensure that the data selection is scientifically justifiable. As discussed in section two, such justification is not problematic if the researcher is referring to an entire population. Nevertheless, it is necessary that the researcher can explain her sampling plan if the findings are meant to be representative.60 To remedy the gaps in legal researchers’ knowledge regarding data collection, specialized training courses are highly beneficial. While there are generalized courses on empirical research, a review of legal studies curricula in Belgium, Netherlands, and Luxembourg indicated a lack of any specialized courses designed for legal researchers interested in SCA. This is not to say that one cannot conduct a SCA without specialized social science and statistic training. To the contrary, many legal researchers employing such method have a tendency to rely on a certain logic to justify their data collection.61

Second, when the researcher has identified the data to be analyzed, she must then collect the data, which can be problematic if the data is not available or easily accessible. For instance, although many court cases and legislative texts are available online, serious gaps remain. This is especially evident in the field of dispute resolution research. Arbitral awards in commercial disputes are rarely made public, which makes research into the content of such awards highly problematic. Thus, a researcher cannot conduct SCA regarding arbitral awards unless she gains access to the awards.62

A potential and final challenge to a legal researcher considering the application of SCA is the fear of the adverse attitude of others to unorthodox legal research methods. To shift the views of those skeptical to SCA, a slight cultural shift in the legal community is required. Perhaps the skeptics must be reminded that the introduction of a new method is in no way meant as a criticism of traditional methods, but simply as a new avenue to enrich our work. Therefore, it is important to note that SCA will not replace traditional legal research, but merely supplement and thereby enhance it in situations where its use would be beneficial.63 In absence of a shift, however, a researcher embarking on SCA must ensure that they properly justify the reasons for supplementing traditional legal research in order to limit the possibility of criticism.

5 Concluding Thoughts

The primary aim of this article was to demonstrate the utility of SCA in legal research. To achieve this goal, the first section provided an overview of the concept of SCA including its various stages. Second, this article assessed both the suitability and necessity of including SCA in legal scholarship. Subsequently, the third section reflected on the impact of the limitations legal scholars might face when attempting to employ SCA. Undoubtedly, it is argued that SCA is a research technique that can strengthen the study of law. However, its utility is limited to instances where the researcher’s goal correlates with the functionality of SCA. Moreover, legal scholars interested in SCA must ensure that they are prepared to face the challenges that arise when using a non-legal research method.

Law schools can play a prominent role in integrating SCA in legal research. Among other things, law schools can provide researchers with the confidence to employ new methods. Today, however, the utility of SCA is not emphasized in legal education. Consequently, many young legal researchers and students remain oblivious to the added benefits of SCA. It is thus of the essence that law schools take on the task to introduce their students to enriching research methods that fall outside traditional legal research. The first task of law schools in introducing SCA to their students would be establishing a task force for the creation of a handbook on the application of this method in law. Perhaps, the work of Hall and Wright, as well as this article, can serve as starting points for such a handbook.