Esprii Chapman – Notes on RACIAL EQUITY IN ALGORITHMIC CRIMINAL JUSTICE (Article 3 of 10)

“The concerns of constitutional law simply do not map onto the ways in which race impinges on algorithmic criminal justice. The result is a gap between legal criteria and their objects.” (1088)

“A focus on racial animus will almost never be fruitful. A focus on classification leads to perverse and unjustified results. The replacement of unstructured discretion with algorithmic precision, therefore, thoroughly destabilizes how equal protection doctrine works on the ground. The resulting mismatches compel my conclusion that a new framework is needed for thinking about the pertinent racial equity questions.” (1088)

“I suspect that the notion of machine intentionality is sufficiently counterintuitive to find no place in constitutional law. Speculation about a future of “superintelligent” artificial intelligences aside, the transformation of training data into new schemes of classification by machine learning or deep learning does not obviously map onto familiar forms of human intentionality. The most advanced artificial intelligences can now pass the Turing test and defeat (human) world champions at Go. But even these machines do not obviously possess the sort of psychological interiority commonly thought to be a necessary predicate to intentionality. Talk of machine intentionality, therefore, is either premature or a badly poised metaphor. It is better to treat the algorithm itself as irrelevant to the constitutional analysis so far as intentionality is concerned.” (1088-89)

“Bracketing the machine-learning tool as agent, however, there are two possible ways in which intention might enter the picture. First, an algorithm’s designer might be motivated by either an animosity toward a racial group, or else a prior belief that race correlates with criminality, and then deliberately design the algorithm on that basis. Barocas and Selbst call this “masking.” Masking might occur through either a choice to use polluted training data or the deliberate selection of some features but not others on racial grounds. For instance, it is well understood that when employers ignore credit score information, they tend to search for proxies that have the inadvertent effect of deepening racial disparities. A discriminatory algorithm designer will leverage such knowledge to fashion instruments that yield the disparate racial effects they believe to be warranted a priori. Without knowing the full spectrum of features that could, conceivably, have been included in the training data—which can be “enormous”—it will be difficult or impossible to diagnose this kind of conduct absent direct evidence of discriminatory intent. It will, moreover, be especially difficult to show that, but for race, a specific feature would or would not have been included, as the doctrine requires. A basic principle of “feature selection” instructs that one should keep the important features and discard the unimportant ones. To the extent that masking occurs, therefore, it seems clear that the litigation process would rarely yield evidence of such intentional manipulation of the algorithm’s design.” (1089-90)

  • It’s not necessarily the algorithm’s fault, it can be the creator’s own fault for programming bias at the base level of the AI/algorithm.

“Another reason to set aside the masking phenomenon, however, is the fact that it does not appear to be a significant one in practice. Part of the reason for this is that racial animus has a performative, interpersonal aspect. Racial discrimination commonly entails an effort by one group to “produce esteem for itself by lowering the status of another group,” correlatively producing a “set of . . . privileges[] and benefits” of superordinate group membership. Masking is a form of discrimination that involves no interpersonal interaction and no esteem-affirming performance.” (1090)

  • A silent bias that could be done by anywith with data. If I understand it correctly, it seems to already be inherently happening as we are experiencing in our legal justice system right now. Black people are inherently a higher risk population due to them living in areas where, due to systemic racism, the police often visit and as such often catch people in the act of doing something illegal. It is no more common there than it is in historically White neighborhoods, the only reason it’s more prevalent among Black people is because systemic racism still exists and is only being perpetuated by feedback loops.

“The anticlassification account of equal protection is premised on two main justifications. First, it is motivated by a concern that the state’s use of racial classifications will facilitate or amplify private discrimination. This worry is premised on an empirical claim that a “perception . . . fostered by [government]” of differences between racial groups “can only exacerbate rather than reduce racial prejudice.” The foundation of this empirical claim is hardly clear. Why would the communicative effect of state racial classifications entail a legitimation of private animus? The causal link here is not obvious. One interpretation of the Court’s argument might start with the Court’s claim that race is “‘in most circumstances irrelevant’ to any constitutionally acceptable legislative purpose.” Read sympathetically, the Court appears to be saying that because race is irrelevant to the vindication of legitimate government ends, the observation that the state is treating race nevertheless as salient has the effect of propagating a false popular belief in racial hierarchies. The second possible interpretation of an anticlassification rule turns on a nonconsequentialist, deontological intuition. That is, according to some Justices, it is a moral axiom that the state must treat all persons as individuals, and such individualization precludes any taking account of their race. This moral demand for individuation entails demanding judicial scrutiny for all racial classifications. There are, to be sure, reasons for skepticism about these moral and theoretical premises of the anticlassification principle. But even bracketing those hesitations, and taking those justifications at face value, there is still no reason to think that the logic of anticlassification strongly militates against the use of race either as a feature or as an element of a classifier by machine-learning tools. To the contrary, as a matter of either precedent or logic, equal protection law can accommodate racially sensitive algorithmic criminal justice.” (1094-96)

  • Counterexample claiming that racially-sensitive statistics being used in the court can be eliminated from AI. I believe that they can, but it will take time and much more developed AI.

“Northpointe omitted race from the training data used for COMPAS. But this appears to reflect corporate risk aversion, not an effort at legal compliance. Current law does not address whether the availability of race as an input into the deliberative process that results in state action violates the Equal Protection Clause on anticlassification grounds. To be sure, there is language in earlier precedent that suggests that any racial trace in official deliberation raises a constitutional problem.” (1098)

  • This is strictly analysing this in the legal sense. It is not necessarily violating any laws, but it is clearly a turning point where the algorithm would discriminate even more heavily based on race.

“Imagine that wearing a particular baseball cap is used by police as a proxy for drug possession (say, because it may signal gang membership). Both blacks and whites wear this cap. For 100 percent of whites, and for zero percent of blacks, the cap is an accurate signal of drug possession. Let us say that police stop all those encountered wearing the cap, and this population is 75 percent white and 25 percent black. Because the cap generates a 75 percent success rate, its categorical (and colorblind) use might be deemed a meritorious criterion. But the efficacy of searches, and the avoidance of needless hassle for minorities, can be increased by limiting the instrument to white suspects. Colorblindness here generates substantial and avoidable social costs. These can be corrected by simply accounting for race.” (1100)

  • However this would only work in a system that had no systemic bias built in. It would not work in today’s system as is. Police are proven to stop more Black suspects than White suspects on a daily basis and so it would make sense that drug possession on Black people would be higher even if, in the grand scheme of things, it was not higher.
  • “At the beginning of the twentieth century, national public discourse about “law and order became racialized, and conviction and incarceration rates for African Americans jumped disproportionately.” As the leading historical work by Khalil Gibran Muhammad vividly demonstrates, Progressive-era academics, journalists, and politicians in the North linked crime to African Americans at the same time as they downplayed white ethnic groups as sources of crime. By the early 1940s, Muhammad explains, “‘Black’ stood as the unmitigated signifier of deviation (and deviance) from the normative category of ‘White.’” Concomitant to this rhetorical shift, urban policing and carceral resources were disproportionately allocated to African Americans who were in the process of migrating up from the rural South. In northern cities in particular, police singled out blacks for intense surveillance and coercion. This pushed up the rate of black incarceration and the proportion of the prison population that was black. The black share of that population never subsequently dropped. Racialized mass incarceration, that is, was at its inception a product of a moral panic stoked by northern elites in respect to the growing presence of an African American population that previously had been the South’s “problem.”” (1106)

“As Randall Kennedy cogently observed three decades ago, African American men experience a “racial tax” from American criminal justice systems— even if they have no contact with it—because police and citizens are prone to perceive their race as a proxy for criminality and, hence, to configure them as potential criminals rather than potential victims. Recent empirical work has confirmed Kennedy’s account of the externalities of criminal justice for minority groups as a whole. African American men hence continue to receive disfavored treatment in a wide array of economic and social contexts that limit important life opportunities. The increased risk of contact with police, and thus incarceration, undermines the economic and social resources available to the larger racial cohort embedded in the same geographic community. One in four black children also experiences parental incarceration—an experience that directly and negatively impacts their health and education outcomes. Most notably, and dismayingly, black parental incarceration is associated with a 49 percent increase in infant mortality, an increase that has no parallel among white families affected by incarceration. So not even children are spared. Rather, a concentration of policing and incarceration within black communities generates distinctive burdens with no parallel for majority racial groups—burdens that diffuse and concatenate across communities and generations. It is on this basis, I think, that it is plausible to characterize the contemporary American criminal justice system as “a systemic and institutional phenomenon that reproduces racial inequality and the presumption of black and brown criminality.”” (1109-10)

  • The effects of what systemic racial bias has done to minorities.

“the introduction of new computational and epistemic technologies does not alter the basic stakes of racial equity. They should be evaluated, that is, as elements of that overall system. In this light, the key question for racial equity is whether the costs that an algorithmically driven policy imposes upon a minority group outweigh the benefits accruing to that group. If an algorithmic tool generates public security by imposing greater costs (net of benefits) for blacks as a group, it raises a racial equity concern. That policy undermines racial equity by deepening the causal effect of the criminal justice system on race-based social stratification. This test is consequentialist. It focuses on the effects of an algorithm’s use. It is also holistic. Unlike older risk assessment tools, it accounts for both the benefits and the costs of intervention. And, to emphasize again, it is quite general: There is no reason not to apply it to criminal justice more generally. I develop the test here nevertheless because I am concerned with algorithmic tools that can develop precise cut-points for using coercion based on analyses of large volumes [of] data.” (1111-12)

“The four conceptions of algorithmic fairness or algorithmic nondiscrimination can be elaborated as follows. First, an algorithmic classifier might exhibit statistical parity. This means that an equal proportion of members of each group are subject to coercion. In terms of the graphic, this means that the shaded areas under the white and the black curves to the right of the threshold are equal to each other. This can happen, it is worth noting, even if there is wide variation in the ratio of false positives to true positives for whites and for blacks. Where there is no threshold, one might instead use the average risk score for a given group. A variant on statistical parity is “conditional statistical parity,” which requires that, having controlled for a “limited set of ‘legitimate’ risk factors, an equal proportion of defendants within each race group” are treated as risky. In practice, however, this definition is highly sensitive to what counts as a “legitimate” risk factor.” (1119)

“Conflicts Between Algorithmic Fairness Definitons. It would seem desirable to satisfy all these definitions of equality. At least at first blush, all capture colorable and important intuitions about the fair allocation of coercion. But matters are not so simple. It turns out that this is not possible in many cases—and not possible under conditions that are reasonably likely to occur in practice” (1024)

“The Irrelevance of False Positive Rates.” — “False-positive focused definitions not only played a central role in the debate between Northpointe and ProPublica, they have also infiltrated public debate more broadly. A concern with false positives is not without normative appeal. But definitions of nondiscrimination that hinge on false positive rates do not index in any obvious fashion the extent to which an algorithmic instrument exacerbates racial stratification.” (1125)

“Evaluating the Impact of Algorithmic Criminal Justice on Racial Stratification.” — “Existing criminal justice systems influence the extent of racialized social stratification in society as a whole. Racial equity in criminal justice generally—and in particular in the algorithmic context—should be primarily concerned with mitigating these pernicious effects. It should repudiate the tight linkages that have bound criminal justice to the reproduction of racial hierarchy since the beginning of the twentieth century. Even if the present-day operation of criminal justice institutions cannot undo past harms, at a minimum they should not compound those harms.” (1128)


“Algorithmic criminal justice, relying first on machine learning and then on deep learning, is only now beginning to impinge on criminal justice institutions. For a much longer time, the latter have been sites for the production of racial stratification. This comes in the form of a policing and carceral apparatus that weighs most heavily on African Americans. It also arises thanks to a racial tax that extends to all members of the group, whether or not they have any connection to criminality. Given this history, it seems to me important to get algorithmic criminal justice right. Such tools, if fashioned wisely, might be useful in restoring equilibrium and mitigating the burden of racial externalities. Wrongly configured, they may prove subtle levers for preserving or even exacerbating those burdens. Wrongly configured, I also fear, they would be exceedingly hard to dislodge. My aim in this Article has been to demonstrate that constitutional law does not contain effectual tools to meet these problems. It is a mistake, therefore, to contort constitutional doctrine in the hope that it will do service in a context where it is so substantially ill fitted. Far better, in my view, to recognize that the constitutional law of racial equality has almost nothing cogent to say about what counts as a racially just algorithm. It might instead achieve the remarkable doubleheader of impeding both racial equity and social welfare maximization. The doctrine is thus a moral vacuity. Reformulation of the doctrine, in my view, is desirable but unlikely. In the interim, algorithm designers, local officials, and state legislators should instead ask directly how best to achieve racial equity given the shape of existing criminal justice institutions and the technical tools at their disposal. I have offered an answer to that question that draws on, without quite tracking, existing technical definitions of algorithmic nondiscrimination. I have further stressed that my approach has the distinctive feature of aligning racial equity with social efficiency. My project has been demarcated in terms of algorithmic criminal justice. But it should not escape notice that there is no particular reason to confine the scope of the analysis to algorithmic tools, or even to criminal justice. But those extensions are for another day. For now, a recognition of the potential convergence of equity and efficiency might move us closer to a remedy for the difficult, enduring, and damaging legacy of our racialized criminal justice past.” (1133-34)

The four concepts of fairness:

“The four conceptions of algorithmic fairness or algorithmic nondiscrimination can be elaborated as follows. First, an algorithmic classifier might exhibit statistical parity. This means that an equal proportion of members of each group are subject to coercion. In terms of the graphic, this means that the shaded areas under the white and the black curves to the right of the threshold are equal to each other. This can happen, it is worth noting, even if there is wide variation in the ratio of false positives to true positives for whites and for blacks. Where there is no threshold, one might instead use the average risk score for a given group. A variant on statistical parity is “conditional statistical parity,” which requires that, having controlled for a “limited set of ‘legitimate’ risk factors, an equal proportion of defendants within each race group” are treated as risky. In practice, however, this definition is highly sensitive to what counts as a “legitimate” risk factor. Because my analysis does not assume an answer to the question of what counts as a legitimate risk factor, I put aside here the possibility of conditional statistical parity. Statistical parity is a clear and simple idea. Indeed, it is employed as part of the prima facie case in disparate impact analysis in employment discrimination law. Under longstanding administrative agency construction, a racial difference in selection rates of “less than four-fifths” is “generally” taken as evidence of “adverse impact.” On the other hand, there is no a priori reason why state coercion should be equally distributed among racial groups. To be sure, there is some evidence that at least for certain sorts of offenses, such as narcotics crimes, there are “no statistically significant differences” in offending rates for different racial and ethnic groups. But on the assumption that the algorithm’s training data are not flawed, the hypothetical would simply not capture such cases. Second, an algorithmic classifier might be viewed as fair if it treated two people who evinced the same ex ante evidence of risk, but differed by race, in the same way. The computer science literature has distinguished between a single threshold and “multiple race-specific thresholds.” A recent paper further offers a formal proof to the effect that the “immediate utility” of a decision rule—defined in terms of the immediate benefits of crime directly suppressed and direct costs of coercion (and ignoring externalities)—is typically optimized by maintaining a single threshold rule for coercion rather than having plural thresholds. That is, a social planner with an algorithmic tool that is trained on unbiased data would select a single risk threshold for both whites and blacks if she wished to optimize over the costs and benefits of crime control. This analysis of social welfare, however, does not answer the question of what necessarily furthers racial equity under all conditions. In particular, it is important to observe that the formal proof of optimality is limited to the immediate effects of an algorithmic tool. Racial stratification is plausibly understood to be a compounding effect of the latter concept rather than something captured by the former. This conception of fairness in algorithmic criminal justice has not so far attracted a distinctive label. Indeed, some accounts of discrimination in the algorithmic context simply do not cite this kind of fairness, preferring to focus on the relative frequency of false (or true) positives (or negatives) in the two racial groups. In other work, this conception has been characterized simply as “fairness,” but that nomenclature is too vague to be helpful. I label this definition, therefore, the single threshold definition of algorithmic fairness. Graphically, the single threshold definition of fairness is represented by the fact that the vertical line that marks the threshold between coercion and its absence is in the same place for both racial groups. If the vertical thresholds were placed in different locations on the x-axis, there would be a group of individuals between the two thresholds who would present the same evaluated risk but would be treated differently solely on account of their race. A third conception of algorithmic nondiscrimination examines only the portion of the population that lies to the right of the risk threshold. In Figure 1, this comprises the shaded areas under the curves. These encompass parts of the white and black populations subject to coercion as a consequence of the algorithm’s recommendations. Not all of these recommendations, however, will be borne out by future events. In the bail context, for example, some fraction of those subject to state coercion would not have gone on to commit crimes that justified pretrial detention. They will, in other words, be false positives. One way of thinking about nondiscrimination is in terms of the false positive error rate conditional on being assigned state coercion by the algorithm—which can also be stated as P(nonrecidivist|high risk). So if a greater fraction of blacks stopped or detained turn out to be innocent in the relevant sense than the same fraction of nonrecidivist whites, then this would violate the third conception of fairness. Or, stated in yet another form, if the proportion of those false positives under the black curve to the right of the risk threshold is greater than the proportion of false positives under the white curve to the right of the threshold, then this conception of equality is violated. This notion is captured by a number of different terms in the computer science literature. A leading group of analysts label it “conditional use accuracy.” In my view, it is simplest to label it equally precise coercion because this conception is centrally concerned with the rate at which false positives occur conditional on the fact of being coerced. Equally precise coercion played a role in the debate over the COMPAS algorithm. Responding to ProPublica’s allegations of racial disparity, Northpointe focused on the fact that the rate of error among the black and white groups subject to coercion was the same. In effect, the Northpointe argument was that so long as equally precise coercion obtained, there was no discrimination problem. The fourth and final conception of fairness in the algorithmic context also focuses on false positives, but from a different angle. Rather than the subset subject to coercion, it focuses on the subset that would not go on to commit a crime or violent act. This subset of nonrecidivating persons is used as a denominator. For a numerator, it asks what fraction of that subpopulation is incorrectly subject to coercion. In the bail context, for example, this means asking whether “among defendants who would not have gone on to commit a violent crime if released, detention rates are equal across race groups.” In other words, conditional on being a nonrecidivist (in whatever sense of that term is relevant), the rate of erroneous false positives across racial groups does not vary—or P(high risk|nonrecidivist). This conception of equality is not easy to capture using Figure 1, since the baseline category of nonrecidivists are dispersed on both sides of the risk thresholds. In effect, it comprises a diffuse subset of whites and blacks who in fact would not commit actions that justify coercion. This conception of fairness requires that we look for the proportion of that nonrecidivist subset to the right of the risk threshold. If one racial group’s ratio is larger than the other’s, there is reason for concern under this theory.” (1119-22)

Huq, Aziz, Z. “RACIAL EQUITY IN ALGORITHMIC CRIMINAL JUSTICE.” Duke Law Journal, Vol. 68, No. 6, March 2019, pp. 1043-1134

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s