Skip to content

Replication code for "Indian judges show no gender or religious in-group bias" (Ash et al. 2020)

Notifications You must be signed in to change notification settings


Repository files navigation

In-group bias in the Indian judiciary


These code and data files replicate the results in "In-group bias in the Indian judiciary: Evidence from 5 million criminal cases" by Elliot Ash, Sam Asher, Aditi Bhowmick, Daniel Chen, Tanaya Devi, Christoph Goessman, Paul Novosad, Bilal Siddiqi (2021). A working paper version of the manuscript can be found here.

Data Availability

All data sources used in the paper are available in the paper's data packet.The primary data source is the recently digitized data from the eCourts platform (a semi-public system by Indian government to host summary data and full text from orders and judgements in courts across the country) on the outcomes of close to the universe of criminal cases in India from 2010-2018. The data files are separated by each year and follows the naming convention "cases_clean_20xx".

The authors have legitimate access to and permission to use the data used in this manuscript.

Description of Data Files

Dataset Description
cases_clean_2010 The file contains data on all criminal court cases from Indian lower Judiciary from the year 2010.
cases_clean_2011 The file contains data on all criminal court cases from Indian lower Judiciary from the year 2011.
cases_clean_2012 The file contains data on all criminal court cases from Indian lower Judiciary from the year 2012.
cases_clean_2013 The file contains data on all criminal court cases from Indian lower Judiciary from the year 2013.
cases_clean_2014 The file contains data on all criminal court cases from Indian lower Judiciary from the year 2014.
cases_clean_2015 The file contains data on all criminal court cases from Indian lower Judiciary from the year 2015.
cases_clean_2016 The file contains data on all criminal court cases from Indian lower Judiciary from the year 2016.
cases_clean_2017 The file contains data on all criminal court cases from Indian lower Judiciary from the year 2017.
cases_clean_2018 The file contains data on all criminal court cases from Indian lower Judiciary from the year 2018.
judges_clean The file contains data on judges in all courts in the Indian lower judiciary from the eCourts platform.
poi_master The file contains data on People of India; only modules used in the data are shared.
ACLED_India_violence_2005-2023 The file contains data on violent conflict and protests in India, collected by ACLED (Armed Conflict Location & Event Data) which is an independent, impartial, international non-profit organization collecting data on violent conflict and protest across the world.
acled_districts The file contains keys to match the ACLED violence data to Indian districts.

Computational Requirements

This package is designed to be run on a *nix system with Python 3.2+, Matlab 2019+, and Stata 16+ installed. Data and code folders for the replication must not include spaces. This package may require modification to run on Windows due to the use of some Unix shell commands. This package was tested on a system with about 30 GB of memory.

Description of programs / code

The file describes the build and analysis process in detail.

Instruction to Replicators

To regenerate the tables and figures from the paper, take the following steps:

  • Download and unzip the replication data package linked at the end of this document

  • Clone this repo (github) or copy all the code into a folder.

  • Create a python environment following the package list in requirements.yml. For example:

conda env create -f requirements.yml -n py_justice
conda activate py_justice
  • Set the following environment variables so that Python will be able to find the data and output paths. From the Unix/OSX shell (before running Stata):
export TMP=[path to working files]
export OUT=[destination path for exhibits]
export JDATA=[folder where the replication data package is unzipped]
  • Open the do file, and set the globals out, jdata, tmp, and jcode. These need to match the environment variables set in the previous step!
  1. $out is the target folder for all outputs, such as tables and graphs.
  2. $tmp is the folder for the data files and temporary data files that will be created during the rebuild.
  3. $jdata is the folder where you unzipped and saved the replication data package.
  4. $jcode is the code folder of the clone of the replication repo
  • Run the do file This will run through all the other do files to regenerate all of the results.

  • We have included all the required programs to generate the main results. However, some of the estimation output commands (like estout) may fail if certain Stata packages are missing. These can be replaced by the estimation output commands preferred by the user.

  • Please note we use globals for pathnames, which will cause errors if filepaths have spaces in them. Please store code and data in paths that can be access without spaces in filenames.

  • This code was tested using Stata 16.0. Run time to generate all results on our server was about 8 hours.

The mapping of do files to tables and figures is as follows:

Exhibit Code filename Output Filename
Figure 1 g_coef1.png, g_coef2.png, r_coef1.png, r_coef2.png
Table 1 judge_summary.tex
Table 2 gender_acquitted.tex , gender_decision.tex
Table 3 religion_acquitted.tex , religion_decision.tex
Table 4 victim_inter.tex
Table 5 last_names.tex
Figure 2 , lit_coef.png , pub_bias.png
Table 6 pub_bias.tex
Figure A5 judge_acquittal_resids.png
Figure A6 name_balance_coef_rcap.png
Figure A7 rare_names_weighted.png , rare_names_unweighted.png
Table A2 table_crime_in_sample.tex , table_state_in_sample.tex
Table A3 class_success.tex
Table A5 gender_amb.tex
Table A6 religion_amb.tex
Table A7 gender_non_convicted.tex
Table A8 gender_acquitted_amb.tex
Table A9 religion_non_convicted.tex
Table A10 religion_acquitted_amb.tex
Table A11 gbal.tex
Table A12 rbal.tex
Table A13 output_sample_accounting_1.tex
Table A14 output_sample_accounting_2.tex
Table A15 balance_extended_missing.tex
Table A16 balance_extended_lawyers.tex
Table A17 table_judges_by_crime_category.tex
Table A18 low_ambiguity_rcts.tex
Table A19 balance_lawyers.tex
Table A20 lawyers_religion.tex
Table A21 lawyers_gender.tex
Table A22 victim_inter_all_g.tex
Table A23 victim_inter_all_r.tex
Table A24 crimes_against_women.tex
Table A25 victim_inter_cy.tex
Table A26 rct_2year_bins.tex
Table A27 table_election_month.tex
Table A28 last_names_loc_year.tex
Table A29 surname_freq_table.tex
Table A30 table_balance_poi.tex
Table A31 table_ingroup_poi.tex
Table B1 random_acq.tex

Data download

The data to replicate this paper is available on Google Drive and at the Harvard Dataverse.

  • The Google Drive version is recommended, because Harvard Dataverse requires us to split up the files in strange ways. If you download from the Harvard Dataverse, you need to: (1) unzip all case files separately into the raw/ subfolder; (2) recombine the large 2018 case file: cat cases_clean_2018_part_* > and put it into the raw/ subfolder.

  • The layout of data files should look like this when everything is unzipped. $jdata should point to the raw folder with the * below.

├── out
├── raw*
│   ├── acled_district_key.dta
│   ├── acled_districts.dta
│   ├── ACLED_India_violence_2005-2023.csv
│   ├── cases_clean_2010.dta
│   ├── cases_clean_2011.dta
│   ├── cases_clean_2012.dta
│   ├── cases_clean_2013.dta
│   ├── cases_clean_2014.dta
│   ├── cases_clean_2015.dta
│   ├── cases_clean_2016.dta
│   ├── cases_clean_2017.dta
│   ├── cases_clean_2018.dta
│   ├── cases_state_key.dta
│   ├── classification
│   │   └── pooled_names_clean_appended.dta
│   ├── judges_clean.dta
│   ├── keys
│   │   ├── acled_district_key.dta
│   │   ├── cases_district_key.dta
│   │   ├── cases_state_key.dta
│   │   ├── disp_name_key.dta
│   │   ├── pc11_court_district_key.dta
│   │   ├── purpose_name_key.dta
│   │   └── type_name_key.dta
│   ├── lit_coefs.dta
│   ├── names
│   ├── poi_master.dta
│   └── raw
│       └── ACLED_India_violence_2005-2023.csv
└── tmp


Replication code for "Indian judges show no gender or religious in-group bias" (Ash et al. 2020)






No releases published


No packages published