Skip to content

This depository uses SEC EDGAR data in Schedule 13D and Schedule 13G data to find all positions above 5% in all US stocks between 1994 and 2018.

Notifications You must be signed in to change notification settings

Erfanit64/Block_Codes

 
 

Repository files navigation

Block_Codes

This GitHub page describes construction of the data in the paper "Is Blockholder Diversity Detrimental?" by Miriam Schwartz-Ziv and Ekaterina Volkova (2020)

The most recent version of the paper is avaliable as SSRN: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3621939

Step 1. Download Files.

  • download_forms.R file downloads sc13d/13g files and their amendments and puts them into SQL database.
  • this file downloads the list of all forms for each year from SEC website, the only thing you need to specify is a range of years in loop and working directory
  • code is slow and takes up to several hours to complete. To make sure, that I get all posible files, I download each file twice from master file for filer and for subject.

Step 2. Extract and Convert Main Filings.

  • extract_body_form.R extracts main filing from complete submission files and convert .htm to plain text format if needed.
  • I put output into another SQL database.

Step 3. Parse SEC Header.

Step 4. Extract CUSIP from the filings.

  • extract_CUSIP.R script returns six and eight digit CUSIP from SEC filings.
  • Output of this part is a CIK-CUSIP map, which could be downloaded in .csv format from my website (www.evolkova.info)

Step 5. Extract size of the block positon.

  • parsing_prc_position.R extracts the aggregate block size from the filing.

Step 6. Extract identity of blockholders.

About

This depository uses SEC EDGAR data in Schedule 13D and Schedule 13G data to find all positions above 5% in all US stocks between 1994 and 2018.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • R 100.0%