Skip to content

Commit

Permalink
upstream cleaning English affricate wrt tie bar (#491)
Browse files Browse the repository at this point in the history
* upstream cleaning English aftricate wrt tie bar

* Changelog updated and summary.tsv added

* removed extra tsv files
  • Loading branch information
Othergreengrasses authored Mar 28, 2023
1 parent 2af814c commit a194bca
Show file tree
Hide file tree
Showing 10 changed files with 44,078 additions and 21,269 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,7 @@ Unreleased
- Fixed Common character collection in `common_characters.py` (\#419)
- Scraping test fixed for `blt`. (\#436)
- Changed URLs to point at CUNY-CL repo, where applicable. (\#438)
- Upstream cleaning wrt English tie bar. (\#491)

### Under `wikipron/` and elsewhere

Expand Down
12 changes: 6 additions & 6 deletions data/scrape/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,12 +87,12 @@
| [TSV](tsv/ell_grek_broad.tsv) | ell | Modern Greek (1453-) | Greek | Greek | | False | Broad | True | 11,900 |
| [TSV](tsv/ell_grek_broad_filtered.tsv) | ell | Modern Greek (1453-) | Greek | Greek | | True | Broad | True | 11,725 |
| [TSV](tsv/ell_grek_narrow.tsv) | ell | Modern Greek (1453-) | Greek | Greek | | False | Narrow | True | 407 |
| [TSV](tsv/eng_latn_uk_broad.tsv) | eng | English | English | Latin | UK, Received Pronunciation | False | Broad | True | 66,476 |
| [TSV](tsv/eng_latn_uk_broad_filtered.tsv) | eng | English | English | Latin | UK, Received Pronunciation | True | Broad | True | 65,765 |
| [TSV](tsv/eng_latn_uk_narrow.tsv) | eng | English | English | Latin | UK, Received Pronunciation | False | Narrow | True | 1,343 |
| [TSV](tsv/eng_latn_us_broad.tsv) | eng | English | English | Latin | US, General American | False | Broad | True | 63,813 |
| [TSV](tsv/eng_latn_us_broad_filtered.tsv) | eng | English | English | Latin | US, General American | True | Broad | True | 62,993 |
| [TSV](tsv/eng_latn_us_narrow.tsv) | eng | English | English | Latin | US, General American | False | Narrow | True | 1,700 |
| [TSV](tsv/eng_latn_uk_broad.tsv) | eng | English | English | Latin | UK, Received Pronunciation | False | Broad | True | 71,727 |
| [TSV](tsv/eng_latn_uk_broad_filtered.tsv) | eng | English | English | Latin | UK, Received Pronunciation | True | Broad | True | 71,021 |
| [TSV](tsv/eng_latn_uk_narrow.tsv) | eng | English | English | Latin | UK, Received Pronunciation | False | Narrow | True | 1,557 |
| [TSV](tsv/eng_latn_us_broad.tsv) | eng | English | English | Latin | US, General American | False | Broad | True | 69,683 |
| [TSV](tsv/eng_latn_us_broad_filtered.tsv) | eng | English | English | Latin | US, General American | True | Broad | True | 68,874 |
| [TSV](tsv/eng_latn_us_narrow.tsv) | eng | English | English | Latin | US, General American | False | Narrow | True | 2,035 |
| [TSV](tsv/enm_latn_broad.tsv) | enm | Middle English (1100-1500) | Middle English | Latin | | False | Broad | True | 8,825 |
| [TSV](tsv/epo_latn_broad.tsv) | epo | Esperanto | Esperanto | Latin | | False | Broad | True | 4,889 |
| [TSV](tsv/epo_latn_narrow.tsv) | epo | Esperanto | Esperanto | Latin | | False | Narrow | True | 12,218 |
Expand Down
3 changes: 2 additions & 1 deletion data/scrape/lib/languages.json
Original file line number Diff line number Diff line change
Expand Up @@ -598,7 +598,8 @@
"script": {
"zyyy": "Common",
"latn": "Latin",
"grek": "Greek"
"grek": "Greek",
"hebr": "Hebrew"
}
},
"enm": {
Expand Down
Loading

0 comments on commit a194bca

Please sign in to comment.