Skip to content

Commit

Permalink
update_unicode.sh: move it into contrib/update-unicode
Browse files Browse the repository at this point in the history
As it's used only by a tiny minority of the Git developer population,
this script does not belong into the main Git source directory.

Move it into contrib/ and adjust the paths to account for the new
location.

Signed-off-by: Beat Bolli <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
  • Loading branch information
bbolli authored and gitster committed Dec 14, 2016
1 parent 32c239d commit f3eb549
Show file tree
Hide file tree
Showing 4 changed files with 26 additions and 6 deletions.
1 change: 0 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -231,7 +231,6 @@
/config.mak.autogen
/config.mak.append
/configure
/unicode
/tags
/TAGS
/cscope*
Expand Down
3 changes: 3 additions & 0 deletions contrib/update-unicode/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
uniset/
UnicodeData.txt
EastAsianWidth.txt
20 changes: 20 additions & 0 deletions contrib/update-unicode/README
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
TL;DR: Run update_unicode.sh after the publication of a new Unicode
standard and commit the resulting unicode_widths.h file.

The long version
================

The Git source code ships the file unicode_widths.h which contains
tables of zero and double width Unicode code points, respectively.
These tables are generated using update_unicode.sh in this directory.
update_unicode.sh itself uses a third-party tool, uniset, to query two
Unicode data files for the interesting code points.

On first run, update_unicode.sh clones uniset from Github and builds it.
This requires a current-ish version of autoconf (2.69 works per December
2016).

On each run, update_unicode.sh checks whether more recent Unicode data
files are available from the Unicode consortium, and rebuilds the header
unicode_widths.h with the new data. The new header can then be
committed.
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,9 @@
#Mn Nonspacing_Mark a nonspacing combining mark (zero advance width)
#Cf Format a format control character
#
UNICODEWIDTH_H=../unicode_width.h
if ! test -d unicode; then
mkdir unicode
fi &&
( cd unicode &&
cd "$(dirname "$0")"
UNICODEWIDTH_H=$(git rev-parse --show-toplevel)/unicode_width.h
(
if ! test -f UnicodeData.txt; then
wget http://www.unicode.org/Public/UCD/latest/ucd/UnicodeData.txt
fi &&
Expand Down

0 comments on commit f3eb549

Please sign in to comment.