-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use code to generate Unicode-LaTeX character mapping table #223
Conversation
@yihui in case you got a minute to review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First, I'd prefer using a matrix
to write the data, which is a little more compact than the data frame.
Second, I wonder if it's worth the effort to make the file R/unicode_latex.R
human-readable. If not, we could consider just dump()
the data frame in update_unicode_latex()
.
I don't have a strong opinion on either point. It's fine to merge the current PR as is.
Co-authored-by: Yihui Xie <[email protected]>
Great! Thanks. I've applied the changes and updated the table. The matrix version is exactly what we need to be less tedious. How I hoped there could be a row-wise data frame constructor in base. 😂 Making it human-readable seems to be manageable in this case, so let's just keep it that way. |
rows <- paste( | ||
sprintf( | ||
'"%s", "%s", %d', | ||
tbl$unicode, | ||
gsub("\\", "\\\\", tbl$latex, fixed = TRUE), | ||
tbl$int | ||
), | ||
sep = ", " | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
paste()
is unnecessary (commas have been added in sprintf()
).
rows <- paste( | |
sprintf( | |
'"%s", "%s", %d', | |
tbl$unicode, | |
gsub("\\", "\\\\", tbl$latex, fixed = TRUE), | |
tbl$int | |
), | |
sep = ", " | |
) | |
rows <- sprintf( | |
'"%s", "%s", %d', | |
tbl$unicode, | |
gsub("\\", "\\\\", tbl$latex, fixed = TRUE), | |
tbl$int | |
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! Patched in another PR: #224
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for improving the transparency the the source data!
Fixes #218
This PR creates an internal function in
R/utils.R
to generate the mapping table intoR/unicode_latex.R
.This eliminates the need for using the binary file
sysdata.rda
and is more friendly for version control.The new, code-generated data frame is bitwise identical to the version saved in
sysdata.rda
, except that theint
column is of class integer, not numeric.Data ingestion issue worth following up
You might want to check the data ingestion logic. I found no evidence on how the previous version was constructed. I used some ad hoc logic to get an identical version of the table, but it would be good to check if the data included in the previous version is reasonable, or what specific filters were applied. For example, from the beginning, without using
quote = ""
inread.table()
, it will give:This will result in only 1740 rows vs. 2757 rows when using
quote = ""
, which avoids the warning.