Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zeno IMproved #153

Open
KOLANICH opened this issue May 21, 2019 · 6 comments · May be fixed by #343
Open

Zeno IMproved #153

KOLANICH opened this issue May 21, 2019 · 6 comments · May be fixed by #343

Comments

@KOLANICH
Copy link
Contributor

KOLANICH commented May 21, 2019

meta:
  id: zim
  title: "(Open) Zeno IMproved"
  application: 
    - Kiwix
    - zimlib
  file-extension: zim
  xref:
    wikidata: Q784695
  license: CC-BY-SA-3.0
  encoding: utf-8
  endian: le
doc: |
  A file format to store encyclopaedias of articles written in MediaWiki markup language. 
  Files for test: https://dumps.wikimedia.org/other/kiwix/zim/wikipedia/

doc-ref:
  - https://www.openzim.org/wiki/ZIM_file_format
  - https://wiki.openzim.org/wiki/OpenZIM
WiP:
  - https://github.com/KOLANICH/kaitai_struct_formats/blob/OpenZIM/media/openzim.ksy
@GreyCat
Copy link
Member

GreyCat commented Jun 26, 2019

ecyclopaedias => encyclopaedias?

@KOLANICH
Copy link
Contributor Author

KOLANICH commented Jun 26, 2019

Good catch! Fixed (currently in this issue only), thanks.

KOLANICH added a commit to KOLANICH-specs/kaitai_struct_formats that referenced this issue Feb 17, 2020
KOLANICH added a commit to KOLANICH-specs/kaitai_struct_formats that referenced this issue Jun 11, 2020
KOLANICH added a commit to KOLANICH-specs/kaitai_struct_formats that referenced this issue Jul 13, 2020
KOLANICH added a commit to KOLANICH-specs/kaitai_struct_formats that referenced this issue Jul 21, 2020
KOLANICH added a commit to KOLANICH-specs/kaitai_struct_formats that referenced this issue Jul 28, 2020
KOLANICH added a commit to KOLANICH-specs/kaitai_struct_formats that referenced this issue Sep 21, 2020
@KOLANICH KOLANICH linked a pull request Sep 21, 2020 that will close this issue
KOLANICH added a commit to KOLANICH-specs/kaitai_struct_formats that referenced this issue Oct 8, 2020
@generalmimon
Copy link
Member

encyclopaedias

This is a British spelling. See https://en.wikipedia.org/wiki/Encyclopedia:

An encyclopedia or encyclopaedia (British English) is a reference work or (...)

and https://www.merriam-webster.com/dictionary/encyclopaedia:

encyclopaedia, encyclopaedic

Definition of encyclopaedia

chiefly British spellings of ENCYCLOPEDIA , ENCYCLOPEDIC

I thought that we prefer American spelling for KS identifiers, don't we? In fact, @KOLANICH suggested this himself in kaitai-io/kaitai_struct#522 (comment):

One more thing. American spelling: meter, not metre.

@KOLANICH
Copy link
Contributor Author

KOLANICH commented Mar 13, 2021

In fact, @KOLANICH suggested this himself in kaitai-io/kaitai_struct#522 (comment)

Please don't rip out of context. That suggestion was in context of units. Units are the things that must be uniform, and most of units libraries use American spelling for a meter (though some support both), so introducing there metre instead of meter would cause a need of additional remapping of that id too.

In identifiers it probably makes sense to use American spelling only when the word with variability was introduced by spec author and if an another part of the id is not already using British spelling. If the id was originally using British spelling it may make more sense to keep using it.

Also, doc is not an id and doesn't have such expectations as an id has. From searcheability point of view it may make sense to unify spelling, as long as not all search algos can handle that automatically. But I feel like encyclopaedia (and maybe even encyclopædia) looks cooler than encyclopedia, as naïve looks cooler than naive...

@generalmimon
Copy link
Member

In identifiers it probably makes sense to use American spelling only when the word with variability was introduced by spec author and if an another part of the id is not already using British spelling. If the id was originally using British spelling it may make more sense to keep using it.

I don't think so. Quite often, it doesn't make sense to blindly and thoughtlessly follow the style and internal conventions of the reference spec (see examples in the style guide). We want to ensure consistency, predictability and intelligibility of the formats in KSF, so we should follow stable style and conventions for KSY specs rather than conventions of anything else. If a spec for one image format uses color, then what's the point of using colour in another one?

For illustration, here's how it looks if you don't set any conventions:

- id: color_space
type: u4
enum: data_colour_spaces

In an ideal world, same things should be called the same and different things differently, and not only within one specification, but also across all. That's why it makes sense to normalize num_/len_/ofs_ prefixes (since every text specification has its own specific concept how these fields should be called), spelling, etc.

Also, doc is not an id and doesn't have such expectations as an id has.

Perhaps. But that does not justify using different spelling in ids and doc.

But I feel like encyclopaedia (and maybe even encyclopædia) looks cooler than encyclopedia, as naïve looks cooler than naive...

Yes, that's exactly the problem - it's cool, stylish and definitely not monotonous. That's why it distracts attention - you're attracted by the unconventional spelling and not focused on the actual content.

@KOLANICH
Copy link
Contributor Author

If a spec for one image format uses color, then what's the point of using colour in another one?

i.e. when software name has the word colour in it and we include the software name into a spec id.

icc_4.ksy

Clearly a bug - the wording is used inconsistently without a proper justification to do so.

Perhaps. But that does not justify using different spelling in ids and doc.

Not quite. I feel like the pieces that are quotes should retain the original spelling. Especially if they are quotes within text literals marked as such, i.e. with >.

I.e. in software with the token colour in name

meta:
  id: ..._colour_...
  title: ... Color Format
  application: ... Colour ...

doc: |
  ... color ...
  The doc says:
  > ... colour ...

seq:
  ...
  - id: color
  ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants