Skip to content

Commit

Permalink
Change name to packages.cfg.
Browse files Browse the repository at this point in the history
Remove other syntax alternatives. We are aiming for a simple format.

Review URL: https://codereview.chromium.org//1080283002
  • Loading branch information
lrhn committed Apr 14, 2015
1 parent 7d1c5f4 commit 5bd44ab
Showing 1 changed file with 24 additions and 62 deletions.
86 changes: 24 additions & 62 deletions DEP-pkgspec.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,30 +34,25 @@ It avoids creating extra directories with symbolic links, and instead puts the s

## Examples

An example `packages.txt` file could be:
An example `packages.cfg` file generated by the pub tool could be:

# auto-generated 2015-12-24 from somewhere/projectname/pubspec.yaml
# auto-generated 2015-12-24 11:24 from somewhere/projectname/pubspec.yaml
unittest=/home/somebody/.pub/cache/unittest-0.9.9/lib/
async=/home/somebody/.pub/cache/async-1.1.0/lib/
quiver=/home/somebody/.pub/cache/quiver-1.2.1/lib/
# end auto-generated
homebrew=../../libs/homebrew/lib/
rawangular=https://raw.githubusercontent.com/angular/angular.dart/master/lib/

This package specification will allow a program run using it to import `package:unittest/unittest.dart` and receive the file `/home/somebody/.pub/cache/unittest-0.9.9/lib/unittest.dart`.

At the same time, other libraries can be fetched from the net every time they are needed, or from another project that is still in development.

I envision the comment-delimited lines have been automatically generated by the "pub" tool, while keeping the manually added lines unchanged.

## Proposal

The solution proposed here is:

- Dart tools can load their "package"-URI resolution from a single file.
- They must still support the "packages" directory and "--package-root" argument for backwards compatibility.
- The proposed default name is `packages.txt`.
- The file uses a simple line-based key/value format similar to Java properties files or Windows ini files. This format is deliberately kept so simple that parsing it is trivial.
- The proposed default name is `packages.cfg`.
- The file uses a simple line-based key/value format similar to Java properties files or Windows ini files. This format is deliberately kept so simple that parsing it is trivial. It can be directly read using a Java `Properties` object.
- Tools that now support a "--package-root" parameter must also support a "--package-spec" parameter which takes a file name as argument.

The file itself contains a list of key/value pairs, separated by a `=` character. The syntax is:
Expand All @@ -82,83 +77,50 @@ If the `unittest` package was specified as:

unittest=../../packages/unittest-0.9.9/lib

in the specification file `file:///home/somebody/dart/project/smarty/packages.txt`, then the base path of the package `unittest` is `file:///home/somebody/dart/packages/unittest-0.9.9/lib/`. The remaining path "unittest.dart" is resolved against this, getting `file:///home/somebody/dart/packages/unittest-0.9.9/lib/unittest.dart`.
in the specification file `file:///home/somebody/dart/project/smarty/packages.cfg`, then the base path of the package `unittest` is `file:///home/somebody/dart/packages/unittest-0.9.9/lib/`. The remaining path "unittest.dart" is resolved against this, getting `file:///home/somebody/dart/packages/unittest-0.9.9/lib/unittest.dart`.

As another example, the import `import 'package:unittest/../../bar/something.dart';` is first normalized to import `'package:bar/something.dart'` before it's resolved. This avoids clever URIs from escaping from the specified package locations and reading arbitrary files on the same system.

If a tool gets neither a "--package-spec" or a "--package-root" command line parameter, it may look for a `packages.txt` file next to the program entry point (which can then not be given using a package: URI). For example running an application like:
If a tool gets neither a "--package-spec" or a "--package-root" command line parameter, it may look for a `packages.cfg` file next to the program entry point (which can then not be given using a package: URI). For example running an application like:

dart http://example.com/smarty/main.dart

will cause the `dart` stand-alone VM to check for the existence of `http://example.com/smarty/packages.txt`, and if that URI returns a file, use the content for resolving package URIs in the application.
will cause the `dart` stand-alone VM to check for the existence of `http://example.com/smarty/packages.cfg`, and if that URI returns a file, use the content for resolving package URIs in the application.

If a tool does not find a `packages.txt` file in that location, it should fall back to expecting a `packages` directory next to the entry point, as if that had been specified using `--package-root`, and resolving packages against that directory.
If a tool does not find a `packages.cfg` file in that location, it should fall back to expecting a `packages` directory next to the entry point, as if that had been specified using `--package-root`, and resolving packages against that directory.

As part of the implementation of this proposal, the "pub" tool should be changed to allow writing a `packages.txt` file where it currently creates a "package" directory in a package's root directory.
Pub should avoid automatically creating duplicate `packages.txt` files in other locations.
As part of the implementation of this proposal, the "pub" tool should be changed to allow writing a `packages.cfg` file where it currently creates a "package" directory in a package's root directory.
Pub should avoid automatically creating duplicate `packages.cfg` files in other locations.

Tools that need to support the "--package-spec" parameter includes the standalone VM, dart2js, and dart-analyzer.

## Alternatives and Variants

The simplest alternative is to keep using just the "--package-root" parameter.
The shortcomings are not crippling in most cases, but it has some problems that are not easily solved.

Alternative implementations of the same concept could be:

### Using a more structured and extensible format
The format should be *efficiently* parsable by the VM (which parses the file during start-up) and easily parsable by other tools. It's likely that more tools will need to read the file in the future, so the format should either be dead simple or it should have existing efficient parsers.

The proposed format for the `packages.txt` file is basically the least common denominator of configuration files, so there are lots of existing parsers compatible with it (it should be directly parsable as a Java .properties file, by most Windows .ini file parsers, and by any number of Unix config file parsers).
### Look for `packages.cfg` recursively for local files.
A tool is only required to look for the `packages.cfg` file next to the entry point. It would be useful if the tool checks the parent directory as well, recursively, so that there is no need for extra package resolution files when running files in a sub-directory. This mainly makes sense for local files, so it should be done only if the entry-point is a `file:` URI.

It has been suggested that the file could later be repurposed to also contain other meta-data for the program.
The proposed format is not extensible in this way.
A more structured format would allow other, unrelated, data to be added to the file in the future without breaking existing implementations.
We require that tools that are passed neither `--package-root` nor `--package-spec` as command line parameters with an entry point that is a `file:` URI must also *check* if a `packages` directory exists next to the entry point, and not just assume it.
If the directory does not exist, the tool should check for the existence of a `packages.cfg` in the parent directory, and all the way along the path from the root file system root, stopping when the first file is found.

One example would be using JSON, with a format like `{"packages":{"unittest":"...../lib", ...}}`. Any JSON parser can parse the file, and only look at the object under `"packages"` for package resolution information. That would allow adding more properties to the top-level object later without affecting package resolution.
This allows running `dart program.dart` on any dart file in a package without extra symlinks and without extra duplicates of the package resolution file, since the `packages.cfg` file in the root of the package directory will take presedence.

Another option would be to have a format that is a subset of Dart syntax, for example:
```dart
var packages = const { r"unittest": r"...../lib", .... };
```
That would allow the VM to use its existing Dart parser to parse the data, but would also require all other tools that want to read the file to implement a (partial) Dart parser. There is no particular reason to prefer this format over JSON for the VM, and it's very inconvenient for other exising or future tools.
The current solution is to create duplicate package directories in all directories that might contain a Dart program. Adding this complication in looking for the package resolution file is probably worth it if it avoids having similar duplicate `packages.cfg` files.

Reusing the file for other information as well is so far purely speculative.
It is not clear that using this file for any other metadata is preferable to having it in a spearate file. If anything, VM start-up time consideration would suggest *not* adding any data to the package resolution file that isn't needed by the VM at start-up. One example of data that would be necessary at start-up is configuration options, normally passed on the command line as `-Dfoo=bar`, which could be added from a file instead.

Using a more complex format adds extra overhead and complexity. Changing it from a simple line-based format to a structured format requring a real parser. This makes tools more complex, and since the file must be read on start-up, more complexity may increase the delay before code can start running.
Using JSON can leverage existing efficient parsers, but it's still unlikely to be more efficient than a single line-based format.
It still takes extra time to search the parent path in the case where there is no file in any higher-level directory, and the search continues to the root directory. Some file systems may even have networked directories in the parent path, even if the current directory is on local disk, making the extra lookup potentially more expensive.

The proposed format is probably the simplest format that solves the original problem, and if we later figure out that we need more, we can change the format at that time.
Using a file with just the simple format is still an improvement over using file-system links, and making it more complex doesn't add anything to solving that problem.
The check can only be performed on `file:` URIs using OS file operations. There is no guaranteed way to check if a `packages` directory exists on an HTTP server. Fetching `http://example.com/app/packages/` may fail even if `http://example.com/app/packages/foo/foo.dart` would succeed.

On the other hand, all experience shows that configuration files tend to increase in complexity by adding "just one more feature" to an eventually too-simple format. Picking an existing structured format will leverage this experience ahead of time, instead of maybe having to doing it later.
## Alternatives and Variants

### Allow more than one specification file.
It could be possible to add more than one file on the command line, and/or allow imports in specification files. This would allow reuse of existing files, and patching together a configuration from partial configurations. Again this will increase start-up latency, and imports will require a more complex format.
There are a few tweaks that can be applied to the behavior above if it is deemed advantagous, but which should probably not be part of a first implementation.

### Formatting tweaks
The simple line-based syntax may be tweaked slightly.

- Simplifying the format by dropping comments. Comments are very useful for both manually written files and for adding extra information.
- Simplifying the format by dropping comments. Comments are very useful for both manually written files and for adding extra information, and as currently specified, detecting a comment is only a matter of reading the first character of a line.

- Allowing extra white-space in the format at the start and end of a line and around the `=`. This allows, for example, aligning entries, but requires defining "whitespace" and (very) slightly increases the complexity of the parser. It's not necessary, but might be convenient. Just accepting space and tab is likely sufficient for most users, but it's also annoying to have other white-space characters not allowed if they are not visually distinguishable from allowed spaces.

- Add sections like in Windows ini-files. A section is begun by a line like `[section name]` and reaches until the next section or the end of the file. Sections have no influence on resolution, but can allow tools that create or manipulate the package resolution file file to tag specific entries in the file. The same behavior can be achieved using recognizable comments (like in the first example above), but if the feature is useful, it is safer to have it supported explicitly instead of hacking it up using meaningful comments. Parsers that are uninterested in sections can treat a line starting with `[` as a comment. A `[` character is not valid in a path segment, so using it for comments is not ambiguous.

### Look for `packages.txt` recursively.
A tool is only required to look for the `packages.txt` file next to the entry point. It may be useful if the tool checks the parent directory as well, recursively, so that there is no need for extra package resolution files when running files in a sub-directory. This mainly makes sense for local files, so it should be done only if the entry-point is a `file:` URI.

We can require that tools that are passed neither `--package-root` nor `--package-spec` as command line parameters for an entry point that is a `file:` URI must also *check* if a `packages` directory exists next to the entry point, and not just assume it.
If the directory does not exist, the tool should check for the existence of a `packages.txt` in the parent directory, and all the way along the path from the root file system root, stopping when the first file is found.

This allows running `dart program.dart` on any dart file in a package without extra symlinks and without extra duplicates of the package resolution file, since the `packages.txt` file in the root of the package directory will take presedence.

The current solution is to create duplicate package directories in all directories that might contain a Dart program. Adding this complication in looking for the package resolution file is probably worth it if it avoids having similar duplicate `packages.txt` files.

It still takes extra time to search the parent path in the case where there is no file in any higher-level directory, and the search continues to the root directory. Some file systems may even have networked directories in the parent path, even if the current directory is on local disk, making the extra lookup potentially more expensive.

The check can only be performed on `file:` URIs using OS file operations. There is no guaranteed way to check if a `packages` directory exists on an HTTP server. Fetching `http://example.com/app/packages/` may fail even if `http://example.com/app/packages/foo/foo.dart` would succeed.
### Allow more than one specification file.
It could be possible to add more than one file on the command line, and/or allow imports in specification files. This would allow reuse of existing files, and patching together a configuration from partial configurations. Again this will increase start-up latency, and imports will require a more complex format.

### Multiple locations for the same package
Allow the same package name to occur more than once, associating it to more than one target location. Resolving a file in that package then checks each possible target location in order until it finds one that holds the requested file. The use of this feature is highly speculative, but could allow some parts of a package to reside in a different location than the rest, without having to copy the files to a common location.
Expand All @@ -171,7 +133,7 @@ The format was picked to be quick for the VM to parse.

The `Isolate.spawnUri` function has a `packageRoot` parameter. It should probably be extended with a `packageResolution` parameter of type `Map<String,Uri>`. If both parameters are passed, the `spawnUri` function should fail.

There should be a Dart package for reading and writing `packages.txt` files, converting to and from `Map<String, Uri>`. This can be used by all Dart based tools that need to read or write the package resolution file.
There should be a Dart package for reading and writing `packages.cfg` files, converting to and from `Map<String, Uri>`. This can be used by all Dart based tools that need to read or write the package resolution file.

The "per package name" configuration enforces that package names are special, they are not just the first segment of path of the `package:` URI. Nothing currently prevents placing a file *in* the "package" directory, say "trick.dart" and then importing "package:trick.dart". With the package-spec file, that is no longer possible, because "trick.dart" would be seen as a package name and the file path is missing, so it would not find any file.

Expand Down

0 comments on commit 5bd44ab

Please sign in to comment.