Skip to content

CDLUC3/ezid

Repository files navigation

What is this thing?
===================

EZID is a Django web application that can be run under Django's
built-in development server or under Apache with the mod_wsgi
extension.

Prerequisites
=============

EZID requires:

- Django <http://www.djangoproject.com/> (version 1.8+);
- Python <http://www.python.org/> (version 2.7+);
- a relational database (SQLite <http://www.sqlite.org/> and MySQL
  <http://www.mysql.com/> have both been used);
- if using MySQL, a MySQL API driver, e.g., mysqlclient
  <https://pypi.python.org/pypi/mysqlclient>;
- lxml <http://lxml.de/> for processing XML;
- Apache <http://httpd.apache.org/> if running under that, and
- mod_wsgi <http://code.google.com/p/modwsgi/> (note that to work with
  mod_wsgi Python will have to have been compiled with shared object
  support);
- an SSL library, e.g., OpenSSL <http://www.openssl.org/>, to
  communicate with DataCite and LDAP;
- OpenLDAP <http://www.openldap.org/> if LDAP authentication is used,
  and
- python-ldap <http://www.python-ldap.org/>.

Those are the core prerequisites for running EZID standalone.  But
EZID is of little value unless there are some other, external servers
running: a Noid <http://wiki.ucop.edu/display/Curation/NOID> "egg"
server for storing metadata (strictly speaking, optional, but an
essential component in the grand scheme of things); additional Noid
"nog" servers for minting identifiers; a shoulder server (optional);
DataCite services for creating DOI identifiers; and an LDAP server, if
LDAP authentication is used.

General layout and configuration
================================

Whether it is run under the built-in development server or under
Apache, EZID assumes that it is embedded in the following directory
layout:

    .../SITE_ROOT/
        PROJECT_ROOT/
            (this software distribution:)
            LDAP
            LOCALIZATION
            NOTES
            README
            apache/
            code/
            doc/
            etc/
            ezidapp/
            profiles/
            settings/
            static/
            templates/
                info/ (separate distribution)
            tools/
            ui_tags/
            xsd/
        db/
            alert_message
            django.sqlite3
            search.sqlite3 (if using SQLite)
            secret_key
            shoulders_cache.txt
            stats
            store.sqlite3
        download/
            public/
        logs/
            transaction_log

The names of the SITE_ROOT and PROJECT_ROOT directories are arbitrary
(EZID automatically detects what they are), but for the remainder of
this document we'll assume that they are literally those names.

EZID requires certain static HTML files to be present in the
.../SITE_ROOT/PROJECT_ROOT/templates/info directory.  These files are
provided by a separate Mercurial repository.

EZID requires one environment variable, DJANGO_SETTINGS_MODULE, that
indicates the deployment level.  Possible values:

    settings.localdev
    settings.remotedev
    settings.development
    settings.staging
    settings.workflow
    settings.production

See the corresponding modules in the settings directory for the
effects the deployment level has.  In addition, the deployment level
is used to define deployment-level-specific values for options in
settings/ezid.conf and settings/ezid.conf.shadow, as in:

    [datacite]
    enabled: false
    {production}enabled: true

Running under Django
====================

When running EZID under Django's built-in server it will probably be
necessary to set the PYTHONPATH like so:

    setenv PYTHONPATH .../SITE_ROOT/PROJECT_ROOT

The combination of the PYTHONPATH and DJANGO_SETTINGS_MODULE
environment variables determines the location of the Django
application.

To run the built-in server:

    django-admin runserver

The server is hosted at http://localhost:8000/.

Initial setup
=============

To create the initial databases, assuming the above environment
variables have been set:

    django-admin migrate
    django-admin migrate --database=search

    sqlite3 .../SITE_ROOT/db/store.sqlite3 \
      < .../SITE_ROOT/PROJECT_ROOT/etc/store-schema.sql

Also, if using MySQL for the search database:

    mysql < .../SITE_ROOT/PROJECT_ROOT/etc/search-mysql-addendum.sql

To load initial data into the search database, run the following
command.  (search-init.json is stored in
.../SITE_ROOT/PROJECT_ROOT/ezidapp/fixtures, but the full path need
not be specified on the command line.)  The users, groups, and realms
defined in this file are just examples.

    django-admin loaddata --database=search search-init.json

Running under Apache
====================

To run EZID under Apache and mod_wsgi, only the DJANGO_SETTINGS_MODULE
environment variable need be set.  Five sets of directives are
required in Apache's httpd.conf.  The following assume that EZID is
hosted at http://{host}/.

1. Load mod_wsgi.

    LoadModule wsgi_module path/to/mod_wsgi.so

2. Tell mod_wsgi to run EZID as a single, separate, multithreaded
process.  (Actually, this directive applies to all mod_wsgi
applications within the same server or virtual host.)  The name is
arbitrary.

    WSGIDaemonProcess site-1 threads=50 shutdown-timeout=60

The parameters are not required, but may be desirable.  The default
number of threads is 15, which may be too low if clients submit many
requests concurrently.  The default shutdown timeout is 5 seconds; a
longer timeout gives current operations more time to finish cleanly.

3. Map EZID URLs to mod_wsgi+Django.

    WSGIScriptAlias / /path/to/SITE_ROOT/PROJECT_ROOT/apache/django.wsgi

4. Add the following directives.  If SSL is used to protect the login
page, the WSGIApplicationGroup directive is needed to avoid creating
two, parallel instances of the Python interpreter, one for HTTP and
one for HTTPS.  (Strangely, though, while this directive has the
desired effect, it does not need to be mentioned anywhere in Apache's
SSL configuration.  In fact, nothing EZID-related needs mentioning in
the SSL configuration.)  The WSGIPassAuthorization directive is needed
to pass through HTTP Basic authentication credentials (otherwise, they
get swallowed).

    <Directory /path/to/SITE_ROOT/PROJECT_ROOT/apache>
    Order Allow,Deny
    Allow from all
    WSGIApplicationGroup %{GLOBAL}
    WSGIProcessGroup site-1
    WSGIPassAuthorization on
    </Directory>

For somewhat mysterious reasons the application group *must* be
"%{GLOBAL}" and not an arbitrary name such as "ezid".  This is because
lxml is incompatible with mod_wsgi's use of Python sub-interpreters
(even though EZID is being run as a separate daemon process).

5. Add aliases so that static files are served by Apache, not Django.

    Alias /static /path/to/SITE_ROOT/PROJECT_ROOT/static

    <Directory /path/to/SITE_ROOT/PROJECT_ROOT/static>
    Order Allow,Deny
    Allow from all
    </Directory>

    Alias /download /path/to/SITE_ROOT/download/public

    <Directory /path/to/SITE_ROOT/download/public>
    Order Allow,Deny
    Allow from all
    Options -Indexes
    </Directory>