Seedster is a way to work with real content in a development database, for a Rails application that uses Postgres.
Why not use the built-in Rails seed mechanism?
We preferred a custom solution so that the development data was based on real production data, with referential integrity intact. What we settled on was exporting particular rows of data for a specific user, to provide a meaningful set of their data that could periodically be refreshed and added to with minimal effort.
How is it different from Rails' db seeds?
With Seedster, you write SQL queries for the tables you want data from. The queries can have a parameter like a user_id
. A Rake task is provided to dump data from a production database, and a separate task is provided to load data locally on developers machines.
This works with real user data?
Seedster uses content from a real user, so the recommendation is to use an employee user, test user, or something along those lines, so that the content being used for development is appropriate to share on a development team. Dump files can be viewed by all team members, to ensure they are based on an agreed upon user's data. At this time the dump file is not encrypted although that could be added.
What are the dependencies?
Seedster was developed for use with a Rails application, using PostgreSQL database and depends on psql
, ssh
, scp
, and tar
commands.
Requirements and expectations:
- Local and remote database schema versions are in same. The development database is empty prior to load.
- Use the provided Rake task to dump data from production.
rake seedster:dump
Individual dump files are consolidated into a single tar file. - Use the provided Rake task to download the data file (
rake seedster:load
) to download, extract, and load the data. NEW: Use theskip_download
option if you wish to load from a local data file. - Use
ENV
variables to pass in dynamic data to SQL queries responsible for selecting data, for example supplyUSER_ID
with a value on the command line, and add a parameter to your SQL query with syntax like this%{}
, for example:SELECT * FROM users WHERE id = '%{user_id}'
.
In order to dump the data for user ID 1, using a parameterized SQL query that expects a value for user_id
:
$ rake seedster:dump USER_ID=1
To download, extract, and load the data file:
$ rake seedster:load
gem 'seedster'
And then execute:
$ bundle
Or install it yourself as:
$ gem install seedster
An initializer is required to make Seedster work with your hosts, database, ssh config, and really drives everything that is unique to your application.
The gem provides an initializer. Run:
rails generate seedster:initializer
This will create a file config/initializers/seedster.rb
where you can begin putting in values for your application. The configuration is both for DB credentials, SSH credentials, but also the configuration of the tables you wish to dump.
These are both configured to read your Rails database configuration for defaults to get started. However the intention is the dump host is specified to be a Production host, and the load host is the Development host (or Staging etc.).
Database name
Database connection username
Database connection password
The path on the production host where the application is deployed. For example: /var/company/www/app/current
.
To use a parameter like user_id
in SQL queries, passing the value in via an environment variable like USER_ID=1
, provide a mapping like this {user_id: ENV['user_id']}
. Provide any additional key and value mappings you expect to be present when the query executes.
An SSH user that can connect (for purposes of downloading the file with scp
) to the host where the production dump is.
The SSH host (a hostname) where the dump file is stored. We made this configurable to be able to dump data from production and staging.
e.g. tables = [ {name: '', query: '' } ]
The tables option is where each table being dumped is configured. This is currently an array of hashes, with two items, name
and query
. name
is the name of the database table, and query
is the SQL query.
We are dumping 10-15 tables worth of data for the same user, so that means we have 10-15 hashes in this array.
This is where we would modify those queries over time to include or exclude more tables or fields of data.
To install this gem onto your local machine, run bundle exec rake install
. To release a new version, update the version number in version.rb
, and then run bundle exec rake release
, which will create a git tag for the version, push git commits and tags, and push the .gem
file to rubygems.org.
Run the test suite with rake test
.
Bug reports and pull requests are welcome on GitHub at https://github.com/groupon/seedster. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the Contributor Covenant code of conduct.
-
Update the version in
seedster.gemspec
-
git commit seedster.gemspec
with the following message format:Version x.x.x Changelog: * Some new feature * Some new bug fix
-
rake release
The gem is available as open source under the terms of the APACHE LICENSE, VERSION 2.0.
Everyone interacting in the Seedster project’s codebases, issue trackers, chat rooms and mailing lists is expected to follow the code of conduct.