Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker file configuration. #200

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Mac OSX file.
.DS_Store
/*/.DS_Store

#Git
/.git/*

# Documentation
*.md

# Ignore log and temp files.
/log/*
/tmp/*
/*/log/*
/*/tmp/*
2 changes: 2 additions & 0 deletions .env
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
VERSION=8.16.1
ENV=local
37 changes: 37 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# Use Ruby 3.x as the base image
FROM ruby:3.4

ENV PORT=9393

# Set working directory
WORKDIR /app

# Copy the entire project
COPY . .


# Retain just the files needed for building
RUN find . \! -name "Gemfile" \! -name "*.gemspec" -mindepth 2 -maxdepth 2 -print | xargs rm -rf
RUN find . \! -name "Gemfile*" \! -name "*.gemspec" -maxdepth 1 -type f | xargs rm

# Also need the version file add it back
COPY elasticgraph-support/lib/elastic_graph/version.rb ./elasticgraph-support/lib/elastic_graph/version.rb


# Use Ruby 3.x as the base image
FROM ruby:3.4

WORKDIR /app


# Copy files from the first build stage.
COPY --from=0 /app .

# Install Ruby dependencies
RUN bundle install

# Copy the entire project
COPY . .


CMD ["bundle", "exec", "rake", "boot_in_container[${PORT,--host=0.0.0.0,true]"]
2 changes: 1 addition & 1 deletion config/settings/development.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ datastore:
require: httpx/adapters/faraday
clusters:
main:
url: http://localhost:9334
url: http://elasticsearch:9200
backend: elasticsearch
settings: {}
index_definitions:
Expand Down
90 changes: 90 additions & 0 deletions docker-compose.yaml
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There may be able to simplify both but the depends on logic only works if both are in the same compose file and it's a very nice way to make sure elasticsearch is running before we bootstrap elasticgraph.

Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
---
networks:
default:
name: elastic
external: false
services:
elasticsearch:
build:
context: ./elasticgraph-local/lib/elastic_graph/local/elasticsearch/.
dockerfile: Dockerfile
args:
VERSION: ${VERSION}
container_name: elasticsearch-${VERSION}-${ENV}
healthcheck:
interval: 10s
retries: 80
test: curl --write-out 'HTTP %{http_code}' --fail --silent --output /dev/null http://localhost:9200/
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably a good idea to add, but the value is mostly that this is used to ensure elasticsearch is running before we startup elasticgraph.

environment:
# Note: we use `discovery.type=single-node` to ensure that the Elasticsearch node does not
# try to join a cluster (or let another node join it). This prevents problems when you
# have multiple projects using elasticgraph-local at the same time. You do not want
# their Elasticsearch nodes to try to join into a single cluster.
- discovery.type=single-node
# Note: we use `xpack.security.enabled=false` to silence an annoying warning Elasticsearch 7.13 has
# started spewing (as in hundreds of times!) as we run our test suite:
#
# > warning: 299 Elasticsearch-7.13.0-5ca8591c6fcdb1260ce95b08a8e023559635c6f3 "Elasticsearch built-in
# > security features are not enabled. Without authentication, your cluster could be accessible to anyone.
# > See https://www.elastic.co/guide/en/elasticsearch/reference/7.13/security-minimal-setup.html to enable
# > security."
#
# Since this is only used in local dev/test environments where the added security would make things harder
# (we'd have to setup credentials in our tests), it's simpler/better just to explicitly disable the security,
# which silences the warning.
- xpack.security.enabled=false
# We disable `xpack.ml` because it's not compatible with the `darwin-aarch64` distribution we use on M1 Macs.
# Without that flag, we get this error:
#
# > [2022-01-20T10:06:54,582][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [myron-macbookpro.local] uncaught exception in thread [main]
# > org.elasticsearch.bootstrap.StartupException: ElasticsearchException[Failure running machine learning native code. This could be due to running
# > on an unsupported OS or distribution, missing OS libraries, or a problem with the temp directory. To bypass this problem by running Elasticsearch
# > without machine learning functionality set [xpack.ml.enabled: false].]
#
# See also this github issue: https://github.com/elastic/elasticsearch/pull/68068
- xpack.ml.enabled=false
# We don't want Elasticsearch to block writes when the disk allocation passes a threshold for our local/test
# Elasticsearch we run using this docker setup.
# https://stackoverflow.com/a/75962819
#
# Without this, I frequently get `FORBIDDEN/10/cluster create-index blocked (api)` errors when running tests.
- cluster.routing.allocation.disk.threshold_enabled=false
# Necessary on Elasticsearch 8 since our test suites indiscriminately deletes all documents
# between tests to sandbox the state of each test. Without this setting, we get errors like:
#
# > illegal_argument_exception: Wildcard expressions or all indices are not allowed
- action.destructive_requires_name=false
- ES_JAVA_OPTS=-Xms4g -Xmx4g
ulimits:
nofile:
soft: 65536
hard: 65536
volumes:
- elasticsearch:/usr/share/elasticsearch/data
ports:
- ${PORT:-9200}:9200
kibana:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't actually need kibana for this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured it doesn't hurt to give people access to kibana, but I can remove it. Looking to figure out how to have conditional containers run so we can choose elasticsearch vs opensearch if I do I'll make kibana optional.

build:
context: ./elasticgraph-local/lib/elastic_graph/local/elasticsearch/.
dockerfile: UI-Dockerfile
args:
VERSION: ${VERSION}
container_name: kibana-${VERSION}-${ENV}
environment:
- ELASTICSEARCH_HOSTS=http://elasticsearch:9200
ports:
- ${UI_PORT:-5601}:5601
elasticgraph:
build:
context: .
dockerfile: Dockerfile
args:
VERSION: ${VERSION}
container_name: elasticgraph-${ENV}
ports:
- 9393:9393
depends_on:
elasticsearch:
condition: service_healthy
volumes:
elasticsearch:
17 changes: 10 additions & 7 deletions elasticgraph-local/lib/elastic_graph/local/rake_tasks.rb
Original file line number Diff line number Diff line change
Expand Up @@ -375,7 +375,7 @@ def initialize(local_config_yaml:, path_to_schema:)
yield self if block_given?

# Default the local port from the local_config_yaml file.
self.env_port_mapping = {"local" => local_datastore_port}.merge(env_port_mapping || {})
local_port = 9200
if (invalid_port_mapping = env_port_mapping.reject { |env, port| VALID_PORT_RANGE.cover?(port) }).any?
raise "`env_port_mapping` has invalid ports: #{invalid_port_mapping.inspect}. Valid ports must be in the #{VALID_PORT_RANGE} range."
end
Expand Down Expand Up @@ -466,9 +466,13 @@ def define_other_tasks
desc "Boots ElasticGraph locally from scratch: boots #{datastore_to_boot}, configures it, indexes fake data, and boots GraphiQL"
task :boot_locally, [:port, :rackup_args, :no_open] => ["#{datastore_to_boot.downcase}:local:daemon", *index_fake_data_tasks, "boot_graphiql"]

desc "Boots ElasticGraph locally from and connects to #{datastore_to_boot} cluster: configures it, indexes fake data, and boots GraphiQL"
task :boot_in_container, [:port, :rackup_args, :no_open] => [*index_fake_data_tasks, "boot_graphiql"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a little weird to expose a rake task which is only for use when running the "demonstrate EG" one-liner, and isn't itself useful when working in an EG project. (And also, defining a task here defines it for ALL EG projects!).

Can we instead improve boot_locally to tolerate the case where the datastore is already running? Maybe the elasticsearch:local:daemon task can be a no-op if the datastore is already running. Or if need be we can use an ENV var to enable that behavior or something...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious what you think of reworking local to be able to run everything i a container like this? It makes a nice easy way to have multiple version of all the components. I saw the comment about elasticsearch nodes accidentally forming a cluster. We can make that impossible if they don't share a network.

Copy link
Collaborator

@myronmarston myronmarston Feb 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious what you think of reworking local to be able to run everything i a container like this?

I'm open to it as an option, particularly if it simplifies things. However, I wouldn't want to change things to require that ElasticGraph projects run the Ruby bits in a docker container...and I have a hard time thinking of situations where running the Ruby pieces in a docker container makes for a better development experience, outside of this one use case (offering a way to boot an example ElasticGraph project locally for people checking out ElasticGraph who may not have a working Ruby development environment).

Maybe this is just my lack of experience with docker, but while Docker's great for dependencies written in other languages that need to run as a separate process (e.g. Elasticsearch, OpenSearch, DynamoDB, LocalStack, etc), I've never used docker to run a project's "main" language environment--e.g. in a Ruby project I run Ruby directly on my dev machine, and in a java/kotlin project I run the JVM outside docker directly on my dev machine. Running the main language environment inside a docker container while developing feels like it would have lots of downsides to me--e.g. it'd be more difficult to attach a debugger, IDE features might work differently (or not work at all), start up time to run a test or some other short-lived task would be greater, etc.



desc "Boots ElasticGraph locally with the GraphiQL UI, and opens it in a browser."
task :boot_graphiql, [:port, :rackup_args, :no_open] => :ensure_datastore_ready_for_indexing_and_querying do |task, args|
args.with_defaults(port: 9393, rackup_args: "", no_open: false)
args.with_defaults(port: 9393, host: "localhost", rackup_args: "", no_open: false)
port = args.fetch(:port)

# :nocov: -- we can't test `open` behavior through a test
Expand All @@ -495,7 +499,7 @@ def define_other_tasks
end

task :ensure_local_datastore_running do
unless /200 OK/.match?(`curl -is localhost:#{local_datastore_port}`)
unless /200 OK/.match?(`curl -is #{local_datastore_url}`)
if elasticsearch_versions.empty?
raise <<~EOS
OpenSearch is not running locally. You need to start it in another terminal using this command:
Expand Down Expand Up @@ -533,13 +537,12 @@ def define_other_tasks
end
end

def local_datastore_port
@local_datastore_port ||= local_config
def local_datastore_url
@local_datastore_url ||= local_config
.fetch("datastore")
.fetch("clusters")
.fetch("main")
.fetch("url")[/localhost:(\d+)$/, 1]
.then { |port_str| Integer(port_str) }
.fetch("url")
end

def local_cluster_backends
Expand Down
Loading