Skip to content

Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark

License

Notifications You must be signed in to change notification settings

mcoGaia/dr-elephant

 
 

Repository files navigation

Dr. Elephant

Build Status Join the chat at https://gitter.im/linkedin/dr-elephant

Dr. Elephant is a performance monitoring and tuning tool for Hadoop and Spark. It automatically gathers all the metrics, runs analysis on them, and presents them in a simple way for easy consumption. Its goal is to improve developer productivity and increase cluster efficiency by making it easier to tune the jobs. It analyzes the Hadoop and Spark jobs using a set of pluggable, configurable, rule-based heuristics that provide insights on how a job performed, and then uses the results to make suggestions about how to tune the job to make it perform more efficiently.

Documentation

For more information on Dr. Elephant, check the wiki pages here.

For quick setup instructions: Click here

Developer guide: Click here

Administrator guide: Click here

User guide: Click here

Engineering Blog: Click here

Mailing-list & Github Issues

Google groups mailing list: Click here

Github issues: click here

Meetings

We have scheduled a weekly Dr. Elephant meeting for the interested developers and users to discuss future plans for Dr. Elephant. Please click here for details.

How to Contribute?

Check this link.

How to compile and launch on Gaia3 from Thales gitlab

  1. Clone the project

  2. Global variables

    • Check the file "setEnv.txt" and type: source setEnv.txt
    • Double check HTTP_PROXY & HTTPS_PROXY !
  3. Database

    • Start the service mysql.
    • Default account is drelephant with pwd = "Dr-elephant123"
    • mysql -u drelephant -p (or use root default account musql -u root)
    • Create your account or use default account.
      • To create your account connect as root.
      • GRANT ALL PRIVILEGES ON *.* TO 'newUserName'@'localhost' IDENTIFIED BY 'newPassword' WITH GRANT OPTION;
    • Create a database or use default database (default datadase is "drelephant")
      • use drelephant or create database databaseName
    • Exit mysql prompt.
  4. Test the application

    • go to $PROJECT_ROOT
    • Type "activator" -> you are un Play framework prompt
    • type command "test" -> all test are launched.
  5. Compile

  • Only for the First compilation

  • ./compile.sh compile.conf

  • The following warning must appear (only for the first compilation): DEPRECATION: You're using legacy binding syntax: valueBinding="newUser" For all others compilations this warning should not appear.

  • For all others compilations

    • In compil.sh you can add or remove tests.
      • Replace "play_command $OPTS clean compile dist"by "play_command $OPTS clean compile test dist"
    • ./compile.sh compile.conf
    • The result of the compilation is stored in $PROJECT_ROOT/dist as a zip file
  1. Start & Stop
    • After compilation:
      • cd dist/; unzip dr-elephant*.zip; cd dr-elephant*
      • Edit the following parameters in file app-conf/elephant.conf : port, db_url, db_name, db_user and db_password;
    • Launch dr.Elephant ->
      • ./bin/start.sh app-conf/ and go to localhost: "port" to use web UI.
    • Stop dr.elephant
      • ./bin/stop

Get the latest modification from linkedIn github on master branch

  1. Type the following command: git remote -v This should output something like: GAIA-repo https://[email protected]/gitlab/GAIA/Dr-elephant.git (fetch) GAIA-repo https://[email protected]/gitlab/GAIA/Dr-elephant.git (push) linkedIn-repo https://github.com/linkedin/dr-elephant.git (fetch) linkedIn-repo https://github.com/linkedin/dr-elephant.git (push) If not type:

  2. Pull the project from github

    • git pull linkedIn-repo master

Push modifications on Thales gitlab.

  • Update remote (or check file .git/config)
  • Check all modified files: git status
  • Add all modified/untracked files to the staging area: git add .
  • Commit your modification: git commit -m "message de commit"
  • Push on remote repository: git push -u GAIA-repo

Tag a new version of Dr.elephant

  • Checkout on master branch: git checkout master
  • Create the tag: git tag -a v1.0 -m "Version 1.0"
  • Check that the tag is well created on local machine: git tag
  • Push the tag on remote: git push GAIA-repo --tags

Generate the delivery

  • Checkout the tag on a new branch: git checkout tags/tagName -b branchName
  • Check that the branch is well created: git branch
  • You can also check the last commit of the tag: git log
  • Generate the delivery: compile the project (you can follow steps 2 and 4 of part "How to compile and launch on Gaia3 from Thales gitlab")

License

Copyright 2016 LinkedIn Corp.

Licensed under the Apache License, Version 2.0 (the "License"); you may not
use this file except in compliance with the License. You may obtain a copy of
the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
License for the specific language governing permissions and limitations under
the License.

About

Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Java 59.9%
  • Scala 13.4%
  • JavaScript 10.2%
  • HTML 10.0%
  • Handlebars 3.0%
  • CSS 1.5%
  • Other 2.0%