Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added makefile #34

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

added makefile #34

wants to merge 3 commits into from

Conversation

de-code
Copy link
Contributor

@de-code de-code commented Jun 18, 2019

This is work-in-progress.

It introduces a makefile which seem to be a good way to encapsulate commands (I would also be in favour of dockerizing it in the future).

I don't have a GPU on my laptop, therefore I moved the tensorflow_gpu dependency out.

You can run make dev-venv USE_GPU=1 to install it with that dependency.

It might be good to enforce flake8 and pylint and add exceptions where necessary. Otherwise nobody is probably checking the warnings.

Using .flake8 (and .pylintrc) instead of command line arguments would also help other editor integrations of those tools to pick up the same rules.

/cc @kermitt2

@kermitt2
Copy link
Owner

Hi @de-code

Thanks !

You don't need to remove tensorflow_gpu it defaults automatically CPU when no GPU is found (tensorflow_gpu includes tensorflow).

Question: We don't want to use flake8 for undefined variables?

@de-code
Copy link
Contributor Author

de-code commented Jun 18, 2019

You don't need to remove tensorflow_gpu it defaults automatically CPU when no GPU is found (tensorflow_gpu includes tensorflow).

It failed for me, trying to find a cudas binary. Maybe if we created a Docker container with the cudas binary it would automatically use the CPU if no GPU was found?

Question: We don't want to use flake8 for undefined variables?

I am not sure I understand you correctly here. I usually don't have many flake8 exceptions. I tend to have more pylint exceptions. It should certainly highlight and fail with undefined variables. Sometimes it gets it wrong with third-party libraries.

My recommendation is to use the default rules with the least possible exceptions, make the build fail, revise the code or rules to make the checks pass (could be a separate PR to keep the PR small).
(some people also use mypy)

Example config from another project: .flake8, .pylintrc.

@kermitt2
Copy link
Owner

With all CUDA lib installed, I think you need to add as env variable CUDA_VISIBLE_DEVICES=-1
e.g. os.environ['CUDA_VISIBLE_DEVICES'] = '-1' to avoid creating a GPU device.

But the CPU fallback works fine for me, I am commonly deploying on server without GPU and there is no issue with the tensorflow-gpu dependency. I didn't see any issue so far with machine without CUDA installed.

@kermitt2
Copy link
Owner

Sorry about flake8, I overlooked the corresponding part in the makefile, there is no difference with the current setting in .travis.yml

@de-code
Copy link
Contributor Author

de-code commented Jun 18, 2019

With all CUDA lib installed, I think you need to add as env variable CUDA_VISIBLE_DEVICES=-1
e.g. os.environ['CUDA_VISIBLE_DEVICES'] = '-1' to avoid creating a GPU device.

But the CPU fallback works fine for me, I am commonly deploying on server without GPU and there is no issue with the tensorflow-gpu dependency. I didn't see any issue so far with machine without CUDA installed.

This is what I am getting trying to use tensorflow_gpu:

Using TensorFlow backend.
Traceback (most recent call last):
  File "/path/to/delft/venv/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/path/to/delft/venv/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/path/to/delft/venv/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "/path/to/delft/venv/lib/python3.6/imp.py", line 243, in load_module
    return load_dynamic(name, filename, file)
  File "/path/to/delft/venv/lib/python3.6/imp.py", line 343, in load_dynamic
    return _load(spec)
ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "grobidTagger.py", line 3, in <module>
    from delft.utilities.Embeddings import Embeddings
  File "/path/to/delft/delft/utilities/Embeddings.py", line 3, in <module>
    from keras.preprocessing import text, sequence
  File "/path/to/delft/venv/lib/python3.6/site-packages/keras/__init__.py", line 3, in <module>
    from . import utils
  File "/path/to/delft/venv/lib/python3.6/site-packages/keras/utils/__init__.py", line 6, in <module>
    from . import conv_utils
  File "/path/to/delft/venv/lib/python3.6/site-packages/keras/utils/conv_utils.py", line 9, in <module>
    from .. import backend as K
  File "/path/to/delft/venv/lib/python3.6/site-packages/keras/backend/__init__.py", line 89, in <module>
    from .tensorflow_backend import *
  File "/path/to/delft/venv/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 5, in <module>
    import tensorflow as tf
  File "/path/to/delft/venv/lib/python3.6/site-packages/tensorflow/__init__.py", line 24, in <module>
    from tensorflow.python import pywrap_tensorflow  # pylint: disable=unused-import
  File "/path/to/delft/venv/lib/python3.6/site-packages/tensorflow/python/__init__.py", line 49, in <module>
    from tensorflow.python import pywrap_tensorflow
  File "/path/to/delft/venv/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py", line 74, in <module>
    raise ImportError(msg)
ImportError: Traceback (most recent call last):
  File "/path/to/delft/venv/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/path/to/delft/venv/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/path/to/delft/venv/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "/path/to/delft/venv/lib/python3.6/imp.py", line 243, in load_module
    return load_dynamic(name, filename, file)
  File "/path/to/delft/venv/lib/python3.6/imp.py", line 343, in load_dynamic
    return _load(spec)
ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory


Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/errors

for some common reasons and solutions.  Include the entire stack trace

Setting CUDA_VISIBLE_DEVICES to -1 doesn't change anything for me.

@de-code
Copy link
Contributor Author

de-code commented Jun 18, 2019

Sorry about flake8, I overlooked the corresponding part in the makefile, there is no difference with the current setting in .travis.yml

Yes, should be the same settings at the moment. Going forward flake8-syntax and flake8-warning-only could be removed in favour of just using the flake8 (or tests) target.

@de-code de-code changed the title [wip] added makefile added makefile Jun 18, 2019
@kermitt2
Copy link
Owner

I looked a bit, tensorflow-gpu will work without GPU on CPU automatically, but still required CUDA and cudnn installed.

It appears that conda install -c anaconda tensorflow-gpu will also install CUDA and cudnn as required (as we could expect from a dependency manager), but pip3 install tensorflow-gpu will not install those dependencies. This would explain why it tries to find libcublas.so.

So I guess either we could strongly indicate that DeLFT should be installed with conda, or we need indeed a mechanisms to either install tensorflow rather than tensorflow-gpu when cuda is not installed on the machine or to install CUDA and cudnn in addition to tensorflow-gpu.
Maybe it would be interesting to see how it is done elsewhere.

@de-code
Copy link
Contributor Author

de-code commented Jun 18, 2019

Since TensorFlow provides two separate Python packages, it seems to make sense to provide a mechanism to use either of the two. It would have been better if they provided GPU as an extra (e.g. tensorflow[gpu] rather than a completely separate package, because then a dependency on tensorflow would also be satisfied by the user having installed tensorflow[gpu] (I think).

The mechanism seems easy when using delft as an application as the user can just pass it in. When using it as a library it might be slightly more complicated.

I wouldn't want to install CUDA libraries globally. Some libraries build from source might use the presence of the libraries as a clue to built for GPU.

I never actually used conda, but it is generally packaging more binaries. Not sure if there are really technical differences. pip can certainly also install binary packages if provided (usually the responsible of the package maintainer).

@kermitt2
Copy link
Owner

Thanks Daniel, that's all make sense indeed.

About condas, see this article about tensorflow install:

https://towardsdatascience.com/tensorflow-gpu-installation-made-easy-use-conda-instead-of-pip-52e5249374bc

the idea is to work anyway in an environment, so there is no system-wide impact of installing CUDA/ cudnn. Given that we can then simply use the tensorflow-gpu dependency independently from cpu or gpu usage, it's very convenient and it might be a good idea to enforce this usage.

@de-code
Copy link
Contributor Author

de-code commented Jun 20, 2019

Are you suggesting to switch to use to conda for development?

Another thing to consider is that someone might have an optimized version of TensorFlow already installed. e.g. Google ML Engine, will already have TensorFlow installed with GPU enabled. Or at least in early times, it recommended to compile the CPU version (not sure if they have backed that in - compiling TF was no fun). Which would be an argument not to have TensorFlow as a direct dependency of delft library itself (setup.py) and for development, decouple that a bit from the other dependencies.

@kermitt2
Copy link
Owner

Are you suggesting to switch to use to conda for development?

Yes apparently it solves the issue of tensorflow/tensorflow-gpu and would simplify a lot the life of the users. For using as a library, I will look a bit more at the way it is done by other similar libraries.

I think we don't want to use an already installed version of TensorFlow? First it would be a bit a miracle to have the right cocktail of delft, tensorflow, cuda, cudnn compatible versions, and if it's the case, it would likely break as soon there is an update somewhere (in particular in delft). I think working only with an environment is the only manageable solution.

@de-code
Copy link
Contributor Author

de-code commented Jun 21, 2019

Personally I would be in favour of leaving it more up to the user. It's easy to install TensorFlow. It's more difficult to skip installing it while installing the other dependencies.

In any case, in the interest to avoid long running branches / PRs, would you accept the PR if I removed the switch for CPU / GPU version and added that back to requirements.txt? Or are there other changes? Or maybe you don't like the idea of having a Makefile?

@de-code
Copy link
Contributor Author

de-code commented Aug 21, 2019

My plan for this was:

  • try to get this PR (revised if necessary) merged
  • address linting issue and make build fail
  • start moving back changes I made in in sciencebeam-trainer-delft (such as the layout features)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants