Disponibilidade geral do Databricks Runtime 6.4 ML (EoS)
Observação
O suporte para esta versão do Databricks Runtime foi encerrado. Para obter a data de fim do suporte, consulte o Histórico de fim do suporte. Para todas as versões compatíveis do Databricks Runtime, consulte Versões e compatibilidade de notas sobre a versão do Databricks Runtime.
O Databricks lançou essa versão em fevereiro de 2020.
O Databricks Runtime 6.4 para Machine Learning fornece um ambiente de aprendizado de máquina e ciência de dados pronto para uso baseado no Databricks Runtime 6.4 (EoS). O Databricks Runtime ML contém muitas bibliotecas de aprendizado de máquina populares, inclusive TensorFlow, PyTorch, Keras e XGBoost. Ele também dá suporte ao treinamento de aprendizado profundo distribuído com o uso do Horovod.
Para obter mais informações, incluindo instruções para criar um cluster de ML do Databricks Runtime, confira IA e Machine Learning no Databricks.
Novos recursos
O Databricks Runtime 6.4 ML foi desenvolvido com base no Databricks Runtime 6.4. Para obter informações sobre as novidades do Databricks Runtime 6.4, consulte as notas da versão do Databricks Runtime 6.4 (EoS).
Aprimoramentos
Desativações
- O pacote
keras
autônomo foi preterido e será removido em uma próxima versão principal do Databricks Runtime para ML. O Databricks recomenda que você usetensorflow.keras
. - O pacote
pymongo
foi preterido e será removido em uma próxima versão principal do Databricks Runtime para ML.
Bibliotecas de aprendizado de máquina atualizadas
- PyTorch: 1.3.1 a 1.4.0
- Horovod: 0.18.2 a 1.19.0
Ambiente do sistema
O ambiente do sistema no Databricks Runtime 6.4 ML é diferente do Databricks Runtime 6.4 nestes aspectos:
- DBUtils: Não contém Utilitário de biblioteca (dbutils.library) (herdado).
- Para clusters de GPU, as seguintes bibliotecas de GPU NVIDIA:
- CUDA 10.0
- cuDNN 7.6.4
- NCCL 2.4.8
Bibliotecas
As seções a seguir listam as bibliotecas incluídas no Databricks Runtime 6.4 ML que são diferentes daquelas incluídas no Databricks Runtime 6.4.
Nesta seção:
- Bibliotecas de camada superior
- Bibliotecas do Python
- Bibliotecas do R
- Bibliotecas do Java e Scala (cluster do Scala 2.11)
Bibliotecas de camada superior
O Databricks Runtime 6.4 ML inclui as seguintes bibliotecas de camada superior:
- GraphFrames
- Horovod e HorovodRunner
- MLflow
- PyTorch
- spark-tensorflow-connector
- TensorFlow
- TensorBoard
Bibliotecas do Python
O Databricks Runtime 6.4 ML usa o Conda para gerenciamento de pacotes do Python e inclui vários pacotes populares de ML. A seção a seguir descreve o ambiente do Conda para o Databricks Runtime 6.4 ML.
Python em clusters de CPU
name: databricks-ml
channels:
- Databricks
- pytorch
- defaults
dependencies:
- _libgcc_mutex=0.1=main
- _py-xgboost-mutex=2.0=cpu_0
- _tflow_select=2.3.0=mkl
- absl-py=0.9.0=py37_0
- asn1crypto=0.24.0=py37_0
- astor=0.8.0=py37_0
- backcall=0.1.0=py37_0
- backports=1.0=py_2
- bcrypt=3.1.7=py37h7b6447c_0
- blas=1.0=mkl
- boto=2.49.0=py37_0
- boto3=1.9.162=py_0
- botocore=1.12.163=py_0
- c-ares=1.15.0=h7b6447c_1001
- ca-certificates=2019.1.23=0
- certifi=2019.3.9=py37_0
- cffi=1.12.2=py37h2e261b9_1
- chardet=3.0.4=py37_1003
- click=7.0=py_0
- cloudpickle=0.8.0=py37_0
- colorama=0.4.1=py_0
- configparser=3.7.4=py37_0
- cpuonly=1.0=0
- cryptography=2.6.1=py37h1ba5d50_0
- cycler=0.10.0=py37_0
- cython=0.29.6=py37he6710b0_0
- decorator=4.4.0=py37_1
- docutils=0.14=py37_0
- entrypoints=0.3=py37_0
- et_xmlfile=1.0.1=py37_0
- flask=1.0.2=py37_1
- freetype=2.9.1=h8a8886c_1
- future=0.17.1=py37_0
- gast=0.2.2=py37_0
- gitdb2=2.0.6=py_0
- gitpython=2.1.11=py37_0
- google-pasta=0.1.8=py_0
- grpcio=1.16.1=py37hf8bcb03_1
- gunicorn=19.9.0=py37_0
- h5py=2.9.0=py37h7918eee_0
- hdf5=1.10.4=hb1b8bf9_0
- html5lib=1.0.1=py_0
- icu=58.2=h9c2bf20_1
- idna=2.8=py37_0
- intel-openmp=2019.3=199
- ipykernel=5.1.0=py37h39e3cac_0
- ipython=7.4.0=py37h39e3cac_0
- ipython_genutils=0.2.0=py37_0
- itsdangerous=1.1.0=py_0
- jdcal=1.4=py37_0
- jedi=0.13.3=py37_0
- jinja2=2.10=py37_0
- jmespath=0.9.4=py_0
- jpeg=9b=h024ee3a_2
- jupyter_client=5.2.4=py37_0
- jupyter_core=4.4.0=py37_0
- keras-applications=1.0.8=py_0
- keras-preprocessing=1.1.0=py_1
- kiwisolver=1.0.1=py37hf484d3e_0
- krb5=1.16.1=h173b8e3_7
- libedit=3.1.20181209=hc058e9b_0
- libffi=3.2.1=hd88cf55_4
- libgcc-ng=8.2.0=hdf63c60_1
- libgfortran-ng=7.3.0=hdf63c60_0
- libpng=1.6.36=hbc83047_0
- libpq=11.2=h20c2e04_0
- libprotobuf=3.11.4=hd408876_0
- libsodium=1.0.16=h1bed415_0
- libstdcxx-ng=8.2.0=hdf63c60_1
- libtiff=4.0.10=h2733197_2
- libxgboost=0.90=he6710b0_1
- libxml2=2.9.9=hea5a465_1
- libxslt=1.1.33=h7d1a2b0_0
- llvmlite=0.28.0=py37hd408876_0
- lxml=4.3.2=py37hefd8a0e_0
- mako=1.0.10=py_0
- markdown=3.1.1=py37_0
- markupsafe=1.1.1=py37h7b6447c_0
- mkl=2019.3=199
- mkl_fft=1.0.10=py37ha843d7b_0
- mkl_random=1.0.2=py37hd81dba3_0
- ncurses=6.1=he6710b0_1
- networkx=2.2=py37_1
- ninja=1.9.0=py37hfd86e86_0
- nose=1.3.7=py37_2
- numba=0.43.1=py37h962f231_0
- numpy=1.16.2=py37h7e9f1db_0
- numpy-base=1.16.2=py37hde5b4d6_0
- olefile=0.46=py_0
- openpyxl=2.6.1=py37_1
- openssl=1.1.1b=h7b6447c_1
- opt_einsum=3.1.0=py_0
- pandas=0.24.2=py37he6710b0_0
- paramiko=2.4.2=py37_0
- parso=0.3.4=py37_0
- pathlib2=2.3.3=py37_0
- patsy=0.5.1=py37_0
- pexpect=4.6.0=py37_0
- pickleshare=0.7.5=py37_0
- pillow=5.4.1=py37h34e0f95_0
- pip=19.0.3=py37_0
- ply=3.11=py37_0
- prompt_toolkit=2.0.9=py37_0
- protobuf=3.11.4=py37he6710b0_0
- psutil=5.6.1=py37h7b6447c_0
- psycopg2=2.7.6.1=py37h1ba5d50_0
- ptyprocess=0.6.0=py37_0
- py-xgboost=0.90=py37he6710b0_1
- py-xgboost-cpu=0.90=py37_1
- pyasn1=0.4.8=py_0
- pycparser=2.19=py_0
- pygments=2.3.1=py37_0
- pymongo=3.8.0=py37he6710b0_1
- pynacl=1.3.0=py37h7b6447c_0
- pyopenssl=19.0.0=py37_0
- pyparsing=2.3.1=py37_0
- pysocks=1.6.8=py37_0
- python=3.7.3=h0371630_0
- python-dateutil=2.8.0=py37_0
- python-editor=1.0.4=py_0
- pytorch=1.4.0=py3.7_cpu_0
- pytz=2018.9=py37_0
- pyyaml=5.1=py37h7b6447c_0
- pyzmq=18.0.0=py37he6710b0_0
- readline=7.0=h7b6447c_5
- requests=2.21.0=py37_0
- s3transfer=0.2.1=py37_0
- scikit-learn=0.20.3=py37hd81dba3_0
- scipy=1.2.1=py37h7c811a0_0
- setuptools=40.8.0=py37_0
- simplejson=3.16.0=py37h14c3975_0
- singledispatch=3.4.0.3=py37_0
- six=1.12.0=py37_0
- smmap2=2.0.5=py_0
- sqlite=3.27.2=h7b6447c_0
- sqlparse=0.3.0=py_0
- statsmodels=0.9.0=py37h035aef0_0
- tabulate=0.8.3=py37_0
- tensorboard=1.15.0+db2=pyhb230dea_0
- tensorflow=1.15.0+db2=mkl_py37hc5fbf04_0
- tensorflow-base=1.15.0+db2=mkl_py37h2ae1e84_0
- tensorflow-estimator=1.15.1+db2=pyh2649769_0
- tensorflow-mkl=1.15.0+db2=h4fcabd2_0
- termcolor=1.1.0=py37_1
- tk=8.6.8=hbc83047_0
- torchvision=0.5.0=py37_cpu
- tornado=6.0.2=py37h7b6447c_0
- tqdm=4.31.1=py37_1
- traitlets=4.3.2=py37_0
- urllib3=1.24.1=py37_0
- virtualenv=16.0.0=py37_0
- wcwidth=0.1.7=py37_0
- webencodings=0.5.1=py37_1
- websocket-client=0.56.0=py37_0
- werkzeug=0.14.1=py37_0
- wheel=0.33.1=py37_0
- wrapt=1.11.1=py37h7b6447c_0
- xz=5.2.4=h14c3975_4
- yaml=0.1.7=had09818_2
- zeromq=4.3.1=he6710b0_3
- zlib=1.2.11=h7b6447c_3
- zstd=1.3.7=h0b5b093_0
- pip:
- argparse==1.4.0
- databricks-cli==0.9.1
- deprecated==1.2.7
- docker==4.2.0
- fusepy==2.0.4
- gorilla==0.3.0
- horovod==0.19.0
- hyperopt==0.2.2.db1
- keras==2.2.5
- matplotlib==3.0.3
- mleap==0.8.1
- mlflow==1.5.0
- nose-exclude==0.5.0
- pyarrow==0.13.0
- querystring-parser==1.2.4
- seaborn==0.9.0
- tensorboardx==1.9
prefix: /databricks/conda/envs/databricks-ml
Python em clusters GPU
name: databricks-ml-gpu
channels:
- Databricks
- pytorch
- defaults
dependencies:
- _libgcc_mutex=0.1=main
- _py-xgboost-mutex=1.0=gpu_0
- _tflow_select=2.1.0=gpu
- absl-py=0.9.0=py37_0
- asn1crypto=0.24.0=py37_0
- astor=0.8.0=py37_0
- backcall=0.1.0=py37_0
- backports=1.0=py_2
- bcrypt=3.1.7=py37h7b6447c_0
- blas=1.0=mkl
- boto=2.49.0=py37_0
- boto3=1.9.162=py_0
- botocore=1.12.163=py_0
- c-ares=1.15.0=h7b6447c_1001
- ca-certificates=2019.1.23=0
- certifi=2019.3.9=py37_0
- cffi=1.12.2=py37h2e261b9_1
- chardet=3.0.4=py37_1003
- click=7.0=py_0
- cloudpickle=0.8.0=py37_0
- colorama=0.4.1=py_0
- configparser=3.7.4=py37_0
- cryptography=2.6.1=py37h1ba5d50_0
- cudatoolkit=10.0.130=0
- cudnn=7.6.4=cuda10.0_0
- cupti=10.0.130=0
- cycler=0.10.0=py37_0
- cython=0.29.6=py37he6710b0_0
- decorator=4.4.0=py37_1
- docutils=0.14=py37_0
- entrypoints=0.3=py37_0
- et_xmlfile=1.0.1=py37_0
- flask=1.0.2=py37_1
- freetype=2.9.1=h8a8886c_1
- future=0.17.1=py37_0
- gast=0.2.2=py37_0
- gitdb2=2.0.6=py_0
- gitpython=2.1.11=py37_0
- google-pasta=0.1.8=py_0
- grpcio=1.16.1=py37hf8bcb03_1
- gunicorn=19.9.0=py37_0
- h5py=2.9.0=py37h7918eee_0
- hdf5=1.10.4=hb1b8bf9_0
- html5lib=1.0.1=py_0
- icu=58.2=h9c2bf20_1
- idna=2.8=py37_0
- intel-openmp=2019.3=199
- ipykernel=5.1.0=py37h39e3cac_0
- ipython=7.4.0=py37h39e3cac_0
- ipython_genutils=0.2.0=py37_0
- itsdangerous=1.1.0=py_0
- jdcal=1.4=py37_0
- jedi=0.13.3=py37_0
- jinja2=2.10=py37_0
- jmespath=0.9.4=py_0
- jpeg=9b=h024ee3a_2
- jupyter_client=5.2.4=py37_0
- jupyter_core=4.4.0=py37_0
- keras-applications=1.0.8=py_0
- keras-preprocessing=1.1.0=py_1
- kiwisolver=1.0.1=py37hf484d3e_0
- krb5=1.16.1=h173b8e3_7
- libedit=3.1.20181209=hc058e9b_0
- libffi=3.2.1=hd88cf55_4
- libgcc-ng=8.2.0=hdf63c60_1
- libgfortran-ng=7.3.0=hdf63c60_0
- libpng=1.6.36=hbc83047_0
- libpq=11.2=h20c2e04_0
- libprotobuf=3.11.4=hd408876_0
- libsodium=1.0.16=h1bed415_0
- libstdcxx-ng=8.2.0=hdf63c60_1
- libtiff=4.0.10=h2733197_2
- libxgboost=0.90=h688424c_0
- libxml2=2.9.9=hea5a465_1
- libxslt=1.1.33=h7d1a2b0_0
- llvmlite=0.28.0=py37hd408876_0
- lxml=4.3.2=py37hefd8a0e_0
- mako=1.0.10=py_0
- markdown=3.1.1=py37_0
- markupsafe=1.1.1=py37h7b6447c_0
- mkl=2019.3=199
- mkl_fft=1.0.10=py37ha843d7b_0
- mkl_random=1.0.2=py37hd81dba3_0
- ncurses=6.1=he6710b0_1
- networkx=2.2=py37_1
- ninja=1.9.0=py37hfd86e86_0
- nose=1.3.7=py37_2
- numba=0.43.1=py37h962f231_0
- numpy=1.16.2=py37h7e9f1db_0
- numpy-base=1.16.2=py37hde5b4d6_0
- olefile=0.46=py_0
- openpyxl=2.6.1=py37_1
- openssl=1.1.1b=h7b6447c_1
- opt_einsum=3.1.0=py_0
- pandas=0.24.2=py37he6710b0_0
- paramiko=2.4.2=py37_0
- parso=0.3.4=py37_0
- pathlib2=2.3.3=py37_0
- patsy=0.5.1=py37_0
- pexpect=4.6.0=py37_0
- pickleshare=0.7.5=py37_0
- pillow=5.4.1=py37h34e0f95_0
- pip=19.0.3=py37_0
- ply=3.11=py37_0
- prompt_toolkit=2.0.9=py37_0
- protobuf=3.11.4=py37he6710b0_0
- psutil=5.6.1=py37h7b6447c_0
- psycopg2=2.7.6.1=py37h1ba5d50_0
- ptyprocess=0.6.0=py37_0
- py-xgboost=0.90=py37h688424c_0
- py-xgboost-gpu=0.90=py37h28bbb66_0
- pyasn1=0.4.8=py_0
- pycparser=2.19=py_0
- pygments=2.3.1=py37_0
- pymongo=3.8.0=py37he6710b0_1
- pynacl=1.3.0=py37h7b6447c_0
- pyopenssl=19.0.0=py37_0
- pyparsing=2.3.1=py37_0
- pysocks=1.6.8=py37_0
- python=3.7.3=h0371630_0
- python-dateutil=2.8.0=py37_0
- python-editor=1.0.4=py_0
- pytorch=1.4.0=py3.7_cuda10.0.130_cudnn7.6.3_0
- pytz=2018.9=py37_0
- pyyaml=5.1=py37h7b6447c_0
- pyzmq=18.0.0=py37he6710b0_0
- readline=7.0=h7b6447c_5
- requests=2.21.0=py37_0
- s3transfer=0.2.1=py37_0
- scikit-learn=0.20.3=py37hd81dba3_0
- scipy=1.2.1=py37h7c811a0_0
- setuptools=40.8.0=py37_0
- simplejson=3.16.0=py37h14c3975_0
- singledispatch=3.4.0.3=py37_0
- six=1.12.0=py37_0
- smmap2=2.0.5=py_0
- sqlite=3.27.2=h7b6447c_0
- sqlparse=0.3.0=py_0
- statsmodels=0.9.0=py37h035aef0_0
- tabulate=0.8.3=py37_0
- tensorboard=1.15.0+db2=pyhb230dea_0
- tensorflow=1.15.0+db2=gpu_py37h9fd0ff8_0
- tensorflow-base=1.15.0+db2=gpu_py37hd56f5dd_0
- tensorflow-estimator=1.15.1+db2=pyh2649769_0
- tensorflow-gpu=1.15.0+db2=h0d30ee6_0
- termcolor=1.1.0=py37_1
- tk=8.6.8=hbc83047_0
- torchvision=0.5.0=py37_cu100
- tornado=6.0.2=py37h7b6447c_0
- tqdm=4.31.1=py37_1
- traitlets=4.3.2=py37_0
- urllib3=1.24.1=py37_0
- virtualenv=16.0.0=py37_0
- wcwidth=0.1.7=py37_0
- webencodings=0.5.1=py37_1
- websocket-client=0.56.0=py37_0
- werkzeug=0.14.1=py37_0
- wheel=0.33.1=py37_0
- wrapt=1.11.1=py37h7b6447c_0
- xz=5.2.4=h14c3975_4
- yaml=0.1.7=had09818_2
- zeromq=4.3.1=he6710b0_3
- zlib=1.2.11=h7b6447c_3
- zstd=1.3.7=h0b5b093_0
- pip:
- argparse==1.4.0
- databricks-cli==0.9.1
- deprecated==1.2.7
- docker==4.2.0
- fusepy==2.0.4
- gorilla==0.3.0
- horovod==0.19.0
- hyperopt==0.2.2.db1
- keras==2.2.5
- matplotlib==3.0.3
- mleap==0.8.1
- mlflow==1.5.0
- nose-exclude==0.5.0
- pyarrow==0.13.0
- querystring-parser==1.2.4
- seaborn==0.9.0
- tensorboardx==1.9
prefix: /databricks/conda/envs/databricks-ml-gpu
Pacotes do Spark que contêm módulos do Python
Pacote do Spark | Módulo do Python | Versão |
---|---|---|
graphframes | graphframes | 0.7.0-db1-spark2.4 |
spark-deep-learning | sparkdl | 1.5.0-db12-spark2.4 |
tensorframes | tensorframes | 0.8.2-s_2.11 |
Bibliotecas do R
As bibliotecas do R são idênticas às Bibliotecas do R do Databricks Runtime 6.4.
Bibliotecas do Java e Scala (cluster do Scala 2.11)
Além das bibliotecas do Java e do Scala no Databricks Runtime 6.4, o Databricks Runtime 6.4 ML contém os seguintes JARs:
ID do Grupo | Artifact ID | Versão |
---|---|---|
com.databricks | spark-deep-learning | 1.5.0-db12-spark2.4 |
com.typesafe.akka | akka-actor_2.11 | 2.3.11 |
ml.combust.mleap | mleap-databricks-runtime_2.11 | 0.15.0 |
ml.dmlc | xgboost4j | 0,90 |
ml.dmlc | xgboost4j-spark | 0,90 |
org.graphframes | graphframes_2.11 | 0.7.0-db1-spark2.4 |
org.mlflow | mlflow-client | 1.4.0 |
org.tensorflow | libtensorflow | 1.15.0 |
org.tensorflow | libtensorflow_jni | 1.15.0 |
org.tensorflow | spark-tensorflow-connector_2.11 | 1.15.0 |
org.tensorflow | tensorflow | 1.15.0 |
org.tensorframes | tensorframes | 0.8.2-s_2.11 |