Condividi tramite


Databricks Runtime 12.2 LTS per Machine Learning

Databricks Runtime 12.2 LTS per Machine Learning offre un ambiente pronto per apprendimento automatico e data science basato su Databricks Runtime 12.2 LTS. Databricks Runtime ML contiene molte di queste popolari librerie per l’apprendimento automatico, tra cui TensorFlow, PyTorch e XGBoost. Databricks Runtime ML include AutoML, uno strumento per eseguire automaticamente il training delle pipeline di Machine Learning. Databricks Runtime ML supporta inoltre il training di Deep Learning distribuito tramite Horovod.

Nota

LTS indica che questa versione è supportata a lungo termine. Vedere Ciclo di vita della versione LTS di Databricks Runtime.

Per altre informazioni, incluse le istruzioni per la creazione di un cluster Databricks Runtime ML, vedere IA e Machine Learning in Azure Databricks.

Suggerimento

Per visualizzare le note sulla versione per le versioni di Databricks Runtime che hanno raggiunto la fine del supporto (EoS), vedere Note sulla versione della fine del supporto di Databricks Runtime. Le versioni EoS di Databricks Runtime sono state ritirate e potrebbero non essere aggiornate.

Miglioramenti e nuove funzionalità

Databricks Runtime 12.2 LTS ML è basato su Databricks Runtime 12.2 LTS. Per informazioni sulle novità di Databricks Runtime 12.2 LTS, tra cui Apache Spark MLlib e SparkR, vedere le note sulla versione di Databricks Runtime 12.2 LTS.

AutoML

È possibile usare le tabelle delle funzionalità esistenti in Feature Store per aumentare il set di dati di input originale per i problemi di previsione autoML. Per informazioni dettagliate, vedere Integrazione dell'archivio funzionalità AutoML.

Per altre informazioni su AutoML, vedere Che cos'è AutoML?.

Ambiente di sistema

L'ambiente di sistema in Databricks Runtime 12.2 LTS ML differisce da Databricks Runtime 12.2 LTS come indicato di seguito:

Databricks Runtime 12.2 ML include XGBoost 1.7.2, che non supporta i cluster GPU con capacità di calcolo 5.2 e versioni precedenti.

Librerie

Le sezioni seguenti elencano le librerie incluse in Databricks Runtime 12.2 LTS ML che differiscono da quelle incluse in Databricks Runtime 12.2 LTS.

Contenuto della sezione:

Librerie di livello superiore

Databricks Runtime 12.2 LTS ML include le librerie di livello superiore seguenti:

Librerie Python

Databricks Runtime 12.2 LTS ML usa Virtualenv per la gestione dei pacchetti Python e include molti dei pacchetti ML più diffusi.

Oltre ai pacchetti specificati nelle sezioni seguenti, Databricks Runtime 12.2 LTS ML include anche i pacchetti seguenti:

  • hyperopt 0.2.7+db3
  • sparkdl 2.3.0-db3
  • automl 1.16.0

Per riprodurre l'ambiente Python di Databricks Runtime ML nell'ambiente virtuale Python locale, scaricare il file requirements-12.2.txt ed eseguire pip install -r requirements-12.2.txt. Questo comando installa tutte le librerie open source usate da Databricks Runtime ML, ma non installa librerie sviluppate da Databricks, ad esempio databricks-automl, databricks-feature-store o il fork di Databricks di hyperopt.

Librerie Python nei cluster CPU

Library Versione Library Versione Library Versione
absl-py 1.0.0 argon2-cffi 21.3.0 argon2-cffi-bindings 21.2.0
astor 0.8.1 asttoken 2.0.5 astunparse 1.6.3
attrs 21.4.0 azure-core 1.26.3 azure-cosmos 4.2.0
backcall 0.2.0 backports.entry-points-selectable 1.2.0 bcrypt 3.2.0
beautifulsoup4 4.11.1 black 22.3.0 bleach 4.1.0
blis 0.7.9 boto3 1.21.32 botocore 1.24.32
cachetools 4.2.2 catalogue 2.0.8 category-encoders 2.5.1.post0
certifi 2021.10.8 cffi 1.15.0 chardet 4.0.0
charset-normalizer 2.0.4 click 8.0.4 cloudpickle 2.0.0
cmdstanpy 1.1.0 confection 0.0.4 configparser 5.2.0
convertdate 2.4.0 cryptography 3.4.8 cycler 0.11.0
cymem 2.0.7 Cython 0.29.28 databricks-automl-runtime 0.2.15
databricks-cli 0.17.4 databricks-feature-store 0.10.0 dbl-tempo 0.1.12
dbus-python 1.2.16 debugpy 1.5.1 decorator 5.1.1
defusedxml 0.7.1 dill 0.3.4 diskcache 5.4.0
distlib 0.3.6 docstring-to-markdown 0.11 entrypoints 0.4
ephem 4.1.4 executing 0.8.3 facet-overview 1.0.0
fastjsonschema 2.16.2 fasttext 0.9.2 filelock 3.6.0
Flask 1.1.2 flatbuffers 23.1.21 fonttools 4.25.0
fsspec 2022.2.0 future 0.18.2 gast 0.4.0
gitdb 4.0.10 GitPython 3.1.27 google-auth 1.33.0
google-auth-oauthlib 0.4.6 google-pasta 0.2.0 grpcio 1.42.0
gunicorn 20.1.0 gviz-api 1.10.0 h5py 3.6.0
hijri-converter 2.2.4 holidays 0,18 horovod 0.27.0
htmlmin 0.1.12 huggingface-hub 0.12.0 idna 3.3
ImageHash 4.3.1 imbalanced-learn 0.10.1 importlib-metadata 4.11.3
ipykernel 6.15.3 ipython 8.5.0 ipython-genutils 0.2.0
ipywidgets 7.7.2 isodate 0.6.1 itsdangerous 2.0.1
jedi 0.18.1 Jinja2 2.11.3 jmespath 0.10.0
joblib 1.1.1 joblibspark 0.5.1 jsonschema 4.4.0
jupyter-client 6.1.12 jupyter_core 4.11.2 jupyterlab-pygments 0.1.2
jupyterlab-widgets 1.0.0 keras 2.11.0 kiwisolver 1.3.2
korean-lunar-calendar 0.3.1 langcodes 3.3.0 libclang 15.0.6.1
lightgbm 3.3.4 llvmlite 0.38.0 LunarCalendar 0.0.9
Mako 1.2.0 Markdown 3.3.4 MarkupSafe 2.0.1
matplotlib 3.5.1 matplotlib-inline 0.1.2 mccabe 0.7.0
mistune 0.8.4 mleap 0.20.0 mlflow-skinny 2.1.1
multimethod 1.9.1 murmurhash 1.0.9 mypy-extensions 0.4.3
nbclient 0.5.13 nbconvert 6.4.4 nbformat 5.3.0
nest-asyncio 1.5.5 networkx 2.7.1 nltk 3.7
nodeenv 1.7.0 notebook 6.4.8 numba 0.55.1
numpy 1.21.5 oauthlib 3.2.0 opt-einsum 3.3.0
packaging 21.3 pandas 1.4.2 pandas-profiling 3.6.2
pandocfilters 1.5.0 paramiko 2.9.2 parso 0.8.3
pathspec 0.9.0 pathy 0.10.1 patsy 0.5.2
petastorm 0.12.1 pexpect 4.8.0 phik 0.12.3
pickleshare 0.7.5 Pillow 9.0.1 pip 21.2.4
platformdirs 2.6.2 plotly 5.6.0 pluggy 1.0.0
pmdarima 2.0.2 preshed 3.0.8 prometheus-client 0.13.1
prompt-toolkit 3.0.20 prophet 1.1.1 protobuf 3.19.4
psutil 5.8.0 psycopg2 2.9.3 ptyprocess 0.7.0
pure-eval 0.2.2 pyarrow 7.0.0 pyasn1 0.4.8
pyasn1-modules 0.2.8 pybind11 2.10.3 pycparser 2.21
pydantic 1.10.2 pyflakes 2.5.0 Pygments 2.11.2
PyGObject 3.36.0 PyJWT 2.6.0 PyMeeus 0.5.12
PyNaCl 1.5.0 pyodbc 4.0.32 pyparsing 3.0.4
pyright 1.1.283 pyrsistent 0.18.0 python-dateutil 2.8.2
python-editor 1.0.4 python-lsp-jsonrpc 1.0.0 python-lsp-server 1.6.0
pytz 2021.3 PyWavelets 1.3.0 PyYAML 6.0
pyzmq 22.3.0 regex 2022.3.15 requests 2.27.1
requests-oauthlib 1.3.1 requests-unixsocket 0.2.0 rope 0.22.0
rsa 4.7.2 s3transfer 0.5.0 scikit-learn 1.0.2
scipy 1.7.3 seaborn 0.11.2 Send2Trash 1.8.0
setuptools 61.2.0 setuptools-git 1.2 shap 0.41.0
simplejson 3.17.6 six 1.16.0 slicer 0.0.7
smart-open 5.2.1 smmap 5.0.0 soupsieve 2.3.1
spacy 3.4.4 spacy-legacy 3.0.12 spacy-logger 1.0.4
spark-tensorflow-distributor 1.0.0 sqlparse 0.4.2 srsly 2.4.5
ssh-import-id 5.10 stack-data 0.2.0 statsmodels 0.13.2
tabulate 0.8.9 tangled-up-in-unicode 0.2.0 tenacity 8.0.1
tensorboard 2.11.2 tensorboard-data-server 0.6.1 tensorboard-plugin-profile 2.11.1
tensorboard-plugin-wit 1.8.1 tensorflow-cpu 2.11.0 tensorflow-estimator 2.11.0
tensorflow-io-gcs-filesystem 0.30.0 termcolor 2.2.0 terminado 0.13.1
testpath 0.5.0 thinc 8.1.7 threadpoolctl 2.2.0
tokenize-rt 4.2.1 tokenizers 0.13.2 tomli 1.2.2
torch 1.13.1+cpu torchvision 0.14.1+cpu tornado 6.1
tqdm 4.64.0 traitlets 5.1.1 transformers 4.25.1
typeguard 2.13.3 typer 0.7.0 typing_extensions 4.1.1
ujson 5.1.0 unattended-upgrades 0.1 urllib3 1.26.9
virtualenv 20.8.0 visions 0.7.5 wasabi 0.10.1
wcwidth 0.2.5 webencodings 0.5.1 websocket-client 0.58.0
Werkzeug 2.0.3 whatthepatch 1.0.4 wheel 0.37.1
widgetsnbextension 3.6.1 wrapt 1.12.1 xgboost 1.7.2
yapf 0.31.0 zipp 3.7.0

Librerie Python nei cluster GPU

Library Versione Library Versione Library Versione
absl-py 1.0.0 argon2-cffi 21.3.0 argon2-cffi-bindings 21.2.0
astor 0.8.1 asttoken 2.0.5 astunparse 1.6.3
attrs 21.4.0 azure-core 1.26.3 azure-cosmos 4.2.0
backcall 0.2.0 backports.entry-points-selectable 1.2.0 bcrypt 3.2.0
beautifulsoup4 4.11.1 black 22.3.0 bleach 4.1.0
blis 0.7.9 boto3 1.21.32 botocore 1.24.32
cachetools 4.2.2 catalogue 2.0.8 category-encoders 2.5.1.post0
certifi 2021.10.8 cffi 1.15.0 chardet 4.0.0
charset-normalizer 2.0.4 click 8.0.4 cloudpickle 2.0.0
cmdstanpy 1.1.0 confection 0.0.4 configparser 5.2.0
convertdate 2.4.0 cryptography 3.4.8 cycler 0.11.0
cymem 2.0.7 Cython 0.29.28 databricks-automl-runtime 0.2.15
databricks-cli 0.17.4 databricks-feature-store 0.10.0 dbl-tempo 0.1.12
dbus-python 1.2.16 debugpy 1.5.1 decorator 5.1.1
defusedxml 0.7.1 dill 0.3.4 diskcache 5.4.0
distlib 0.3.6 docstring-to-markdown 0.11 entrypoints 0.4
ephem 4.1.4 executing 0.8.3 facet-overview 1.0.0
fastjsonschema 2.16.2 fasttext 0.9.2 filelock 3.6.0
Flask 1.1.2 flatbuffers 23.1.21 fonttools 4.25.0
fsspec 2022.2.0 future 0.18.2 gast 0.4.0
gitdb 4.0.10 GitPython 3.1.27 google-auth 1.33.0
google-auth-oauthlib 0.4.6 google-pasta 0.2.0 grpcio 1.42.0
gunicorn 20.1.0 gviz-api 1.10.0 h5py 3.6.0
hijri-converter 2.2.4 holidays 0,18 horovod 0.27.0
htmlmin 0.1.12 huggingface-hub 0.12.0 idna 3.3
ImageHash 4.3.1 imbalanced-learn 0.10.1 importlib-metadata 4.11.3
ipykernel 6.15.3 ipython 8.5.0 ipython-genutils 0.2.0
ipywidgets 7.7.2 isodate 0.6.1 itsdangerous 2.0.1
jedi 0.18.1 Jinja2 2.11.3 jmespath 0.10.0
joblib 1.1.1 joblibspark 0.5.1 jsonschema 4.4.0
jupyter-client 6.1.12 jupyter_core 4.11.2 jupyterlab-pygments 0.1.2
jupyterlab-widgets 1.0.0 keras 2.11.0 kiwisolver 1.3.2
korean-lunar-calendar 0.3.1 langcodes 3.3.0 libclang 15.0.6.1
lightgbm 3.3.4 llvmlite 0.38.0 LunarCalendar 0.0.9
Mako 1.2.0 Markdown 3.3.4 MarkupSafe 2.0.1
matplotlib 3.5.1 matplotlib-inline 0.1.2 mccabe 0.7.0
mistune 0.8.4 mleap 0.20.0 mlflow-skinny 2.1.1
multimethod 1.9.1 murmurhash 1.0.9 mypy-extensions 0.4.3
nbclient 0.5.13 nbconvert 6.4.4 nbformat 5.3.0
nest-asyncio 1.5.5 networkx 2.7.1 nltk 3.7
nodeenv 1.7.0 notebook 6.4.8 numba 0.55.1
numpy 1.21.5 oauthlib 3.2.0 opt-einsum 3.3.0
packaging 21.3 pandas 1.4.2 pandas-profiling 3.6.2
pandocfilters 1.5.0 paramiko 2.9.2 parso 0.8.3
pathspec 0.9.0 pathy 0.10.1 patsy 0.5.2
petastorm 0.12.1 pexpect 4.8.0 phik 0.12.3
pickleshare 0.7.5 Pillow 9.0.1 pip 21.2.4
platformdirs 2.6.2 plotly 5.6.0 pluggy 1.0.0
pmdarima 2.0.2 preshed 3.0.8 prompt-toolkit 3.0.20
prophet 1.1.1 protobuf 3.19.4 psutil 5.8.0
psycopg2 2.9.3 ptyprocess 0.7.0 pure-eval 0.2.2
pyarrow 7.0.0 pyasn1 0.4.8 pyasn1-modules 0.2.8
pybind11 2.10.3 pycparser 2.21 pydantic 1.10.2
pyflakes 2.5.0 Pygments 2.11.2 PyGObject 3.36.0
PyJWT 2.6.0 PyMeeus 0.5.12 PyNaCl 1.5.0
pyodbc 4.0.32 pyparsing 3.0.4 pyright 1.1.283
pyrsistent 0.18.0 python-dateutil 2.8.2 python-editor 1.0.4
python-lsp-jsonrpc 1.0.0 python-lsp-server 1.6.0 pytz 2021.3
PyWavelets 1.3.0 PyYAML 6.0 pyzmq 22.3.0
regex 2022.3.15 requests 2.27.1 requests-oauthlib 1.3.1
requests-unixsocket 0.2.0 rope 0.22.0 rsa 4.7.2
s3transfer 0.5.0 scikit-learn 1.0.2 scipy 1.7.3
seaborn 0.11.2 Send2Trash 1.8.0 setuptools 61.2.0
setuptools-git 1.2 shap 0.41.0 simplejson 3.17.6
six 1.16.0 slicer 0.0.7 smart-open 5.2.1
smmap 5.0.0 soupsieve 2.3.1 spacy 3.4.4
spacy-legacy 3.0.12 spacy-logger 1.0.4 spark-tensorflow-distributor 1.0.0
sqlparse 0.4.2 srsly 2.4.5 ssh-import-id 5.10
stack-data 0.2.0 statsmodels 0.13.2 tabulate 0.8.9
tangled-up-in-unicode 0.2.0 tenacity 8.0.1 tensorboard 2.11.2
tensorboard-data-server 0.6.1 tensorboard-plugin-profile 2.11.1 tensorboard-plugin-wit 1.8.1
tensorflow 2.11.0 tensorflow-estimator 2.11.0 tensorflow-io-gcs-filesystem 0.30.0
termcolor 2.2.0 terminado 0.13.1 testpath 0.5.0
thinc 8.1.7 threadpoolctl 2.2.0 tokenize-rt 4.2.1
tokenizers 0.13.2 tomli 1.2.2 torch 1.13.1+cu117
torchvision 0.14.1+cu117 tornado 6.1 tqdm 4.64.0
traitlets 5.1.1 transformers 4.25.1 typeguard 2.13.3
typer 0.7.0 typing_extensions 4.1.1 ujson 5.1.0
unattended-upgrades 0.1 urllib3 1.26.9 virtualenv 20.8.0
visions 0.7.5 wasabi 0.10.1 wcwidth 0.2.5
webencodings 0.5.1 websocket-client 0.58.0 Werkzeug 2.0.3
whatthepatch 1.0.4 wheel 0.37.1 widgetsnbextension 3.6.1
wrapt 1.12.1 xgboost 1.7.2 yapf 0.31.0
zipp 3.7.0

Librerie R

Le librerie R sono identiche alle librerie R in Databricks Runtime 12.2 LTS.

Librerie Java e Scala (cluster Scala 2.12)

Oltre alle librerie Java e Scala in Databricks Runtime 12.2 LTS, Databricks Runtime 12.2 LTS ML contiene i file JAR seguenti:

Cluster CPU

ID gruppo ID artefatto Versione
com.typesafe.akka akka-actor_2.12 2.5.23
ml.combust.mleap mleap-databricks-runtime_2.12 v0.20.0-db1
ml.dmlc xgboost4j-spark_2.12 1.7.3
ml.dmlc xgboost4j_2.12 1.7.3
org.graphframes graphframes_2.12 0.8.2-db1-spark3.2
org.mlflow mlflow-client 2.1.1
org.scala-lang.modules scala-java8-compat_2.12 0.8.0
org.tensorflow spark-tensorflow-connector_2.12 1.15.0

Cluster GPU

ID gruppo ID artefatto Versione
com.typesafe.akka akka-actor_2.12 2.5.23
ml.combust.mleap mleap-databricks-runtime_2.12 v0.20.0-db1
ml.dmlc xgboost4j-gpu_2.12 1.7.3
ml.dmlc xgboost4j-spark-gpu_2.12 1.7.3
org.graphframes graphframes_2.12 0.8.2-db1-spark3.2
org.mlflow mlflow-client 2.1.1
org.scala-lang.modules scala-java8-compat_2.12 0.8.0
org.tensorflow spark-tensorflow-connector_2.12 1.15.0