Skip to content

Minimal python requirements.txt #210

@yxtay

Description

@yxtay

Is it possible to provide a minimal set of python requirements.txt for the python container required to run on databricks-runtime with support for necessary functions (i.e. cell magics)?

I note that this was available prior to commit 832dc53, where a condense list of python requirements was provided. I believe this was valid of 14.3 LTS. However, this was changed for the 15.4 LTS release.

# required packages
six==1.16.0
jedi==0.18.1
ipython==8.14.0
numpy==1.23.5
pandas==1.5.3
pyarrow==8.0.0
matplotlib==3.7.0
jinja2==3.1.2
ipykernel==6.25.0
grpcio==1.48.2
grpcio-status==1.48.1
databricks-sdk==0.1.6
python-lsp-jsonrpc==1.1.2

A similar list is currently still available for the GPU container. Is it still valid? I suppose I should update the versions to the ones in 15.4 LTS and I should expect to work fine?

six==1.16.0
jedi==0.18.1
ipython==8.14.0
ipython-genutils==0.2.0
numpy==1.23.5
pandas==1.5.3
pyarrow==8.0.0
matplotlib==3.7.0
Jinja2==3.1.2
ipykernel==6.25.0
protobuf==4.23.3
pyccolo==0.0.52
grpcio==1.48.2
grpcio-status==1.48.1
databricks-sdk==0.1.6

In comparison, the requirements.txt in the python container is the full list replicating databricks-runtime 15.4 LTS.

asttokens==2.0.5
astunparse==1.6.3
azure-core==1.30.2
azure-storage-blob==12.19.1
azure-storage-file-datalake==12.14.0
backcall==0.2.0
black==23.3.0
blinker==1.4
boto3==1.34.39
botocore==1.34.39
cachetools==5.3.3
certifi==2023.7.22
cffi==1.15.1
chardet==4.0.0
charset-normalizer==2.0.4
click==8.0.4
cloudpickle==2.2.1
comm==0.1.2
contourpy==1.0.5
cryptography==41.0.3
cycler==0.11.0
Cython==0.29.32
databricks-sdk==0.20.0
dbus-python==1.2.18
debugpy==1.6.7
decorator==5.1.1
distlib==0.3.8
entrypoints==0.4
executing==0.8.3
facets-overview==1.1.1
filelock==3.13.4
fonttools==4.25.0
gitdb==4.0.11
GitPython==3.1.43
google-api-core==2.18.0
google-auth==2.31.0
google-cloud-core==2.4.1
google-cloud-storage==2.17.0
google-crc32c==1.5.0
google-resumable-media==2.7.1
googleapis-common-protos==1.63.2
grpcio==1.60.0
grpcio-status==1.60.0
httplib2==0.20.2
idna==3.4
importlib-metadata==6.0.0
ipyflow-core==0.0.198
ipykernel==6.25.1
ipython==8.15.0
ipython-genutils==0.2.0
ipywidgets==7.7.2
isodate==0.6.1
jedi==0.18.1
jeepney==0.7.1
jmespath==0.10.0
joblib==1.2.0
jupyter_client==7.4.9
jupyter_core==5.3.0
keyring==23.5.0
kiwisolver==1.4.4
launchpadlib==1.10.16
lazr.restfulclient==0.14.4
lazr.uri==1.0.6
matplotlib==3.7.2
matplotlib-inline==0.1.6
mlflow-skinny==2.11.4
more-itertools==8.10.0
mypy-extensions==0.4.3
nest-asyncio==1.5.6
numpy==1.23.5
oauthlib==3.2.0
packaging==23.2
pandas==1.5.3
parso==0.8.3
pathspec==0.10.3
patsy==0.5.3
pexpect==4.8.0
pickleshare==0.7.5
Pillow==9.4.0
pip==23.2.1
platformdirs==3.10.0
plotly==5.9.0
prompt-toolkit==3.0.36
proto-plus==1.24.0
protobuf==4.24.1
psutil==5.9.0
psycopg2==2.9.3
ptyprocess==0.7.0
pure-eval==0.2.2
pyarrow==14.0.1
pyasn1==0.4.8
pyasn1-modules==0.2.8
pyccolo==0.0.52
pycparser==2.21
pydantic==1.10.6
Pygments==2.15.1
PyGObject==3.42.1
PyJWT==2.3.0
pyodbc==4.0.38
pyparsing==3.0.9
python-dateutil==2.8.2
python-lsp-jsonrpc==1.1.1
pytz==2022.7
PyYAML==6.0
pyzmq==23.2.0
requests==2.31.0
rsa==4.9
s3transfer==0.10.2
scikit-learn==1.3.0
scipy==1.11.1
seaborn==0.12.2
SecretStorage==3.3.1
setuptools==68.0.0
six==1.16.0
smmap==5.0.1
sqlparse==0.5.0
ssh-import-id==5.11
stack-data==0.2.0
statsmodels==0.14.0
tenacity==8.2.2
threadpoolctl==2.2.0
tokenize-rt==4.2.1
tornado==6.3.2
traitlets==5.7.1
typing_extensions==4.10.0
tzdata==2022.1
ujson==5.4.0
unattended-upgrades==0.1
urllib3==1.26.16
virtualenv==20.24.2
wadllib==1.3.6
wcwidth==0.2.5
wheel==0.38.4
zipp==3.11.0

To reiterate my main point, is there a minimal requirements.txt for the python container? Will you also provide the same for the lsp-requirements.txt. It also seems to have expanded quite a bit for the 15.4 LTS release. This would help me build a small container image for my team as the current one is >10GB uncompressed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions