-
Couldn't load subscription status.
- Fork 350
Open
Labels
build/install issueBuild or installation issueBuild or installation issue
Description
Checklist
- I have searched for similar issues.
- I have tested with the latest development wheel.
- I have checked the release documentation and the latest documentation (for
mainbranch).
Steps to reproduce the issue
I have a custom dataset of huge point clouds, I want to run on HPC. The HPC has GPU's a100, h100, v100, h200.
This is my def file:
Bootstrap: docker
From: nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04
%post
# Update package lists
apt-get update
# Install essential system libraries and Python3 (including OpenMP - fixes libgomp.so.1 error)
apt-get install -y --no-install-recommends \
python3 \
python3-pip \
python3-dev \
python3-venv \
libgl1-mesa-glx \
libglib2.0-0 \
libgomp1 \
libgcc-s1 \
libstdc++6 \
libomp-dev \
git \
ca-certificates \
wget \
build-essential \
&& rm -rf /var/lib/apt/lists/*
# Ensure PyTorch 2.0.x (cu118) compatible with Open3D-ML
python3 -m pip install --no-cache-dir \
torch==2.0.1+cu118 \
torchvision==0.15.2+cu118 \
torchaudio==2.0.2+cu118 \
--index-url https://download.pytorch.org/whl/cu118
# Install Open3D and dependencies (exact versions from working SSH env)
python3 -m pip install --no-cache-dir \
open3d==0.18.0 \
numpy==1.26.4 \
scikit-learn==1.7.1 \
pyyaml==6.0.2 \
laspy==2.5.4 \
torchmetrics==1.4.0 \
tensorboard \
wandb==0.17.5
# Clone and install Open3D-ML from source (idempotent)
mkdir -p /opt
rm -rf /opt/Open3D-ML
git clone --depth 1 https://github.com/isl-org/Open3D-ML.git /opt/Open3D-ML
python3 -m pip install -e /opt/Open3D-ML
# Test installations during build to catch issues early
python3 -c "import torch; print(f'✓ PyTorch {torch.__version__} installed')"
python3 -c "import open3d; print('✓ Open3D imported successfully')"
python3 -c "import open3d.ml.torch; print('✓ Open3D-ML imported successfully')"
python3 -c "import numpy as np; print(f'✓ NumPy {np.__version__}')"
python3 -c "import sklearn; print(f'✓ scikit-learn {sklearn.__version__}')"
%environment
# Set environment variables for optimal performance
export OMP_NUM_THREADS=8
export MPLBACKEND=Agg
export PYTHONUNBUFFERED=1
# Ensure CUDA libraries are found (critical for GPU training)
export LD_LIBRARY_PATH="/usr/local/cuda/lib64:/usr/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH"
# Python path for custom modules
export PYTHONPATH="/workspace/code:$PYTHONPATH"
%runscript
exec python3 "$@"
### Error message
```shell
The following modules were not unloaded:
(Use "module --force purge" to unload all):
Warning: Open3D was built with CUDA 11.7 butPyTorch was built with CUDA 11.8. Falling back to CPU for now.Otherwise, install PyTorch with CUDA 11.7.
/usr/bin/python3: Error while finding module specification for 'open3d.ml.torch.scripts.run_pipeline' (ModuleNotFoundError: No module named 'open3d.ml.torch.scripts')
Open3D, Python and System information
- Operating system: Ubuntu 22.04.3
- Python version: Python 3.10.12
- Open3D version: 0.18.0
- System type: HPC
- Is this remote workstation?: yes
- How did you install Open3D?: pip
- Compiler version (if built from source): (e.g. gcc 7.5, clang 7.0)Additional information
No response
Inshu0302
Metadata
Metadata
Assignees
Labels
build/install issueBuild or installation issueBuild or installation issue