AI starter kit chart #579

volatilemolotov · 2025-09-24T14:28:06Z

This PR adds a AI starter kit helm chart that aims to provide a out of the box development solution for AI workloads. Uses RayServe, Ollama or Ramalama to run the LLMs and JupyterHub for the development.

k8s-ci-robot · 2025-09-24T14:28:15Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: volatilemolotov
Once this PR has been reviewed and has the lgtm label, please assign soltysh for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

volatilemolotov · 2025-09-24T14:30:16Z

Here is the initial PR, currently in draft state. Think we should be able to send it for reviews

@janetkuo @gongmax @fcabrera23

janetkuo

I'm not sure if we want Cloud Build and Terraform as prerequisites. Suggest making this example more generic, like other AI examples. I'd like to focus on the Kubernetes manifests and make it customizable for different platforms.

volatilemolotov · 2025-09-29T14:02:51Z

I'm not sure if we want Cloud Build and Terraform as prerequisites. Suggest making this example more generic, like other AI examples. I'd like to focus on the Kubernetes manifests and make it customizable for different platforms.

Removed the example values and ci folder. Hope makefile can stay, it can be useful

ai/ai-starter-kit/helm-chart/ai-starter-kit/README.md

ai/ai-starter-kit/helm-chart/ai-starter-kit/files/welcome.ipynb

ai/ai-starter-kit/helm-chart/ai-starter-kit/files/ray.ipynb

volatilemolotov · 2025-10-14T14:48:41Z

ai/ai-starter-kit/helm-chart/ai-starter-kit/README.md

+  -f values.yaml
+```
+
+3. **Access JupyterHub:**


I will check out which exactly can be run in Minikube and note accordingly

All should work. Multi agent ray one needs ray enabled but we are not enabling it by default

gongmax · 2025-10-09T21:32:13Z

ai/ai-starter-kit/helm-chart/ai-starter-kit/README.md

+helm install ai-starter-kit . \
+  --set huggingface.token="YOUR_HF_TOKEN" \
+  -f values.yaml \
+  -f values-gke.yaml


@janetkuo do you have concern to include the GKE specific setup in the example? Do you think we should remove this all?

Yes, the example should be as generic as possible so that it's applicable to most Kubernetes clusters. It might be challenging at times for some platform-specific setup, and in that case we should call it out and mention alternatives.

In our case, there are some platform specific setup such as specify the GPU in GKE cloud.google.com/gke-accelerator: nvidia-l4. What is your suggestion to handle this?

Removed all GKE mentions and added a readme entry that demonstrates how to work with GPUs

ai/ai-starter-kit/helm-chart/ai-starter-kit/README.md

remove output from ray.ipynb

linux-foundation-easycla · 2025-10-15T15:59:21Z

The committers listed above are authorized under a signed CLA.

✅ login: alex-akv / name: Aleksandar Stefanovic (42ca138, 78a03d7, ced46e9)
✅ login: drogovozDP / name: Drogovoz Dima (2781c92)
✅ login: volatilemolotov / name: Vlado Djerek (05c0644, 0672f83, 2755940, 6995e3c, 6bf740d, 74322fc, 7d7df31, c01a078, ceb424e, d57a1ea, a871088, b20d943)

ai/ai-starter-kit/helm-chart/ai-starter-kit/README.md

ai/ai-starter-kit/helm-chart/ai-starter-kit/files/welcome.ipynb

ai/ai-starter-kit/helm-chart/ai-starter-kit/files/multi-agent-ollama.ipynb

gongmax · 2025-10-20T23:45:41Z

ai/ai-starter-kit/helm-chart/ai-starter-kit/files/multi-agent-ollama.ipynb

+   "source": [
+    "import os, time, requests, json\n",
+    "\n",
+    "USE_WRAPPER = True\n",


Can this be auto set like what you did in cell 5?

This was resolved in previous commit.

ai/ai-starter-kit/helm-chart/ai-starter-kit/files/multi-agent-ollama.ipynb

ai/ai-starter-kit/helm-chart/ai-starter-kit/README.md

gongmax · 2025-10-21T17:41:24Z

ai/ai-starter-kit/Makefile

@@ -0,0 +1,64 @@
+.PHONY: check_hf_token check_OCI_target package_helm lint dep_update install install_gke start uninstall push_helm


What is the usage of the make commands?

You want me to document each?

Just in general in README. User can still following the current README to install via helm, so not sure when these make commands should be used.

Documented in commit: 78a03d7

ai/ai-starter-kit/helm-chart/ai-starter-kit/README.md

ai/ai-starter-kit/helm-chart/ai-starter-kit/files/multi-agent-ramalama.ipynb

ai/ai-starter-kit/helm-chart/ai-starter-kit/values.yaml

ai/ai-starter-kit/helm-chart/ai-starter-kit/README.md

ai/ai-starter-kit/helm-chart/ai-starter-kit/files/ray.ipynb

ai/ai-starter-kit/helm-chart/ai-starter-kit/values.yaml

Applying fixes to resolve PR comments

gongmax · 2025-10-29T17:06:27Z

ai/ai-starter-kit/helm-chart/ai-starter-kit/README.md

+### Delete GKE cluster
+```bash
+gcloud container clusters delete ${CLUSTER_NAME} --region=${REGION}
+```


Please remove this

Removed in commit: ced46e9

gongmax · 2025-11-04T21:17:32Z

ai/ai-starter-kit/helm-chart/ai-starter-kit/files/multi-agent-ramalama.ipynb

+   "id": "0af596cf-5ba6-42df-a030-61d7a20d6f7b",
+   "metadata": {},
+   "source": [
+    "### Cell 6 - MLFlow: connect to tracking server and list recent runs\n",


@alex-akv did you reproduce this issue? I'm still seeing it.

No, I get 4 recent runs as an output.

gongmax · 2025-10-31T16:36:30Z

ai/ai-starter-kit/helm-chart/ai-starter-kit/README.md

+          nvidia.com/gpu: 1
+
+      nodeSelector:
+        cloud.google.com/gke-accelerator: nvidia-l4


Let's call out in the description above that this is using GKE as an example

Described in commit: ced46e9

gongmax · 2025-10-31T20:10:55Z

ai/ai-starter-kit/helm-chart/ai-starter-kit/files/ray.ipynb

+   },
+   "outputs": [],
+   "source": [
+    "!pip install numpy mlflow tensorflow \"ray[serve,default,client]\""


specify tensorflow==2.20.0 fixed some error.

Specified in commit: ced46e9

gongmax · 2025-10-31T21:28:41Z

ai/ai-starter-kit/helm-chart/ai-starter-kit/files/multi-agent.ipynb

+   "id": "8111d705-595e-4e65-8479-bdc76191fa31",
+   "metadata": {},
+   "source": [
+    "### Cell 3 - Deploy model on Ray Serve with llama-cpp\n",


Run this cell doesn't output any error but the corresponding Ray job failed with the below logs

runtime_env setup failed: Failed to set up runtime environment. Could not create the actor because its associated runtime env failed to be created. Traceback (most recent call last): File "/home/ray/anaconda3/lib/python3.12/site-packages/ray/_private/runtime_env/agent/runtime_env_agent.py", line 384, in _create_runtime_env_with_retry runtime_env_context = await asyncio.wait_for( ^^^^^^^^^^^^^^^^^^^^^^^ File "/home/ray/anaconda3/lib/python3.12/asyncio/tasks.py", line 520, in wait_for return await fut ^^^^^^^^^ File "/home/ray/anaconda3/lib/python3.12/site-packages/ray/_private/runtime_env/agent/runtime_env_agent.py", line 350, in _setup_runtime_env await create_for_plugin_if_needed( File "/home/ray/anaconda3/lib/python3.12/site-packages/ray/_private/runtime_env/plugin.py", line 254, in create_for_plugin_if_needed size_bytes = await plugin.create(uri, runtime_env, context, logger=logger) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/ray/anaconda3/lib/python3.12/site-packages/ray/_private/runtime_env/pip.py", line 309, in create pip_dir_bytes = await task ^^^^^^^^^^ File "/home/ray/anaconda3/lib/python3.12/site-packages/ray/_private/runtime_env/pip.py", line 289, in _create_for_hash await PipProcessor( File "/home/ray/anaconda3/lib/python3.12/site-packages/ray/_private/runtime_env/pip.py", line 191, in _run await self._install_pip_packages( File "/home/ray/anaconda3/lib/python3.12/site-packages/ray/_private/runtime_env/pip.py", line 167, in _install_pip_packages await check_output_cmd(pip_install_cmd, logger=logger, cwd=cwd, env=pip_env) File "/home/ray/anaconda3/lib/python3.12/site-packages/ray/_private/runtime_env/utils.py", line 105, in check_output_cmd raise SubprocessCalledProcessError( ray._private.runtime_env.utils.SubprocessCalledProcessError: Run cmd[13] failed with the following details. Command '['/tmp/ray/session_2025-10-31_14-03-31_982555_1/runtime_resources/pip/8dc32a48ead56d51e7e1a0de9341332701cf7b2f/virtualenv/bin/python', '-m', 'pip', 'install', '--disable-pip-version-check', '--no-cache-dir', '-r', '/tmp/ray/session_2025-10-31_14-03-31_982555_1/runtime_resources/pip/8dc32a48ead56d51e7e1a0de9341332701cf7b2f/ray_runtime_env_internal_pip_requirements.txt']' returned non-zero exit status 1. Last 50 lines of stdout: ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 62.8/62.8 kB 179.2 MB/s eta 0:00:00 Downloading graphql_core-3.2.6-py3-none-any.whl (203 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 203.4/203.4 kB 196.9 MB/s eta 0:00:00 Downloading graphql_relay-3.2.0-py3-none-any.whl (16 kB) Downloading greenlet-3.2.4-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (607 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 607.6/607.6 kB 184.8 MB/s eta 0:00:00 Downloading itsdangerous-2.2.0-py3-none-any.whl (16 kB) Downloading joblib-1.5.2-py3-none-any.whl (308 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 308.4/308.4 kB 217.8 MB/s eta 0:00:00 Downloading kiwisolver-1.4.9-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (1.5 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.5/1.5 MB 188.7 MB/s eta 0:00:00 Downloading pillow-12.0.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (7.0 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.0/7.0 MB 165.1 MB/s eta 0:00:00 Downloading threadpoolctl-3.6.0-py3-none-any.whl (18 kB) Downloading werkzeug-3.1.3-py3-none-any.whl (224 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 224.5/224.5 kB 182.1 MB/s eta 0:00:00 Downloading zipp-3.23.0-py3-none-any.whl (10 kB) Downloading mako-1.3.10-py3-none-any.whl (78 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 78.5/78.5 kB 181.9 MB/s eta 0:00:00 Downloading smmap-5.0.2-py3-none-any.whl (24 kB) Building wheels for collected packages: llama-cpp-python Building wheel for llama-cpp-python (pyproject.toml): started Building wheel for llama-cpp-python (pyproject.toml): finished with status 'error' error: subprocess-exited-with-error × Building wheel for llama-cpp-python (pyproject.toml) did not run successfully. │ exit code: 1 ╰─> [16 lines of output] *** scikit-build-core 0.11.6 using CMake 4.1.2 (wheel) *** Configuring CMake... loading initial cache file /tmp/tmpqiu0581x/build/CMakeInit.txt CMake Error at /tmp/pip-build-env-6dhn2ys0/normal/lib/python3.12/site-packages/cmake/data/share/cmake-4.1/Modules/CMakeDetermineCCompiler.cmake:48 (message): Could not find compiler set in environment variable CC: gcc -pthread -B /home/ray/anaconda3/compiler_compat. Call Stack (most recent call first): CMakeLists.txt:3 (project) CMake Error: CMAKE_C_COMPILER not set, after EnableLanguage CMake Error: CMAKE_CXX_COMPILER not set, after EnableLanguage -- Configuring incomplete, errors occurred! *** CMake configuration failed [end of output] note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for llama-cpp-python Failed to build llama-cpp-python ERROR: Could not build wheels for llama-cpp-python, which is required to install pyproject.toml-based projects ```

We have tested using MacOS and Linux desktop environments and were not able to reproduce the issue.

Were you testing using minikube on Linux?

volatilemolotov added 2 commits September 10, 2025 13:27

init ai-starter-kit

c01a078

Updated PVCs. added GPU support, added MacOS support

d57a1ea

k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Sep 24, 2025

k8s-ci-robot requested review from kow3ns and soltysh September 24, 2025 14:28

k8s-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Sep 24, 2025

janetkuo reviewed Sep 24, 2025

View reviewed changes

remove example values and ci

05c0644

clean up makefile

6995e3c

gongmax reviewed Oct 10, 2025

View reviewed changes

volatilemolotov and others added 4 commits October 14, 2025 16:47

changes to readme

74322fc

update readme and change default password

2755940

remove output from ray.ipynb

2781c92

Merge pull request #29 from volatilemolotov/ai-starter-kit-2-remove-blob

a871088

remove output from ray.ipynb

gongmax reviewed Oct 21, 2025

View reviewed changes

ai/ai-starter-kit/helm-chart/ai-starter-kit/values.yaml Outdated Show resolved Hide resolved

update readme

7d7df31

gongmax reviewed Oct 22, 2025

View reviewed changes

ai/ai-starter-kit/helm-chart/ai-starter-kit/README.md Outdated Show resolved Hide resolved

gongmax reviewed Oct 22, 2025

View reviewed changes

ai/ai-starter-kit/helm-chart/ai-starter-kit/README.md Outdated Show resolved Hide resolved

ai/ai-starter-kit/helm-chart/ai-starter-kit/files/ray.ipynb Show resolved Hide resolved

volatilemolotov added 2 commits October 24, 2025 18:46

add pre delete hook for singleuser env

6bf740d

readme fixes, added hook to delete singleuser pod

ceb424e

gongmax reviewed Oct 27, 2025

View reviewed changes

ai/ai-starter-kit/helm-chart/ai-starter-kit/values.yaml Show resolved Hide resolved

alex-akv and others added 2 commits October 27, 2025 21:42

Applying fixes to resolve PR comments

42ca138

Merge pull request #30 from volatilemolotov/comment-resolve

b20d943

Applying fixes to resolve PR comments

remove mlflow enabled from doc

0672f83

gongmax reviewed Oct 29, 2025

View reviewed changes

gongmax reviewed Oct 31, 2025

View reviewed changes

alex-akv added 2 commits November 4, 2025 15:31

Applying fixes to resolve PR comments

ced46e9

Updating readme with makefile command description

78a03d7

		@@ -0,0 +1,64 @@
		.PHONY: check_hf_token check_OCI_target package_helm lint dep_update install install_gke start uninstall push_helm

AI starter kit chart #579

Are you sure you want to change the base?

AI starter kit chart #579

Conversation

volatilemolotov commented Sep 24, 2025

Uh oh!

k8s-ci-robot commented Sep 24, 2025

Uh oh!

volatilemolotov commented Sep 24, 2025

Uh oh!

janetkuo left a comment

Choose a reason for hiding this comment

Uh oh!

volatilemolotov commented Sep 29, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment was marked as resolved.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

linux-foundation-easycla bot commented Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

This comment was marked as outdated.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alex-akv Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gongmax Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

linux-foundation-easycla bot commented Oct 15, 2025 •

edited

Loading

alex-akv Nov 4, 2025 •

edited

Loading

gongmax Oct 31, 2025 •

edited

Loading

alex-akv Nov 4, 2025 •

edited

Loading