Skip to content

[DMP 2025]: AI Code generation for lesson plans and model abstraction layer #4671

@Joshithach18

Description

@Joshithach18

Ticket Contents

Description

This feature aims to develop and integrate an AI-powered code generation system within the Music Blocks platform to automatically produce project code snippets from lesson plans. The goal is to make it easier for educators and students to create, understand, and extend Music Blocks projects using natural language inputs.

By training an open-source Large Language Model (LLM) on curated lesson plans and project data, the system will allow seamless generation of Music Blocks-compatible code. A model abstraction layer will be introduced to ensure flexibility—so different AI models can be plugged in over time without changing the core application logic.

This will make the Music Blocks platform more accessible, intuitive, and sustainable for future generations of learners, educators, and developers.

Goals & Mid-Point Milestone

Goals

  • Train an open-source Large Language Model (LLM) to generate Music Blocks project code from lesson plans.
  • Implement a model abstraction layer to keep the AI system flexible and model-agnostic.
  • Expand the dataset by adding more Music Blocks lesson plans and project metadata.
  • Integrate Approximate Nearest Neighbor (ANN) algorithms for efficient code/context retrieval.
  • Create and document FastAPI endpoints for deploying the AI model.
  • Develop strategies and safeguards to minimize AI hallucinations and ensure accurate outputs.
  • Document the technical setup, dataset structure, and contributor guidelines for future maintainers.

Goals Achieved By Mid-point Milestone (1.5 Months)

  • LLM is selected, fine-tuned, and generates basic Music Blocks code from a prompt.
  • A functional model abstraction layer is implemented with at least two interchangeable models.
  • At least 30–50 new lesson plans or projects added to the dataset.
  • Initial version of ANN-based retrieval is working with sample queries.
  • Initial FastAPI service is set up and running locally or in a test environment.

Setup/Installation

No response

Expected Outcome

Expected Outcome

The final product will be an integrated AI-assisted code generation system within the Music Blocks platform that empowers educators and learners to automatically generate Music Blocks project code from natural language lesson plans or prompts.

Key Features and Behaviors:

  • Code Generation Interface
    Users can input lesson objectives or prompts (e.g., "Create a loop-based melody with tempo variation") and receive ready-to-use Music Blocks code snippets.

  • Model-Agnostic AI Backend
    A model abstraction layer allows swapping or upgrading LLMs (e.g., switching from a fine-tuned GPT-J to Mistral or another open-source model) without changing the front-end or API structure.

  • Expanded and Searchable Dataset
    The system is trained on a diverse set of Music Blocks lesson plans and project files, organized and indexed for fast retrieval using Approximate Nearest Neighbor (ANN) algorithms.

  • FastAPI Deployment
    The AI code generation engine is exposed via RESTful FastAPI endpoints that support local or cloud-based deployment, enabling integration with both the Music Blocks app and external tools.

  • Reduced Hallucination & High Relevance
    Responses from the AI model are grounded in actual project data via Retrieval-Augmented Generation (RAG), minimizing hallucinations and ensuring accuracy.

  • Well-Documented for Open Source
    All components—datasets, model training scripts, APIs, and abstraction layers—are clearly documented with setup guides and contribution instructions to help new contributors onboard quickly.

Final Behavior Summary:

When a user (teacher, student, or developer) types a lesson idea, the system will:

  1. Retrieve similar past lesson plans or code blocks using ANN.
  2. Use the LLM to generate relevant Music Blocks code with annotations.
  3. Display or export the code for immediate use or customization in the Music Blocks app.

Acceptance Criteria

Acceptance Criteria

The feature will be considered complete and accepted when the following criteria are met:

  • The open-source LLM is trained/fine-tuned and generates relevant Music Blocks code based on natural language prompts.
  • A model abstraction layer is implemented, tested, and allows switching between at least two different models without modifying the core API.
  • At least 50 high-quality lesson plans and project examples are added to the training dataset.
  • Approximate Nearest Neighbor (ANN)-based retrieval is integrated and improves code relevance through context-aware retrieval.
  • A FastAPI backend is available with endpoints for:
    • Prompt submission
    • Code snippet retrieval
    • Model switching (optional)
  • The AI-generated code runs without errors in the Music Blocks application and is pedagogically meaningful.
  • A technical guide and contributor documentation is available in the repository or wiki.
  • Unit tests and integration tests are written for major components (model, API, abstraction layer).
  • Clear instructions for setup, usage, and contribution are provided for future developers and contributors.
  • AI hallucinations are minimized by using Retrieval-Augmented Generation (RAG) or other grounding techniques.

Implementation Details

Implementation Details

The implementation of this feature will involve several technical components across AI model training, API development, and system integration. Below are the key details:

Technologies & Tools

  • Programming Languages: Python (backend, AI), JavaScript (Music Blocks frontend)
  • Frameworks/Libraries:
    • FastAPI – for building RESTful APIs to serve the AI model
    • Transformers (Hugging Face) – for working with and fine-tuning open-source LLMs (e.g., Mistral, GPT-J, LLaMA)
    • Faiss or Annoy – for implementing Approximate Nearest Neighbor (ANN) search for retrieving similar lesson plans
    • LangChain or Haystack (optional) – for building Retrieval-Augmented Generation (RAG) pipelines
    • Pandas, NumPy – for data handling and preprocessing
    • Docker – for containerizing the application and model server
  • Deployment Tools: Uvicorn (FastAPI server), optionally Hugging Face Spaces or local server

AI Model Training

  • Use an open-source LLM as the base (e.g., GPT-J, Mistral, Phi-2).
  • Fine-tune the model on a curated dataset of Music Blocks lesson plans and project descriptions.
  • Preprocess lesson plans and code into prompt–completion pairs for supervised fine-tuning.

Model Abstraction Layer

  • Design an abstraction class/interface (e.g., ModelInterface) with methods like generate_code(prompt) and get_model_info().
  • Implement separate adapters for each model backend (e.g., MistralAdapter, GPTJAdapter).
  • Allow dynamic switching of models via config or API call without affecting the front-end logic.

Approximate Nearest Neighbor (ANN) Integration

  • Convert lesson plans and code examples into embeddings using SentenceTransformers or OpenAI-compatible models.
  • Index embeddings using FAISS or Annoy for fast similarity search.
  • Use top-k retrieved examples as additional context in RAG.

API Layer

  • Expose endpoints like:
    • POST /generate – accepts a prompt and returns generated code
    • GET /lesson-plan/<id> – returns code from stored lesson plan
    • POST /switch-model – switches between supported models (if enabled)
  • Validate and sanitize all input/output.

Testing & Validation

  • Write unit tests for all components (model adapters, API routes, retrieval logic).
  • Evaluate model output quality using real lesson plan prompts and rubric for relevance, correctness, and usability.
  • Include regression tests to prevent degradation when updating models.

Documentation

  • Provide setup instructions, API usage examples, and dataset schema in the project wiki or README.
  • Include a contributor guide to onboard new developers easily.

Mockups/Wireframes

No response

Product Name

Music Blocks AI Composer

Organisation Name

Sugar Labs

Domain

⁠Education

Tech Skills Needed

Artificial Intelligence

Mentor(s)

@walterbender
@sumitsrv (Sumit Srivastava)
@devinulibarri

This issue proposes a high-impact feature to enhance Music Blocks with AI-powered code generation and model-agnostic architecture. The feature aligns with the mission of Sugar Labs and Music Blocks to make learning engaging, creative, and accessible.

I request your review and guidance on this ticket. If approved, this would be a great issue for GSoC contributors or open-source AI enthusiasts interested in the intersection of education, music, and AI. Kindly assign the issue if suitable or suggest modifications.

Thank you for your support and mentorship!
Joshitha Chennamsetty

Category

AI

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions