Azure Computer Vision 🤝 Python

### Talk title

Azure Computer Vision with Python

### Short talk description

This session is designed to take you on a journey through the powerful capabilities of **Azure Computer Vision**.

We'll begin by exploring the wealth of **out-of-the-box features** available, showing you how to immediately leverage AI for image and video analysis without writing complex machine learning models. This introductory segment will include:

- **Detailed explanations** of various features, such as object detection, optical character recognition (OCR), image classification, and facial recognition.

- **Real-world examples** to illustrate the practical applications of each service.

- A **guided walkthrough of Azure Vision Studio** , demonstrating a no-code approach to experimenting with and understanding these services interactively.

### Long talk description

This session dives deep into **Azure Computer Vision**, Microsoft's powerful, AI-based service designed for analyzing and extracting meaningful insights from images and videos. We'll start by showcasing the practical, no-code capabilities available right out of the box.

**Exploring the Azure Vision Studio** 
We'll introduce the dedicated **Azure Vision Studio**, a user-friendly interface that lets you instantly test and visualize the service's various features—like image tagging, object detection, and optical character recognition (OCR)—without writing a single line of code. This provides a perfect foundation for understanding the core functionality before we move to development.

**Hands-On Development with Python**
The core of this talk focuses on **programmatic development** using the **Python SDK**. You'll learn the essentials of integrating Azure Computer Vision into your applications, starting with practical, step-by-step examples demonstrating how to send images to the service and interpret the AI-generated results. We'll then solidify this knowledge by diving into two distinct, real-world mini-projects built using Python and **Streamlit** for a quick and engaging user interface:

**Project 1: Smart Image Renaming**
This project presents an elegant solution to a common organizational challenge. We will use a Streamlit application to process a directory of images. By leveraging Azure Computer Vision's **image captioning** feature, the solution will automatically rename every image file based on the descriptive caption generated by the AI, turning cryptic file names into meaningful labels.

**Project 2: Automated Image Classification and Segregation**
We will build a Streamlit application that uses Azure Computer Vision's **object detection and tagging** capabilities to intelligently sort files. Given a collection of images (e.g., photos of cars, buildings, and street signs), the solution will auto-segregate them, placing each image into its correct, automatically-created folder, demonstrating powerful automation for large-scale content management.

By the end of this talk, you'll have the foundational knowledge and concrete code examples to start building your own intelligent vision applications!


### What format do you have in mind?

Talk (20-25 minutes + Q&A)

### Talk outline / Agenda


## **1\. Introduction & Foundational Concepts (5 min)**

* **1.1. Welcome & Session Goal:** Introduce Azure Computer Vision and its role in modern application development.  
* **1.2. What is Azure Computer Vision?**  
  * AI-as-a-Service model for image and video analysis.  
  * Key components: Classification, Detection, OCR, Captioning.  
* **1.3. Why Vision Studio?** Transitioning to the no-code environment.

## **2\. Out-of-the-Box Power (10 min)**

* **2.1. Guided Tour of Azure Vision Studio**  
  * How to quickly experiment and test models without code.  
  * Understanding feature capabilities (e.g., detecting famous landmarks, content moderation).  
* **2.2. Core Feature Walkthrough (Examples)**  
  * **Image Tagging & Categorization:** Automatically describing image content.  
  * **Object Detection:** Locating and bounding specific items (cars, signs, etc.).  
  * **Optical Character Recognition (OCR):** Extracting text from images/documents.  
* **2.3. Key Takeaway:** Demonstrating the ease of use and immediate insights.


## **3\. Part III: Real-World Mini-Projects (10 min)**

### **Project 1 Demo: Smart Image Renaming**

* **3.1. The Problem:** Managing a folder full of cryptic image file names.  
* **3.2. The Solution:** Using Streamlit and Azure's **Image Captioning** feature.  
* **3.3. Demo:** Live walk-through of the application, renaming files based on AI-generated descriptions.

### **Project 2 Demo: Automated Image Classification**

* **3.4. The Problem:** Automatically sorting large volumes of images based on content.  
* **3.5. The Solution:** Using Streamlit and Azure's **Object Detection & Tagging**.  
* **3.6. Demo:** Showing the app segregating images (cars, buildings, signs) into respective folders.

## **4\. Conclusion (5 min)**

* **4.1. Q\&A and Discussion.**

### Key takeaways

- **No-Code Mastery:** How to immediately leverage Vision Studio for rapid prototyping and testing.
- **Seamless Integration:** Mastering the Python SDK to connect CV services to your custom applications.
- **Real-World Automation:** Applying AI for practical solutions like file renaming and auto-classification.

### What domain would you say your talk falls under?

Web Development

### Duration (including Q&A)

30 Minutes (20-25 minutes session + 5-10 minutes Q&A)

### Prerequisites and preparation

- **Basic Python Programming:** Familiarity with Core Python Fundamentals.
- **API Understanding:** Basic Knowledge of RESTful APIs and HTTP Requests.
- **Cloud Exposure:** Basic understanding of cloud services

### Resources and references

_No response_

### Link to slides/demos (if available)

[Slides](https://docs.google.com/presentation/d/17UO4srh-j8t-uMF3orid1IRbTRSHPC2e/edit?usp=sharing&ouid=109895395201970378286&rtpof=true&sd=true)

### Twitter/X handle (optional)

https://x.com/vipul_malhotra

### LinkedIn profile (optional)

https://www.linkedin.com/in/vipulm124/

### Profile picture URL (optional)

_No response_

### Speaker bio

- Currently working as Principal Consultant working in backend(Python) and Azure.
- I like building small, useful things that make life a bit easier.
- I’ve created a couple of Chrome extensions — [Handy Links](https://chromewebstore.google.com/detail/mfhgonmedmkpgibomffmjfogmdolelke?utm_source=item-share-cb) and [Boring X](https://chromewebstore.google.com/detail/boring-x/njjlikigbflidkddjflbplnlfkillcpe?utm_source=item-share-cb) — and also a few Python packages like [qrstyler](https://pypi.org/project/qrstyler/) and [csv-excel-azure](https://pypi.org/project/csv-excel-azure/).
- I’m ranked [192](https://www.c-sharpcorner.com/members/vipul-malhotra5) on C# Corner, where I write about coding stuff.
- I also run a small [YouTube channel](https://www.youtube.com/@vipulm124) where I share quick POCs, code demos, and Azure experiments.
- Currently working with an open source community TeamShiksha on couple of projects and mentoring freshers

### Availability

I'll be Available for the 8th November Tech Session.

### Accessibility & special requirements

- External display
- Internet

### Speaker checklist

- [x] I have read and understood the [PyDelhi guidelines](https://github.com/pydelhi/talks/blob/main/guidelines.md) for submitting proposals and giving talks
- [x] I will make my talk accessible to all attendees and will proactively ask for any accommodations or special requirements I might need
- [x] I agree to share slides, code snippets, and other materials used during the talk with the community
- [x] I will follow PyDelhi's Code of Conduct and maintain a welcoming, inclusive environment throughout my participation
- [x] I understand that PyDelhi meetups are community-centric events focused on learning, knowledge sharing, and networking, and I will respect this ethos by not using this platform for self-promotion or hiring pitches during my presentation, unless explicitly invited to do so by means of a sponsorship or similar arrangement
- [x] If the talk is recorded by the PyDelhi team, I grant permission to release the video on PyDelhi's YouTube channel under the CC-BY-4.0 license, or a different license of my choosing if I am specifying it in my proposal or with the materials I share

### Additional comments

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Azure Computer Vision 🤝 Python #366

Talk title

Short talk description

Long talk description

What format do you have in mind?

Talk outline / Agenda

1. Introduction & Foundational Concepts (5 min)

2. Out-of-the-Box Power (10 min)

3. Part III: Real-World Mini-Projects (10 min)

Project 1 Demo: Smart Image Renaming

Project 2 Demo: Automated Image Classification

4. Conclusion (5 min)

Key takeaways

What domain would you say your talk falls under?

Duration (including Q&A)

Prerequisites and preparation

Resources and references

Link to slides/demos (if available)

Twitter/X handle (optional)

LinkedIn profile (optional)

Profile picture URL (optional)

Speaker bio

Availability

Accessibility & special requirements

Speaker checklist

Additional comments

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Azure Computer Vision 🤝 Python #366

Description

Talk title

Short talk description

Long talk description

What format do you have in mind?

Talk outline / Agenda

1. Introduction & Foundational Concepts (5 min)

2. Out-of-the-Box Power (10 min)

3. Part III: Real-World Mini-Projects (10 min)

Project 1 Demo: Smart Image Renaming

Project 2 Demo: Automated Image Classification

4. Conclusion (5 min)

Key takeaways

What domain would you say your talk falls under?

Duration (including Q&A)

Prerequisites and preparation

Resources and references

Link to slides/demos (if available)

Twitter/X handle (optional)

LinkedIn profile (optional)

Profile picture URL (optional)

Speaker bio

Availability

Accessibility & special requirements

Speaker checklist

Additional comments

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions