-
Couldn't load subscription status.
- Fork 52
Description
Talk title
Azure Computer Vision with Python
Short talk description
This session is designed to take you on a journey through the powerful capabilities of Azure Computer Vision.
We'll begin by exploring the wealth of out-of-the-box features available, showing you how to immediately leverage AI for image and video analysis without writing complex machine learning models. This introductory segment will include:
-
Detailed explanations of various features, such as object detection, optical character recognition (OCR), image classification, and facial recognition.
-
Real-world examples to illustrate the practical applications of each service.
-
A guided walkthrough of Azure Vision Studio , demonstrating a no-code approach to experimenting with and understanding these services interactively.
Long talk description
This session dives deep into Azure Computer Vision, Microsoft's powerful, AI-based service designed for analyzing and extracting meaningful insights from images and videos. We'll start by showcasing the practical, no-code capabilities available right out of the box.
Exploring the Azure Vision Studio
We'll introduce the dedicated Azure Vision Studio, a user-friendly interface that lets you instantly test and visualize the service's various features—like image tagging, object detection, and optical character recognition (OCR)—without writing a single line of code. This provides a perfect foundation for understanding the core functionality before we move to development.
Hands-On Development with Python
The core of this talk focuses on programmatic development using the Python SDK. You'll learn the essentials of integrating Azure Computer Vision into your applications, starting with practical, step-by-step examples demonstrating how to send images to the service and interpret the AI-generated results. We'll then solidify this knowledge by diving into two distinct, real-world mini-projects built using Python and Streamlit for a quick and engaging user interface:
Project 1: Smart Image Renaming
This project presents an elegant solution to a common organizational challenge. We will use a Streamlit application to process a directory of images. By leveraging Azure Computer Vision's image captioning feature, the solution will automatically rename every image file based on the descriptive caption generated by the AI, turning cryptic file names into meaningful labels.
Project 2: Automated Image Classification and Segregation
We will build a Streamlit application that uses Azure Computer Vision's object detection and tagging capabilities to intelligently sort files. Given a collection of images (e.g., photos of cars, buildings, and street signs), the solution will auto-segregate them, placing each image into its correct, automatically-created folder, demonstrating powerful automation for large-scale content management.
By the end of this talk, you'll have the foundational knowledge and concrete code examples to start building your own intelligent vision applications!
What format do you have in mind?
Talk (20-25 minutes + Q&A)
Talk outline / Agenda
1. Introduction & Foundational Concepts (5 min)
- 1.1. Welcome & Session Goal: Introduce Azure Computer Vision and its role in modern application development.
- 1.2. What is Azure Computer Vision?
- AI-as-a-Service model for image and video analysis.
- Key components: Classification, Detection, OCR, Captioning.
- 1.3. Why Vision Studio? Transitioning to the no-code environment.
2. Out-of-the-Box Power (10 min)
- 2.1. Guided Tour of Azure Vision Studio
- How to quickly experiment and test models without code.
- Understanding feature capabilities (e.g., detecting famous landmarks, content moderation).
- 2.2. Core Feature Walkthrough (Examples)
- Image Tagging & Categorization: Automatically describing image content.
- Object Detection: Locating and bounding specific items (cars, signs, etc.).
- Optical Character Recognition (OCR): Extracting text from images/documents.
- 2.3. Key Takeaway: Demonstrating the ease of use and immediate insights.
3. Part III: Real-World Mini-Projects (10 min)
Project 1 Demo: Smart Image Renaming
- 3.1. The Problem: Managing a folder full of cryptic image file names.
- 3.2. The Solution: Using Streamlit and Azure's Image Captioning feature.
- 3.3. Demo: Live walk-through of the application, renaming files based on AI-generated descriptions.
Project 2 Demo: Automated Image Classification
- 3.4. The Problem: Automatically sorting large volumes of images based on content.
- 3.5. The Solution: Using Streamlit and Azure's Object Detection & Tagging.
- 3.6. Demo: Showing the app segregating images (cars, buildings, signs) into respective folders.
4. Conclusion (5 min)
- 4.1. Q&A and Discussion.
Key takeaways
- No-Code Mastery: How to immediately leverage Vision Studio for rapid prototyping and testing.
- Seamless Integration: Mastering the Python SDK to connect CV services to your custom applications.
- Real-World Automation: Applying AI for practical solutions like file renaming and auto-classification.
What domain would you say your talk falls under?
Web Development
Duration (including Q&A)
30 Minutes (20-25 minutes session + 5-10 minutes Q&A)
Prerequisites and preparation
- Basic Python Programming: Familiarity with Core Python Fundamentals.
- API Understanding: Basic Knowledge of RESTful APIs and HTTP Requests.
- Cloud Exposure: Basic understanding of cloud services
Resources and references
No response
Link to slides/demos (if available)
Twitter/X handle (optional)
LinkedIn profile (optional)
https://www.linkedin.com/in/vipulm124/
Profile picture URL (optional)
No response
Speaker bio
- Currently working as Principal Consultant working in backend(Python) and Azure.
- I like building small, useful things that make life a bit easier.
- I’ve created a couple of Chrome extensions — Handy Links and Boring X — and also a few Python packages like qrstyler and csv-excel-azure.
- I’m ranked 192 on C# Corner, where I write about coding stuff.
- I also run a small YouTube channel where I share quick POCs, code demos, and Azure experiments.
- Currently working with an open source community TeamShiksha on couple of projects and mentoring freshers
Availability
I'll be Available for the 8th November Tech Session.
Accessibility & special requirements
- External display
- Internet
Speaker checklist
- I have read and understood the PyDelhi guidelines for submitting proposals and giving talks
- I will make my talk accessible to all attendees and will proactively ask for any accommodations or special requirements I might need
- I agree to share slides, code snippets, and other materials used during the talk with the community
- I will follow PyDelhi's Code of Conduct and maintain a welcoming, inclusive environment throughout my participation
- I understand that PyDelhi meetups are community-centric events focused on learning, knowledge sharing, and networking, and I will respect this ethos by not using this platform for self-promotion or hiring pitches during my presentation, unless explicitly invited to do so by means of a sponsorship or similar arrangement
- If the talk is recorded by the PyDelhi team, I grant permission to release the video on PyDelhi's YouTube channel under the CC-BY-4.0 license, or a different license of my choosing if I am specifying it in my proposal or with the materials I share
Additional comments
No response