From 894bc572037d72431ebf5d75c062c0b0fb20dadf Mon Sep 17 00:00:00 2001
From: carlosmolina2007-gif <carlosmolina2007@hotmail.com>
Date: Mon, 25 Aug 2025 20:27:31 -0500
Subject: [PATCH] =?UTF-8?q?Se=20cre=C3=B3=20con=20Colab?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

---
 LiteralA-Grupo-9.ipynb | 1005 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 1005 insertions(+)
 create mode 100644 LiteralA-Grupo-9.ipynb

diff --git a/LiteralA-Grupo-9.ipynb b/LiteralA-Grupo-9.ipynb
new file mode 100644
index 0000000..7b60bb3
--- /dev/null
+++ b/LiteralA-Grupo-9.ipynb
@@ -0,0 +1,1005 @@
+{
+  "nbformat": 4,
+  "nbformat_minor": 0,
+  "metadata": {
+    "colab": {
+      "provenance": [],
+      "authorship_tag": "ABX9TyMEdnFiwn2ZS6PxZp5kBQVs"
+    },
+    "kernelspec": {
+      "name": "python3",
+      "display_name": "Python 3"
+    },
+    "language_info": {
+      "name": "python"
+    }
+  },
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "source": [
+        "**Practica de Grupo 9 - De acuerdo a las recomendaciones se busca otra pagina web a la del ejemplo y se ejecuta la actividad.**\n",
+        "**http://books.toscrape.com**\n",
+        "\n",
+        "1.- Esta sección que instala los paquetes a utilizar\n",
+        "\n"
+      ],
+      "metadata": {
+        "id": "ZfReJrK77xQs"
+      }
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 1,
+      "metadata": {
+        "id": "2_M3acf55VPn"
+      },
+      "outputs": [],
+      "source": [
+        "%pip -q install requests beautifulsoup4 lxml"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "Sección para importar los paquetes\n",
+        "1. requests, me permite hacer la solicitud HTTP.\n",
+        "2. BeautifulSoup, me ayuda a convertir el HTML en un árbol para poder recorrer.\n",
+        "3. time sirve para pausar el scraping y no sobrecargar la página.\n",
+        "4. urljoin, sirve para construir enlaces completos cuando en la web aparecen enlaces relativos."
+      ],
+      "metadata": {
+        "id": "CRnShOTg7SI3"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "import time, math, re\n",
+        "import requests\n",
+        "from urllib.parse import urljoin\n",
+        "from bs4 import BeautifulSoup"
+      ],
+      "metadata": {
+        "id": "3luOLpTy5mgF"
+      },
+      "execution_count": 3,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "Aquí aprendí cómo descargar el HTML de una página web.\n",
+        "Uso requests.get(URL) para pedirle al servidor la página, y luego con .text obtengo el código HTML en formato de texto."
+      ],
+      "metadata": {
+        "id": "TxOgH6o07-BE"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "BASE = \"http://books.toscrape.com/\"\n",
+        "headers = {\"User-Agent\": \"Mozilla/5.0\"}\n",
+        "\n",
+        "resp = requests.get(BASE, headers=headers, timeout=20)\n",
+        "print(\"status:\", resp.status_code, \"| bytes:\", len(resp.text))\n",
+        "src = resp.text"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "nC0gXWjL5pgo",
+        "outputId": "15202c11-aeed-4d20-d865-9a85cadac16a"
+      },
+      "execution_count": 4,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "status: 200 | bytes: 51294\n"
+          ]
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "El HTML descargado es solo texto plano. Con BeautifulSoup lo transformamos en un objeto llamado soup, que nos permite navegar fácilmente por etiquetas, atributos y clases."
+      ],
+      "metadata": {
+        "id": "3oPfd-Us8Gjr"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "soup = BeautifulSoup(src, \"lxml\")\n",
+        "print(soup.title.text)  # título de la página"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "yFCagT3Y5sNI",
+        "outputId": "15990207-ea86-44f7-a877-5ff5e80f9564"
+      },
+      "execution_count": 5,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "\n",
+            "    All products | Books to Scrape - Sandbox\n",
+            "\n"
+          ]
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "Una de las partes más importantes que aprendí es cómo buscar etiquetas dentro del HTML, con .select(\"article.product_pod\") busco todos los artículos de libros en la página.\n",
+        "Cada artículo tiene título, precio, rating y enlace dentro de distintas etiquetas."
+      ],
+      "metadata": {
+        "id": "H0DhjDyY8TDJ"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# Por etiqueta + clase\n",
+        "pods = soup.select(\"article.product_pod\")\n",
+        "len(pods), pods[0]"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "a5Bq1_ZW5u_U",
+        "outputId": "6015a237-fc28-4db9-c705-ec199a0c0123"
+      },
+      "execution_count": 6,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "(20,\n",
+              " <article class=\"product_pod\">\n",
+              " <div class=\"image_container\">\n",
+              " <a href=\"catalogue/a-light-in-the-attic_1000/index.html\"><img alt=\"A Light in the Attic\" class=\"thumbnail\" src=\"media/cache/2c/da/2cdad67c44b002e7ead0cc35693c0e8b.jpg\"/></a>\n",
+              " </div>\n",
+              " <p class=\"star-rating Three\">\n",
+              " <i class=\"icon-star\"></i>\n",
+              " <i class=\"icon-star\"></i>\n",
+              " <i class=\"icon-star\"></i>\n",
+              " <i class=\"icon-star\"></i>\n",
+              " <i class=\"icon-star\"></i>\n",
+              " </p>\n",
+              " <h3><a href=\"catalogue/a-light-in-the-attic_1000/index.html\" title=\"A Light in the Attic\">A Light in the ...</a></h3>\n",
+              " <div class=\"product_price\">\n",
+              " <p class=\"price_color\">Â£51.77</p>\n",
+              " <p class=\"instock availability\">\n",
+              " <i class=\"icon-ok\"></i>\n",
+              "     \n",
+              "         In stock\n",
+              "     \n",
+              " </p>\n",
+              " <form>\n",
+              " <button class=\"btn btn-primary btn-block\" data-loading-text=\"Adding...\" type=\"submit\">Add to basket</button>\n",
+              " </form>\n",
+              " </div>\n",
+              " </article>)"
+            ]
+          },
+          "metadata": {},
+          "execution_count": 6
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "links = pods[0].select(\"h3 a\")\n",
+        "links[0]"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "UcMSwMIo5yNC",
+        "outputId": "54e1f594-2ad0-400e-d7d0-fccc030b8549"
+      },
+      "execution_count": 7,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "<a href=\"catalogue/a-light-in-the-attic_1000/index.html\" title=\"A Light in the Attic\">A Light in the ...</a>"
+            ]
+          },
+          "metadata": {},
+          "execution_count": 7
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "soup.select(\"article.product_pod p.price_color\")[:3]"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "d0svdSkp51qq",
+        "outputId": "3d750e01-b228-41eb-da78-550f44240533"
+      },
+      "execution_count": 8,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "[<p class=\"price_color\">Â£51.77</p>,\n",
+              " <p class=\"price_color\">Â£53.74</p>,\n",
+              " <p class=\"price_color\">Â£50.10</p>]"
+            ]
+          },
+          "metadata": {},
+          "execution_count": 8
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "soup.select(\"article.product_pod p.star-rating\")[:3]"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "CX_bQFp-54Sv",
+        "outputId": "1f363fef-feef-4151-eeab-87f3895cfd44"
+      },
+      "execution_count": 9,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "[<p class=\"star-rating Three\">\n",
+              " <i class=\"icon-star\"></i>\n",
+              " <i class=\"icon-star\"></i>\n",
+              " <i class=\"icon-star\"></i>\n",
+              " <i class=\"icon-star\"></i>\n",
+              " <i class=\"icon-star\"></i>\n",
+              " </p>,\n",
+              " <p class=\"star-rating One\">\n",
+              " <i class=\"icon-star\"></i>\n",
+              " <i class=\"icon-star\"></i>\n",
+              " <i class=\"icon-star\"></i>\n",
+              " <i class=\"icon-star\"></i>\n",
+              " <i class=\"icon-star\"></i>\n",
+              " </p>,\n",
+              " <p class=\"star-rating One\">\n",
+              " <i class=\"icon-star\"></i>\n",
+              " <i class=\"icon-star\"></i>\n",
+              " <i class=\"icon-star\"></i>\n",
+              " <i class=\"icon-star\"></i>\n",
+              " <i class=\"icon-star\"></i>\n",
+              " </p>]"
+            ]
+          },
+          "metadata": {},
+          "execution_count": 9
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "Ya que encontré las etiquetas, aprendí a extraer la información que contienen:\n",
+        "\n",
+        "tag.get_text(strip=True), con esta variante obtuve el texto dentro de una etiqueta.\n",
+        "\n",
+        "tag[\"atributo\"], accede a un atributo específico, como un enlace href.\n",
+        "\n",
+        "También aprendí que algunas clases en HTML pueden servir para obtener valores (ej. rating)."
+      ],
+      "metadata": {
+        "id": "45sRp5m38j-g"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "pod = pods[0]\n",
+        "\n",
+        "# título está en el atributo 'title' del <a>\n",
+        "title = pod.select_one(\"h3 a\")[\"title\"]\n",
+        "\n",
+        "# precio en p.price_color (ej. '51.77')\n",
+        "price = pod.select_one(\"p.price_color\").get_text(strip=True)\n",
+        "\n",
+        "# rating: segunda clase de p.star-rating (ej. ['star-rating','Three'] → 'Three')\n",
+        "rating = pod.select_one(\"p.star-rating\")[\"class\"][1]\n",
+        "\n",
+        "# enlace relativo → absoluto\n",
+        "href = pod.select_one(\"h3 a\")[\"href\"]\n",
+        "url  = urljoin(BASE, href)\n",
+        "\n",
+        "title, price, rating, url"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "S1lkc9jx58WY",
+        "outputId": "700b04b5-139f-4e3d-fd87-f38b19223210"
+      },
+      "execution_count": 10,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "('A Light in the Attic',\n",
+              " 'Â£51.77',\n",
+              " 'Three',\n",
+              " 'http://books.toscrape.com/catalogue/a-light-in-the-attic_1000/index.html')"
+            ]
+          },
+          "metadata": {},
+          "execution_count": 10
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "**Challenge 1 – Extraer todos los libros de la página**"
+      ],
+      "metadata": {
+        "id": "otWYDGnX9HCp"
+      }
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "Aquí practiqué cómo hacer un bucle for para recorrer todos los libros de la página y guardarlos como tuplas (título, precio, rating, url)."
+      ],
+      "metadata": {
+        "id": "RgUc7NM69Lnb"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "books = []\n",
+        "for pod in soup.select(\"article.product_pod\"):\n",
+        "    title  = pod.select_one(\"h3 a\")[\"title\"]\n",
+        "    price  = pod.select_one(\"p.price_color\").get_text(strip=True)\n",
+        "    rating = pod.select_one(\"p.star-rating\")[\"class\"][1]\n",
+        "    href   = pod.select_one(\"h3 a\")[\"href\"]\n",
+        "    url    = urljoin(BASE, href)\n",
+        "    books.append((title, price, rating, url))\n",
+        "\n",
+        "len(books), books[:3]"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "fSA05cSK6IDa",
+        "outputId": "94d3ad1f-bf28-4505-ad2d-0654d4729325"
+      },
+      "execution_count": 11,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "(20,\n",
+              " [('A Light in the Attic',\n",
+              "   'Â£51.77',\n",
+              "   'Three',\n",
+              "   'http://books.toscrape.com/catalogue/a-light-in-the-attic_1000/index.html'),\n",
+              "  ('Tipping the Velvet',\n",
+              "   'Â£53.74',\n",
+              "   'One',\n",
+              "   'http://books.toscrape.com/catalogue/tipping-the-velvet_999/index.html'),\n",
+              "  ('Soumission',\n",
+              "   'Â£50.10',\n",
+              "   'One',\n",
+              "   'http://books.toscrape.com/catalogue/soumission_998/index.html')])"
+            ]
+          },
+          "metadata": {},
+          "execution_count": 11
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "**Challenge 2 – Función get_books**"
+      ],
+      "metadata": {
+        "id": "vXZgWawx9Q1m"
+      }
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "Luego aprendí a modularizar el código, es decir, a meterlo dentro de una función para reutilizarlo en diferentes páginas."
+      ],
+      "metadata": {
+        "id": "MZgWkSZ69Vuy"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "def get_books(page_url: str):\n",
+        "    \"\"\"Scrapea una página de listado de BooksToScrape y devuelve [(title, price, rating, url), ...].\"\"\"\n",
+        "    r = requests.get(page_url, headers=headers, timeout=20)\n",
+        "    s = BeautifulSoup(r.text, \"lxml\")\n",
+        "    out = []\n",
+        "    for pod in s.select(\"article.product_pod\"):\n",
+        "        title  = pod.select_one(\"h3 a\")[\"title\"]\n",
+        "        price  = pod.select_one(\"p.price_color\").get_text(strip=True)\n",
+        "        rating = pod.select_one(\"p.star-rating\")[\"class\"][1]\n",
+        "        href   = pod.select_one(\"h3 a\")[\"href\"]\n",
+        "        url    = urljoin(page_url, href)\n",
+        "        out.append((title, price, rating, url))\n",
+        "    return out\n",
+        "\n",
+        "test = get_books(BASE)\n",
+        "len(test), test[0]"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "XCt4Mx1M6M2Z",
+        "outputId": "64956bf6-9794-4e3a-a4f3-916092a07f0f"
+      },
+      "execution_count": 12,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "(20,\n",
+              " ('A Light in the Attic',\n",
+              "  'Â£51.77',\n",
+              "  'Three',\n",
+              "  'http://books.toscrape.com/catalogue/a-light-in-the-attic_1000/index.html'))"
+            ]
+          },
+          "metadata": {},
+          "execution_count": 12
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "**Challenge 3 – Seguir paginación (todas las páginas)**"
+      ],
+      "metadata": {
+        "id": "d5WGO3fT9epn"
+      }
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "Aquí aprendí que no basta con leer solo la primera página. Muchas webs tienen varias páginas y debemos buscar el botón Next para seguir.\n",
+        "El truco es:\n",
+        "\n",
+        "1. Buscar li.next a en el HTML.\n",
+        "\n",
+        "2. Usar urljoin para construir el nuevo link.\n",
+        "\n",
+        "3. Repetir hasta que no haya más páginas."
+      ],
+      "metadata": {
+        "id": "sKyY5CEi9hOT"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "def get_all_books(start_url: str, delay=0.5):\n",
+        "    books = []\n",
+        "    url = start_url\n",
+        "    while True:\n",
+        "        r = requests.get(url, headers=headers, timeout=20)\n",
+        "        s = BeautifulSoup(r.text, \"lxml\")\n",
+        "        # acumular libros de la página actual\n",
+        "        for pod in s.select(\"article.product_pod\"):\n",
+        "            title  = pod.select_one(\"h3 a\")[\"title\"]\n",
+        "            price  = pod.select_one(\"p.price_color\").get_text(strip=True)\n",
+        "            rating = pod.select_one(\"p.star-rating\")[\"class\"][1]\n",
+        "            href   = pod.select_one(\"h3 a\")[\"href\"]\n",
+        "            absurl = urljoin(url, href)\n",
+        "            books.append((title, price, rating, absurl))\n",
+        "        # ¿hay siguiente?\n",
+        "        nxt = s.select_one(\"li.next a\")\n",
+        "        if not nxt:\n",
+        "            break\n",
+        "        url = urljoin(url, nxt[\"href\"])\n",
+        "        time.sleep(delay)\n",
+        "    return books\n",
+        "\n",
+        "all_books = get_all_books(BASE)  # ~1000 libros en total\n",
+        "len(all_books), all_books[:3]"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "F3ELLvlX6S_T",
+        "outputId": "40057320-502f-4575-8474-34733cb7982a"
+      },
+      "execution_count": 13,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "(1000,\n",
+              " [('A Light in the Attic',\n",
+              "   'Â£51.77',\n",
+              "   'Three',\n",
+              "   'http://books.toscrape.com/catalogue/a-light-in-the-attic_1000/index.html'),\n",
+              "  ('Tipping the Velvet',\n",
+              "   'Â£53.74',\n",
+              "   'One',\n",
+              "   'http://books.toscrape.com/catalogue/tipping-the-velvet_999/index.html'),\n",
+              "  ('Soumission',\n",
+              "   'Â£50.10',\n",
+              "   'One',\n",
+              "   'http://books.toscrape.com/catalogue/soumission_998/index.html')])"
+            ]
+          },
+          "metadata": {},
+          "execution_count": 13
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "**Challenge 4 – Extraer por categorías**"
+      ],
+      "metadata": {
+        "id": "9wc8lFeI9vaP"
+      }
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "Por último, aprendí que también se puede organizar el scraping por categoría, ya que el sitio tiene un menú lateral con enlaces de cada género de libros."
+      ],
+      "metadata": {
+        "id": "NpN4lATX9xvm"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# 1) obtener URLs de categorías\n",
+        "r = requests.get(BASE, headers=headers, timeout=20)\n",
+        "s = BeautifulSoup(r.text, \"lxml\")\n",
+        "cat_links = [(a.get_text(strip=True), urljoin(BASE, a[\"href\"]))\n",
+        "             for a in s.select(\"ul.nav-list a\") if a.get(\"href\")]\n",
+        "cat_links[:5]"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "G7HUFgw26eRf",
+        "outputId": "504eff9d-7729-4b39-fc7e-4010502a277b"
+      },
+      "execution_count": 14,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "[('Books', 'http://books.toscrape.com/catalogue/category/books_1/index.html'),\n",
+              " ('Travel',\n",
+              "  'http://books.toscrape.com/catalogue/category/books/travel_2/index.html'),\n",
+              " ('Mystery',\n",
+              "  'http://books.toscrape.com/catalogue/category/books/mystery_3/index.html'),\n",
+              " ('Historical Fiction',\n",
+              "  'http://books.toscrape.com/catalogue/category/books/historical-fiction_4/index.html'),\n",
+              " ('Sequential Art',\n",
+              "  'http://books.toscrape.com/catalogue/category/books/sequential-art_5/index.html')]"
+            ]
+          },
+          "metadata": {},
+          "execution_count": 14
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# 2) diccionario: categoria -> primeros N libros\n",
+        "def get_books_by_category(limit_per_cat=30, delay=0.5):\n",
+        "    res = {}\n",
+        "    for cat_name, cat_url in cat_links:\n",
+        "        books = get_all_books(cat_url, delay=delay)\n",
+        "        res[cat_name] = books[:limit_per_cat]\n",
+        "        time.sleep(delay)\n",
+        "    return res\n",
+        "\n",
+        "buckets = get_books_by_category(limit_per_cat=10, delay=0.2)\n",
+        "list(buckets.keys())[:5], len(buckets)"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "PqCCdUjL6jKs",
+        "outputId": "d8874e3c-26bc-4444-d89d-a78ea2a88d01"
+      },
+      "execution_count": 15,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "(['Books', 'Travel', 'Mystery', 'Historical Fiction', 'Sequential Art'], 51)"
+            ]
+          },
+          "metadata": {},
+          "execution_count": 15
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "import pandas as pd\n",
+        "df = pd.DataFrame(all_books, columns=[\"title\",\"price\",\"rating\",\"url\"])\n",
+        "df.head()\n",
+        "\n",
+        "# df.to_csv(\"books.csv\", index=False)"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 206
+        },
+        "id": "8lJoA4gb6uXN",
+        "outputId": "8d796f3b-fdcd-4a4a-d32c-6145bfc16512"
+      },
+      "execution_count": 16,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "                                   title    price rating  \\\n",
+              "0                   A Light in the Attic  Â£51.77  Three   \n",
+              "1                     Tipping the Velvet  Â£53.74    One   \n",
+              "2                             Soumission  Â£50.10    One   \n",
+              "3                          Sharp Objects  Â£47.82   Four   \n",
+              "4  Sapiens: A Brief History of Humankind  Â£54.23   Five   \n",
+              "\n",
+              "                                                 url  \n",
+              "0  http://books.toscrape.com/catalogue/a-light-in...  \n",
+              "1  http://books.toscrape.com/catalogue/tipping-th...  \n",
+              "2  http://books.toscrape.com/catalogue/soumission...  \n",
+              "3  http://books.toscrape.com/catalogue/sharp-obje...  \n",
+              "4  http://books.toscrape.com/catalogue/sapiens-a-...  "
+            ],
+            "text/html": [
+              "\n",
+              "  <div id=\"df-2dc07230-37e3-4a4d-96b7-786a857bbccc\" class=\"colab-df-container\">\n",
+              "    <div>\n",
+              "<style scoped>\n",
+              "    .dataframe tbody tr th:only-of-type {\n",
+              "        vertical-align: middle;\n",
+              "    }\n",
+              "\n",
+              "    .dataframe tbody tr th {\n",
+              "        vertical-align: top;\n",
+              "    }\n",
+              "\n",
+              "    .dataframe thead th {\n",
+              "        text-align: right;\n",
+              "    }\n",
+              "</style>\n",
+              "<table border=\"1\" class=\"dataframe\">\n",
+              "  <thead>\n",
+              "    <tr style=\"text-align: right;\">\n",
+              "      <th></th>\n",
+              "      <th>title</th>\n",
+              "      <th>price</th>\n",
+              "      <th>rating</th>\n",
+              "      <th>url</th>\n",
+              "    </tr>\n",
+              "  </thead>\n",
+              "  <tbody>\n",
+              "    <tr>\n",
+              "      <th>0</th>\n",
+              "      <td>A Light in the Attic</td>\n",
+              "      <td>Â£51.77</td>\n",
+              "      <td>Three</td>\n",
+              "      <td>http://books.toscrape.com/catalogue/a-light-in...</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th>1</th>\n",
+              "      <td>Tipping the Velvet</td>\n",
+              "      <td>Â£53.74</td>\n",
+              "      <td>One</td>\n",
+              "      <td>http://books.toscrape.com/catalogue/tipping-th...</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th>2</th>\n",
+              "      <td>Soumission</td>\n",
+              "      <td>Â£50.10</td>\n",
+              "      <td>One</td>\n",
+              "      <td>http://books.toscrape.com/catalogue/soumission...</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th>3</th>\n",
+              "      <td>Sharp Objects</td>\n",
+              "      <td>Â£47.82</td>\n",
+              "      <td>Four</td>\n",
+              "      <td>http://books.toscrape.com/catalogue/sharp-obje...</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th>4</th>\n",
+              "      <td>Sapiens: A Brief History of Humankind</td>\n",
+              "      <td>Â£54.23</td>\n",
+              "      <td>Five</td>\n",
+              "      <td>http://books.toscrape.com/catalogue/sapiens-a-...</td>\n",
+              "    </tr>\n",
+              "  </tbody>\n",
+              "</table>\n",
+              "</div>\n",
+              "    <div class=\"colab-df-buttons\">\n",
+              "\n",
+              "  <div class=\"colab-df-container\">\n",
+              "    <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-2dc07230-37e3-4a4d-96b7-786a857bbccc')\"\n",
+              "            title=\"Convert this dataframe to an interactive table.\"\n",
+              "            style=\"display:none;\">\n",
+              "\n",
+              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\" viewBox=\"0 -960 960 960\">\n",
+              "    <path d=\"M120-120v-720h720v720H120Zm60-500h600v-160H180v160Zm220 220h160v-160H400v160Zm0 220h160v-160H400v160ZM180-400h160v-160H180v160Zm440 0h160v-160H620v160ZM180-180h160v-160H180v160Zm440 0h160v-160H620v160Z\"/>\n",
+              "  </svg>\n",
+              "    </button>\n",
+              "\n",
+              "  <style>\n",
+              "    .colab-df-container {\n",
+              "      display:flex;\n",
+              "      gap: 12px;\n",
+              "    }\n",
+              "\n",
+              "    .colab-df-convert {\n",
+              "      background-color: #E8F0FE;\n",
+              "      border: none;\n",
+              "      border-radius: 50%;\n",
+              "      cursor: pointer;\n",
+              "      display: none;\n",
+              "      fill: #1967D2;\n",
+              "      height: 32px;\n",
+              "      padding: 0 0 0 0;\n",
+              "      width: 32px;\n",
+              "    }\n",
+              "\n",
+              "    .colab-df-convert:hover {\n",
+              "      background-color: #E2EBFA;\n",
+              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
+              "      fill: #174EA6;\n",
+              "    }\n",
+              "\n",
+              "    .colab-df-buttons div {\n",
+              "      margin-bottom: 4px;\n",
+              "    }\n",
+              "\n",
+              "    [theme=dark] .colab-df-convert {\n",
+              "      background-color: #3B4455;\n",
+              "      fill: #D2E3FC;\n",
+              "    }\n",
+              "\n",
+              "    [theme=dark] .colab-df-convert:hover {\n",
+              "      background-color: #434B5C;\n",
+              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
+              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
+              "      fill: #FFFFFF;\n",
+              "    }\n",
+              "  </style>\n",
+              "\n",
+              "    <script>\n",
+              "      const buttonEl =\n",
+              "        document.querySelector('#df-2dc07230-37e3-4a4d-96b7-786a857bbccc button.colab-df-convert');\n",
+              "      buttonEl.style.display =\n",
+              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
+              "\n",
+              "      async function convertToInteractive(key) {\n",
+              "        const element = document.querySelector('#df-2dc07230-37e3-4a4d-96b7-786a857bbccc');\n",
+              "        const dataTable =\n",
+              "          await google.colab.kernel.invokeFunction('convertToInteractive',\n",
+              "                                                    [key], {});\n",
+              "        if (!dataTable) return;\n",
+              "\n",
+              "        const docLinkHtml = 'Like what you see? Visit the ' +\n",
+              "          '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
+              "          + ' to learn more about interactive tables.';\n",
+              "        element.innerHTML = '';\n",
+              "        dataTable['output_type'] = 'display_data';\n",
+              "        await google.colab.output.renderOutput(dataTable, element);\n",
+              "        const docLink = document.createElement('div');\n",
+              "        docLink.innerHTML = docLinkHtml;\n",
+              "        element.appendChild(docLink);\n",
+              "      }\n",
+              "    </script>\n",
+              "  </div>\n",
+              "\n",
+              "\n",
+              "    <div id=\"df-76a03c54-fdf0-42fc-a2f9-2cc075928565\">\n",
+              "      <button class=\"colab-df-quickchart\" onclick=\"quickchart('df-76a03c54-fdf0-42fc-a2f9-2cc075928565')\"\n",
+              "                title=\"Suggest charts\"\n",
+              "                style=\"display:none;\">\n",
+              "\n",
+              "<svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
+              "     width=\"24px\">\n",
+              "    <g>\n",
+              "        <path d=\"M19 3H5c-1.1 0-2 .9-2 2v14c0 1.1.9 2 2 2h14c1.1 0 2-.9 2-2V5c0-1.1-.9-2-2-2zM9 17H7v-7h2v7zm4 0h-2V7h2v10zm4 0h-2v-4h2v4z\"/>\n",
+              "    </g>\n",
+              "</svg>\n",
+              "      </button>\n",
+              "\n",
+              "<style>\n",
+              "  .colab-df-quickchart {\n",
+              "      --bg-color: #E8F0FE;\n",
+              "      --fill-color: #1967D2;\n",
+              "      --hover-bg-color: #E2EBFA;\n",
+              "      --hover-fill-color: #174EA6;\n",
+              "      --disabled-fill-color: #AAA;\n",
+              "      --disabled-bg-color: #DDD;\n",
+              "  }\n",
+              "\n",
+              "  [theme=dark] .colab-df-quickchart {\n",
+              "      --bg-color: #3B4455;\n",
+              "      --fill-color: #D2E3FC;\n",
+              "      --hover-bg-color: #434B5C;\n",
+              "      --hover-fill-color: #FFFFFF;\n",
+              "      --disabled-bg-color: #3B4455;\n",
+              "      --disabled-fill-color: #666;\n",
+              "  }\n",
+              "\n",
+              "  .colab-df-quickchart {\n",
+              "    background-color: var(--bg-color);\n",
+              "    border: none;\n",
+              "    border-radius: 50%;\n",
+              "    cursor: pointer;\n",
+              "    display: none;\n",
+              "    fill: var(--fill-color);\n",
+              "    height: 32px;\n",
+              "    padding: 0;\n",
+              "    width: 32px;\n",
+              "  }\n",
+              "\n",
+              "  .colab-df-quickchart:hover {\n",
+              "    background-color: var(--hover-bg-color);\n",
+              "    box-shadow: 0 1px 2px rgba(60, 64, 67, 0.3), 0 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
+              "    fill: var(--button-hover-fill-color);\n",
+              "  }\n",
+              "\n",
+              "  .colab-df-quickchart-complete:disabled,\n",
+              "  .colab-df-quickchart-complete:disabled:hover {\n",
+              "    background-color: var(--disabled-bg-color);\n",
+              "    fill: var(--disabled-fill-color);\n",
+              "    box-shadow: none;\n",
+              "  }\n",
+              "\n",
+              "  .colab-df-spinner {\n",
+              "    border: 2px solid var(--fill-color);\n",
+              "    border-color: transparent;\n",
+              "    border-bottom-color: var(--fill-color);\n",
+              "    animation:\n",
+              "      spin 1s steps(1) infinite;\n",
+              "  }\n",
+              "\n",
+              "  @keyframes spin {\n",
+              "    0% {\n",
+              "      border-color: transparent;\n",
+              "      border-bottom-color: var(--fill-color);\n",
+              "      border-left-color: var(--fill-color);\n",
+              "    }\n",
+              "    20% {\n",
+              "      border-color: transparent;\n",
+              "      border-left-color: var(--fill-color);\n",
+              "      border-top-color: var(--fill-color);\n",
+              "    }\n",
+              "    30% {\n",
+              "      border-color: transparent;\n",
+              "      border-left-color: var(--fill-color);\n",
+              "      border-top-color: var(--fill-color);\n",
+              "      border-right-color: var(--fill-color);\n",
+              "    }\n",
+              "    40% {\n",
+              "      border-color: transparent;\n",
+              "      border-right-color: var(--fill-color);\n",
+              "      border-top-color: var(--fill-color);\n",
+              "    }\n",
+              "    60% {\n",
+              "      border-color: transparent;\n",
+              "      border-right-color: var(--fill-color);\n",
+              "    }\n",
+              "    80% {\n",
+              "      border-color: transparent;\n",
+              "      border-right-color: var(--fill-color);\n",
+              "      border-bottom-color: var(--fill-color);\n",
+              "    }\n",
+              "    90% {\n",
+              "      border-color: transparent;\n",
+              "      border-bottom-color: var(--fill-color);\n",
+              "    }\n",
+              "  }\n",
+              "</style>\n",
+              "\n",
+              "      <script>\n",
+              "        async function quickchart(key) {\n",
+              "          const quickchartButtonEl =\n",
+              "            document.querySelector('#' + key + ' button');\n",
+              "          quickchartButtonEl.disabled = true;  // To prevent multiple clicks.\n",
+              "          quickchartButtonEl.classList.add('colab-df-spinner');\n",
+              "          try {\n",
+              "            const charts = await google.colab.kernel.invokeFunction(\n",
+              "                'suggestCharts', [key], {});\n",
+              "          } catch (error) {\n",
+              "            console.error('Error during call to suggestCharts:', error);\n",
+              "          }\n",
+              "          quickchartButtonEl.classList.remove('colab-df-spinner');\n",
+              "          quickchartButtonEl.classList.add('colab-df-quickchart-complete');\n",
+              "        }\n",
+              "        (() => {\n",
+              "          let quickchartButtonEl =\n",
+              "            document.querySelector('#df-76a03c54-fdf0-42fc-a2f9-2cc075928565 button');\n",
+              "          quickchartButtonEl.style.display =\n",
+              "            google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
+              "        })();\n",
+              "      </script>\n",
+              "    </div>\n",
+              "\n",
+              "    </div>\n",
+              "  </div>\n"
+            ],
+            "application/vnd.google.colaboratory.intrinsic+json": {
+              "type": "dataframe",
+              "summary": "{\n  \"name\": \"# df\",\n  \"rows\": 5,\n  \"fields\": [\n    {\n      \"column\": \"title\",\n      \"properties\": {\n        \"dtype\": \"string\",\n        \"num_unique_values\": 5,\n        \"samples\": [\n          \"Tipping the Velvet\",\n          \"Sapiens: A Brief History of Humankind\",\n          \"Soumission\"\n        ],\n        \"semantic_type\": \"\",\n        \"description\": \"\"\n      }\n    },\n    {\n      \"column\": \"price\",\n      \"properties\": {\n        \"dtype\": \"string\",\n        \"num_unique_values\": 5,\n        \"samples\": [\n          \"\\u00c2\\u00a353.74\",\n          \"\\u00c2\\u00a354.23\",\n          \"\\u00c2\\u00a350.10\"\n        ],\n        \"semantic_type\": \"\",\n        \"description\": \"\"\n      }\n    },\n    {\n      \"column\": \"rating\",\n      \"properties\": {\n        \"dtype\": \"string\",\n        \"num_unique_values\": 4,\n        \"samples\": [\n          \"One\",\n          \"Five\",\n          \"Three\"\n        ],\n        \"semantic_type\": \"\",\n        \"description\": \"\"\n      }\n    },\n    {\n      \"column\": \"url\",\n      \"properties\": {\n        \"dtype\": \"string\",\n        \"num_unique_values\": 5,\n        \"samples\": [\n          \"http://books.toscrape.com/catalogue/tipping-the-velvet_999/index.html\",\n          \"http://books.toscrape.com/catalogue/sapiens-a-brief-history-of-humankind_996/index.html\",\n          \"http://books.toscrape.com/catalogue/soumission_998/index.html\"\n        ],\n        \"semantic_type\": \"\",\n        \"description\": \"\"\n      }\n    }\n  ]\n}"
+            }
+          },
+          "metadata": {},
+          "execution_count": 16
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "Conclusión: Dentro de la prectica elegí una pagina para practivar otros escenarios y he logrado aprender que de una página web en HTML plano a una estructura en Python que puedo recorrer y analizar. Con requests obtuve el contenido, con BeautifulSoup lo convertí en algo navegable, y después aprendí a buscar etiquetas, extraer atributos, recorrer páginas y modularizar mi código en funciones."
+      ],
+      "metadata": {
+        "id": "i3cPlfNb97OG"
+      }
+    }
+  ]
+}
\ No newline at end of file