Skip to content

Deprecate Nokogiri::HTML4::ElementDescription #3443

@flavorjones

Description

@flavorjones

(Originally posted as [RFC] Deprecate Nokogiri::HTML4::ElementDescription · sparklemotion/nokogiri · Discussion #3311)

Resolved

Deprecate the class Nokogiri::HTML4::ElementDescription in a 1.x release, and remove it in a 2.x release.

Reasoning

The two HTML4 parsing libraries that are used by Nokogiri -- nekohtml and libxml2 -- have always had different metadata about HTML elements.1 And this metadata has even changed over time. 2.

In an upcoming release of libxml2, much of the metadata reported by that library will be removed.3

All of this leads me to the conclusions that:

  1. the metadata is not reliable or usable today
  2. the metadata will continue to change over time
  3. the metadata will likely only get worse over time as upstream deprecates it

Short-term actions

I'm going to add deprecation warnings to the release of Nokogiri that packages libxml2 2.14.0 (not yet released as of this writing) for the particular facets that are already deprecated upstream:

  • implied_start_tag?
  • deprecated? (how meta)
  • sub_elements
  • default_sub_element
  • optional_attributes
  • deprecated_attributes (how meta)
  • required_attributes

Long-term actions

I'd like to mark the entire class and all its methods as deprecated in a pre-2.0 release, and remove it entirely in a 2.0 release.

Are you surprised by this?

If you are a user of this API, I would really like to talk to you to understand your use case and help make this change easier.

Footnotes

  1. For example, 583f6f0

  2. For example, 277db2e

  3. https://gitlab.gnome.org/GNOME/libxml2/-/issues/758#note_2232549

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions