From 7d0980c9e3272843780a3064f794eca5a5e4f9ba Mon Sep 17 00:00:00 2001 From: justmhie Date: Sun, 26 Oct 2025 11:12:35 +0800 Subject: [PATCH 1/3] DOC: Clarify groupby operates on axis 0 and remove 'selected axis' reference This commit addresses issue #56397 by removing outdated references to "the selected axis" in groupby documentation and clarifying that: 1. DataFrame.groupby() always operates along axis 0 (rows) 2. The axis parameter was removed in pandas 3.0 3. To group by columns, users must transpose the DataFrame first Changes: - Updated API reference docstring in DataFrame.groupby() to replace "selected axis" with "number of rows" - Enhanced user guide to explicitly state groupby operates on axis 0 - Added note explaining the removal of the axis parameter and the need to use .T for column-wise grouping Fixes #56397 --- doc/source/user_guide/groupby.rst | 10 ++++++++-- pandas/core/frame.py | 2 +- 2 files changed, 9 insertions(+), 3 deletions(-) diff --git a/doc/source/user_guide/groupby.rst b/doc/source/user_guide/groupby.rst index 4ec34db6ed959..07bb2d104f851 100644 --- a/doc/source/user_guide/groupby.rst +++ b/doc/source/user_guide/groupby.rst @@ -137,8 +137,9 @@ We could naturally group by either the ``A`` or ``B`` columns, or both: ``df.groupby('A')`` is just syntactic sugar for ``df.groupby(df['A'])``. -The above GroupBy will split the DataFrame on its index (rows). To split by columns, first do -a transpose: +The above GroupBy will split the DataFrame on its index (rows). DataFrame groupby +always operates along axis 0 (rows). To split by columns instead, first transpose +the DataFrame: .. ipython:: @@ -151,6 +152,11 @@ a transpose: In [5]: grouped = df.T.groupby(get_letter_type) +.. note:: + + Prior to pandas 3.0, groupby had an ``axis`` parameter. This has been removed. + To group by columns, transpose your DataFrame using ``.T`` before calling groupby. + pandas :class:`~pandas.Index` objects support duplicate values. If a non-unique index is used as the group key in a groupby operation, all values for the same index value will be considered to be in one group and thus the diff --git a/pandas/core/frame.py b/pandas/core/frame.py index dc81610d220bc..c2a7563b659d6 100644 --- a/pandas/core/frame.py +++ b/pandas/core/frame.py @@ -9432,7 +9432,7 @@ def groupby( index. If a dict or Series is passed, the Series or dict VALUES will be used to determine the groups (the Series' values are first aligned; see ``.align()`` method). If a list or ndarray of length - equal to the selected axis is passed (see the `groupby user guide + equal to the number of rows is passed (see the `groupby user guide `_), the values are used as-is to determine the groups. A label or list of labels may be passed to group by the columns in ``self``. From 6bfdbde809e5c568694be1b1a6df851fb3656e41 Mon Sep 17 00:00:00 2001 From: justmhie Date: Sun, 26 Oct 2025 13:38:45 +0800 Subject: [PATCH 2/3] DOC: Fix trailing whitespace in groupby.rst --- doc/source/user_guide/groupby.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/doc/source/user_guide/groupby.rst b/doc/source/user_guide/groupby.rst index 07bb2d104f851..c5b868bc743b6 100644 --- a/doc/source/user_guide/groupby.rst +++ b/doc/source/user_guide/groupby.rst @@ -137,8 +137,8 @@ We could naturally group by either the ``A`` or ``B`` columns, or both: ``df.groupby('A')`` is just syntactic sugar for ``df.groupby(df['A'])``. -The above GroupBy will split the DataFrame on its index (rows). DataFrame groupby -always operates along axis 0 (rows). To split by columns instead, first transpose +The above GroupBy will split the DataFrame on its index (rows). DataFrame groupby +always operates along axis 0 (rows). To split by columns instead, first transpose the DataFrame: .. ipython:: From 3635d814c5a2f65a368f26bcee32b08d9642f821 Mon Sep 17 00:00:00 2001 From: justmhie Date: Tue, 28 Oct 2025 22:21:46 +0800 Subject: [PATCH 3/3] DOC: Simplify groupby docs per reviewer feedback - Combine redundant sentences about axis 0 - Keep original transpose wording - Remove redundant note block --- doc/source/user_guide/groupby.rst | 10 ++-------- 1 file changed, 2 insertions(+), 8 deletions(-) diff --git a/doc/source/user_guide/groupby.rst b/doc/source/user_guide/groupby.rst index c5b868bc743b6..40369bd40cdb5 100644 --- a/doc/source/user_guide/groupby.rst +++ b/doc/source/user_guide/groupby.rst @@ -137,9 +137,8 @@ We could naturally group by either the ``A`` or ``B`` columns, or both: ``df.groupby('A')`` is just syntactic sugar for ``df.groupby(df['A'])``. -The above GroupBy will split the DataFrame on its index (rows). DataFrame groupby -always operates along axis 0 (rows). To split by columns instead, first transpose -the DataFrame: +DataFrame groupby always operates along axis 0 (rows). To split by columns, first do +a transpose: .. ipython:: @@ -152,11 +151,6 @@ the DataFrame: In [5]: grouped = df.T.groupby(get_letter_type) -.. note:: - - Prior to pandas 3.0, groupby had an ``axis`` parameter. This has been removed. - To group by columns, transpose your DataFrame using ``.T`` before calling groupby. - pandas :class:`~pandas.Index` objects support duplicate values. If a non-unique index is used as the group key in a groupby operation, all values for the same index value will be considered to be in one group and thus the