Skip to content

docs: add example to dataframe.nlargest, dataframe.nsmallest, datafra… #234

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Dec 11, 2023
151 changes: 147 additions & 4 deletions third_party/bigframes_vendored/pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -3324,6 +3324,58 @@ def nlargest(self, n: int, columns, keep: str = "first"):
``df.sort_values(columns, ascending=False).head(n)``, but more
performant.

**Examples:**

>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None

>>> df = bpd.DataFrame({"A": [1, 1, 3, 3, 5, 5],
... "B": [5, 6, 3, 4, 1, 2],
... "C": ['a', 'b', 'a', 'b', 'a', 'b']})
>>> df
A B C
0 1 5 a
1 1 6 b
2 3 3 a
3 3 4 b
4 5 1 a
5 5 2 b
<BLANKLINE>
[6 rows x 3 columns]

Returns rows with the largest value in 'A', including all ties:

>>> df.nlargest(1, 'A', keep = "all")
A B C
4 5 1 a
5 5 2 b
<BLANKLINE>
[2 rows x 3 columns]

Returns the first row with the largest value in 'A', default behavior in case of ties:

>>> df.nlargest(1, 'A')
A B C
4 5 1 a
<BLANKLINE>
[1 rows x 3 columns]

Returns the last row with the largest value in 'A' in case of ties:

>>> df.nlargest(1, 'A', keep = "last")
A B C
5 5 2 b
<BLANKLINE>
[1 rows x 3 columns]

Returns the row with the largest combined values in both 'A' and 'C':

>>> df.nlargest(1, ['A', 'C'])
A B C
5 5 2 b
<BLANKLINE>
[1 rows x 3 columns]

Args:
n (int):
Number of rows to return.
Expand Down Expand Up @@ -3359,6 +3411,59 @@ def nsmallest(self, n: int, columns, keep: str = "first"):
``df.sort_values(columns, ascending=True).head(n)``, but more
performant.

**Examples:**

>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None

>>> df = bpd.DataFrame({"A": [1, 1, 3, 3, 5, 5],
... "B": [5, 6, 3, 4, 1, 2],
... "C": ['a', 'b', 'a', 'b', 'a', 'b']})
>>> df
A B C
0 1 5 a
1 1 6 b
2 3 3 a
3 3 4 b
4 5 1 a
5 5 2 b
<BLANKLINE>
[6 rows x 3 columns]

Returns rows with the smallest value in 'A', including all ties:

>>> df.nsmallest(1, 'A', keep = "all")
A B C
0 1 5 a
1 1 6 b
<BLANKLINE>
[2 rows x 3 columns]

Returns the first row with the smallest value in 'A', default behavior in case of ties:

>>> df.nsmallest(1, 'A')
A B C
0 1 5 a
<BLANKLINE>
[1 rows x 3 columns]

Returns the last row with the smallest value in 'A' in case of ties:

>>> df.nsmallest(1, 'A', keep = "last")
A B C
1 1 6 b
<BLANKLINE>
[1 rows x 3 columns]

Returns rows with the smallest values in 'A' and 'C'

>>> df.nsmallest(1, ['A', 'C'])
A B C
0 1 5 a
<BLANKLINE>
[1 rows x 3 columns]


Args:
n (int):
Number of rows to return.
Expand All @@ -3384,23 +3489,61 @@ def nsmallest(self, n: int, columns, keep: str = "first"):

def idxmin(self):
"""
Return index of first occurrence of minimum over requested axis.
Return index of first occurrence of minimum over columns.

NA/null values are excluded.

**Examples:**

>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None

>>> df = bpd.DataFrame({"A": [3, 1, 2], "B": [1, 2, 3]})
>>> df
A B
0 3 1
1 1 2
2 2 3
<BLANKLINE>
[3 rows x 2 columns]

>>> df.idxmin()
A 1
B 0
dtype: Int64

Returns:
Series: Indexes of minima along the specified axis.
Series: Indexes of minima along the columns.
"""
raise NotImplementedError(constants.ABSTRACT_METHOD_ERROR_MESSAGE)

def idxmax(self):
"""
Return index of first occurrence of maximum over requested axis.
Return index of first occurrence of maximum over columns.

NA/null values are excluded.

**Examples:**

>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None

>>> df = bpd.DataFrame({"A": [3, 1, 2], "B": [1, 2, 3]})
>>> df
A B
0 3 1
1 1 2
2 2 3
<BLANKLINE>
[3 rows x 2 columns]

>>> df.idxmax()
A 0
B 2
dtype: Int64

Returns:
Series: Indexes of maxima along the specified axis.
Series: Indexes of maxima along the columns.
"""
raise NotImplementedError(constants.ABSTRACT_METHOD_ERROR_MESSAGE)

Expand Down