Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: add artithmetic df sample code #153

Merged
merged 3 commits into from
Oct 30, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions bigframes/session/__init__.py
Original file line number Diff line number Diff line change
@@ -352,7 +352,7 @@ def read_gbq_query(
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None

Simple query input:
Simple query input:

>>> df = bpd.read_gbq_query('''
... SELECT
@@ -368,7 +368,7 @@ def read_gbq_query(
<BLANKLINE>
[2 rows x 3 columns]

Preserve ordering in a query input.
Preserve ordering in a query input.

>>> df = bpd.read_gbq_query('''
... SELECT
494 changes: 490 additions & 4 deletions third_party/bigframes_vendored/pandas/core/frame.py
Original file line number Diff line number Diff line change
@@ -697,6 +697,7 @@ def align(
Join method is specified for each axis Index.
Args:
other (DataFrame or Series):
join ({{'outer', 'inner', 'left', 'right'}}, default 'outer'):
@@ -978,9 +979,9 @@ def sort_values(
Sort ascending vs. descending. Specify list for multiple sort
orders. If this is a list of bools, must match the length of
the by.
kind (str, default `quicksort`):
Choice of sorting algorithm. Accepts 'quicksort’, ‘mergesort,
heapsort’, ‘stable. Ignored except when determining whether to
kind (str, default 'quicksort'):
Choice of sorting algorithm. Accepts 'quicksort', 'mergesort',
'heapsort', 'stable'. Ignored except when determining whether to
sort stably. 'mergesort' or 'stable' will result in stable reorder.
na_position ({'first', 'last'}, default `last`):
``{'first', 'last'}``, default 'last' Puts NaNs at the beginning
@@ -1014,6 +1015,29 @@ def eq(self, other, axis: str | int = "columns") -> DataFrame:
Equivalent to `==`, `!=`, `<=`, `<`, `>=`, `>` with support to choose axis
(rows or columns) and level for comparison.
**Examples:**
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
You can use method name:
>>> df = bpd.DataFrame({'angles': [0, 3, 4],
... 'degrees': [360, 180, 360]},
... index=['circle', 'triangle', 'rectangle'])
>>> df["degrees"].eq(360)
circle True
triangle False
rectangle True
Name: degrees, dtype: boolean
You can also use arithmetic operator ``==``:
>>> df["degrees"] == 360
circle True
triangle False
rectangle True
Name: degrees, dtype: boolean
Args:
other (scalar, sequence, Series, or DataFrame):
Any single or multiple element data structure, or list-like object.
@@ -1036,6 +1060,30 @@ def ne(self, other, axis: str | int = "columns") -> DataFrame:
Equivalent to `==`, `!=`, `<=`, `<`, `>=`, `>` with support to choose axis
(rows or columns) and level for comparison.
**Examples:**
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
You can use method name:
>>> df = bpd.DataFrame({'angles': [0, 3, 4],
... 'degrees': [360, 180, 360]},
... index=['circle', 'triangle', 'rectangle'])
>>> df["degrees"].ne(360)
circle False
triangle True
rectangle False
Name: degrees, dtype: boolean
You can also use arithmetic operator ``!=``:
>>> df["degrees"] != 360
circle False
triangle True
rectangle False
Name: degrees, dtype: boolean
Args:
other (scalar, sequence, Series, or DataFrame):
Any single or multiple element data structure, or list-like object.
@@ -1061,6 +1109,30 @@ def le(self, other, axis: str | int = "columns") -> DataFrame:
floating point columns are considered different
(i.e. `NaN` != `NaN`).
**Examples:**
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
You can use method name:
>>> df = bpd.DataFrame({'angles': [0, 3, 4],
... 'degrees': [360, 180, 360]},
... index=['circle', 'triangle', 'rectangle'])
>>> df["degrees"].le(180)
circle False
triangle True
rectangle False
Name: degrees, dtype: boolean
You can also use arithmetic operator ``<=``:
>>> df["degrees"] <= 180
circle False
triangle True
rectangle False
Name: degrees, dtype: boolean
Args:
other (scalar, sequence, Series, or DataFrame):
Any single or multiple element data structure, or list-like object.
@@ -1087,6 +1159,30 @@ def lt(self, other, axis: str | int = "columns") -> DataFrame:
floating point columns are considered different
(i.e. `NaN` != `NaN`).
**Examples:**
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
You can use method name:
>>> df = bpd.DataFrame({'angles': [0, 3, 4],
... 'degrees': [360, 180, 360]},
... index=['circle', 'triangle', 'rectangle'])
>>> df["degrees"].lt(180)
circle False
triangle False
rectangle False
Name: degrees, dtype: boolean
You can also use arithmetic operator ``<``:
>>> df["degrees"] < 180
circle False
triangle False
rectangle False
Name: degrees, dtype: boolean
Args:
other (scalar, sequence, Series, or DataFrame):
Any single or multiple element data structure, or list-like object.
@@ -1113,6 +1209,30 @@ def ge(self, other, axis: str | int = "columns") -> DataFrame:
floating point columns are considered different
(i.e. `NaN` != `NaN`).
**Examples:**
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
You can use method name:
>>> df = bpd.DataFrame({'angles': [0, 3, 4],
... 'degrees': [360, 180, 360]},
... index=['circle', 'triangle', 'rectangle'])
>>> df["degrees"].ge(360)
circle True
triangle False
rectangle True
Name: degrees, dtype: boolean
You can also use arithmetic operator ``>=``:
>>> df["degrees"] >= 360
circle True
triangle False
rectangle True
Name: degrees, dtype: boolean
Args:
other (scalar, sequence, Series, or DataFrame):
Any single or multiple element data structure, or list-like object.
@@ -1139,6 +1259,28 @@ def gt(self, other, axis: str | int = "columns") -> DataFrame:
floating point columns are considered different
(i.e. `NaN` != `NaN`).
**Examples:**
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> df = bpd.DataFrame({'angles': [0, 3, 4],
... 'degrees': [360, 180, 360]},
... index=['circle', 'triangle', 'rectangle'])
>>> df["degrees"].gt(360)
circle False
triangle False
rectangle False
Name: degrees, dtype: boolean
You can also use arithmetic operator ``>``:
>>> df["degrees"] > 360
circle False
triangle False
rectangle False
Name: degrees, dtype: boolean
Args:
other (scalar, sequence, Series, or DataFrame):
Any single or multiple element data structure, or list-like object.
@@ -1162,6 +1304,32 @@ def add(self, other, axis: str | int = "columns") -> DataFrame:
.. note::
Mismatched indices will be unioned together.
**Examples:**
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> df = bpd.DataFrame({
... 'A': [1, 2, 3],
... 'B': [4, 5, 6],
... })
You can use method name:
>>> df['A'].add(df['B'])
0 5
1 7
2 9
dtype: Int64
You can also use arithmetic operator ``+``:
>>> df['A'] + (df['B'])
0 5
1 7
2 9
dtype: Int64
Args:
other (float, int, or Series):
Any single or multiple element data structure, or list-like object.
@@ -1185,6 +1353,32 @@ def sub(self, other, axis: str | int = "columns") -> DataFrame:
.. note::
Mismatched indices will be unioned together.
**Examples:**
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> df = bpd.DataFrame({
... 'A': [1, 2, 3],
... 'B': [4, 5, 6],
... })
You can use method name:
>>> df['A'].sub(df['B'])
0 -3
1 -3
2 -3
dtype: Int64
You can also use arithmetic operator ``-``:
>>> df['A'] - (df['B'])
0 -3
1 -3
2 -3
dtype: Int64
Args:
other (float, int, or Series):
Any single or multiple element data structure, or list-like object.
@@ -1208,6 +1402,29 @@ def rsub(self, other, axis: str | int = "columns") -> DataFrame:
.. note::
Mismatched indices will be unioned together.
**Examples:**
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> df = bpd.DataFrame({
... 'A': [1, 2, 3],
... 'B': [4, 5, 6],
... })
>>> df['A'].rsub(df['B'])
0 3
1 3
2 3
dtype: Int64
It's equivalent to using arithmetic operator: ``-``:
>>> df['B'] - (df['A'])
0 3
1 3
2 3
dtype: Int64
Args:
other (float, int, or Series):
Any single or multiple element data structure, or list-like object.
@@ -1231,6 +1448,32 @@ def mul(self, other, axis: str | int = "columns") -> DataFrame:
.. note::
Mismatched indices will be unioned together.
**Examples:**
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> df = bpd.DataFrame({
... 'A': [1, 2, 3],
... 'B': [4, 5, 6],
... })
You can use method name:
>>> df['A'].mul(df['B'])
0 4
1 10
2 18
dtype: Int64
You can also use arithmetic operator ``*``:
>>> df['A'] * (df['B'])
0 4
1 10
2 18
dtype: Int64
Args:
other (float, int, or Series):
Any single or multiple element data structure, or list-like object.
@@ -1254,6 +1497,32 @@ def truediv(self, other, axis: str | int = "columns") -> DataFrame:
.. note::
Mismatched indices will be unioned together.
**Examples:**
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> df = bpd.DataFrame({
... 'A': [1, 2, 3],
... 'B': [4, 5, 6],
... })
You can use method name:
>>> df['A'].truediv(df['B'])
0 0.25
1 0.4
2 0.5
dtype: Float64
You can also use arithmetic operator ``/``:
>>> df['A'] / (df['B'])
0 0.25
1 0.4
2 0.5
dtype: Float64
Args:
other (float, int, or Series):
Any single or multiple element data structure, or list-like object.
@@ -1277,6 +1546,29 @@ def rtruediv(self, other, axis: str | int = "columns") -> DataFrame:
.. note::
Mismatched indices will be unioned together.
**Examples:**
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> df = bpd.DataFrame({
... 'A': [1, 2, 3],
... 'B': [4, 5, 6],
... })
>>> df['A'].rtruediv(df['B'])
0 4.0
1 2.5
2 2.0
dtype: Float64
It's equivalent to using arithmetic operator: ``/``:
>>> df['B'] / (df['A'])
0 4.0
1 2.5
2 2.0
dtype: Float64
Args:
other (float, int, or Series):
Any single or multiple element data structure, or list-like object.
@@ -1300,6 +1592,32 @@ def floordiv(self, other, axis: str | int = "columns") -> DataFrame:
.. note::
Mismatched indices will be unioned together.
**Examples:**
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> df = bpd.DataFrame({
... 'A': [1, 2, 3],
... 'B': [4, 5, 6],
... })
You can use method name:
>>> df['A'].floordiv(df['B'])
0 0
1 0
2 0
dtype: Int64
You can also use arithmetic operator ``//``:
>>> df['A'] // (df['B'])
0 0
1 0
2 0
dtype: Int64
Args:
other (float, int, or Series):
Any single or multiple element data structure, or list-like object.
@@ -1323,6 +1641,29 @@ def rfloordiv(self, other, axis: str | int = "columns") -> DataFrame:
.. note::
Mismatched indices will be unioned together.
**Examples:**
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> df = bpd.DataFrame({
... 'A': [1, 2, 3],
... 'B': [4, 5, 6],
... })
>>> df['A'].rfloordiv(df['B'])
0 4
1 2
2 2
dtype: Int64
It's equivalent to using arithmetic operator: ``//``:
>>> df['B'] // (df['A'])
0 4
1 2
2 2
dtype: Int64
Args:
other (float, int, or Series):
Any single or multiple element data structure, or list-like object.
@@ -1346,6 +1687,32 @@ def mod(self, other, axis: str | int = "columns") -> DataFrame:
.. note::
Mismatched indices will be unioned together.
**Examples:**
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> df = bpd.DataFrame({
... 'A': [1, 2, 3],
... 'B': [4, 5, 6],
... })
You can use method name:
>>> df['A'].mod(df['B'])
0 1
1 2
2 3
dtype: Int64
You can also use arithmetic operator ``%``:
>>> df['A'] % (df['B'])
0 1
1 2
2 3
dtype: Int64
Args:
other:
Any single or multiple element data structure, or list-like object.
@@ -1369,6 +1736,29 @@ def rmod(self, other, axis: str | int = "columns") -> DataFrame:
.. note::
Mismatched indices will be unioned together.
**Examples:**
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> df = bpd.DataFrame({
... 'A': [1, 2, 3],
... 'B': [4, 5, 6],
... })
>>> df['A'].rmod(df['B'])
0 0
1 1
2 0
dtype: Int64
It's equivalent to using arithmetic operator: ``%``:
>>> df['B'] % (df['A'])
0 0
1 1
2 0
dtype: Int64
Args:
other (float, int, or Series):
Any single or multiple element data structure, or list-like object.
@@ -1382,7 +1772,7 @@ def rmod(self, other, axis: str | int = "columns") -> DataFrame:
raise NotImplementedError(constants.ABSTRACT_METHOD_ERROR_MESSAGE)

def pow(self, other, axis: str | int = "columns") -> DataFrame:
"""Get Exponential power of dataframe and other, element-wise (binary operator `pow`).
"""Get Exponential power of dataframe and other, element-wise (binary operator `**`).
Equivalent to ``dataframe ** other``, but with support to substitute a fill_value
for missing data in one of the inputs. With reverse version, `rpow`.
@@ -1393,6 +1783,32 @@ def pow(self, other, axis: str | int = "columns") -> DataFrame:
.. note::
Mismatched indices will be unioned together.
**Examples:**
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> df = bpd.DataFrame({
... 'A': [1, 2, 3],
... 'B': [4, 5, 6],
... })
You can use method name:
>>> df['A'].pow(df['B'])
0 1
1 32
2 729
dtype: Int64
You can also use arithmetic operator ``**``:
>>> df['A'] ** (df['B'])
0 1
1 32
2 729
dtype: Int64
Args:
other (float, int, or Series):
Any single or multiple element data structure, or list-like object.
@@ -1417,6 +1833,29 @@ def rpow(self, other, axis: str | int = "columns") -> DataFrame:
.. note::
Mismatched indices will be unioned together.
**Examples:**
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> df = bpd.DataFrame({
... 'A': [1, 2, 3],
... 'B': [4, 5, 6],
... })
>>> df['A'].rpow(df['B'])
0 4
1 25
2 216
dtype: Int64
It's equivalent to using arithmetic operator: ``**``:
>>> df['B'] ** (df['A'])
0 4
1 25
2 216
dtype: Int64
Args:
other (float, int, or Series):
Any single or multiple element data structure, or list-like object.
@@ -1438,6 +1877,21 @@ def combine(
to element-wise combine columns. The row and column indexes of the
resulting DataFrame will be the union of the two.
**Examples:**
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> df1 = bpd.DataFrame({'A': [0, 0], 'B': [4, 4]})
>>> df2 = bpd.DataFrame({'A': [1, 1], 'B': [3, 3]})
>>> take_smaller = lambda s1, s2: s1 if s1.sum() < s2.sum() else s2
>>> df1.combine(df2, take_smaller)
A B
0 0 3
1 0 3
<BLANKLINE>
[2 rows x 2 columns]
Args:
other (DataFrame):
The DataFrame to merge column-wise.
@@ -1468,6 +1922,20 @@ def combine_first(self, other) -> DataFrame:
second.loc[index, col] are not missing values, upon calling
first.combine_first(second).
**Examples:**
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> df1 = bpd.DataFrame({'A': [None, 0], 'B': [None, 4]})
>>> df2 = bpd.DataFrame({'A': [1, 1], 'B': [3, 3]})
>>> df1.combine_first(df2)
A B
0 1.0 3.0
1 0.0 4.0
<BLANKLINE>
[2 rows x 2 columns]
Args:
other (DataFrame):
Provided DataFrame to use to fill null values.
@@ -1485,6 +1953,24 @@ def update(
Aligns on indices. There is no return value.
**Examples:**
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> df = bpd.DataFrame({'A': [1, 2, 3],
... 'B': [400, 500, 600]})
>>> new_df = bpd.DataFrame({'B': [4, 5, 6],
... 'C': [7, 8, 9]})
>>> df.update(new_df)
>>> df
A B
0 1 4
1 2 5
2 3 6
<BLANKLINE>
[3 rows x 2 columns]
Args:
other (DataFrame, or object coercible into a DataFrame):
Should have at least one matching index/column label