Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: googleapis/python-bigquery-dataframes
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: v0.4.0
Choose a base ref
...
head repository: googleapis/python-bigquery-dataframes
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: v0.5.0
Choose a head ref

Commits on Sep 18, 2023

  1. Copy the full SHA
    bbbd21e View commit details
  2. chore: enforce use of conventional commits (#31)

    This will prevent accidental merging of commits that release-please can't handle.
    tswast authored Sep 18, 2023
    Copy the full SHA
    69e51a6 View commit details

Commits on Sep 20, 2023

  1. chore: sync internal changes to GitHub (#34)

    feat: support `optimize_strategy` parameter in `bigframes.ml.linear_model.LinearRegression`
    feat: support `l2_reg` parameter in `bigframes.ml.linear_model.LinearRegression`
    feat: support `max_iterations` parameter in `bigframes.ml.linear_model.LinearRegression`
    feat: support `learn_rate_strategy` parameter in `bigframes.ml.linear_model.LinearRegression`
    feat: support `early_stop` parameter in `bigframes.ml.linear_model.LinearRegression`
    feat: support `min_rel_progress` parameter in `bigframes.ml.linear_model.LinearRegression`
    feat: support `ls_init_learn_rate` parameter in `bigframes.ml.linear_model.LinearRegression`
    feat: support `calculate_p_values` parameter in `bigframes.ml.linear_model.LinearRegression`
    feat: support `enable_global_explain` parameter in `bigframes.ml.linear_model.LinearRegression`
    test: add golden SQL test for logistic model
    test: extend ml golden sql test linear_reg
    docs: link to Remote Functions code samples from README and API reference
    feat: support `df[column_name] = df_only_one_column`
    feat: add `DataFrame.rolling` and `DataFrame.expanding` methods
    feat: add `DataFrame.kurtosis` / `DF.kurt` method
    feat: support `class_weights="balanced"` in `LogisticRegression` model
    tswast authored Sep 20, 2023
    Copy the full SHA
    c1900c2 View commit details
  2. perf: simplify join order to use multiple order keys instead of strin…

    …g. (#36)
    
    Change-Id: I8c37e9296b2e4e0ea87f6a7e836d48988d161d37
    TrevorBergeron authored Sep 20, 2023
    Copy the full SHA
    5056da6 View commit details

Commits on Sep 21, 2023

  1. Copy the full SHA
    edabdbb View commit details
  2. refactor: remove ibis references outside of arrayvalue code. (#37)

    Change-Id: I1386355446e90f89a43cee8a9f447f0775639902
    TrevorBergeron authored Sep 21, 2023
    Copy the full SHA
    109ee24 View commit details
  3. feat: add items, apply methods to DataFrame. (#43)

    Change-Id: Id3a0e78da3bb9ccce64e190f7797f737b239c33f
    
    Co-authored-by: Tim Swast <swast@google.com>
    TrevorBergeron and tswast authored Sep 21, 2023
    Copy the full SHA
    3adc1b3 View commit details
  4. feat: add index dtype, astype, drop, fillna, aggregate attrib…

    …utes. (#38)
    
    Change-Id: I4af249d10b2fcd779ad05d1f1d95049893e40135
    TrevorBergeron authored Sep 21, 2023
    Copy the full SHA
    1a254a4 View commit details
  5. perf: inline small Series and DataFrames in query text (#45)

    The prevents unnecessary load and query jobs.
    
    Towards internal issue 296474170 🦕
    tswast authored Sep 21, 2023
    Copy the full SHA
    5e199ec View commit details
  6. refactor: ml.sql to Object (#44)

    * refactor: ml.sql to Object
    
    Change-Id: Ibf795b81619778eaf28572fccd95a09b65f8ad58
    GarrettWu authored Sep 21, 2023
    Copy the full SHA
    33274c2 View commit details

Commits on Sep 22, 2023

  1. Copy the full SHA
    2510461 View commit details
  2. Copy the full SHA
    f9a93ce View commit details

Commits on Sep 23, 2023

  1. Copy the full SHA
    416d7cb View commit details

Commits on Sep 25, 2023

  1. Copy the full SHA
    14b262b View commit details
  2. Copy the full SHA
    9cf9972 View commit details

Commits on Sep 26, 2023

  1. fix: Fix header skipping logic in read_csv (#49)

    Change-Id: Ib575e2c2b07f819d1dc499a271fea91107fbb8b4
    shobsi authored Sep 26, 2023
    Copy the full SHA
    d56258c View commit details
  2. fix: LabelEncoder params consistent with Sklearn (#60)

    * fix: LabelEncoder params consistent with Sklearn
    
    * fix:add LabelTransformer
    
    * fix: address comments for base LabelTransformer
    
    * fix: type for params
    ashleyxuu authored Sep 26, 2023
    Copy the full SHA
    632caec View commit details
  3. Copy the full SHA
    3502f83 View commit details
  4. Copy the full SHA
    a6e32aa View commit details
  5. Copy the full SHA
    e804e13 View commit details
  6. feat: add ml.preprocessing.MinMaxScaler (#64)

    * feat: add ml.preprocessing.MinMaxScaler
    
    * fix comments and typo
    
    * add test check for min value
    
    * nit fix
    ashleyxuu authored Sep 26, 2023
    Copy the full SHA
    392113b View commit details

Commits on Sep 27, 2023

  1. Copy the full SHA
    61200bd View commit details

Commits on Sep 28, 2023

  1. fix: generate unique ids on join to avoid id collisions (#65)

    * fix: generate unique ids on join to avoid id collisions
    TrevorBergeron authored Sep 28, 2023
    Copy the full SHA
    7ab65e8 View commit details
  2. chore(main): release 0.5.0 (#35)

    Co-authored-by: release-please[bot] <55107282+release-please[bot]@users.noreply.github.com>
    release-please[bot] authored Sep 28, 2023
    Copy the full SHA
    0e0493f View commit details
Showing with 4,053 additions and 1,155 deletions.
  1. +1 −1 .github/release-trigger.yml
  2. +1 −0 .github/sync-repo-settings.yaml
  3. +48 −0 CHANGELOG.md
  4. +3 −1 README.rst
  5. +2 −1 bigframes/clients.py
  6. +187 −75 bigframes/core/__init__.py
  7. +105 −6 bigframes/core/block_transforms.py
  8. +144 −66 bigframes/core/blocks.py
  9. +63 −7 bigframes/core/groupby/__init__.py
  10. +16 −28 bigframes/core/indexers.py
  11. +229 −14 bigframes/core/indexes/index.py
  12. +83 −119 bigframes/core/joins/single_column.py
  13. +1 −49 bigframes/core/scalar.py
  14. +30 −20 bigframes/core/window/__init__.py
  15. +112 −66 bigframes/dataframe.py
  16. +14 −9 bigframes/dtypes.py
  17. +20 −0 bigframes/ml/base.py
  18. +3 −2 bigframes/ml/cluster.py
  19. +6 −2 bigframes/ml/compose.py
  20. +140 −155 bigframes/ml/core.py
  21. +3 −2 bigframes/ml/decomposition.py
  22. +11 −7 bigframes/ml/ensemble.py
  23. +10 −2 bigframes/ml/forecasting.py
  24. +30 −0 bigframes/ml/globals.py
  25. +9 −3 bigframes/ml/imported.py
  26. +69 −18 bigframes/ml/linear_model.py
  27. +5 −3 bigframes/ml/llm.py
  28. +34 −1 bigframes/ml/pipeline.py
  29. +290 −8 bigframes/ml/preprocessing.py
  30. +225 −172 bigframes/ml/sql.py
  31. +36 −9 bigframes/operations/__init__.py
  32. +1 −3 bigframes/operations/base.py
  33. +11 −22 bigframes/series.py
  34. +5 −5 bigframes/session.py
  35. +1 −1 bigframes/version.py
  36. +2 −2 setup.py
  37. +1 −1 testing/constraints-3.9.txt
  38. +12 −12 tests/system/large/ml/test_compose.py
  39. +10 −6 tests/system/large/ml/test_core.py
  40. +55 −38 tests/system/large/ml/test_linear_model.py
  41. +142 −17 tests/system/large/ml/test_pipeline.py
  42. +2 −1 tests/system/small/ml/conftest.py
  43. +320 −7 tests/system/small/ml/test_preprocessing.py
  44. +144 −5 tests/system/small/test_dataframe.py
  45. +24 −0 tests/system/small/test_groupby.py
  46. +230 −0 tests/system/small/test_index.py
  47. +26 −1 tests/system/small/test_series.py
  48. +12 −6 tests/system/small/test_session.py
  49. +41 −1 tests/system/small/test_window.py
  50. +13 −0 tests/unit/core/__init__.py
  51. +85 −0 tests/unit/core/test_blocks.py
  52. +74 −20 tests/unit/ml/test_compose.py
  53. +134 −10 tests/unit/ml/test_golden_sql.py
  54. +197 −75 tests/unit/ml/test_sql.py
  55. +6 −6 tests/unit/test_core.py
  56. +82 −9 third_party/bigframes_vendored/pandas/core/frame.py
  57. +55 −0 third_party/bigframes_vendored/pandas/core/generic.py
  58. +21 −0 third_party/bigframes_vendored/pandas/core/groupby/__init__.py
  59. +255 −0 third_party/bigframes_vendored/pandas/core/indexes/base.py
  60. +0 −55 third_party/bigframes_vendored/pandas/core/series.py
  61. +21 −1 third_party/bigframes_vendored/sklearn/linear_model/_base.py
  62. +8 −3 third_party/bigframes_vendored/sklearn/linear_model/_logistic.py
  63. +77 −1 third_party/bigframes_vendored/sklearn/preprocessing/_data.py
  64. +4 −1 third_party/bigframes_vendored/sklearn/preprocessing/_encoder.py
  65. +52 −0 third_party/bigframes_vendored/sklearn/preprocessing/_label.py
2 changes: 1 addition & 1 deletion .github/release-trigger.yml
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
enabled: true
multiScmName: bigframes
multiScmName: python-bigquery-dataframes
1 change: 1 addition & 0 deletions .github/sync-repo-settings.yaml
Original file line number Diff line number Diff line change
@@ -7,6 +7,7 @@ branchProtectionRules:
requiresCodeOwnerReviews: true
requiresStrictStatusChecks: true
requiredStatusCheckContexts:
- 'conventionalcommits.org'
- 'cla/google'
- 'OwlBot Post Processor'
- 'docs'
48 changes: 48 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -4,6 +4,54 @@

[1]: https://pypi.org/project/bigframes/#history

## [0.5.0](https://github.com/googleapis/python-bigquery-dataframes/compare/v0.4.0...v0.5.0) (2023-09-28)


### Features

* Add `DataFrame.kurtosis` / `DF.kurt` method ([c1900c2](https://github.com/googleapis/python-bigquery-dataframes/commit/c1900c29a44199d5d8d036d6d842b4f00448fa79))
* Add `DataFrame.rolling` and `DataFrame.expanding` methods ([c1900c2](https://github.com/googleapis/python-bigquery-dataframes/commit/c1900c29a44199d5d8d036d6d842b4f00448fa79))
* Add `items`, `apply` methods to `DataFrame`. ([#43](https://github.com/googleapis/python-bigquery-dataframes/issues/43)) ([3adc1b3](https://github.com/googleapis/python-bigquery-dataframes/commit/3adc1b3aa3e2b218d4fa5debdaa4298276bdf801))
* Add axis param to simple df aggregations ([#52](https://github.com/googleapis/python-bigquery-dataframes/issues/52)) ([9cf9972](https://github.com/googleapis/python-bigquery-dataframes/commit/9cf99721ed83704e6ee28b15c699326c431eb252))
* Add index `dtype`, `astype`, `drop`, `fillna`, aggregate attributes. ([#38](https://github.com/googleapis/python-bigquery-dataframes/issues/38)) ([1a254a4](https://github.com/googleapis/python-bigquery-dataframes/commit/1a254a496633957b9506dd8392dcc6fd10762201))
* Add ml.preprocessing.LabelEncoder ([#50](https://github.com/googleapis/python-bigquery-dataframes/issues/50)) ([2510461](https://github.com/googleapis/python-bigquery-dataframes/commit/25104610e5ffe526315923946533a66713c1d155))
* Add ml.preprocessing.MaxAbsScaler ([#56](https://github.com/googleapis/python-bigquery-dataframes/issues/56)) ([14b262b](https://github.com/googleapis/python-bigquery-dataframes/commit/14b262bde2bb86093bf4df63862e369c5a84b0ad))
* Add ml.preprocessing.MinMaxScaler ([#64](https://github.com/googleapis/python-bigquery-dataframes/issues/64)) ([392113b](https://github.com/googleapis/python-bigquery-dataframes/commit/392113b70d6a8c407accbb6684d75b31261e3741))
* Add more index methods ([#54](https://github.com/googleapis/python-bigquery-dataframes/issues/54)) ([a6e32aa](https://github.com/googleapis/python-bigquery-dataframes/commit/a6e32aa875370063c48ce7922c2aa369a770bd30))
* Support `calculate_p_values` parameter in `bigframes.ml.linear_model.LinearRegression` ([c1900c2](https://github.com/googleapis/python-bigquery-dataframes/commit/c1900c29a44199d5d8d036d6d842b4f00448fa79))
* Support `class_weights="balanced"` in `LogisticRegression` model ([c1900c2](https://github.com/googleapis/python-bigquery-dataframes/commit/c1900c29a44199d5d8d036d6d842b4f00448fa79))
* Support `df[column_name] = df_only_one_column` ([c1900c2](https://github.com/googleapis/python-bigquery-dataframes/commit/c1900c29a44199d5d8d036d6d842b4f00448fa79))
* Support `early_stop` parameter in `bigframes.ml.linear_model.LinearRegression` ([c1900c2](https://github.com/googleapis/python-bigquery-dataframes/commit/c1900c29a44199d5d8d036d6d842b4f00448fa79))
* Support `enable_global_explain` parameter in `bigframes.ml.linear_model.LinearRegression` ([c1900c2](https://github.com/googleapis/python-bigquery-dataframes/commit/c1900c29a44199d5d8d036d6d842b4f00448fa79))
* Support `l2_reg` parameter in `bigframes.ml.linear_model.LinearRegression` ([c1900c2](https://github.com/googleapis/python-bigquery-dataframes/commit/c1900c29a44199d5d8d036d6d842b4f00448fa79))
* Support `learn_rate_strategy` parameter in `bigframes.ml.linear_model.LinearRegression` ([c1900c2](https://github.com/googleapis/python-bigquery-dataframes/commit/c1900c29a44199d5d8d036d6d842b4f00448fa79))
* Support `ls_init_learn_rate` parameter in `bigframes.ml.linear_model.LinearRegression` ([c1900c2](https://github.com/googleapis/python-bigquery-dataframes/commit/c1900c29a44199d5d8d036d6d842b4f00448fa79))
* Support `max_iterations` parameter in `bigframes.ml.linear_model.LinearRegression` ([c1900c2](https://github.com/googleapis/python-bigquery-dataframes/commit/c1900c29a44199d5d8d036d6d842b4f00448fa79))
* Support `min_rel_progress` parameter in `bigframes.ml.linear_model.LinearRegression` ([c1900c2](https://github.com/googleapis/python-bigquery-dataframes/commit/c1900c29a44199d5d8d036d6d842b4f00448fa79))
* Support `optimize_strategy` parameter in `bigframes.ml.linear_model.LinearRegression` ([c1900c2](https://github.com/googleapis/python-bigquery-dataframes/commit/c1900c29a44199d5d8d036d6d842b4f00448fa79))
* Support casting string to integer or float ([#59](https://github.com/googleapis/python-bigquery-dataframes/issues/59)) ([3502f83](https://github.com/googleapis/python-bigquery-dataframes/commit/3502f835b35c437933430698e7a1c9badaddcb99))


### Bug Fixes

* Fix header skipping logic in `read_csv` ([#49](https://github.com/googleapis/python-bigquery-dataframes/issues/49)) ([d56258c](https://github.com/googleapis/python-bigquery-dataframes/commit/d56258cbfcda168cb9e437a021e282818d622d6a))
* Generate unique ids on join to avoid id collisions ([#65](https://github.com/googleapis/python-bigquery-dataframes/issues/65)) ([7ab65e8](https://github.com/googleapis/python-bigquery-dataframes/commit/7ab65e88deb0080e9c36c2709f8a5385ccaf8cf2))
* LabelEncoder params consistent with Sklearn ([#60](https://github.com/googleapis/python-bigquery-dataframes/issues/60)) ([632caec](https://github.com/googleapis/python-bigquery-dataframes/commit/632caec420a7e23188f01b96a00c354d205da74e))
* Loosen filter items tests to accomodate shifting pandas impl ([#41](https://github.com/googleapis/python-bigquery-dataframes/issues/41)) ([edabdbb](https://github.com/googleapis/python-bigquery-dataframes/commit/edabdbb131150707ea9211292cacbb60b8d076dd))


### Performance Improvements

* Add ability to cache dataframe and series to session table ([#51](https://github.com/googleapis/python-bigquery-dataframes/issues/51)) ([416d7cb](https://github.com/googleapis/python-bigquery-dataframes/commit/416d7cb9b560d7e33dcc0227f03a00d43f55ba0d))
* Inline small `Series` and `DataFrames` in query text ([#45](https://github.com/googleapis/python-bigquery-dataframes/issues/45)) ([5e199ec](https://github.com/googleapis/python-bigquery-dataframes/commit/5e199ecf1ecf13a68a2ed0dd4464afd9db977ab1))
* Reimplement unpivot to use cross join rather than union ([#47](https://github.com/googleapis/python-bigquery-dataframes/issues/47)) ([f9a93ce](https://github.com/googleapis/python-bigquery-dataframes/commit/f9a93ce71d053aa17b1e3a2946c90e0227076184))
* Simplify join order to use multiple order keys instead of string. ([#36](https://github.com/googleapis/python-bigquery-dataframes/issues/36)) ([5056da6](https://github.com/googleapis/python-bigquery-dataframes/commit/5056da6b385dbcfc179d2bcbb6549fa539428cda))


### Documentation

* Link to Remote Functions code samples from README and API reference ([c1900c2](https://github.com/googleapis/python-bigquery-dataframes/commit/c1900c29a44199d5d8d036d6d842b4f00448fa79))

## [0.4.0](https://github.com/googleapis/python-bigquery-dataframes/compare/v0.3.2...v0.4.0) (2023-09-16)


4 changes: 3 additions & 1 deletion README.rst
Original file line number Diff line number Diff line change
@@ -241,7 +241,9 @@ Remote functions
BigQuery DataFrames gives you the ability to turn your custom scalar functions
into `BigQuery remote functions
<https://cloud.google.com/bigquery/docs/remote-functions>`_ . Creating a remote
function in BigQuery DataFrames creates a BigQuery remote function, a `BigQuery
function in BigQuery DataFrames (See `code samples
<https://cloud.google.com/bigquery/docs/remote-functions#bigquery-dataframes>`_)
creates a BigQuery remote function, a `BigQuery
connection
<https://cloud.google.com/bigquery/docs/create-cloud-resource-connection>`_ ,
and a `Cloud Functions (2nd gen) function
3 changes: 2 additions & 1 deletion bigframes/clients.py
Original file line number Diff line number Diff line change
@@ -18,7 +18,7 @@

import logging
import time
from typing import Optional
from typing import cast, Optional

import google.api_core.exceptions
from google.cloud import bigquery_connection_v1, resourcemanager_v3
@@ -80,6 +80,7 @@ def create_bq_connection(
logger.info(
f"Created BQ connection {connection_name} with service account id: {service_account_id}"
)
service_account_id = cast(str, service_account_id)
# Ensure IAM role on the BQ connection
# https://cloud.google.com/bigquery/docs/reference/standard-sql/remote-functions#grant_permission_on_function
self._ensure_iam_binding(project_id, service_account_id, iam_role)
Loading