Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: googleapis/python-bigquery-dataframes
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: v1.27.0
Choose a base ref
...
head repository: googleapis/python-bigquery-dataframes
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: v1.28.0
Choose a head ref

Commits on Nov 19, 2024

  1. docs: add snippet for creating boosted tree model (#1142)

    * docs: create boosted tree model
    
    * merge main
    
    * update model
    
    * update test
    rey-esp authored Nov 19, 2024
    Copy the full SHA
    a972668 View commit details

Commits on Nov 20, 2024

  1. docs: add snippet for evaluating a boosted tree model (#1154)

    * must verify evaluation_data and edit Output
    
    * line edit
    
    * edit desc
    
    * correct model info
    rey-esp authored Nov 20, 2024
    Copy the full SHA
    9d8970a View commit details
  2. docs: add snippet for predicting classifications using a boosted tree…

    … model (#1156)
    
    * docs: add snippet for predicting classifications using a boosted tree model
    
    * merge and rename bigquery_dataframes_bqml_boosted_tree_explain to bigquery_dataframes_bqml_boosted_tree_evaluate
    
    * remove training |
    
    * clean up asserts
    
    ---------
    
    Co-authored-by: Tim Sweña (Swast) <swast@google.com>
    rey-esp and tswast authored Nov 20, 2024
    Copy the full SHA
    e7b83f1 View commit details

Commits on Nov 21, 2024

  1. Copy the full SHA
    5b355ef View commit details
  2. feat: bigframes.bigquery.vector_search supports use_brute_force a…

    …nd `fraction_lists_to_search` parameters (#1158)
    
    * feat: `bigframes.bigquery.vector_search` supports `use_brute_force` and `fraction_lists_to_search` parameters
    
    * fix f-string on lower python versions
    
    ---------
    
    Co-authored-by: Chelsea Lin <chelsealin@google.com>
    tswast and chelsea-lin authored Nov 21, 2024
    Copy the full SHA
    131edc3 View commit details
  3. Copy the full SHA
    de923d0 View commit details
  4. feat: (Series | DataFrame).plot.bar (#1152)

    * feat: (Series | DataFrame).plot.bar
    
    * add warning message
    
    * fix mypy
    
    * 🦉 Updates from OwlBot post-processor
    
    See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md
    
    ---------
    
    Co-authored-by: Owl Bot <gcf-owl-bot[bot]@users.noreply.github.com>
    chelsea-lin and gcf-owl-bot[bot] authored Nov 21, 2024
    Copy the full SHA
    0fae2e0 View commit details
  5. Copy the full SHA
    9015c33 View commit details

Commits on Nov 25, 2024

  1. Copy the full SHA
    682d938 View commit details
  2. docs: add a code sample using `bpd.options.bigquery.ordering_mode = "…

    …partial"` (#909)
    
    * docs: add a code sample using `bpd.options.bigquery.ordering_mode = "partial"`
    
    * add warning filter too
    
    * add drop_duplicates alternative
    tswast authored Nov 25, 2024
    Copy the full SHA
    f80d705 View commit details
  3. Copy the full SHA
    b39a4b7 View commit details
  4. feat: add client_endpoints_override to bq options (#1167)

    * feat: add client_endpoints_override to bq options
    
    * fix wording
    GarrettWu authored Nov 25, 2024
    Copy the full SHA
    be74b99 View commit details

Commits on Nov 26, 2024

  1. chore: close session after partial ordering mode sample (#1173)

    * chore: close session after partial ordering mode sample
    
    * 🦉 Updates from OwlBot post-processor
    
    See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md
    
    * try manually selecting strict mode
    
    ---------
    
    Co-authored-by: Owl Bot <gcf-owl-bot[bot]@users.noreply.github.com>
    tswast and gcf-owl-bot[bot] authored Nov 26, 2024
    Copy the full SHA
    1c8d510 View commit details
  2. feat: Allow join-free alignment of analytic expressions (#1168)

    * feat: Allow join-free alignment of analytic expressions
    
    * address pr comments
    
    * fix bugs in pull_up_selection
    
    * fix unit test and remove validations
    
    * fix test failures
    TrevorBergeron authored Nov 26, 2024
    Copy the full SHA
    daef4f0 View commit details

Commits on Nov 27, 2024

  1. docs: update bigframes.pandas.Index docstrings (#1144)

    * docs: update `bigframes.pandas.Index` docstrings
    
    * updating more methods
    
    * update docstrings of more methods
    
    * update docstrings
    
    * update docs
    
    * update docstrings
    
    * fix failing doc test
    
    * fix docs indentation
    
    * fix indentation error
    
    * fix doctest error
    
    * fix doctest error
    
    * fix errors
    
    * fix errors
    
    * fix failing doctest
    
    * remove .name docstring
    arwas11 authored Nov 27, 2024
    Copy the full SHA
    557ab8d View commit details
  2. refactor: use core.convert for series conversions under the ml packag…

    …es (#1178)
    
    * refactor: use core.convert for series conversions under the ml packages
    
    * update method name
    
    * fetch global session lazily
    sycai authored Nov 27, 2024
    Copy the full SHA
    afa7cc4 View commit details
  3. docs: add third party pandas.Index methods and docstrings (#1171)

    * docs: add third party pandas.Index methods and docstrings
    
    * fix indent error
    
    * updae bigframes.pandas.Index docstrings
    
    * fix doctest errors
    
    * fix doctest error
    
    * fix doctest
    
    * add name docstring
    arwas11 authored Nov 27, 2024
    Copy the full SHA
    a970294 View commit details

Commits on Nov 28, 2024

  1. refactor: Move dataframe conversion function from ml package to core …

    …package (#1180)
    
    * refactor: use core.convert for series conversions under the ml packages
    
    * fetch global session lazily
    
    * fix type error
    
    * remove unnecessary get_session call
    
    * code cleanup
    
    * 🦉 Updates from OwlBot post-processor
    
    See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md
    
    * code cleanup
    
    * add session check
    
    * check session before conversion
    
    * check individual types during conversion
    
    * add missing return statement
    
    * remove individul session check to relax the arg requirement
    
    ---------
    
    Co-authored-by: Owl Bot <gcf-owl-bot[bot]@users.noreply.github.com>
    sycai and gcf-owl-bot[bot] authored Nov 28, 2024
    Copy the full SHA
    faa2738 View commit details

Commits on Dec 2, 2024

  1. Copy the full SHA
    421d24d View commit details

Commits on Dec 3, 2024

  1. chore: Minor updates on doc "bigframes/ml/base.py" (#1184)

    * docs(bigquery): update minor parts in base.py
    
    * docs(bigquery): update minor changes for bigframes/ml/base.py
    
    ---------
    
    Co-authored-by: Shuowei Li <shuowei@google.com>
    Shuowei Li and shuoweil authored Dec 3, 2024
    Copy the full SHA
    b959838 View commit details

Commits on Dec 4, 2024

  1. Copy the full SHA
    f882505 View commit details

Commits on Dec 6, 2024

  1. Copy the full SHA
    0693a7d View commit details
  2. feat: Update llm.TextEmbeddingGenerator to 005 (#1186)

    * docs(bigquery): update minor parts in base.py
    
    * docs(bigquery): update minor changes for bigframes/ml/base.py
    
    * udpate docs in semantics.py to match the text-embedding-005 update
    
    ---------
    
    Co-authored-by: Shuowei Li <shuowei@google.com>
    Shuowei Li and shuoweil authored Dec 6, 2024
    Copy the full SHA
    3072d38 View commit details

Commits on Dec 9, 2024

  1. perf: update df.corr, df.cov to be used with more than 30 columns cas…

    …e. (#1161)
    
    * perf: update df.corr, df.cov to be used with more than 30 columns case.
    
    * add large test
    
    * remove print
    
    * fix_index
    
    * fix index
    
    * test fix
    
    * fix test
    
    * fix test
    
    * slightly improve multi_apply_unary_op to avoid RecursionError
    
    * update recursion limit for nox session
    
    * skip the test in e2e/python 3.12
    
    * simplify code
    
    * simplify code
    Genesis929 authored Dec 9, 2024
    Copy the full SHA
    9dcf1aa View commit details
  2. refactor: consolidate reshaping functions under the reshape package (#…

    …1194)
    
    * consolidate reshaping functions under the reshape package
    
    * format code
    
    * fix import
    
    * fix import
    
    * lint code
    sycai authored Dec 9, 2024
    Copy the full SHA
    d638f7c View commit details
  3. Copy the full SHA
    0d8a16b View commit details
  4. feat: add ARIMAPlus.predict_explain() to generate forecasts with ex…

    …planation columns (#1177)
    
    * feat: create arima_plus_predict_attribution method
    
    * tmp: debug notes for time_series_arima_plus_model.predict_attribution
    
    * update test_arima_plus_predict_explain_default test and create test_arima_plus_predict_explain_params test
    
    * Merge branch 'ml-predict-explain' of github.com:googleapis/python-bigquery-dataframes into ml-predict-explain
    
    * update  test_arima_plus_predict_explain_params test
    
    * Revert "tmp: debug notes for time_series_arima_plus_model.predict_attribution"
    
    This reverts commit f6dd455.
    
    * format and lint
    
    * Update bigframes/ml/forecasting.py
    
    Co-authored-by: Tim Sweña (Swast) <swast@google.com>
    
    * update predict explain params test
    
    * update test
    
    * 🦉 Updates from OwlBot post-processor
    
    See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md
    
    * add unit test file - bare bones
    
    * 🦉 Updates from OwlBot post-processor
    
    See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md
    
    * fixed tests
    
    * 🦉 Updates from OwlBot post-processor
    
    See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md
    
    * lint
    
    * lint
    
    * fix test: float -> int
    
    ---------
    
    Co-authored-by: Chelsea Lin <chelsealin@google.com>
    Co-authored-by: Tim Sweña (Swast) <swast@google.com>
    Co-authored-by: Owl Bot <gcf-owl-bot[bot]@users.noreply.github.com>
    4 people authored Dec 9, 2024
    Copy the full SHA
    05f8b4d View commit details

Commits on Dec 10, 2024

  1. feat: Add support for temporal types in dataframe's describe() method (

    …#1189)
    
    * feat: Add support for temporal types in dataframe's describe() method
    
    * add type hint to make mypy happy
    
    * directly use output_type to check agg op support for input type
    
    * format code
    
    * 🦉 Updates from OwlBot post-processor
    
    See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md
    
    * perf: update df.corr, df.cov to be used with more than 30 columns case. (#1161)
    
    * perf: update df.corr, df.cov to be used with more than 30 columns case.
    
    * add large test
    
    * remove print
    
    * fix_index
    
    * fix index
    
    * test fix
    
    * fix test
    
    * fix test
    
    * lint code
    
    * fix import path
    sycai authored Dec 10, 2024
    Copy the full SHA
    2d564a6 View commit details
  2. test: fix load test local run (#1201)

    * test: fix load test local run
    
    * add back removed
    GarrettWu authored Dec 10, 2024
    Copy the full SHA
    d9898e6 View commit details

Commits on Dec 11, 2024

  1. Copy the full SHA
    a44eafd View commit details
  2. chore(main): release 1.28.0 (#1159)

    Co-authored-by: release-please[bot] <55107282+release-please[bot]@users.noreply.github.com>
    release-please[bot] authored Dec 11, 2024
    Copy the full SHA
    7c12e69 View commit details
Showing with 49,104 additions and 1,725 deletions.
  1. +0 −2 .github/ISSUE_TEMPLATE/bug_report.md
  2. +37 −0 CHANGELOG.md
  3. +20 −0 bigframes/_config/bigquery_options.py
  4. +2 −2 bigframes/_config/experiment_options.py
  5. +27 −28 bigframes/bigquery/_operations/search.py
  6. +55 −29 bigframes/core/__init__.py
  7. +75 −49 bigframes/core/blocks.py
  8. +35 −34 bigframes/core/compile/aggregate_compiler.py
  9. +56 −43 bigframes/core/compile/compiled.py
  10. +9 −12 bigframes/core/compile/compiler.py
  11. +3 −3 bigframes/core/compile/concat.py
  12. +17 −13 bigframes/core/compile/default_ordering.py
  13. +27 −19 bigframes/core/compile/ibis_types.py
  14. +178 −125 bigframes/core/compile/scalar_op_compiler.py
  15. +4 −4 bigframes/core/compile/schema_translator.py
  16. +5 −5 bigframes/core/compile/single_column.py
  17. +50 −13 bigframes/core/convert.py
  18. +6 −0 bigframes/core/identifiers.py
  19. +2 −2 bigframes/core/indexers.py
  20. +0 −13 bigframes/core/indexes/base.py
  21. +105 −60 bigframes/core/nodes.py
  22. +2 −2 bigframes/core/ordering.py
  23. +0 −187 bigframes/core/reshape/__init__.py
  24. +5 −7 bigframes/core/{joins/__init__.py → reshape/api.py}
  25. +106 −0 bigframes/core/reshape/concat.py
  26. +5 −0 bigframes/core/{joins → reshape}/merge.py
  27. +129 −0 bigframes/core/reshape/tile.py
  28. +26 −0 bigframes/core/rewrite/__init__.py
  29. +59 −0 bigframes/core/rewrite/identifiers.py
  30. +190 −0 bigframes/core/rewrite/implicit_align.py
  31. +2 −235 bigframes/core/{rewrite.py → rewrite/legacy_align.py}
  32. +228 −0 bigframes/core/rewrite/slices.py
  33. +32 −31 bigframes/core/sql.py
  34. +264 −32 bigframes/dataframe.py
  35. +13 −4 bigframes/dtypes.py
  36. +6 −4 bigframes/functions/_remote_function_session.py
  37. +3 −3 bigframes/functions/_utils.py
  38. +3 −3 bigframes/functions/remote_function.py
  39. +1 −1 bigframes/ml/base.py
  40. +4 −4 bigframes/ml/cluster.py
  41. +2 −2 bigframes/ml/compose.py
  42. +8 −0 bigframes/ml/core.py
  43. +3 −3 bigframes/ml/decomposition.py
  44. +16 −16 bigframes/ml/ensemble.py
  45. +40 −3 bigframes/ml/forecasting.py
  46. +3 −3 bigframes/ml/imported.py
  47. +2 −2 bigframes/ml/impute.py
  48. +8 −8 bigframes/ml/linear_model.py
  49. +16 −12 bigframes/ml/llm.py
  50. +1 −0 bigframes/ml/loader.py
  51. +10 −10 bigframes/ml/metrics/_metrics.py
  52. +3 −3 bigframes/ml/metrics/pairwise.py
  53. +5 −5 bigframes/ml/model_selection.py
  54. +4 −4 bigframes/ml/pipeline.py
  55. +14 −14 bigframes/ml/preprocessing.py
  56. +1 −1 bigframes/ml/remote.py
  57. +8 −0 bigframes/ml/sql.py
  58. +42 −56 bigframes/ml/utils.py
  59. +4 −0 bigframes/operations/__init__.py
  60. +3 −2 bigframes/operations/_matplotlib/__init__.py
  61. +37 −7 bigframes/operations/_matplotlib/core.py
  62. +0 −11 bigframes/operations/aggregations.py
  63. +1 −1 bigframes/operations/base.py
  64. +36 −0 bigframes/operations/blob.py
  65. +9 −1 bigframes/operations/plotting.py
  66. +4 −4 bigframes/operations/semantics.py
  67. +1 −0 bigframes/operations/type.py
  68. +7 −126 bigframes/pandas/__init__.py
  69. +1 −2 bigframes/pandas/io/api.py
  70. +7 −1 bigframes/series.py
  71. +4 −4 bigframes/session/__init__.py
  72. +18 −2 bigframes/session/clients.py
  73. +2 −0 bigframes/session/loader.py
  74. +1 −1 bigframes/version.py
  75. +1 −1 notebooks/experimental/semantic_operators.ipynb
  76. +3 −5 owlbot.py
  77. +73 −1 samples/snippets/classification_boosted_tree_model_test.py
  78. +50 −0 samples/snippets/ordering_mode_partial_test.py
  79. +10 −2 setup.py
  80. +0 −3 testing/constraints-3.11.txt
  81. +0 −1 testing/constraints-3.9.txt
  82. +29 −2 tests/system/conftest.py
  83. +1 −1 tests/system/large/operations/conftest.py
  84. +42 −0 tests/system/large/test_dataframe.py
  85. +6 −2 tests/system/load/test_llm.py
  86. +59 −0 tests/system/small/core/test_convert.py
  87. +1 −0 tests/system/small/ml/test_core.py
  88. +8 −54 tests/system/small/ml/test_ensemble.py
  89. +63 −0 tests/system/small/ml/test_forecasting.py
  90. +9 −19 tests/system/small/ml/test_linear_model.py
  91. +3 −3 tests/system/small/ml/test_llm.py
  92. +4 −4 tests/system/small/ml/test_utils.py
  93. +12 −0 tests/system/small/operations/test_plotting.py
  94. +55 −13 tests/system/small/test_dataframe.py
  95. +6 −2 tests/system/small/test_multiindex.py
  96. +41 −1 tests/system/small/test_null_index.py
  97. +68 −0 tests/system/small/test_series.py
  98. +9 −0 tests/unit/_config/test_bigquery_options.py
  99. +8 −6 tests/unit/core/test_dtypes.py
  100. +4 −4 tests/unit/core/test_rewrite.py
  101. +34 −45 tests/unit/core/test_sql.py
  102. +58 −0 tests/unit/ml/test_forecasting.py
  103. +2 −1 tests/unit/ml/test_golden_sql.py
  104. +0 −83 tests/unit/operations/test_aggregations.py
  105. +0 −6 tests/unit/resources.py
  106. +0 −17 tests/unit/test_formatting_helper.py
  107. +14 −0 tests/unit/test_formatting_helpers.py
  108. +5 −8 tests/unit/test_planner.py
  109. +1 −1 tests/unit/test_remote_function.py
  110. +109 −0 third_party/bigframes_vendored/ibis/__init__.py
  111. +1,431 −0 third_party/bigframes_vendored/ibis/backends/__init__.py
  112. +1,366 −0 third_party/bigframes_vendored/ibis/backends/bigquery/__init__.py
  113. +30 −27 third_party/bigframes_vendored/ibis/backends/bigquery/backend.py
  114. +178 −0 third_party/bigframes_vendored/ibis/backends/bigquery/client.py
  115. +22 −0 third_party/bigframes_vendored/ibis/backends/bigquery/converter.py
  116. +10 −9 third_party/bigframes_vendored/ibis/backends/bigquery/datatypes.py
  117. 0 third_party/bigframes_vendored/ibis/backends/bigquery/udf/__init__.py
  118. +604 −0 third_party/bigframes_vendored/ibis/backends/bigquery/udf/core.py
  119. +64 −0 third_party/bigframes_vendored/ibis/backends/bigquery/udf/find.py
  120. +54 −0 third_party/bigframes_vendored/ibis/backends/bigquery/udf/rewrite.py
  121. +650 −0 third_party/bigframes_vendored/ibis/backends/sql/__init__.py
  122. +5 −3 third_party/bigframes_vendored/ibis/backends/sql/compilers/__init__.py
  123. +19 −19 third_party/bigframes_vendored/ibis/backends/sql/compilers/base.py
  124. +770 −0 third_party/bigframes_vendored/ibis/backends/sql/compilers/bigquery.py
  125. +34 −27 third_party/bigframes_vendored/ibis/backends/sql/compilers/bigquery/__init__.py
  126. +547 −0 third_party/bigframes_vendored/ibis/backends/sql/datatypes.py
  127. +13 −13 third_party/bigframes_vendored/ibis/backends/sql/rewrites.py
  128. 0 third_party/bigframes_vendored/ibis/common/__init__.py
  129. +651 −0 third_party/bigframes_vendored/ibis/common/annotations.py
  130. +248 −0 third_party/bigframes_vendored/ibis/common/bases.py
  131. +98 −0 third_party/bigframes_vendored/ibis/common/caching.py
  132. +379 −0 third_party/bigframes_vendored/ibis/common/collections.py
  133. +632 −0 third_party/bigframes_vendored/ibis/common/deferred.py
  134. +218 −0 third_party/bigframes_vendored/ibis/common/dispatch.py
  135. +830 −0 third_party/bigframes_vendored/ibis/common/egraph.py
  136. +173 −0 third_party/bigframes_vendored/ibis/common/exceptions.py
  137. +775 −0 third_party/bigframes_vendored/ibis/common/graph.py
  138. +232 −0 third_party/bigframes_vendored/ibis/common/grounds.py
  139. +50 −0 third_party/bigframes_vendored/ibis/common/numeric.py
  140. +1,712 −0 third_party/bigframes_vendored/ibis/common/patterns.py
  141. +33 −0 third_party/bigframes_vendored/ibis/common/selectors.py
  142. +262 −0 third_party/bigframes_vendored/ibis/common/temporal.py
  143. +284 −0 third_party/bigframes_vendored/ibis/common/typing.py
  144. +192 −0 third_party/bigframes_vendored/ibis/config.py
  145. +2,477 −0 third_party/bigframes_vendored/ibis/expr/api.py
  146. +312 −0 third_party/bigframes_vendored/ibis/expr/builders.py
  147. +70 −0 third_party/bigframes_vendored/ibis/expr/datashape.py
  148. +19 −0 third_party/bigframes_vendored/ibis/expr/datatypes/__init__.py
  149. +148 −0 third_party/bigframes_vendored/ibis/expr/datatypes/cast.py
  150. +1,131 −0 third_party/bigframes_vendored/ibis/expr/datatypes/core.py
  151. +211 −0 third_party/bigframes_vendored/ibis/expr/datatypes/parse.py
  152. +374 −0 third_party/bigframes_vendored/ibis/expr/datatypes/value.py
  153. +482 −0 third_party/bigframes_vendored/ibis/expr/decompile.py
  154. +373 −0 third_party/bigframes_vendored/ibis/expr/format.py
  155. +16 −0 third_party/bigframes_vendored/ibis/expr/operations/__init__.py
  156. +104 −16 third_party/bigframes_vendored/ibis/expr/operations/analytic.py
  157. +269 −0 third_party/bigframes_vendored/ibis/expr/operations/arrays.py
  158. +196 −0 third_party/bigframes_vendored/ibis/expr/operations/core.py
  159. +333 −0 third_party/bigframes_vendored/ibis/expr/operations/generic.py
  160. +501 −0 third_party/bigframes_vendored/ibis/expr/operations/geospatial.py
  161. +53 −0 third_party/bigframes_vendored/ibis/expr/operations/histograms.py
  162. +86 −3 third_party/bigframes_vendored/ibis/expr/operations/json.py
  163. +171 −0 third_party/bigframes_vendored/ibis/expr/operations/logical.py
  164. +101 −0 third_party/bigframes_vendored/ibis/expr/operations/maps.py
  165. +376 −0 third_party/bigframes_vendored/ibis/expr/operations/numeric.py
  166. +389 −12 third_party/bigframes_vendored/ibis/expr/operations/reductions.py
  167. +531 −0 third_party/bigframes_vendored/ibis/expr/operations/relations.py
  168. +43 −0 third_party/bigframes_vendored/ibis/expr/operations/sortkeys.py
  169. +396 −0 third_party/bigframes_vendored/ibis/expr/operations/strings.py
  170. +62 −0 third_party/bigframes_vendored/ibis/expr/operations/structs.py
  171. +86 −0 third_party/bigframes_vendored/ibis/expr/operations/subqueries.py
  172. +475 −0 third_party/bigframes_vendored/ibis/expr/operations/temporal.py
  173. +660 −0 third_party/bigframes_vendored/ibis/expr/operations/udf.py
  174. +114 −0 third_party/bigframes_vendored/ibis/expr/operations/window.py
  175. +11 −11 third_party/bigframes_vendored/ibis/expr/rewrites.py
  176. +167 −0 third_party/bigframes_vendored/ibis/expr/rules.py
  177. +303 −0 third_party/bigframes_vendored/ibis/expr/schema.py
  178. +383 −0 third_party/bigframes_vendored/ibis/expr/sql.py
  179. +21 −0 third_party/bigframes_vendored/ibis/expr/types/__init__.py
  180. +1,129 −0 third_party/bigframes_vendored/ibis/expr/types/arrays.py
  181. +43 −0 third_party/bigframes_vendored/ibis/expr/types/binary.py
  182. +752 −0 third_party/bigframes_vendored/ibis/expr/types/core.py
  183. +177 −0 third_party/bigframes_vendored/ibis/expr/types/dataframe_interchange.py
  184. +2,439 −0 third_party/bigframes_vendored/ibis/expr/types/generic.py
  185. +1,757 −0 third_party/bigframes_vendored/ibis/expr/types/geospatial.py
  186. +296 −0 third_party/bigframes_vendored/ibis/expr/types/groupby.py
  187. +440 −0 third_party/bigframes_vendored/ibis/expr/types/joins.py
  188. +510 −0 third_party/bigframes_vendored/ibis/expr/types/json.py
  189. +554 −0 third_party/bigframes_vendored/ibis/expr/types/logical.py
  190. +500 −0 third_party/bigframes_vendored/ibis/expr/types/maps.py
  191. +1,219 −0 third_party/bigframes_vendored/ibis/expr/types/numeric.py
  192. +487 −0 third_party/bigframes_vendored/ibis/expr/types/pretty.py
  193. +4,890 −0 third_party/bigframes_vendored/ibis/expr/types/relations.py
  194. +1,714 −0 third_party/bigframes_vendored/ibis/expr/types/strings.py
  195. +403 −0 third_party/bigframes_vendored/ibis/expr/types/structs.py
  196. +995 −0 third_party/bigframes_vendored/ibis/expr/types/temporal.py
  197. +94 −0 third_party/bigframes_vendored/ibis/expr/types/temporal_windows.py
  198. +11 −0 third_party/bigframes_vendored/ibis/expr/types/typing.py
  199. +21 −0 third_party/bigframes_vendored/ibis/expr/types/uuid.py
  200. +255 −0 third_party/bigframes_vendored/ibis/expr/visualize.py
  201. +266 −0 third_party/bigframes_vendored/ibis/formats/__init__.py
  202. +103 −0 third_party/bigframes_vendored/ibis/formats/numpy.py
  203. +431 −0 third_party/bigframes_vendored/ibis/formats/pandas.py
  204. +194 −0 third_party/bigframes_vendored/ibis/formats/polars.py
  205. +381 −0 third_party/bigframes_vendored/ibis/formats/pyarrow.py
  206. +655 −0 third_party/bigframes_vendored/ibis/selectors.py
  207. +699 −0 third_party/bigframes_vendored/ibis/util.py
  208. +718 −26 third_party/bigframes_vendored/pandas/core/indexes/base.py
  209. +60 −0 third_party/bigframes_vendored/pandas/plotting/_core.py
  210. +1 −1 third_party/bigframes_vendored/version.py
2 changes: 0 additions & 2 deletions .github/ISSUE_TEMPLATE/bug_report.md
Original file line number Diff line number Diff line change
@@ -27,15 +27,13 @@ If you are still having issues, please be sure to include as much information as
import sys
import bigframes
import google.cloud.bigquery
import ibis
import pandas
import pyarrow
import sqlglot

print(f"Python: {sys.version}")
print(f"bigframes=={bigframes.__version__}")
print(f"google-cloud-bigquery=={google.cloud.bigquery.__version__}")
print(f"ibis=={ibis.__version__}")
print(f"pandas=={pandas.__version__}")
print(f"pyarrow=={pyarrow.__version__}")
print(f"sqlglot=={sqlglot.__version__}")
37 changes: 37 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -4,6 +4,43 @@

[1]: https://pypi.org/project/bigframes/#history

## [1.28.0](https://github.com/googleapis/python-bigquery-dataframes/compare/v1.27.0...v1.28.0) (2024-12-11)


### Features

* (Series | DataFrame).plot.bar ([#1152](https://github.com/googleapis/python-bigquery-dataframes/issues/1152)) ([0fae2e0](https://github.com/googleapis/python-bigquery-dataframes/commit/0fae2e0291ec8d22341b5b543e8f1b384f83cd3c))
* `bigframes.bigquery.vector_search` supports `use_brute_force` and `fraction_lists_to_search` parameters ([#1158](https://github.com/googleapis/python-bigquery-dataframes/issues/1158)) ([131edc3](https://github.com/googleapis/python-bigquery-dataframes/commit/131edc3d79f46d35a25422f0db7f150e63e8f561))
* Add `ARIMAPlus.predict_explain()` to generate forecasts with explanation columns ([#1177](https://github.com/googleapis/python-bigquery-dataframes/issues/1177)) ([05f8b4d](https://github.com/googleapis/python-bigquery-dataframes/commit/05f8b4d2b2b5f624097228e65a3c42364fc40d36))
* Add client_endpoints_override to bq options ([#1167](https://github.com/googleapis/python-bigquery-dataframes/issues/1167)) ([be74b99](https://github.com/googleapis/python-bigquery-dataframes/commit/be74b99977cfbd513def5b7e439de6b7706c0712))
* Add support for temporal types in dataframe's describe() method ([#1189](https://github.com/googleapis/python-bigquery-dataframes/issues/1189)) ([2d564a6](https://github.com/googleapis/python-bigquery-dataframes/commit/2d564a6a9925b69c7e9a15b532fb66ad68c3e264))
* Allow join-free alignment of analytic expressions ([#1168](https://github.com/googleapis/python-bigquery-dataframes/issues/1168)) ([daef4f0](https://github.com/googleapis/python-bigquery-dataframes/commit/daef4f0c7c5ff2d0a4e9a6ffefeb81f43780ac8b))
* Series.isin supports bigframes.Series arg ([#1195](https://github.com/googleapis/python-bigquery-dataframes/issues/1195)) ([0d8a16b](https://github.com/googleapis/python-bigquery-dataframes/commit/0d8a16ba77a66dce544d0a7cf411fca0adc2a694))
* Update llm.TextEmbeddingGenerator to 005 ([#1186](https://github.com/googleapis/python-bigquery-dataframes/issues/1186)) ([3072d38](https://github.com/googleapis/python-bigquery-dataframes/commit/3072d382c6ff57bdb37d7e080c794c67dbf6e701))


### Bug Fixes

* Fix error loading local dataframes into bigquery ([#1165](https://github.com/googleapis/python-bigquery-dataframes/issues/1165)) ([5b355ef](https://github.com/googleapis/python-bigquery-dataframes/commit/5b355efde122ed76b1cff39900ab8f94f5a13a30))
* Fix null index join with 'on' arg ([#1153](https://github.com/googleapis/python-bigquery-dataframes/issues/1153)) ([9015c33](https://github.com/googleapis/python-bigquery-dataframes/commit/9015c33e73675ebb2299487dce3295732ea0527e))
* Fix series.isin using local path always ([#1202](https://github.com/googleapis/python-bigquery-dataframes/issues/1202)) ([a44eafd](https://github.com/googleapis/python-bigquery-dataframes/commit/a44eafdd95eb1b994dc82411640b61fd0a78a492))


### Performance Improvements

* Update df.corr, df.cov to be used with more than 30 columns case. ([#1161](https://github.com/googleapis/python-bigquery-dataframes/issues/1161)) ([9dcf1aa](https://github.com/googleapis/python-bigquery-dataframes/commit/9dcf1aa918919704dcf4d12b05935b22fb502fc6))


### Documentation

* Add a code sample using `bpd.options.bigquery.ordering_mode = "partial"` ([#909](https://github.com/googleapis/python-bigquery-dataframes/issues/909)) ([f80d705](https://github.com/googleapis/python-bigquery-dataframes/commit/f80d70503b80559a0b1fe64434383aa3e028bf9b))
* Add snippet for creating boosted tree model ([#1142](https://github.com/googleapis/python-bigquery-dataframes/issues/1142)) ([a972668](https://github.com/googleapis/python-bigquery-dataframes/commit/a972668833a454fb18e6cb148697165edd46e8cc))
* Add snippet for evaluating a boosted tree model ([#1154](https://github.com/googleapis/python-bigquery-dataframes/issues/1154)) ([9d8970a](https://github.com/googleapis/python-bigquery-dataframes/commit/9d8970ac1f18b2520a061ac743e767ca8593cc8c))
* Add snippet for predicting classifications using a boosted tree model ([#1156](https://github.com/googleapis/python-bigquery-dataframes/issues/1156)) ([e7b83f1](https://github.com/googleapis/python-bigquery-dataframes/commit/e7b83f166ef56e631120050103c2f43f454fce44))
* Add third party `pandas.Index methods` and docstrings ([#1171](https://github.com/googleapis/python-bigquery-dataframes/issues/1171)) ([a970294](https://github.com/googleapis/python-bigquery-dataframes/commit/a9702945286fbe500ade4d0f0c14cc60a8aa00eb))
* Fix Bigframes.Pandas.General_Function missing docs ([#1164](https://github.com/googleapis/python-bigquery-dataframes/issues/1164)) ([de923d0](https://github.com/googleapis/python-bigquery-dataframes/commit/de923d01b904b96cc51dfd526b6a412f28ff10c4))
* Update `bigframes.pandas.Index` docstrings ([#1144](https://github.com/googleapis/python-bigquery-dataframes/issues/1144)) ([557ab8d](https://github.com/googleapis/python-bigquery-dataframes/commit/557ab8df526fcf743af0a609ec7ec636b00d0c0b))

## [1.27.0](https://github.com/googleapis/python-bigquery-dataframes/compare/v1.26.0...v1.27.0) (2024-11-16)


20 changes: 20 additions & 0 deletions bigframes/_config/bigquery_options.py
Original file line number Diff line number Diff line change
@@ -91,6 +91,7 @@ def __init__(
skip_bq_connection_check: bool = False,
*,
ordering_mode: Literal["strict", "partial"] = "strict",
client_endpoints_override: dict = {},
):
self._credentials = credentials
self._project = project
@@ -103,6 +104,7 @@ def __init__(
self._session_started = False
# Determines the ordering strictness for the session.
self._ordering_mode = _validate_ordering_mode(ordering_mode)
self._client_endpoints_override = client_endpoints_override

@property
def application_name(self) -> Optional[str]:
@@ -317,3 +319,21 @@ def ordering_mode(self) -> Literal["strict", "partial"]:
@ordering_mode.setter
def ordering_mode(self, ordering_mode: Literal["strict", "partial"]) -> None:
self._ordering_mode = _validate_ordering_mode(ordering_mode)

@property
def client_endpoints_override(self) -> dict:
"""Option that sets the BQ client endpoints addresses directly as a dict. Possible keys are "bqclient", "bqconnectionclient", "bqstoragereadclient"."""
return self._client_endpoints_override

@client_endpoints_override.setter
def client_endpoints_override(self, value: dict):
warnings.warn(
"This is an advanced configuration option for directly setting endpoints. Incorrect use may lead to unexpected behavior or system instability. Proceed only if you fully understand its implications."
)

if self._session_started and self._client_endpoints_override != value:
raise ValueError(
SESSION_STARTED_MESSAGE.format(attribute="client_endpoints_override")
)

self._client_endpoints_override = value
4 changes: 2 additions & 2 deletions bigframes/_config/experiment_options.py
Original file line number Diff line number Diff line change
@@ -21,8 +21,8 @@ class ExperimentOptions:
"""

def __init__(self):
self._semantic_operators = False
self._blob = False
self._semantic_operators: bool = False
self._blob: bool = False

@property
def semantic_operators(self) -> bool:
55 changes: 27 additions & 28 deletions bigframes/bigquery/_operations/search.py
Original file line number Diff line number Diff line change
@@ -18,7 +18,6 @@
import typing
from typing import Collection, Literal, Mapping, Optional, Union

import bigframes_vendored.constants as constants
import google.cloud.bigquery as bigquery

import bigframes.core.sql
@@ -96,10 +95,10 @@ def vector_search(
query: Union[dataframe.DataFrame, series.Series],
*,
query_column_to_search: Optional[str] = None,
top_k: Optional[int] = 10,
distance_type: Literal["euclidean", "cosine"] = "euclidean",
top_k: Optional[int] = None,
distance_type: Optional[Literal["euclidean", "cosine", "dot_product"]] = None,
fraction_lists_to_search: Optional[float] = None,
use_brute_force: bool = False,
use_brute_force: Optional[bool] = None,
) -> dataframe.DataFrame:
"""
Conduct vector search which searches embeddings to find semantically similar entities.
@@ -141,7 +140,8 @@ def vector_search(
... base_table="bigframes-dev.bigframes_tests_sys.base_table",
... column_to_search="my_embedding",
... query=search_query,
... top_k=2)
... top_k=2,
... use_brute_force=True)
embedding id my_embedding distance
dog [1. 2.] 1 [1. 2.] 0.0
cat [3. 5.2] 5 [5. 5.4] 2.009975
@@ -185,17 +185,18 @@ def vector_search(
find nearest neighbors. The column must have a type of ``ARRAY<FLOAT64>``. All elements in
the array must be non-NULL and all values in the column must have the same array dimensions
as the values in the ``column_to_search`` column. Can only be set when query is a DataFrame.
top_k (int, default 10):
top_k (int):
Sepecifies the number of nearest neighbors to return. Default to 10.
distance_type (str, defalt "euclidean"):
Specifies the type of metric to use to compute the distance between two vectors.
Possible values are "euclidean" and "cosine". Default to "euclidean".
Possible values are "euclidean", "cosine" and "dot_product".
Default to "euclidean".
fraction_lists_to_search (float, range in [0.0, 1.0]):
Specifies the percentage of lists to search. Specifying a higher percentage leads to
higher recall and slower performance, and the converse is true when specifying a lower
percentage. It is only used when a vector index is also used. You can only specify
``fraction_lists_to_search`` when ``use_brute_force`` is set to False.
use_brute_force (bool, default False):
use_brute_force (bool):
Determines whether to use brute force search by skipping the vector index if one is available.
Default to False.
@@ -204,37 +205,35 @@ def vector_search(
"""
import bigframes.series

if not fraction_lists_to_search and use_brute_force is True:
raise ValueError(
"You can't specify fraction_lists_to_search when use_brute_force is set to True."
)
if (
isinstance(query, bigframes.series.Series)
and query_column_to_search is not None
):
raise ValueError(
"You can't specify query_column_to_search when query is a Series."
)
# TODO(ashleyxu): Support options in vector search. b/344019989
if fraction_lists_to_search is not None or use_brute_force is True:
raise NotImplementedError(
f"fraction_lists_to_search and use_brute_force is not supported. {constants.FEEDBACK_LINK}"
)
options = {
"base_table": base_table,
"column_to_search": column_to_search,
"query_column_to_search": query_column_to_search,
"distance_type": distance_type,
"top_k": top_k,
"fraction_lists_to_search": fraction_lists_to_search,
"use_brute_force": use_brute_force,
}

(query,) = utils.convert_to_dataframe(query)
# Only populate options if not set to the default value.
# This avoids accidentally setting options that are mutually exclusive.
options = None
if fraction_lists_to_search is not None:
options = {} if options is None else options
options["fraction_lists_to_search"] = fraction_lists_to_search
if use_brute_force is not None:
options = {} if options is None else options
options["use_brute_force"] = use_brute_force

(query,) = utils.batch_convert_to_dataframe(query)
sql_string, index_col_ids, index_labels = query._to_sql_query(include_index=True)

sql = bigframes.core.sql.create_vector_search_sql(
sql_string=sql_string, options=options # type: ignore
sql_string=sql_string,
base_table=base_table,
column_to_search=column_to_search,
query_column_to_search=query_column_to_search,
top_k=top_k,
distance_type=distance_type,
options=options,
)
if index_col_ids is not None:
df = query._session.read_gbq(sql, index_col=index_col_ids)
84 changes: 55 additions & 29 deletions bigframes/core/__init__.py
Original file line number Diff line number Diff line change
@@ -26,7 +26,6 @@
import pyarrow as pa
import pyarrow.feather as pa_feather

import bigframes.core.compile
import bigframes.core.expression as ex
import bigframes.core.guid
import bigframes.core.identifiers as ids
@@ -35,15 +34,13 @@
import bigframes.core.nodes as nodes
from bigframes.core.ordering import OrderingExpression
import bigframes.core.ordering as orderings
import bigframes.core.rewrite
import bigframes.core.schema as schemata
import bigframes.core.tree_properties
import bigframes.core.utils
from bigframes.core.window_spec import WindowSpec
import bigframes.dtypes
import bigframes.operations as ops
import bigframes.operations.aggregations as agg_ops
import bigframes.session._io.bigquery

if typing.TYPE_CHECKING:
from bigframes.session import Session
@@ -199,6 +196,8 @@ def as_cached(

def _try_evaluate_local(self):
"""Use only for unit testing paths - not fully featured. Will throw exception if fails."""
import bigframes.core.compile

return bigframes.core.compile.test_only_try_evaluate(self.node)

def get_column_type(self, key: str) -> bigframes.dtypes.Dtype:
@@ -422,22 +421,7 @@ def relational_join(
l_mapping = { # Identity mapping, only rename right side
lcol.name: lcol.name for lcol in self.node.ids
}
r_mapping = { # Rename conflicting names
rcol.name: rcol.name
if (rcol.name not in l_mapping)
else bigframes.core.guid.generate_guid()
for rcol in other.node.ids
}
other_node = other.node
if set(other_node.ids) & set(self.node.ids):
other_node = nodes.SelectionNode(
other_node,
tuple(
(ex.deref(old_id), ids.ColumnId(new_id))
for old_id, new_id in r_mapping.items()
),
)

other_node, r_mapping = self.prepare_join_names(other)
join_node = nodes.JoinNode(
left_child=self.node,
right_child=other_node,
@@ -449,14 +433,63 @@ def relational_join(
)
return ArrayValue(join_node), (l_mapping, r_mapping)

def try_align_as_projection(
def try_row_join(
self,
other: ArrayValue,
conditions: typing.Tuple[typing.Tuple[str, str], ...] = (),
) -> Optional[
typing.Tuple[ArrayValue, typing.Tuple[dict[str, str], dict[str, str]]]
]:
l_mapping = { # Identity mapping, only rename right side
lcol.name: lcol.name for lcol in self.node.ids
}
other_node, r_mapping = self.prepare_join_names(other)
import bigframes.core.rewrite

result_node = bigframes.core.rewrite.try_join_as_projection(
self.node, other_node, conditions
)
if result_node is None:
return None

return (
ArrayValue(result_node),
(l_mapping, r_mapping),
)

def prepare_join_names(
self, other: ArrayValue
) -> Tuple[bigframes.core.nodes.BigFrameNode, dict[str, str]]:
if set(other.node.ids) & set(self.node.ids):
r_mapping = { # Rename conflicting names
rcol.name: rcol.name
if (rcol.name not in self.column_ids)
else bigframes.core.guid.generate_guid()
for rcol in other.node.ids
}
return (
nodes.SelectionNode(
other.node,
tuple(
(ex.deref(old_id), ids.ColumnId(new_id))
for old_id, new_id in r_mapping.items()
),
),
r_mapping,
)
else:
return other.node, {id: id for id in other.column_ids}

def try_legacy_row_join(
self,
other: ArrayValue,
join_type: join_def.JoinType,
join_keys: typing.Tuple[join_def.CoalescedColumnMapping, ...],
mappings: typing.Tuple[join_def.JoinColumnMapping, ...],
) -> typing.Optional[ArrayValue]:
result = bigframes.core.rewrite.join_as_projection(
import bigframes.core.rewrite

result = bigframes.core.rewrite.legacy_join_as_projection(
self.node, other.node, join_keys, mappings, join_type
)
if result is not None:
@@ -488,11 +521,4 @@ def _gen_namespaced_uid(self) -> str:
return self._gen_namespaced_uids(1)[0]

def _gen_namespaced_uids(self, n: int) -> List[str]:
i = len(self.node.defined_variables)
genned_ids: List[str] = []
while len(genned_ids) < n:
attempted_id = f"col_{i}"
if attempted_id not in self.node.defined_variables:
genned_ids.append(attempted_id)
i = i + 1
return genned_ids
return [ids.ColumnId.unique().name for _ in range(n)]
Loading