Skip to content

Commit

Permalink
docs/integrations/ibis typo fix (DAGWorks-Inc#727)
Browse files Browse the repository at this point in the history
Co-authored-by: zilto <tjean@DESKTOP-V6JDCS2>
  • Loading branch information
zilto and zilto authored Mar 1, 2024
1 parent d4b4c71 commit 80ad335
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions docs/integrations/ibis/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ Ibis alleviates a major pain point by enabling data transformations to work acro
### Write modular Ibis code
Hamilton was initially developed to [structure pandas code for a large catalog of features](https://blog.dagworks.io/p/tidy-production-pandas-with-hamilton-3b759a2bf562), and has been adopted by multiple organizations since. Its syntax encourages users to chunk code into meaningful and reusable components, which facilitates documentation, unit testing, code reviews, and improves iteration speed. These benefits directly translate to organizing Ibis code.

Below, we'll refactor the above code to use Hamilton. Users have the flexibility to chunk code at the table or the column-level depending on the needed granularity. This modularity is particularly beneficial to Ibis because:
Now, we'll refactor the above code to use Hamilton. Users have the flexibility to chunk code at the table or the column-level depending on the needed granularity. This modularity is particularly beneficial to Ibis because:

- Well-scoped functions with type annotations and docstring are easier to understand for new Ibis users and facilitate onboarding.
- Unit testing and data validation becomes easier with smaller expressions. These checks become more important when working across backends since the [operation coverage varies](https://ibis-project.org/support_matrix) and bugs may arise.
Expand Down Expand Up @@ -223,7 +223,7 @@ Hamilton can directly produce a dataflow visualization from code, helping with p
![ibis_full_viz](ibis_plan.png)

### Working across rows with user-defined functions (UDFs)
Hamilton and most backends are designed to work primarily on tables and columns, but sometimes you'd like to operate over a row (think of `pd.DataFrame.apply()`). However, pivoting tables is costly and manually iterating over rows to collect values and create a new column is quickly inconvenient. With [using scalar user-defined functions (UDFs)](https://ibis-project.org/reference/scalar-udfs), Ibis makes it to execute arbitrary Python code on rows directly on the backend.
Hamilton and most backends are designed to work primarily on tables and columns, but sometimes you'd like to operate over a row (think of `pd.DataFrame.apply()`). However, pivoting tables is costly and manually iterating over rows to collect values and create a new column is quickly inconvenient. By [using scalar user-defined functions (UDFs)](https://ibis-project.org/reference/scalar-udfs), Ibis makes it possible to execute arbitrary Python code on rows directly on the backend.

``` {note}
Using `@ibis.udf.scalar.python` creates a non-vectorized function that iterates row-by-row. See [the docs](https://ibis-project.org/reference/scalar-udfs) to use backend-specific UDFs with `@ibis.udf.scalar.builtin` and create vectorized scalar UDFs.
Expand Down

0 comments on commit 80ad335

Please sign in to comment.