Skip to content

Commit

Permalink
Add documentation for MS fabric package install + config (#2180)
Browse files Browse the repository at this point in the history
  • Loading branch information
janet-can authored Oct 22, 2024
1 parent c70fb78 commit a59292a
Show file tree
Hide file tree
Showing 2 changed files with 24 additions and 5 deletions.
17 changes: 16 additions & 1 deletion docs/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ Alternatively, you can provide data source connection configurations in the cont
1. Soda Core connects with Spark DataFrames in a unique way, using programmtic scans.
* If you are using Spark DataFrames, follow the configuration details in [Connect to Spark DataFrames](https://docs.soda.io/soda/connect-spark.html#connect-to-spark-dataframes).
* If you are *not* using Spark DataFrames, continue to step 2.
2. Create a `configuration.yml` file. This file stores connection details for your data sources. Use the data source-specific connection configurations listed below to copy+paste the connection syntax into your file, then adjust the values to correspond with your data source's details. You can use [system variables](#provide-credentials-as-system-variables) to pass sensitive values, if you wish. Access connection details in [Connect a data source](https://docs.soda.io/soda/connect-athena.html) section of Soda documentation.
2. Create a `configuration.yml` file. This file stores connection details for your data sources. Use the data source-specific connection configurations listed below to copy+paste the connection syntax into your file, then adjust the values to correspond with your data source's details. You can use [system variables](#provide-credentials-as-system-variables) to pass sensitive values, if you wish. Access connection details in [Connect a data source](https://docs.soda.io/soda/connect-athena.html) section of Soda documentation; see below for MS Fabric connection config as it is only supported in Soda Core.
3. Save the `configuration.yml` file, then create another new YAML file named `checks.yml`.
4. A Soda Check is a test that Soda Core performs when it scans a dataset in your data source. The checks YAML file stores the Soda Checks you write using [SodaCL](https://docs.soda.io/soda-cl/soda-cl-overview.html). Copy+paste the following basic check syntax in your file, then adjust the value for `dataset_name` to correspond with the name of one of the datasets in your data source.
```yaml
Expand All @@ -25,6 +25,21 @@ Alternatively, you can provide data source connection configurations in the cont
5. Save the changes to the `checks.yml` file.
6. Next: [run a scan](/docs/scan-core.md) of the data in your data source.

#### MS Fabric connection configuration

To your `configuration.yml` file, add the following.
```yaml
data_source my_data_source_name:
type: fabric
host: xxx
database: xxx
schema: xxx
driver: ODBC Driver 18 for SQL Server
client_id: xxx
client_secret: xxx
encrypt: True
authentication: xxx
```

## Provide credentials as system variables

Expand Down
12 changes: 8 additions & 4 deletions docs/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,8 @@ Use Soda Core to scan a variety of data sources.<br />

<table>
<tr>
<td>Amazon Athena<br /> Amazon Redshift<br /> Apache Spark DataFrames<sup>1</sup><br /> Apache Spark for Databricks SQL<br /> Azure Synapse (Experimental)<br />ClickHouse (Experimental)<br /> Dask and Pandas (Experimental)<sup>1</sup><br /> Denodo (Experimental)<br />Dremio <br />DuckDB (Experimental)<br /> GCP Big Query</td>
<td>IBM DB2<br /> Local file using Dask<sup>1</sup><br />MS SQL Server<br /> MotherDuck (Experimental)<br /> MySQL<br > OracleDB<br /> PostgreSQL<br /> Snowflake<br /> Teradata (Experimental) <br />Trino<br /> Vertica (Experimental)</td>
<td>Amazon Athena<br /> Amazon Redshift<br /> Apache Spark DataFrames<sup>1</sup><br /> Apache Spark for Databricks SQL<br /> Azure Synapse<br />ClickHouse<br /> Dask and Pandas<sup>1</sup><br /> Denodo<br />Dremio <br />DuckDB<br /> GCP Big Query<br /> IBM DB2</td>
<td>Local file using Dask<sup>1</sup><br />MotherDuck<br /> MS Fabric<br />MS SQL Server<br /> MySQL<br > OracleDB<br /> PostgreSQL<br /> Snowflake<br /> Teradata<br />Trino<br /> Vertica</td>
</tr>
</table>
<sup>1</sup> For use with programmatic Soda scans, only.
Expand Down Expand Up @@ -64,6 +64,7 @@ If you have not already installed Python, consider using <a href="https://github
| IBM DB2 | `soda-core-db2` |
| Local file | Use Dask. |
| MotherDuck | `soda-core-duckdb` |
| MS Fabric | `soda-core-fabric` |
| MS SQL Server | `soda-core-sqlserver` |
| MySQL | `soda-core-mysql` |
| OracleDB | `soda-core-oracle` |
Expand Down Expand Up @@ -99,16 +100,19 @@ deactivate
| ----------- | --------------- |
| Amazon Athena | `soda-core-athena` |
| Amazon Redshift | `soda-core-redshift` |
| Apache Spark DataFrame <br /> (For use with [programmatic Soda scans]({% link soda-core/programmatic.md %}), only.) | `soda-core-spark-df` |
| Apache Spark DataFrames <br /> (For use with [programmatic Soda scans]({% link soda-core/programmatic.md %}), only.) | `soda-core-spark-df` |
| Azure Synapse (Experimental) | `soda-core-sqlserver` |
| ClickHouse (Experimental) | `soda-core-mysql` |
| Dask and Pandas (Experimental) | `soda-core-pandas-dask` |
| Databricks | `soda-core-spark[databricks]` |
| Denodo (Experimental) | `soda-core-denodo` |
| Dremio | `soda-core-dremio` |
| DuckDB (Experimental) | `soda-core-duckdb` |
| DuckDB (Experimental) | `soda-core-duckdb` |
| GCP Big Query | `soda-core-bigquery` |
| IBM DB2 | `soda-core-db2` |
| Local file | Use Dask. |
| MotherDuck | `soda-core-duckdb` |
| MS Fabric | `soda-core-fabric` |
| MS SQL Server | `soda-core-sqlserver` |
| MySQL | `soda-core-mysql` |
| OracleDB | `soda-core-oracle` |
Expand Down

0 comments on commit a59292a

Please sign in to comment.