Skip to content

FEAT: Adding setdecoding() API for connection #173

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: jahnvi/conn_setencoding
Choose a base branch
from

Conversation

jahnvi480
Copy link
Contributor

@jahnvi480 jahnvi480 commented Aug 7, 2025

Work Item / Issue Reference

AB#34918


Summary

This pull request introduces new functionality for configuring and retrieving text decoding settings in the Connection class of the mssql_python package. The main changes add support for a new special SQL type flag (SQL_WMETADATA) to allow explicit control over how column metadata is decoded, and provide two new methods (setdecoding and getdecoding) for managing decoding configuration per SQL type.

Enhancements to decoding configuration:

  • Added a new constant, SQL_WMETADATA, in both mssql_python/__init__.py and mssql_python/connection.py, to allow explicit configuration of column name decoding.
  • Initialized a _decoding_settings dictionary in the Connection class to store decoding settings for SQL_CHAR, SQL_WCHAR, and SQL_WMETADATA, with sensible Python 3 defaults.
  • Introduced the setdecoding method to the Connection class, allowing users to configure the decoding (encoding and ctype) for each SQL type, including validation and error handling.
  • Added the getdecoding method to the Connection class, enabling retrieval of the current decoding settings for a specific SQL type, with validation and error handling.

Testing configuration:

  • Updated the conn_str fixture in tests/conftest.py to use a hardcoded connection string, likely for local testing purposes.

@Copilot Copilot AI review requested due to automatic review settings August 7, 2025 10:09
@github-actions github-actions bot added pr-size: large Substantial code update labels Aug 7, 2025
@jahnvi480 jahnvi480 changed the base branch from main to jahnvi/conn_setencoding August 7, 2025 10:10
@github-actions github-actions bot added pr-size: large Substantial code update and removed pr-size: large Substantial code update labels Aug 7, 2025
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request introduces a comprehensive encoding/decoding configuration system for the Connection class, enabling users to control how text is encoded and decoded when interacting with SQL Server databases. The changes provide explicit control over text handling for SQL statements and results with Python 3 defaults and include security improvements for logging user input.

  • Added setencoding, getencoding, setdecoding, and getdecoding methods for comprehensive text encoding/decoding control
  • Introduced module-level constants (SQL_CHAR, SQL_WCHAR, SQL_WMETADATA) and helper functions for encoding validation
  • Added sanitize_user_input helper function to prevent log injection attacks

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
tests/test_003_connection.py Extensive test suite covering all encoding/decoding functionality with various scenarios
tests/conftest.py Hardcoded connection string for improved test reliability
mssql_python/type.py Minor documentation clarification removing pyodbc reference
mssql_python/helpers.py Added sanitize_user_input function for secure logging
mssql_python/connection.py Core implementation of encoding/decoding methods with validation and security features
mssql_python/__init__.py Export of SQL constants for public API access


def test_setencoding_invalid_encoding(db_connection):
"""Test setencoding with invalid encoding."""
from mssql_python.exceptions import ProgrammingError
Copy link
Preview

Copilot AI Aug 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import statement is duplicated - ProgrammingError is already imported at the top of the file at line 21. Remove this redundant import statement.

Suggested change
from mssql_python.exceptions import ProgrammingError

Copilot uses AI. Check for mistakes.


def test_setencoding_closed_connection(conn_str):
"""Test setencoding on closed connection."""
from mssql_python.exceptions import InterfaceError
Copy link
Preview

Copilot AI Aug 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import statement is duplicated - InterfaceError is already imported at the top of the file at line 21. Remove this redundant import statement.

Suggested change
from mssql_python.exceptions import InterfaceError

Copilot uses AI. Check for mistakes.


def test_setencoding_invalid_encoding(conn_str):
"""Test setencoding with invalid encoding raises ProgrammingError"""
from mssql_python.exceptions import ProgrammingError
Copy link
Preview

Copilot AI Aug 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import statement is duplicated - ProgrammingError is already imported at the top of the file at line 21. Remove this redundant import statement.

Suggested change
from mssql_python.exceptions import ProgrammingError

Copilot uses AI. Check for mistakes.


def test_setencoding_invalid_ctype(conn_str):
"""Test setencoding with invalid ctype raises ProgrammingError"""
from mssql_python.exceptions import ProgrammingError
Copy link
Preview

Copilot AI Aug 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import statement is duplicated - ProgrammingError is already imported at the top of the file at line 21. Remove this redundant import statement.

Suggested change
from mssql_python.exceptions import ProgrammingError

Copilot uses AI. Check for mistakes.

assert "Invalid ctype" in str(exc_info.value), "Should raise ProgrammingError for invalid ctype"
assert "999" in str(exc_info.value), "Error message should include the invalid ctype value"

def test_setencoding_closed_connection(conn_str):
Copy link
Preview

Copilot AI Aug 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function name is duplicated - there are two test functions with the same name test_setencoding_closed_connection at lines 571 and 852. The second function will override the first one. Rename one of them to have a unique name.

Suggested change
def test_setencoding_closed_connection(conn_str):
def test_setencoding_closed_connection_variant(conn_str):

Copilot uses AI. Check for mistakes.

finally:
conn.close()

def test_setencoding_invalid_encoding(conn_str):
Copy link
Preview

Copilot AI Aug 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function name is duplicated - there are two test functions with the same name test_setencoding_invalid_encoding at lines 551 and 830. The second function will override the first one. Rename one of them to have a unique name.

Suggested change
def test_setencoding_invalid_encoding(conn_str):
def test_setencoding_invalid_encoding_raises_error(conn_str):

Copilot uses AI. Check for mistakes.

finally:
conn.close()

def test_setencoding_invalid_ctype(conn_str):
Copy link
Preview

Copilot AI Aug 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function name is duplicated - there are two test functions with the same name test_setencoding_invalid_ctype at lines 561 and 841. The second function will override the first one. Rename one of them to have a unique name.

Suggested change
def test_setencoding_invalid_ctype(conn_str):
def test_setencoding_invalid_ctype_error(conn_str):

Copilot uses AI. Check for mistakes.

@@ -18,7 +18,7 @@ def pytest_configure(config):

@pytest.fixture(scope='session')
def conn_str():
conn_str = os.getenv('DB_CONNECTION_STRING')
conn_str = "Server=tcp:DESKTOP-1A982SC,1433;Database=master;TrustServerCertificate=yes;Trusted_Connection=yes;"
Copy link
Preview

Copilot AI Aug 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoded connection string contains specific server name and configuration. This creates a dependency on a specific machine configuration and may expose internal infrastructure details. Consider using environment variables or configuration files instead.

Suggested change
conn_str = "Server=tcp:DESKTOP-1A982SC,1433;Database=master;TrustServerCertificate=yes;Trusted_Connection=yes;"
conn_str = os.environ.get("MSSQL_CONN_STR")
if not conn_str:
raise RuntimeError("Environment variable MSSQL_CONN_STR is not set. Please set it to your SQL Server connection string.")

Copilot uses AI. Check for mistakes.

Comment on lines 35 to 40
# Cache for encoding validation to improve performance
# Using a simple dict instead of lru_cache for module-level caching
_ENCODING_VALIDATION_CACHE = {}
_CACHE_MAX_SIZE = 100 # Limit cache size to prevent memory bloat


Copy link
Preview

Copilot AI Aug 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Global cache dictionary is defined but never used in the code. The _validate_encoding function uses @lru_cache decorator instead. Remove this unused global variable or implement the manual caching logic if intended.

Suggested change
# Cache for encoding validation to improve performance
# Using a simple dict instead of lru_cache for module-level caching
_ENCODING_VALIDATION_CACHE = {}
_CACHE_MAX_SIZE = 100 # Limit cache size to prevent memory bloat

Copilot uses AI. Check for mistakes.

_CACHE_MAX_SIZE = 100 # Limit cache size to prevent memory bloat


@lru_cache(maxsize=128)
Copy link
Preview

Copilot AI Aug 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Global cache size constant is defined but never used in the code. The _validate_encoding function uses a hardcoded maxsize=128 in the @lru_cache decorator. Remove this unused constant or use it in the decorator.

Suggested change
@lru_cache(maxsize=128)
@lru_cache(maxsize=_CACHE_MAX_SIZE)

Copilot uses AI. Check for mistakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-size: large Substantial code update
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant