Skip to content

Conversation

jahnvi480
Copy link
Contributor

@jahnvi480 jahnvi480 commented Aug 21, 2025

Work Item / Issue Reference

AB#34907
AB#34908


Summary

This pull request adds support for configuring the decimal separator used when parsing and displaying NUMERIC/DECIMAL values in the mssql_python package. It introduces new Python and C++ APIs to control the decimal separator, ensures the separator is respected in string representations, and includes comprehensive tests to verify the new functionality.

Decimal separator configuration and propagation:

  • Added setDecimalSeparator and getDecimalSeparator functions in mssql_python/__init__.py to allow users to set and retrieve the decimal separator character, with input validation and propagation to the C++ layer (DDBCSetDecimalSeparator).
  • Introduced a global g_decimalSeparator variable and the DDBCSetDecimalSeparator function in C++ (ddbc_bindings.cpp and ddbc_bindings.h) to store and update the separator, and exposed this setter to Python via pybind11.

Integration with data handling and display:

  • Updated the C++ data fetching logic to use the configured decimal separator when converting NUMERIC/DECIMAL database values to Python Decimal objects, ensuring correct parsing and formatting.
  • Modified the Python Row.__str__ method to use the current decimal separator when displaying decimal values, so string representations reflect user configuration.

Testing:

  • Added and expanded tests in tests/test_001_globals.py and tests/test_004_cursor.py to cover the new decimal separator functionality, including validation, database operations, string formatting, and ensuring calculations are unaffected by the separator setting.
    These changes make the decimal separator fully configurable and ensure consistent behavior across both parsing and display of decimal values.

@github-actions github-actions bot added the pr-size: medium Moderate update size label Aug 21, 2025
@@ -271,3 +271,9 @@ inline std::wstring Utf8ToWString(const std::string& str) {
return converter.from_bytes(str);
#endif
}

// Global decimal separator setting
extern std::string g_decimalSeparator;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No mention of thread-safety for the global separator variable. If multiple threads use this setting, this could be a problem.
The extern std::string g_decimalSeparator; declaration introduces a global variable that can be accessed concurrently by multiple threads without any thread-safety mechanisms. Consider using std::atomic<std::string> or adding mutex protection.

@@ -2480,6 +2494,14 @@ void enable_pooling(int maxSize, int idleTimeout) {
});
}

// Global decimal separator setting with default value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The global variable std::string g_decimalSeparator = "."; can be modified concurrently without synchronization. This could lead to data races when multiple threads access this variable simultaneously.

// Global decimal separator setting with default value
std::string g_decimalSeparator = ".";

void DDBCSetDecimalSeparator(const std::string& separator) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multiple threads calling setDecimalSeparator() from Python can modify g_decimalSeparator concurrently, potentially causing:

Data corruption during string assignment
Inconsistent reads by other threads
Undefined behavior
Recommendation: Add mutex protection around this assignment or use atomic operations.

std::string numStr(reinterpret_cast<const char*>(
&buffers.charBuffers[col - 1][i * MAX_DIGITS_IN_NUMERIC]),
buffers.indicators[col - 1][i]);
if (g_decimalSeparator != ".") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The FetchBatchData function reads g_decimalSeparator without synchronization while another thread could be modifying it via DDBCSetDecimalSeparator(). This can result in:

Reading partially written/corrupted separator values
Inconsistent decimal formatting within the same batch
String corruption during replacement operations

This affects critical data parsing functionality and could cause incorrect decimal value formatting.

Recommendation: Use atomic operations or mutex protection for both read and write access to g_decimalSeparator.

if (g_decimalSeparator != ".") >>> This may result in Unsafe Read
// Replace the driver's decimal point with our configured separator
size_t pos = numStr.find('.');
if (pos != std::string::npos) {
numStr.replace(pos, 1, g_decimalSeparator); >>> This may result in usage of potentially corrupted value

Copy link
Contributor

@sumitmsft sumitmsft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Global state in multi-threaded environment

The current implementation uses global state (g_decimalSeparator) without considering thread safety. Given that this library is likely to be used in multi-threaded applications (web servers, concurrent data processing), this could lead to:

Data corruption: Concurrent string operations on g_decimalSeparator
Inconsistent behavior: Different threads seeing different separator values
Race conditions: Database operations using incorrect separators
Suggested Solutions:

Use std::atomicstd::string for the global variable
Add mutex protection around read/write operations
Consider thread-local storage for separator configuration
Encapsulate separator in a thread-safe configuration object/context

@sumitmsft
Copy link
Contributor

@jahnvi480
Ensure edge cases are covered:
Setting an invalid separator (empty string, multiple characters, non-ASCII)
Changing the separator during active DB operations
Multi-threaded scenarios >> This can be todo as we do not anticipate multithreaded env without async.
Changing the separator back and forth and confirming correct parsing/display each time
What happens if the separator is set after fetching a result set but before display

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-size: medium Moderate update size
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants