feat: support decimal #114

RaphDal · 2025-10-13T14:48:41Z

This PR adds decimal support.

This is a tandem pr for:

feat(core): support for Decimal questdb#6068

Usage

Decimal object

from decimal import Decimal

sender.row(
    'trades',
    symbols={
        'symbol': 'ETH-USD',
        'side': 'sell'},
    columns={
        'price': 2615.54,
        'amount': Decimal(0.00044),
        },
    at=TimestampNanos.now())

Progress

support binary and text formats
update questdb version when decimal is released

Summary by CodeRabbit

New Features
- End-to-end Decimal (fixed-point) support for ingestion, Arrow/Pandas paths, and protocol v3; public APIs now accept Decimal values.
Documentation
- Added Decimals datatype mapping and usage examples for Pandas, NumPy, and PyArrow.
Bug Fixes
- Improved validation and clearer protocol-version/error messages (now reference 1–3).
Tests
- Added tests for decimal serialization, Arrow decimals, special decimal values, and protocol v3.
Chores
- Added Decimal compatibility layer, CI updates to test decimal branch, and submodule update.

RaphDal · 2025-10-21T16:06:05Z

@CodeRabbit review

coderabbitai · 2025-10-21T16:06:30Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai · 2025-10-21T16:06:33Z

Walkthrough

Adds end-to-end Decimal support: docs, dataframe serialization and Arrow decimal paths, mpdecimal compatibility and decimal→binary conversion, new line-sender decimal APIs and error codes, protocol version 3 handling, tests and mock-server updates, plus a c-questdb-client submodule reference bump.

Changes

Cohort / File(s)	Summary
Submodule `c-questdb-client`	Submodule reference updated to a new commit.
Documentation `src/questdb/dataframe.md`	Adds Decimals documentation and examples (Pandas/NumPy/PyArrow), supported decimal sizes and null handling; duplicated for parity.
Dataframe serialization (Cython) `src/questdb/dataframe.pxi`	Adds decimal target/source enums and dispatch codes, `col.scale` field, byte-swap helpers, Arrow decimal resolution/validation, decimal serialization paths for pyobj and Arrow variants, and Decimal sniffing.
mpdecimal compatibility (C / Cython) `src/questdb/mpdecimal_compat.h`, `src/questdb/mpdecimal_compat.pxd`	New mpd_t / PyDecObject types and platform typedefs; MPD_RADIX and flags; `decimal_pyobj_to_binary` inline to convert Python Decimal → unscaled bytes + encoded scale with NaN/Inf and scale validation.
Ingress / Line-sender APIs `src/questdb/ingress.pyi`, `src/questdb/ingress.pyx`, `src/questdb/line_sender.pxd`	Adds Decimal to Buffer.row / SenderTransaction.row type hints; new `IngressErrorCode.DecimalError`; maps C error for invalid decimal; adds `line_sender_error_invalid_decimal`, `line_sender_protocol_version_3`, and line-sender decimal column functions; protocol version validation accepts 1–3 and protocol v3 wiring.
Tests & Mock server `test/test.py`, `test/test_dataframe.py`, `test/mock_server.py`, `test/system_test.py`	Adds decimal test helpers/constants (DECIMAL_BINARY_FORMAT_TYPE, encode/decode helpers), tests for Decimal pyobj and Arrow decimals, protocol-v3 test class and settings, mock-server defaults updated to include [1,2,3], and QUESTDB_VERSION/FIRST_DECIMAL_RELEASE updates.
CI `ci/run_tests_pipeline.yaml`	Adds steps to clone/compile a decimal branch of QuestDB and parallel test flow against it; minor YAML quoting and step renames.

Sequence Diagram(s)

sequenceDiagram
    participant User as User code
    participant API as Buffer / SenderTransaction
    participant Compat as mpdecimal_compat
    participant Serializer as Line Sender
    participant ILP as ILP Buffer

    User->>API: row(..., columns={'a': Decimal(...)})
    API->>Compat: decimal_pyobj_to_binary(PyDecimal)
    Note right of Compat: inspect mpd_t (flags, exp, digits)
    alt NaN or Inf
        Compat-->>Serializer: special indicator (empty payload, scale=0)
    else Normal Decimal
        Compat-->>Serializer: (unscaled_bytes, encoded_scale)
        Serializer->>Serializer: format decimal field (scale, width, bytes)
        Serializer->>ILP: append decimal field to ILP buffer
    end
    ILP-->>User: serialized ILP message

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Focus review on:
- mpdecimal_compat.h / mpdecimal_compat.pxd: struct layout, platform typedefs, MPD_RADIX and flag semantics.
- decimal_pyobj_to_binary: digit assembly, endianness, signedness, NaN/Inf propagation, and scale limit enforcement.
- dataframe.pxi: new enums/dispatch codes, col_t.scale usage, Arrow decimal resolution and per-variant serializers.
- line_sender boundary functions and protocol-version handling for ABI/backwards compatibility.
- Tests/mock-server/CI: correctness of new test helpers, version gating, and default mock settings.

Poem

🐰
I nibbled digits, counted scales with care,
Packed bytes in order, big-endian flair.
Arrow and Decimal now hop the stream,
Protocol three widened our scheme.
Precision tucked in every carrot I share.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'feat: support decimal' clearly and concisely summarizes the main change: adding Decimal support to py-questdb-client, which aligns with the extensive changes across multiple files implementing decimal handling.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch rd_decimal

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

src/questdb/ingress.pyx (1)
1043-1049: Minor: fix valid-types error message (missing comma concatenates entries)

'datetime.datetime' 'numpy.ndarray' concatenates into one token.

Apply this diff:
-                'TimestampMicros',
-                'datetime.datetime'
-                'numpy.ndarray'))
+                'TimestampMicros',
+                'datetime.datetime',
+                'numpy.ndarray'))
src/questdb/ingress.pyi (1)
1030-1039: Add Decimal to the columns type signature.

The Sender.row method's columns parameter is missing Decimal in its type union, while both Buffer.row (line 386) and SenderTransaction.row (line 207) include it. This inconsistency will cause type checking errors when users try to pass Decimal values to Sender.row.

Apply this diff to add Decimal to the type union:
     columns: Optional[
-            Dict[str, Union[bool, int, float, str, TimestampMicros, datetime, np.ndarray]]
+            Dict[str, Union[bool, int, float, str, TimestampMicros, datetime, np.ndarray, Decimal]]
     ] = None,

🧹 Nitpick comments (6)

src/questdb/ingress.pyx (1)

2414-2429: Doc nit: mention protocol version 3

Property doc explains v1 and v2 only. Consider adding a short note for v3 to avoid confusion.

src/questdb/dataframe.md (1)

96-106: Decimal docs read well

Clear coverage across pandas/NumPy/Arrow with examples; fits new tests. Consider adding the supported scale range (0–76) note here for completeness.

Also applies to: 129-157
src/questdb/ingress.pyi (1)
709-711: Clarify null representation for Decimal columns.

The table shows Y (NaN) for nulls in the Decimal row. However, NaN is typically associated with float types. For Decimal objects, nulls are represented as None or pandas.NA, not NaN. Consider changing this to just Y or Y (None) for clarity.

Apply this diff if you agree:
             * - ``'object'`` (``Decimal`` objects)
-              - Y (``NaN``)
+              - Y
               - ``DECIMAL``
src/questdb/mpdecimal_compat.h (1)
1-19: Document CPython version compatibility assumptions.

This compatibility layer relies on CPython's internal Decimal implementation details (struct layout and limb size). These internals may change between CPython versions. Consider:

Adding a comment documenting which CPython versions are supported (e.g., 3.8+)

Adding runtime checks in the Cython code to verify struct layout hasn't changed

Noting in documentation that this is a best-effort compatibility layer

Example comment to add:
+/* 
+ * Compatibility layer for CPython's decimal module (libmpdec).
+ * Tested with CPython 3.8 through 3.12.
+ * May break with future CPython versions if internal Decimal layout changes.
+ */
+
 /* Determine the limb type used by CPython's libmpdec build. */
 #if SIZE_MAX == UINT64_MAX
src/questdb/dataframe.pxi (2)
59-73: Add comments explaining byte-swap usage for Arrow decimals.

The bswap32 and bswap64 functions are used later for Arrow decimal types (lines 2226, 2245, etc.), but it's not immediately clear why byte-swapping is needed. Arrow stores decimal values in big-endian format, while the ILP protocol expects a specific byte order.

Add a comment explaining the endianness conversion:
+# Arrow decimal types store values in big-endian format (network byte order).
+# These functions convert to the format expected by the ILP protocol.
 cdef inline uint32_t bswap32(uint32_t value):
2213-2295: LGTM! Arrow decimal serialization correctly handles all bit widths.

All four Arrow decimal serialization functions properly:

Check Arrow validity bitmaps

Send NULL for invalid values

Perform correct byte-swapping for endianness conversion

Use the stored scale from column metadata

The 128-bit and 256-bit handlers correctly swap both byte order within each 64-bit word and reverse the word order.

Optional: Consider a helper function to reduce duplication.

The four functions have similar structure. You could extract common logic:
cdef void_int _arrow_decimal_to_bytes(
        col_t* col, 
        size_t byte_count,
        uint64_t* out_buffer,
        bint* valid_out) noexcept nogil:
    """Extract and byte-swap Arrow decimal to output buffer."""
    # Common extraction and swapping logic
This would reduce duplication and make maintenance easier, though the current approach is also acceptable.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 996b251 and 1d4c69f.

📒 Files selected for processing (10)

c-questdb-client (1 hunks)
src/questdb/dataframe.md (1 hunks)
src/questdb/dataframe.pxi (17 hunks)
src/questdb/ingress.pyi (7 hunks)
src/questdb/ingress.pyx (5 hunks)
src/questdb/line_sender.pxd (3 hunks)
src/questdb/mpdecimal_compat.h (1 hunks)
src/questdb/mpdecimal_compat.pxd (1 hunks)
test/test.py (4 hunks)
test/test_dataframe.py (5 hunks)

🧰 Additional context used

🪛 Clang (14.0.6)

src/questdb/mpdecimal_compat.h

[error] 4-4: 'Python.h' file not found

(clang-diagnostic-error)

🪛 Ruff (0.14.1)

test/test_dataframe.py

90-90: Avoid specifying long messages outside the exception class

(TRY003)

93-93: Avoid specifying long messages outside the exception class

(TRY003)

95-95: Avoid specifying long messages outside the exception class

(TRY003)

97-97: Avoid specifying long messages outside the exception class

(TRY003)

102-103: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (15)

GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_arm64)
GitHub Check: questdb.py-questdb-client (1) (cibuildwheel windows_x86_64)
GitHub Check: questdb.py-questdb-client (1) (cibuildwheel macos_x64)
GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_x64_pypy)
GitHub Check: questdb.py-questdb-client (1) (cibuildwheel windows_i686)
GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_x64_cpython_musllinux)
GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_x64_cpython_manylinux_x86_64)
GitHub Check: questdb.py-questdb-client (1) (cibuildwheel start_linux_arm64_agent_aws)
GitHub Check: questdb.py-questdb-client (Building and testing on windows-msvc-2019)
GitHub Check: questdb.py-questdb-client (Building and testing on mac)
GitHub Check: questdb.py-questdb-client (Building and testing on linux-qdb-master)
GitHub Check: questdb.py-questdb-client (Building and testing on linux-old-pandas)
GitHub Check: questdb.py-questdb-client (Building and testing on linux)
GitHub Check: questdb.py-questdb-client (Building and testing TestsAgainstVariousNumpyVersion2x)
GitHub Check: questdb.py-questdb-client (Building and testing TestsAgainstVariousNumpyVersion1x)

🔇 Additional comments (29)

c-questdb-client (1)

1-1: Submodule reference update requires verification of C extension changes.

This file contains only a pointer update to the C extension submodule. The actual Decimal support implementation in the C extension (questdb/c-questdb-client) is not accessible for review from this context.

Given that the broader PR adds significant Decimal support across the Python wrapper (dataframe serialization, ILP ingestion, type signatures), ensure that:

The C extension at commit 5b17715... includes corresponding Decimal serialization/deserialization logic.

The binary protocol changes (if any) are compatible with the Python-side changes.

The submodule commit has been tested with the tandem QuestDB core PR (questdb/questdb#6068).

Note: This PR is marked "DO NOT MERGE" and depends on upstream changes.

To verify C extension compatibility, you may want to:

Inspect the C extension diff at the target commit to ensure Decimal support aligns with the Python wrapper changes.

Confirm that protocol version 3 support (mentioned in the AI summary) is implemented in the C extension.

Verify integration tests pass with the updated submodule.

test/test.py (3)

45-49: V3 pandas tests import looks good

Keeps suite discoverable only when pandas is present.

415-425: Protocol-version validation updates are correct

Treating 3 as valid and 4/'4' as invalid with the updated error text matches the new Sender/Buffer checks.

If CI still runs the “unsupported client for V3” test, please confirm it’s updated (or gated) to reflect that the client now supports V3.

Also applies to: 430-432

1478-1479: Public name fix for V2

Renaming to “protocol version 2” is consistent with the class.

test/test_dataframe.py (3)

84-121: Decimal payload helpers are solid

Helpers make intent clear and align with the binary format used in assertions.

570-585: Comprehensive decimal test coverage

Covers pyobj decimals (incl. special values) and Arrow decimals across widths; version-gated appropriately.

Also applies to: 586-597, 598-608, 609-646

1705-1709: Updated error-message regex

The new wording (“Unsupported arrow type …”) matches current behavior.

src/questdb/mpdecimal_compat.pxd (1)

24-71: Decimal → ILP conversion helper looks correct

Handles NaN/Inf as nulls.

Builds unscaled integer from mpd limbs correctly (LE limbs × MPD_RADIX).

Enforces max scale 76 and applies sign.

One note: zero encodes to an empty mantissa (length 0), which matches the tests’ “special values” treatment; confirm your wire format also expects empty mantissa for numeric zero, or adjust to emit a single 0x00 byte.

src/questdb/ingress.pyx (1)

1241-1260: Decimal support is limited to dataframe() path, not row(); verify PR description scope and consider scoping Decimal to dataframe only

The review comment is accurate. After examining the codebase:

Buffer.row() columns parameter type hint excludes Decimal (only: bool, int, float, str, TimestampMicros, TimestampNanos, datetime.datetime, numpy.ndarray)

Decimal support via decimal_pyobj_to_binary is implemented only in dataframe.pxi

No _column_decimal method exists in the Buffer class; only _column_bool, _column_i64, _column_f64, _column_str, _column_ts_micros, _column_ts_nanos, _column_numpy

Decimal in dataframe requires protocol v3 (tests skip for version < 3)

If the PR description shows sender.row(... Decimal(...)), the documentation/example is inconsistent with the implementation. Either add Decimal support to row() (requiring _column_decimal and protocol v3 guard) or scope examples/docs to dataframe-only.

src/questdb/line_sender.pxd (2)

43-56: LGTM! Protocol version and error code additions are well-structured.

The addition of line_sender_error_invalid_decimal and line_sender_protocol_version_3 follows existing conventions and provides the necessary foundation for Decimal support.

268-282: LGTM! Decimal buffer functions follow established patterns.

The two new functions line_sender_buffer_column_dec_str and line_sender_buffer_column_dec are well-designed:

Consistent with existing column buffer functions

Support both text (string) and binary formats

Include proper error handling via err_out parameter

src/questdb/ingress.pyi (4)

43-61: LGTM! Import and error code additions are correct.

The import of Decimal and the addition of DecimalError to the IngressErrorCode enum are necessary for type checking support.

207-207: LGTM! Type signature correctly includes Decimal.

The addition of Decimal to the columns parameter type union in SenderTransaction.row enables proper type checking for decimal column values.

386-386: LGTM! Type signature correctly includes Decimal.

The addition of Decimal to the columns parameter type union in Buffer.row enables proper type checking for decimal column values.

407-456: LGTM! Documentation clearly illustrates Decimal usage.

The example usage and type mapping table additions help users understand:

How to pass Decimal values in the columns dict

The mapping from Python Decimal to ILP DECIMAL type
src/questdb/mpdecimal_compat.h (3)
21-35: Add runtime validation for struct layout assumptions.

The mpd_t and PyDecObject struct definitions assume a specific memory layout that matches CPython's internal implementation. If CPython changes these internals, this code will silently produce incorrect results or crash.

Consider adding runtime checks in the Cython initialization code (e.g., in mpdecimal_compat.pxd or module init) to verify:

Size of Python Decimal objects matches expectations

Basic sanity checks on extracted values (e.g., comparing against decimal module's official API)

Example validation approach:
# At module initialization
test_decimal = Decimal("123.45")
# Extract using compatibility layer
# Also extract using official decimal API
# Assert they match
This would catch breaking changes early rather than producing silent corruption.

37-44: LGTM! Accessor functions correctly handle inline vs heap storage.

The decimal_digits() function properly handles both storage modes:

Heap-allocated: uses dec->dec.data

Inline (small decimals): uses dec->data[4]

This matches CPython's optimization for small decimal values.

46-54: LGTM! Flag definitions match mpdecimal constants.

The flag enum and MPD_RADIX constant definitions are correct and consistent with libmpdec's public interface.
src/questdb/dataframe.pxi (11)

96-110: LGTM! Enum additions for decimal target are correct.

The addition of col_target_column_decimal = 9 and updating col_target_at = 10 maintains the enum sequence. The target name "decimal" in _TARGET_NAMES is consistent with other entries.

152-179: LGTM! Decimal source types cover all supported formats.

The five decimal source types provide comprehensive coverage:

col_source_decimal_pyobj: Python Decimal objects

col_source_decimal32/64/128/256_arrow: Arrow decimal types of different bit widths

The inclusion in _PYOBJ_SOURCE_DESCR enables clear error messages.

249-272: LGTM! Target-to-source mappings are complete.

The _TARGET_TO_SOURCES mapping correctly includes all five decimal source types for the col_target_column_decimal target. The addition to _FIELD_TARGETS ensures decimal columns are recognized as field columns.

397-406: LGTM! Dispatch codes follow established patterns.

The five dispatch codes combining col_target_column_decimal with each decimal source type enable efficient routing in the serialization switch statement. This follows the same pattern used for other column types.

427-432: LGTM! Scale field addition is well-documented.

The scale field in col_t correctly stores the decimal scale for Arrow types. The comment clearly indicates it's only used for Arrow decimals and defaults to 0. uint8_t is sufficient for the 0-76 scale range.

956-979: LGTM! Arrow decimal type resolution is comprehensive.

The _dataframe_series_resolve_arrow function correctly:

Handles all four Arrow decimal bit widths (32/64/128/256)

Validates scale is within the supported 0-76 range

Provides clear error messages with GitHub issue link

Stores the scale for later use in serialization

1046-1047: LGTM! Decimal detection in object columns is correct.

The isinstance(<object>obj, Decimal) check properly identifies Decimal objects in pandas object-dtype columns and sets the appropriate source type. This is consistent with other type detection logic in _dataframe_series_sniff_pyobj.

1163-1164: LGTM! ArrowDtype support enables Arrow-backed decimal columns.

The handling of _PANDAS.ArrowDtype by delegating to _dataframe_series_resolve_arrow correctly enables support for Arrow-backed columns in pandas, including Arrow decimal types.

1300-1315: LGTM! Formatting improves readability of bitwise operations.

The reformatting of the bitwise validity checks with explicit parentheses and line breaks makes the bit manipulation logic clearer without changing behavior.

2171-2210: LGTM! Python Decimal serialization handles all cases correctly.

The serialization function properly handles:

Null values (returns early)

Special values like NaN/Inf (sends as NULL to server)

Mantissa size validation (127-byte limit)

Error reporting via IngressError

The use of decimal_pyobj_to_binary encapsulates the complex decimal-to-binary conversion logic.

2456-2465: LGTM! Dispatch switch correctly routes all decimal types.

The five decimal dispatch cases properly route to their respective serialization functions. The GIL handling is correct:

decimal_pyobj doesn't pass gs (requires GIL)

Arrow variants pass gs (can release GIL)

src/questdb/ingress.pyx

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

src/questdb/dataframe.pxi (1)
972-977: Verify the decimal scale range (0-76) is correct.

The code validates that the Arrow decimal scale is between 0 and 76 inclusive. Please confirm this range matches QuestDB's decimal column constraints and is documented in the database's decimal support specifications.

If this range is database-specific, consider adding a constant or comment referencing the specification:
+    # QuestDB decimal scale range: 0-76 (see: https://questdb.io/docs/reference/sql/datatypes/#decimal)
     if arrowtype.scale < 0 or arrowtype.scale > 76:

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1d4c69f and b62520a.

📒 Files selected for processing (1)

src/questdb/dataframe.pxi (17 hunks)

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-10-22T07:31:37.016Z

Learnt from: RaphDal
Repo: questdb/py-questdb-client PR: 114
File: src/questdb/ingress.pyx:192-195
Timestamp: 2025-10-22T07:31:37.016Z
Learning: In src/questdb/ingress.pyx, the IngressErrorCode.BadDataFrame enum member is Python-internal only and does not map to any C error code from the underlying library. It is defined as `<int>line_sender_error_invalid_decimal + 1` and should not be added to the c_err_code_to_py function, which only handles C-to-Python error code translation.

Applied to files:

src/questdb/dataframe.pxi

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)

GitHub Check: questdb.py-questdb-client (1) (cibuildwheel windows_x86_64)
GitHub Check: questdb.py-questdb-client (1) (cibuildwheel macos_x64)
GitHub Check: questdb.py-questdb-client (1) (cibuildwheel windows_i686)
GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_x64_pypy)
GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_x64_cpython_manylinux_x86_64)
GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_x64_cpython_musllinux)
GitHub Check: questdb.py-questdb-client (1) (cibuildwheel start_linux_arm64_agent_aws)
GitHub Check: questdb.py-questdb-client (Building and testing on windows-msvc-2019)
GitHub Check: questdb.py-questdb-client (Building and testing on mac)
GitHub Check: questdb.py-questdb-client (Building and testing on linux-qdb-master)
GitHub Check: questdb.py-questdb-client (Building and testing on linux-old-pandas)
GitHub Check: questdb.py-questdb-client (Building and testing on linux)
GitHub Check: questdb.py-questdb-client (Building and testing TestsAgainstVariousNumpyVersion2x)
GitHub Check: questdb.py-questdb-client (Building and testing TestsAgainstVariousNumpyVersion1x)

🔇 Additional comments (6)

src/questdb/dataframe.pxi (6)

59-73: Byte-swap utilities look correct.

The inline bswap32 and bswap64 functions implement standard endianness conversion. These are essential for converting Arrow's little-endian decimal representations to the big-endian format expected by the ILP protocol.

Ensure the byte-swapping operations are thoroughly tested, especially for the complex multi-chunk swaps used in decimal128 and decimal256 serialization (lines 2265-2267, 2287-2291).

96-97: Decimal enum and mapping additions look correct.

The new col_target_column_decimal target, associated sources, and dispatch codes are consistently integrated into the existing enum and mapping structures. The col_target_at offset correctly shifted from 9 to 10.

Also applies to: 110-110, 179-179, 249-255, 271-272, 397-406

432-432: Scale field addition is appropriate.

The scale field is used to store the Arrow decimal type's scale (set during resolution on line 978) and is unused for non-decimal columns. For Python Decimal objects, the scale is extracted during serialization from the object itself.

1046-1047: Decimal type detection is well-integrated.

The decimal sniffing in object columns (lines 1046-1047) and ArrowDtype handling (lines 1163-1164) follow the established patterns and integrate cleanly with existing type detection logic.

Also applies to: 1163-1164

2456-2465: Decimal dispatch integration looks correct.

The dispatch logic correctly routes each decimal dispatch code to its corresponding serialization function, following the established pattern for other column types.

2171-2296: Verify decimal mantissa size limit and byte-swapping correctness.

The decimal serialization implementation has well-tested code paths for Python Decimals and Arrow decimal types, with proper special value handling (NaN/Infinity produce NULL markers). However:

Undocumented 127-byte mantissa limit (line 2197): The hardcoded limit lacks any explanation. Verify this aligns with QuestDB's ILP wire format specification—it appears reasonable for ~307 decimal digits but needs confirmation it's not arbitrary or outdated.

Complex byte-swapping for multi-chunk decimals (decimal128 at lines 2265–2267, decimal256 at lines 2287–2291): The chunk reversal and byte-swap logic is correct in principle but should be validated against actual serialized output, especially endianness correctness across platforms.

No comments on pointer arithmetic: The bit-shift indexing (offset << 1, offset << 2) is correct but could benefit from a brief inline comment explaining the multi-chunk array indexing.

Confirm:

127-byte limit matches QuestDB protocol constraints

Byte-swap output is verified correct via integration testing

Logic handles edge cases (min/max values, overflow near 127-byte boundary)

src/questdb/dataframe.pxi

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

test/test.py (1)
437-437: Remove debug print statement.

This print statement appears to be a debug artifact that should be removed before merging.

Apply this diff:
-                    print("Testing bad protocol version:", version)

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5d7ee5e and 5b4cbfa.

📒 Files selected for processing (1)

test/test.py (7 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

test/test.py (2)

test/test_dataframe.py (1)

TestPandasProtocolVersionV3 (1806-1808)

test/mock_server.py (2)

HttpServer (124-211)

port (210-211)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (15)

GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_arm64)
GitHub Check: questdb.py-questdb-client (1) (cibuildwheel windows_x86_64)
GitHub Check: questdb.py-questdb-client (1) (cibuildwheel start_linux_arm64_agent_aws)
GitHub Check: questdb.py-questdb-client (1) (cibuildwheel windows_i686)
GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_x64_cpython_manylinux_x86_64)
GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_x64_cpython_musllinux)
GitHub Check: questdb.py-questdb-client (1) (cibuildwheel macos_x64)
GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_x64_pypy)
GitHub Check: questdb.py-questdb-client (Building and testing on windows-msvc-2019)
GitHub Check: questdb.py-questdb-client (Building and testing on mac)
GitHub Check: questdb.py-questdb-client (Building and testing on linux-qdb-master)
GitHub Check: questdb.py-questdb-client (Building and testing on linux-old-pandas)
GitHub Check: questdb.py-questdb-client (Building and testing TestsAgainstVariousNumpyVersion1x)
GitHub Check: questdb.py-questdb-client (Building and testing on linux)
GitHub Check: questdb.py-questdb-client (Building and testing TestsAgainstVariousNumpyVersion2x)

🔇 Additional comments (6)

test/test.py (6)

27-27: LGTM: Import updated for expanded protocol version coverage.

The import correctly reflects the expanded protocol version support (v1, v2, v3, and v4), aligning with changes in mock_server.py.

48-48: LGTM: Protocol version v3 test class import.

The import of TestPandasProtocolVersionV3 is consistent with the existing pattern for v1 and v2, and is properly conditional on pandas availability.

420-421: LGTM: Bad protocol version test correctly updated.

The test now correctly validates that version 4 is invalid (since only versions 1-3 are supported), and the error message accurately reflects the valid range.

Also applies to: 430-431

1152-1153: LGTM: HTTP auto protocol version test extended to v3.

The test correctly validates that when the server supports v1, v2, and v3, the client auto-negotiates to the highest version (v3).

1176-1177: LGTM: Unsupported protocol version test updated.

The test now correctly validates the scenario where the server only supports v4 (which the client doesn't support), ensuring proper forward-compatibility error handling.

1479-1479: LGTM: Bug fix for incorrect test class name.

The name attribute now correctly reflects that this test class is for protocol version 2, not version 1.

… version

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9b00b39 and cad427c.

📒 Files selected for processing (1)

test/system_test.py (1 hunks)

test/system_test.py

…nversion

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

test/system_test.py (1)

149-151: Use string variant for “_str” test.

test_auth_tls_insecure_skip_verify_str passes a bool; use the string form to match the test’s intent.
-        self._test_scenario(self.qdb_auth, 'tcps', tls_verify=False, **AUTH)
+        self._test_scenario(self.qdb_auth, 'tcps', tls_verify='unsafe_off', **AUTH)

src/questdb/ingress.pyx (1)

1048-1057: Fix error message list; add missing comma and include Decimal/TimestampNanos.

Current message concatenates 'datetime.datetime' 'numpy.ndarray' and omits new types.

-            valid = ', '.join((
-                'bool',
-                'int',
-                'float',
-                'str',
-                'TimestampMicros',
-                'datetime.datetime'
-                'numpy.ndarray'))
+            valid = ', '.join((
+                'bool',
+                'int',
+                'float',
+                'str',
+                'TimestampMicros',
+                'TimestampNanos',
+                'datetime.datetime',
+                'numpy.ndarray',
+                'Decimal'))

♻️ Duplicate comments (2)

test/system_test.py (1)
222-222: Fix timestamp type threshold: use < (9, 1, 0), not <=.

9.1.0 should use TIMESTAMP_NS; current check treats it as pre‑9.1.0.
-exp_ts_type = 'TIMESTAMP' if self.qdb_plain.version <= (9, 1, 0) else 'TIMESTAMP_NS'
+exp_ts_type = 'TIMESTAMP' if self.qdb_plain.version < (9, 1, 0) else 'TIMESTAMP_NS'
src/questdb/dataframe.pxi (1)
152-157: Remove unnecessary GIL requirement for Arrow decimal sources.

Arrow decimal serializers run nogil except error paths; marking them GIL-required degrades perf for whole DF serialization.
-    col_source_decimal_pyobj =          701100
-    col_source_decimal32_arrow =        702100
-    col_source_decimal64_arrow =        703100
-    col_source_decimal128_arrow =       704100
-    col_source_decimal256_arrow =       705100
+    col_source_decimal_pyobj =          701100
+    col_source_decimal32_arrow =        702000
+    col_source_decimal64_arrow =        703000
+    col_source_decimal128_arrow =       704000
+    col_source_decimal256_arrow =       705000

🧹 Nitpick comments (1)

ci/run_tests_pipeline.yaml (1)

85-91: Temporary decimal-branch test block: ensure removal before merge.

The comment says “Remove before merging decimal support PR”. Please gate or remove this block prior to merge into main.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cad427c and 3a57ea4.

📒 Files selected for processing (5)

ci/run_tests_pipeline.yaml (4 hunks)
src/questdb/dataframe.pxi (17 hunks)
src/questdb/ingress.pyx (8 hunks)
src/questdb/mpdecimal_compat.pxd (1 hunks)
test/system_test.py (5 hunks)

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-10-22T07:31:37.016Z

Learnt from: RaphDal
Repo: questdb/py-questdb-client PR: 114
File: src/questdb/ingress.pyx:192-195
Timestamp: 2025-10-22T07:31:37.016Z
Learning: In src/questdb/ingress.pyx, the IngressErrorCode.BadDataFrame enum member is Python-internal only and does not map to any C error code from the underlying library. It is defined as `<int>line_sender_error_invalid_decimal + 1` and should not be added to the c_err_code_to_py function, which only handles C-to-Python error code translation.

Applied to files:

src/questdb/ingress.pyx
src/questdb/dataframe.pxi
src/questdb/mpdecimal_compat.pxd

🧬 Code graph analysis (1)

test/system_test.py (1)

src/questdb/ingress.pyi (7)

Sender (821-1153)

row (202-217)

row (380-485)

row (1030-1049)

dataframe (219-230)

dataframe (487-769)

dataframe (1051-1099)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (15)

GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_arm64)
GitHub Check: questdb.py-questdb-client (1) (cibuildwheel start_linux_arm64_agent_aws)
GitHub Check: questdb.py-questdb-client (1) (cibuildwheel windows_x86_64)
GitHub Check: questdb.py-questdb-client (1) (cibuildwheel windows_i686)
GitHub Check: questdb.py-questdb-client (1) (cibuildwheel macos_x64)
GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_x64_cpython_musllinux)
GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_x64_pypy)
GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_x64_cpython_manylinux_x86_64)
GitHub Check: questdb.py-questdb-client (Building and testing on windows-msvc-2019)
GitHub Check: questdb.py-questdb-client (Building and testing on mac)
GitHub Check: questdb.py-questdb-client (Building and testing on linux-qdb-master)
GitHub Check: questdb.py-questdb-client (Building and testing on linux-old-pandas)
GitHub Check: questdb.py-questdb-client (Building and testing on linux)
GitHub Check: questdb.py-questdb-client (Building and testing TestsAgainstVariousNumpyVersion2x)
GitHub Check: questdb.py-questdb-client (Building and testing TestsAgainstVariousNumpyVersion1x)

🔇 Additional comments (9)

test/system_test.py (4)

32-32: Version pin looks fine; remember to bump for release.

Update QUESTDB_VERSION to the first release with decimals before merging this PR.

36-36: Confirm decimal feature floor.

Ensure FIRST_DECIMAL_RELEASE = (9, 2, 0) matches the server release that first exposes DECIMAL. Adjust if the final release differs.

273-293: Decimal (python object) test looks good. Verify type name stability.

Expected column type DECIMAL(18,3) matches current behavior; double‑check it won’t change across target server versions.

294-327: Arrow decimal: expected scale vs dtype scale.

DataFrame uses Arrow dtype with scale 2, but asserts DECIMAL(18,3) and values with 3 fractional digits. Confirm server’s canonicalization to scale=3 is intended and stable; otherwise, consider making the expectation reflect Arrow scale.

ci/run_tests_pipeline.yaml (1)

82-83: Env wiring for master tests looks correct.

src/questdb/ingress.pyx (2)

145-147: BadDataFrame computed offset is intentional—OK.

Design keeps Python-only code after last C enum. No C mapping needed.

Based on learnings

830-834: Protocol v3: validation message and range check look good.

src/questdb/dataframe.pxi (2)

956-979: Decimal serialization paths look correct. Two sanity notes.

Arrow endianness handling and sign-propagation via two’s complement look fine.

Python Decimal path delegates to decimal_pyobj_to_binary and respects the 127‑byte mantissa cap; with the mpdecimal fix (see separate comment), this avoids false rejections.

If available, add a negative Arrow decimal case in tests to exercise sign handling for 32/64/128/256.

Also applies to: 2170-2202, 2217-2300

1301-1316: Bitmask helpers: correctness looks good.

Validity and bit extraction logic are correct and match Arrow layout.

ci/run_tests_pipeline.yaml

src/questdb/mpdecimal_compat.pxd

test/system_test.py

…onversion

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3a57ea4 and 6990a20.

📒 Files selected for processing (2)

src/questdb/mpdecimal_compat.pxd (1 hunks)
test/system_test.py (5 hunks)

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-10-22T07:31:37.016Z

Learnt from: RaphDal
Repo: questdb/py-questdb-client PR: 114
File: src/questdb/ingress.pyx:192-195
Timestamp: 2025-10-22T07:31:37.016Z
Learning: In src/questdb/ingress.pyx, the IngressErrorCode.BadDataFrame enum member is Python-internal only and does not map to any C error code from the underlying library. It is defined as `<int>line_sender_error_invalid_decimal + 1` and should not be added to the c_err_code_to_py function, which only handles C-to-Python error code translation.

Applied to files:

src/questdb/mpdecimal_compat.pxd

🧬 Code graph analysis (1)

test/system_test.py (1)

src/questdb/ingress.pyi (7)

Sender (821-1153)

row (202-217)

row (380-485)

row (1030-1049)

dataframe (219-230)

dataframe (487-769)

dataframe (1051-1099)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (15)

GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_arm64)
GitHub Check: questdb.py-questdb-client (1) (cibuildwheel macos_x64)
GitHub Check: questdb.py-questdb-client (1) (cibuildwheel windows_x86_64)
GitHub Check: questdb.py-questdb-client (1) (cibuildwheel windows_i686)
GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_x64_pypy)
GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_x64_cpython_musllinux)
GitHub Check: questdb.py-questdb-client (1) (cibuildwheel start_linux_arm64_agent_aws)
GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_x64_cpython_manylinux_x86_64)
GitHub Check: questdb.py-questdb-client (Building and testing on windows-msvc-2019)
GitHub Check: questdb.py-questdb-client (Building and testing on mac)
GitHub Check: questdb.py-questdb-client (Building and testing on linux-qdb-master)
GitHub Check: questdb.py-questdb-client (Building and testing on linux-old-pandas)
GitHub Check: questdb.py-questdb-client (Building and testing on linux)
GitHub Check: questdb.py-questdb-client (Building and testing TestsAgainstVariousNumpyVersion1x)
GitHub Check: questdb.py-questdb-client (Building and testing TestsAgainstVariousNumpyVersion2x)

test/system_test.py

feat: Add support for decimal.Decimal object

8a0e6df

RaphDal added DO NOT MERGE tandem labels Oct 13, 2025

RaphDal added 4 commits October 15, 2025 17:50

feat: add support to arrow's decimals

4826301

tests: add serialization tests for decimals

7aae3a8

feat: uses binary format for decimal object

856c809

fix: correct enum value for col_source_decimal_pyobj

5772e0e

RaphDal changed the title ~~feat: decimal support~~ feat: support decimal Oct 17, 2025

Merge branch 'main' into rd_decimal

1d4c69f

coderabbitai bot reviewed Oct 21, 2025

View reviewed changes

src/questdb/ingress.pyx Show resolved Hide resolved

src/questdb/ingress.pyx Show resolved Hide resolved

RaphDal added 2 commits November 7, 2025 09:07

fix: hold GIL when working with py_obj

b62520a

tests: fix test to work with new protocol version 3

5d7ee5e

coderabbitai bot reviewed Nov 7, 2025

View reviewed changes

src/questdb/dataframe.pxi Show resolved Hide resolved

RaphDal and others added 3 commits November 7, 2025 09:33

adding print to debug failing test

0d7017f

Remove GIL requirement for arrow decimals

48e9709

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

fix: update decimal source enum values for compatibility

5b4cbfa

coderabbitai bot reviewed Nov 7, 2025

View reviewed changes

RaphDal added 3 commits November 7, 2025 10:04

test: add debug print for bad protocol version handling

c2c19d7

refactor: remove debug print statements for protocol version handling

9b00b39

fix: restore conditional for expected timestamp type based on QuestDB…

cad427c

… version

coderabbitai bot reviewed Nov 7, 2025

View reviewed changes

test/system_test.py Outdated Show resolved Hide resolved

RaphDal added 4 commits November 7, 2025 15:13

tests: add decimal support tests for pyarrow and decimal types

674e83c

feat: support decimal serialization directly from column

784d8b9

fix: adjust byte length calculation for unscaled object in decimal co…

94dba61

…nversion

fix: add decimal repo test in ci

3a57ea4

coderabbitai bot reviewed Nov 7, 2025

View reviewed changes

ci/run_tests_pipeline.yaml Show resolved Hide resolved

src/questdb/mpdecimal_compat.pxd Outdated Show resolved Hide resolved

test/system_test.py Show resolved Hide resolved

fix: correct byte length calculation for unscaled object in decimal c…

00ff8a4

…onversion

fix: update pyarrow import handling

6990a20

coderabbitai bot reviewed Nov 7, 2025

View reviewed changes

test/system_test.py Show resolved Hide resolved

feat: support decimal #114

Are you sure you want to change the base?

feat: support decimal #114

Uh oh!

Conversation

RaphDal commented Oct 13, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Usage

Decimal object

Progress

Summary by CodeRabbit

Uh oh!

RaphDal commented Oct 21, 2025

Uh oh!

coderabbitai bot commented Oct 21, 2025

Uh oh!

coderabbitai bot commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

RaphDal commented Oct 13, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 21, 2025 •

edited

Loading