-
Notifications
You must be signed in to change notification settings - Fork 30
feat(oauth): Support refresh_token_error on the OAuthAuthenticatorModel level and … #829
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat(oauth): Support refresh_token_error on the OAuthAuthenticatorModel level and … #829
Conversation
…add sensible defaults
👋 Greetings, Airbyte Team Member!Here are some helpful tips and reminders for your convenience. Testing This CDK VersionYou can test this version of the CDK using the following: # Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@maxi297/support_config_error_on_oauth_authentication#egg=airbyte-python-cdk[dev]' --help
# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch maxi297/support_config_error_on_oauth_authenticationHelpful ResourcesPR Slash CommandsAirbyte Maintainers can execute the following slash commands on your PR:
|
📝 WalkthroughWalkthroughThe PR promotes refresh-token error configuration from nested Changes
Sequence Diagram(s)sequenceDiagram
participant Factory as ModelToComponentFactory
participant Model as OAuthAuthenticatorModel
participant RTU as RefreshTokenUpdaterModel
participant Auth as DeclarativeOauth2Authenticator
Factory->>Factory: _get_refresh_token_error_information(model)
alt RTU defines error config (deprecated)
Factory->>RTU: read refresh_token_error_status_codes/key/values
RTU-->>Factory: refresh-token error info (deprecated source)
else OAuthAuthenticator defines error config
Factory->>Model: read refresh_token_error_status_codes/key/values
Model-->>Factory: refresh-token error info (authenticator-level)
else Neither defined
Factory->>Factory: use defaults [400], "error", ["invalid_grant","invalid_permissions"]
end
Factory->>Auth: __init__(..., refresh_token_error_status_codes=..., refresh_token_error_key=..., refresh_token_error_values=...)
Auth-->>Factory: authenticator instance
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Possibly related PRs
Suggested labels
Suggested reviewers
Would you like me to run a quick grep of occurrences for the old nested field names to ensure no references were missed, wdyt? Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
airbyte_cdk/sources/declarative/auth/oauth.py (1)
83-89: The comment on line 83 seems misleading, wdyt?The comment mentions "Convert lists to tuples for parent class compatibility", but no conversion is actually happening in this
__post_init__method. The values are passed directly to the parent class as-is.Looking at the factory code in
model_to_component_factory.py, the conversion from lists to tuples happens in the_get_refresh_token_error_informationhelper method before the dataclass is instantiated. Would it be clearer to either:
- Remove this comment entirely since it describes work done elsewhere, or
- Rephrase it to something like "Pass refresh token error parameters to parent class"?
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)
2928-2930: Consider documenting the default values more prominently, wdyt?The hardcoded defaults
(400,), "error", ("invalid_grant", "invalid_permissions")represent a behavior change from the previous implementation. Previously, if no refresh_token_updater was defined, these values would be empty tuples/strings. Now, error checking will always happen with these sensible defaults.While this aligns with the PR objective of providing "sensible defaults so connectors do not need explicit configuration," it might be worth:
- Adding a comment explaining why these specific values were chosen as defaults
- Mentioning this behavior in the docstring of
create_oauth_authenticatoror in the PR descriptionThis would help future maintainers understand the rationale and make it easier to adjust if different defaults are needed.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
airbyte_cdk/sources/declarative/auth/oauth.py(3 hunks)airbyte_cdk/sources/declarative/declarative_component_schema.yaml(3 hunks)airbyte_cdk/sources/declarative/models/declarative_component_schema.py(2 hunks)airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py(5 hunks)
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2024-12-11T16:34:46.319Z
Learnt from: pnilan
Repo: airbytehq/airbyte-python-cdk PR: 0
File: :0-0
Timestamp: 2024-12-11T16:34:46.319Z
Learning: In the airbytehq/airbyte-python-cdk repository, the `declarative_component_schema.py` file is auto-generated from `declarative_component_schema.yaml` and should be ignored in the recommended reviewing order.
Applied to files:
airbyte_cdk/sources/declarative/models/declarative_component_schema.pyairbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
📚 Learning: 2024-11-18T23:40:06.391Z
Learnt from: ChristoGrab
Repo: airbytehq/airbyte-python-cdk PR: 58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
Applied to files:
airbyte_cdk/sources/declarative/models/declarative_component_schema.py
📚 Learning: 2025-01-14T00:20:32.310Z
Learnt from: aaronsteers
Repo: airbytehq/airbyte-python-cdk PR: 174
File: airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py:1093-1102
Timestamp: 2025-01-14T00:20:32.310Z
Learning: In the `airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py` file, the strict module name checks in `_get_class_from_fully_qualified_class_name` (requiring `module_name` to be "components" and `module_name_full` to be "source_declarative_manifest.components") are intentionally designed to provide early, clear feedback when class declarations won't be found later in execution. These restrictions may be loosened in the future if the requirements for class definition locations change.
Applied to files:
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
🧬 Code graph analysis (1)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)
RefreshTokenUpdater(398-440)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
- GitHub Check: Check: source-shopify
- GitHub Check: Check: destination-motherduck
- GitHub Check: Check: source-pokeapi
- GitHub Check: Check: source-intercom
- GitHub Check: Check: source-hardcoded-records
- GitHub Check: SDM Docker Image Build
- GitHub Check: Manifest Server Docker Image Build
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.13, Ubuntu)
- GitHub Check: Pytest (All, Python 3.12, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (Fast)
🔇 Additional comments (10)
airbyte_cdk/sources/declarative/auth/oauth.py (1)
49-51: LGTM! Well-structured field additions.The new refresh token error configuration fields are properly typed and documented. Using tuples for immutability is a good choice, and the empty defaults are sensible fallbacks.
Also applies to: 78-80
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (4)
21-21: LGTM! Necessary imports for the new functionality.The
TupleandRefreshTokenUpdaterModelimports are correctly placed and required for the new helper method.Also applies to: 404-406
2883-2906: Nice validation logic for mutual exclusivity!The check ensuring refresh_token_error information is defined in only one place (either on RefreshTokenUpdaterModel or OAuthAuthenticatorModel) is excellent defensive programming. This prevents configuration errors and makes the deprecation path clear.
2893-2901: Clarification question: Is the truthy check for empty lists intentional?The logic uses truthy checks to determine if error information is defined:
is_defined_on_refresh_token_updated = refresh_token_updater and ( refresh_token_updater.refresh_token_error_status_codes or refresh_token_updater.refresh_token_error_key or refresh_token_updater.refresh_token_error_values )This means that:
refresh_token_error_status_codes = [](empty list) is treated as "not defined"refresh_token_error_key = ""(empty string) is treated as "not defined"Is this the intended behavior? It seems reasonable (treating empty values as "not configured"), but it might be surprising to users who explicitly set empty lists/strings. If this is intentional, it might be worth adding a brief comment to clarify this behavior.
2796-2798: LGTM! Clean integration of the helper method.The tuple unpacking and passing of refresh_token_error parameters to both authenticator types is consistent and correct. The values are properly extracted once via the helper method and reused for both
DeclarativeSingleUseRefreshTokenOauth2AuthenticatorandDeclarativeOauth2Authenticator.Also applies to: 2849-2851, 2878-2880
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (2)
1430-1454: New OAuth refresh token error fields are well-defined with sensible defaults.The three new top-level fields follow the existing schema patterns and the defaults (
[400]for status codes,"error"for key,["invalid_grant", "invalid_permissions"]for values) align well with common OAuth error scenarios. This eliminates the need for explicit configuration in most connectors.
1495-1519: Clear deprecation strategy with empty defaults on nested fields.The nested fields are properly marked as deprecated with titles referencing the OAuthAuthenticator level. The empty defaults (
[],"") are appropriate for deprecated fields—they avoid conflicting with the top-level values.One question: Does the implementation logic ensure that when both nested and top-level fields are specified, the top-level takes precedence (or vice versa if that's the intended behavior)? This would be worth verifying in the implementation code to ensure smooth migration for users who might have configured these fields at the nested level. Wdyt?
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (3)
1-2: Note: This is an auto-generated file.As noted in the learnings, this file is auto-generated from
declarative_component_schema.yaml. The changes here reflect modifications to the source YAML file. If this PR includes the YAML file, that would be the more appropriate place to review the schema design decisions. The generated Python code here looks consistent with typical Pydantic model patterns.
423-440: Backward compatibility is properly handled via explicit factory migration logic—no action needed.The concern is addressed. The
_get_refresh_token_error_information()method inmodel_to_component_factory.py(lines 2884-2929) explicitly handles the migration between deprecatedRefreshTokenUpdaterfields and newOAuthAuthenticatorfields.The strategy:
- If fields are defined on both, a
ValueErroris raised- The factory prioritizes values from
RefreshTokenUpdaterif set, otherwise fromOAuthAuthenticator, with fallback defaults of(400,), "error", ("invalid_grant", "invalid_permissions")- The intentional default discrepancy (
[]vs[400]) preserves old opt-in behavior while enabling sensible defaults for new adoptersThe title-based deprecation (without
deprecated=True) is paired with factory-level enforcement, which is sufficient for this codebase's migration pattern.
1901-1918: Defaults look good—these fields are intentionally shifting from opt-in to opt-out, and the values align with OAuth2 standards.The verification shows these are new fields on
OAuthAuthenticatorthat replace the deprecated equivalents onRefreshTokenUpdater(which had empty defaults[],"", etc.). The intentional behavioral shift from opt-in to opt-out makes sense here because:
The defaults are RFC 6749–compliant:
[400]is the standard HTTP status for OAuth errors,"error"is the standard response key per the spec, and["invalid_grant", "invalid_permissions"]are the most common token refresh failures.Backward compatibility is preserved: The model factory's
_get_refresh_token_error_information()method handles both old and new locations, prevents conflicts, and gracefully defaults to empty values when nothing is configured.The deprecation notices on
RefreshTokenUpdaterfields show this is a deliberate migration path, not an accidental breaking change.Existing connectors without explicit
refresh_token_updaterconfiguration will now get automatic error detection out-of-the-box, which should generally be beneficial. The implementation is sound—this aligns well with the PR's goal of sensible defaults.
| information is defined only once and return the right fields. | ||
| """ | ||
| refresh_token_updater = model.refresh_token_updater | ||
| is_defined_on_refresh_token_updated = refresh_token_updater and ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I struggle a bit with is_defined_on_refresh_token_updated and is_defined_on_oauth_authenticator because if someone wanted to disable the error handling to flag these as system_errors, they would have to set impossible values instead of defining them with none values. That being said, it seems like a weird case to handle and there is an escape path so I'm not too worried
| if is_defined_on_refresh_token_updated: | ||
| not_optional_refresh_token_updater: RefreshTokenUpdaterModel = refresh_token_updater # type: ignore # we know from the condition that this is not None | ||
| return ( | ||
| tuple(not_optional_refresh_token_updater.refresh_token_error_status_codes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this will have the same behavior as the previous code but is a bit different. The previous code would pass refresh_token_updater.refresh_token_error_status_codes directly which felt very dangerous because the typing was different (Tuple[int, ...] vs Optional[List[int]]). I fear that the previous code could lead to None exception here if refresh_token_error_status_codes was not defined. I don't have signals to believe this actually happen in prod though...
PyTest Results (Fast)1 212 tests - 2 605 1 200 ✅ - 2 605 5m 8s ⏱️ - 1m 35s For more details on these failures, see this check. Results for commit b553004. ± Comparison against base commit 6504148. This pull request removes 2605 tests. |
PyTest Results (Full)3 820 tests 3 807 ✅ 11m 18s ⏱️ For more details on these failures, see this check. Results for commit b553004. |
| or model.refresh_token_error_values | ||
| ) | ||
| if is_defined_on_refresh_token_updated and is_defined_on_oauth_authenticator: | ||
| raise ValueError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unit tests are failing because the default value are for OAuthAuthenticatorModel
…add sensible defaults
What
We've seen auth error being flagged system_error (example with source-google-analytics-data-api). It seems like we only support validating the oauth response if a refresh_token_updater is defined.
How
I've moved the
refresh_token_erroron the root of theOAuthAuthenticatorModel. I've also added sensible defaults so that we don't have to configure this for all the connectors.Summary by CodeRabbit
New Features
Deprecations