-
Couldn't load subscription status.
- Fork 6.6k
transient cf implementation #14052
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
transient cf implementation #14052
Conversation
…mn_family_test that currently fails since transient CFs are not dropped
…he version_edit so that transient cfs can be skipped during the manifest replay (no access to cf options yet), add unittests for version_edit_test and column_family_test
…e version_edit_handler to track transient cfs and other missing cfs separately - distinguish between CFs that are skipped due to user error vs transient, avoiding corruption false alarm
|
@virajthakur has imported this pull request. If you are a Meta employee, you can view this in D84785372. |
…b into viraj-transient-cf-v2
…oids hang when we attempt to drop CFs during open
|
@virajthakur is this a prototype or a working progress? If the latter, you may want to consider adding stress test support for feature that has non-trivial interaction with others like this one. cc @cbi42 |
| memtable_avg_op_scan_flush_trigger( | ||
| options.memtable_avg_op_scan_flush_trigger) { | ||
| options.memtable_avg_op_scan_flush_trigger), | ||
| is_transient(options.is_transient) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should only add it to ImmutableCFOptions since it's an immutable option.
| edit.SetComparatorName(cf_options.comparator->Name()); | ||
| edit.SetPersistUserDefinedTimestamps( | ||
| cf_options.persist_user_defined_timestamps); | ||
| edit.SetIsTransientColumnFamily(cf_options.is_transient); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a couple other places where this needs to be set. E.g.
Line 6872 in e32c14e
| cfd->ioptions().persist_user_defined_timestamps); |
| // Create ColumnFamilyOptions from Options | ||
| explicit ColumnFamilyOptions(const Options& options); | ||
|
|
||
| bool is_transient = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add comment to explain what the option is for? Maybe include that opening a DB will fail if user passes in a transient CF, and mention forward/backward compatibility.
|
|
||
| // Validate that no transient CFs are requested | ||
| for (const auto& cf : column_families) { | ||
| if (cf.options.is_transient) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the CF was created as transient, but the user now opens it as non transient (cf.options.is_transient is false)?
| moptions.memtable_op_scan_flush_trigger; | ||
| cf_opts->memtable_avg_op_scan_flush_trigger = | ||
| moptions.memtable_avg_op_scan_flush_trigger; | ||
| cf_opts->is_transient = moptions.is_transient; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The option should not be mutable.
| if (s.ok()) { | ||
| impl->mutex_.AssertHeld(); | ||
|
|
||
| for (auto cfd : *impl->versions_->GetColumnFamilySet()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the transient CF is in the column family set, since its creation is skipped here: https://github.com/facebook/rocksdb/pull/14052/files#diff-18d1efc35e2c8fdc81a58b43fb821c003eee1a9fe5f212e82df18e99e8357945R262-R276
| // Create ColumnFamilyOptions from Options | ||
| explicit ColumnFamilyOptions(const Options& options); | ||
|
|
||
| bool is_transient = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also write this option to the logs?
We've discussed about stress test coverage. Since this change would drop the CF upon re-open, which is not something stress test easily covers, we can add that support later. |
this PR adds a new option for "transient" column families, for which data/metadata should not be persisted in the db long term.
Expected behavior for transient cfs:
The client cannot reopen a db with the transient cfs included. They will silently be dropped after recovery completes. First, during manifest replay, we distinguish between regular cfs that are not included (error) and transient cfs that are expected to be excluded (no error). After recovery, all transient cfs are dropped.
GetLiveFilesStorageInfo and GetLiveFilesMetadata exclude transient cfs, so they will not be included in backup or checkpoints.
During WAL recovery, ignore_missing_column families is always set to true, so no special handling for transient cfs is required.