Skip to content

Conversation

@dyl10s
Copy link
Collaborator

@dyl10s dyl10s commented Jul 23, 2025

Description

Refactored the Mongodb receiver to use LoggingReceiverMacro

A few issues came up with this refactor that I will need some guidance on:

  • I needed a few more operators (lift/rename) so I implemented them to match existing functionality. Let me know if there is a better direction I can go here.
  • There are multiple lua filter components here. I am not sure there exact functionality or how they need to be refactored.
  • There is a lot of logic around parsing the timestamp, currently it seems to not parse correctly and I just tried to match the existing logic.
  • This is similar to the ElasticsearchJson receiver where we just register the processor in the transformation_test.go file

Related issue

How has this been tested?

Validated there were no changes to the generated fluentbit golden file before and after the refactor

Checklist:

  • Unit tests
    • Unit tests do not apply.
    • Unit tests have been added/modified and passed for this PR.
  • Integration tests
    • Integration tests do not apply.
    • Integration tests have been added/modified and passed for this PR.
  • Documentation
    • This PR introduces no user visible changes.
    • This PR introduces user visible changes and the corresponding documentation change has been made.
  • Minor version bump
    • This PR introduces no new features.
    • This PR introduces new features, and there is a separate PR to bump the minor version since the last release already.
    • This PR bumps the version.

logging.googleapis.com/instrumentation_source: agent.googleapis.com/mongodb
logName: projects/my-project/logs/transformation_test
severity: 200.0
timestamp: now
Copy link
Contributor

@franciscovalentecastro franciscovalentecastro Aug 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since all the log entries here seem to have timestamp: now (which happens when the log didn't parse a specific timestamp), there may be a bug in the timestamp parsing.

It may predate this PR, but or it could be a mismatched format. Please address this missing timestamp.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the original implementation there was some special logic to grab a specific fluentbit component from the LoggingProcessorParseRegix at a specific index which was just the time parsing section. Because this is now becoming more generic we can not just grab a given index.

I created a new parser that is just made for parsing the timestamp.

This all stems from the following comment where the nested timestamp has to be brought to the top level of the JSON structure.

	// have to bring $date to top level in order for it to be parsed as timeKey
	// see https://github.com/fluent/fluent-bit/issues/1013

I am open to other solutions here but this seems like a reasonable solution that works similarly to the old implementation.

apps/mongodb.go Outdated

func init() {
confgenerator.LoggingReceiverTypes.RegisterType(func() confgenerator.LoggingReceiver { return &LoggingReceiverMongodb{} })
confgenerator.RegisterLoggingReceiverMacro(func() LoggingReceiverMacroMongodb {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please register using RegisterLoggingFilesProcessorMacro instead. There is no need to only register the receiver anymore.

Field string
}

func (p LoggingProcessorRemoveField) Components(ctx context.Context, tag, uid string) []fluentbit.Component {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've thought about this a bit and realized that the operations done by LoggingProcessorRemoveField, LoggingProcessorHardRename and LoggingProcessorNestLift can be replicated with LoggingProcessorModifyFields. Please remove this processors and using modify fields instead.

Here is a list of this operations :

  1. LoggingProcessorRemoveField can be replicated with :
confgenerator.LoggingProcessorModifyFields{
			Fields: map[string]*confgenerator.ModifyField{
				"jsonPayload.Field": {
					MoveFrom: "jsonPayload.Field",
                     OmitIf: `jsonPayload.Field =~ ".*"`
				},
			},
		}
  1. LoggingProcessorHardRename can be replicated with :
confgenerator.LoggingProcessorModifyFields{
			Fields: map[string]*confgenerator.ModifyField{
				"jsonPayload.dest": {
					MoveFrom: "jsonPayload.src",
				},
			},
		}
  1. LoggingProcessorNestLift can be replicated with :
confgenerator.LoggingProcessorModifyFields{
			Fields: map[string]*confgenerator.ModifyField{
				"jsonPayload.prefix_Field": {
					MoveFrom: "jsonPayload.nested.Field",
				},
			},
		}

Using LoggingProcessorModifyFields instead of creating processors improves a lot the Mongodb processor the following ways :

  • Removes tech debt of having to maintain custom fluent-bit configs.
  • There is no need to implement each of this processors with Otel (modify fields already has both implementations).
  • Improves readability of the Mongodb processor.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the solutions are valid here except there is one condition where LoggingProcessorHardRename is needed due to side effects of MoveFrom.

When dealing with a log from wiredTiger we want to overwrite the top level msg field with the nested attr.message field. We only want to do this if the attr.message field is present. When using MoveFrom it will always overwrite the field with an empty object when attr.message is not present.

From my understanding, I can't think of a solution that allows us to preserve the old msg if attr.message is not present with the current functionality.

Copy link
Contributor

@franciscovalentecastro franciscovalentecastro Aug 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can use the OmitIf feature for this situation. Here the condition jsonPayload.attr.message :* checks if the field attr.message is present in the log. AFAIU, using "OmitIf: NOT jsonPayload.attr.message :*" will skip the MoveFrom if the field is not present.

confgenerator.LoggingProcessorModifyFields{
			Fields: map[string]*confgenerator.ModifyField{
				"jsonPayload.message": {
					MoveFrom: "jsonPayload..attr.message",
					OmitIf: `NOT jsonPayload.attr.message :*`,
				},
			},
		}

Note : I'm partially guiding myself with the 1 modify_fields public documentation, so this may not be fully correct, but i think if this is possible to be done with ModifyFields it would use OmitIf in some capacity. We can also explore the ModifyFields implementation to get more clarity.

Footnotes

  1. https://cloud.google.com/stackdriver/docs/solutions/agents/ops-agent/configuration#logging-processor-modify-fields

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried that and the OmitIf functionality seems to actually omit the destination field if it matches.

From the docs you linked
If the filter matches the input log record, the output field will be unset.

so we end up deleting the jsonPayload.message field entirely if we don't have the nested message

Copy link
Contributor

@franciscovalentecastro franciscovalentecastro Aug 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohhh! Ok! 🤔 Please try further with other variations of the OmitIf to see if there is way to make it work.

Alternative

If this doesn't work, we can try a solution that uses a custom LUA function (or keep using HardRename) for fluent-bit and a ModifyFields + CustomConverFunc + OTTL solution for Otel .

I outlined a similar idea here #2014 (comment) . In summary you would create a struct ConditionalMoveFromProcessor which implements both InternalLoggingProcessor (implements Components() for fluent-bit) and InternalOtelProcessor (implements Processors() for otel) and return the custom logic for each subagent.

You can probably also reuse LoggingProcessorHardRename (though i rather have a more specific name) and implement a Processors() method that uses ModifyFields + CustomConverFunc + OTTL.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not able to come up with anything that worked with OmitIf and updated the name of HardRename to something more clear as well as adding the otel Processors() logic.

Will wait for @quentinmit to see if he has a better solution

apps/mongodb.go Outdated
type LoggingReceiverMongodb struct {
LoggingProcessorMongodb `yaml:",inline"`
ReceiverMixin confgenerator.LoggingReceiverFilesMixin `yaml:",inline" validate:"structonly"`
type LoggingReceiverMacroMongodb struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After registering with RegisterLoggingFilesProcessorMacro there is no need to have a LoggingReceiverMacroMongodb.

@github-actions
Copy link

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Oct 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants