Skip to content

Conversation

@alkampfergit
Copy link
Contributor

@alkampfergit alkampfergit commented Jul 25, 2024

Motivation and Context (Why the change? What's the scenario?)

I'd like to solve discussion #669 the ability to insert page number inside Memory Record in the Section property.

High level description (Approach, Design)

Introduced a new TextChunker2 that allows for specification of a series of line with a tag, then it will split based on token number, but instead of creating simple lines in output it will preserve the tag of the original line. I've chosen a tag object so we can add more info if needed, in this PR I've added the plain page number.

Needs to check if the overall implant of the PR is ok, then I'll finish implementation and testing of TextChunk2. Actually we can even validate that TextChunk2 is the only one chunker (it works exactly as TextChunk with the only addition of keeping tracks of a tag associated to a piece of text).

@alkampfergit alkampfergit force-pushed the feature/pages-in-memoryrecords branch 2 times, most recently from 47ee726 to 045cc1b Compare July 26, 2024 10:05
@alkampfergit alkampfergit marked this pull request as ready for review July 26, 2024 10:17
@alkampfergit alkampfergit requested a review from dluc as a code owner July 26, 2024 10:17
@alkampfergit alkampfergit changed the title Spike - add page number. Ask for OK to proceed. Add page number to Memory Records. Jul 26, 2024
@alkampfergit alkampfergit force-pushed the feature/pages-in-memoryrecords branch 3 times, most recently from 2e331c1 to ec6cfdd Compare July 31, 2024 07:42
@alkampfergit alkampfergit force-pushed the feature/pages-in-memoryrecords branch from e20d3a2 to c220010 Compare January 13, 2025 10:09
This branch introduces a new TextPartitioning Handler
that supports pages numbering in pdf.
@alkampfergit alkampfergit force-pushed the feature/pages-in-memoryrecords branch from 37bc0da to 48e4aca Compare January 17, 2025 09:53
@dluc
Copy link
Collaborator

dluc commented Nov 3, 2025

Closing as part of repository maintenance - no further action planned on this issue.

@dluc dluc closed this Nov 3, 2025
@microsoft microsoft locked and limited conversation to collaborators Nov 3, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants