Can it be used for non structured docs?

Greetings to the creators of Page Index,

I recently discovered your library, and it seems to be an excellent solution.

From the description, it looks ideal for structured documents such as technical or financial reports. However, I would like to know if it can serve these use cases:

1. Can it fetch context relevant to a user query when the context includes previous chats with LLMs or uploaded documents, right out of the box or any manual pre processing needed? These sources are mostly unstructured and may contain spelling and grammatical errors, especially in chat logs.

2. Does it rely on the model’s context limit? For example, if the model has a maximum input of 128k tokens but the context is over 2 million tokens, does the library handle this automatically, or would manual chunking and splitting of the context be required?

3. Is it limited to processing PDF or text-based content, or can it handle images and screenshots as well, whether they are embedded separately or within PDFs?

4. Is it possible to save the tree structure for future use, to enable faster processing and retrieval?

I am considering using this library in my project because it appears to address context relevance challenges in long chats with multiple documents.

I look forward to your response!

Thanks and regards,
Vineet Mangal

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Can it be used for non structured docs? #34

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Can it be used for non structured docs? #34

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions