Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Local Upload Documents as Input for Models instead of using RAG incas… #4503

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

schnaker85
Copy link

@schnaker85 schnaker85 commented Oct 22, 2024

Summary

Related Issue: This is the PR for #4502 I made it WIP because the discussion in a feature might lead to changes

Changes in the code:

  • processFileUpload is using local upload in case RAG_API_URL is not defined and it is not an assistant Upload
  • enhanced strategies to be able to use non-vector upload for generic files
  • refactor (image)-encode method to be used for all kind of files
  • enhanced uploadLocalFile for local strategy to specify directory
  • moved all relevant file operations form local strategy to file.js

Change Type

Please delete any irrelevant options.

  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

Checklist

Please delete any irrelevant options.

  • My code adheres to this project's style guidelines
  • I have performed a self-review of my own code
  • I have commented in any complex areas of my code
  • My changes do not introduce new warnings
  • Local unit tests pass with my changes

… incase RAG_API_URL is not set

- processFileUpload is using local upload in case RAG_API_URL is not defined and it is not an assistant Upload
- enhanced strategies to be able to use non-vector upload for generic files
- refactor (image)-encode method to be used for all kind of files
- enhanced uploadLocalFile for local strategy to specify directory
- moved all relevant file operations form local strategy to file.js

- introduced new client public path for non image files
@kukjun
Copy link

kukjun commented Nov 14, 2024

It seems like a good approach. However, when actually implemented, a problem occurs where the token part becomes NaN when encoding the file.

I think there is a part that processes images as tokens, but there is none for files, so I suggest adding exception handling for this.

// OpenAIClient.js
async buildMessage() {
  ...
  // after 530 line
  // There is a token calculation logic for images, but there is no calculation logic for non-images, so exclude it.
  if (!file.type.startsWith('image')) {
    continue;
  }
  ...
}

Thanks :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants