Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Got error when handling a large file #73

Open
tommy04062019 opened this issue Jun 28, 2024 · 5 comments
Open

Got error when handling a large file #73

tommy04062019 opened this issue Jun 28, 2024 · 5 comments

Comments

@tommy04062019
Copy link

  • Log file size: 11 MB
  • When asking with a file, an error has raised:
BadRequestError: Error code: 400 - {'error': {'message': "Invalid 'messages[1].content': string too long. Expected a string with maximum length 1048576, but got a string with length 11282561 instead.",
'type': 'invalid_request_error', 'param': 'messages[1].content', 'code': 'string_above_max_length'}}

What max size is allowed?How can I handle a large file?

@tommy04062019
Copy link
Author

Screenshot 2024-06-28 at 11 55 30 And this too

@aantn
Copy link
Contributor

aantn commented Jun 30, 2024

Thanks, we're looking into better ways to handle. (And to report the max sizes.)

What are you trying to accomplish with the log analysis? Instead of downloading the file, then analyzing it, are you able to just run a holmes ask command without a file and let holmes itself choose which logs to look at? That tends to handle size limits much much better.

@tommy04062019
Copy link
Author

tommy04062019 commented Jun 30, 2024

Thanks, we're looking into better ways to handle. (And to report the max sizes.)

What are you trying to accomplish with the log analysis? Instead of downloading the file, then analyzing it, are you able to just run a holmes ask command without a file and let holmes itself choose which logs to look at? That tends to handle size limits much much better.

As you can see, in the case of nginx ingress, it's important to note that requests and errors can originate from any of the nginx ingress pods. If there is a deployment with 10 pods, investigating each one individually becomes necessary, so it waste time. Our primary goal is to minimize the time spent on investigation, isn't it? It's worth mentioning that errors might be located anywhere within the log. Even if you limit the log reading to the last 10000 lines, the errors could potentially be found only at the last 10500 linesI, so the result willl returned something like be:AI: everything is ok

@tommy04062019
Copy link
Author

@aantn : I think we should implement two things:

  • First, having a size limit
  • Second, with large files, so here's the deal, you gotta read and use AI to dig into things bit by bit(read by chunks). Then, you bring all those little findings together to get the big picture.

@aantn
Copy link
Contributor

aantn commented Jun 30, 2024

What is the trigger for the investigation? (Meaning what is the initial breadcrumb that you saw which caused you to start investigating at all?)

Your ideas are good. I would also be interested in exploring a third option: give the AI a tool to search the logs (e.g. grep).

We have an internal framework for benchmarking the results of different approaches. If you're interested in jumping on a short call, I'd love to explore this use case in a little more depth. That might help us find the best solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants