feat(weave): On Demand Evals #3877

tssweeney · 2025-03-15T02:33:15Z

Pairs with https://github.com/wandb/core/pull/28049

Summary by CodeRabbit

New Features
- Introduced a dedicated EvalStudio page accessible from the sidebar, allowing users to easily manage and view evaluation results.
- Added new components for creating datasets, scorers, and evaluations within the EvalStudio interface.
- Implemented asynchronous API functions for fetching evaluations, datasets, scorers, and models.
- Added new methods for handling evaluation lifecycle and callbacks during evaluations.
- Enhanced the useProjectSidebar function with a new button for accessing the EvalStudio.
- Introduced a new context provider for managing evaluation studio state and interactions.
- Added a new test suite for validating the complete lifecycle of on-demand evaluations through an API.
- Added a new test file to validate the publishing of evaluation objects using the Weave framework.
Improvements
- Enhanced backend operations for evaluation, scoring, and model predictions to ensure more consistent and reliable performance.
- Upgraded API endpoints to streamline method invocation and feedback integration, creating a smoother, more robust user experience.
- Improved the handling of model input arguments for streamlined processing.
- Updated the default database server for tests to improve test environment consistency.

circle-job-mirror · 2025-03-15T02:40:17Z

Preview this PR with FeatureBee: https://beta.wandb.ai/?betaVersion=466ea4813193671ea5c55a0364f7e7b37483c894

coderabbitai · 2025-03-15T02:50:11Z

Walkthrough

This update introduces a new testing dependency and several new test suites covering local and remote interactions for completion models and scorers. New built-in classes and registration mechanisms for a lightweight LLM completion model and an LLM judge scorer are added. The patch also updates argument handling for scorers, enhances serialization behavior, modifies the client’s scorer application method to return a new result structure, and adds several new methods in various trace servers to support calling methods and scoring calls.

Changes

Files	Change Summary
`pyproject.toml`	Added new dependency `"litellm>=1.49.1"` under the test section with a `# Bultins` comment.
`tests/trace/builtin_objects/test_builtin_model.py` `tests/trace/builtin_objects/test_builtin_scorer.py` `tests/trace/test_on_demand_evals.py`	Introduced new test suites and functions for verifying the functionality of `LiteLLMCompletionModel` and `LLMJudgeScorer`, along with end-to-end evaluations via the API.
`weave/builtin_objects/builtin_registry.py` `weave/builtin_objects/models/CompletionModel.py` `weave/builtin_objects/scorers/LLMJudgeScorer.py`	Added new built-in classes for the lightweight LLM completion model and judge scorer along with a registry mechanism to register and retrieve these built-in objects.
`weave/flow/scorer.py`	Modified the `apply_scorer_async` function to check for "inputs" in scorer argument names and map arguments accordingly.
`weave/trace/serialization/serialize.py`	Enhanced the `from_json` function to validate objects using a built-in registry when the object’s class name matches the expected type.
`weave/trace/weave_client.py`	Added a new data class `ClientApplyScorerResult` and updated the `apply_scorer` method to return this new result type with extra feedback information.
`weave/trace_server/...` (includes: `clickhouse_trace_server_batched.py`, `external_to_internal_trace_server_adapter.py`, `server_side_object_saver.py`, `sqlite_trace_server.py`, `trace_server_interface.py`, `trace_server_bindings/remote_http_trace_server.py`)	Expanded trace server functionalities by adding new methods `call_method` and `score_call`, implementing request/response classes for trace operations, and incorporating built-in object support. Additionally, the server-side object saver now supports isolated multiprocessing with error handling via custom exceptions and ID conversions.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant WeaveClient
    participant TraceServer
    participant RunAsUser

    Client->>WeaveClient: apply_scorer(scorer, kwargs)
    WeaveClient->>TraceServer: score_call(ScoreCallReq)
    TraceServer->>RunAsUser: run_score_call(req)
    RunAsUser-->>TraceServer: ScoreCallRes (feedback_id, result)
    TraceServer-->>WeaveClient: ScoreCallRes
    WeaveClient-->>Client: ClientApplyScorerResult (result, score_call, feedback_id)

sequenceDiagram
    participant Client
    participant ExternalAdapter
    participant InternalTraceServer
    participant RunAsUser

    Client->>ExternalAdapter: call_method(CallMethodReq)
    ExternalAdapter->>InternalTraceServer: call_method(CallMethodReq)
    InternalTraceServer->>RunAsUser: run_call_method(req)
    RunAsUser-->>InternalTraceServer: CallMethodRes
    InternalTraceServer-->>ExternalAdapter: CallMethodRes
    ExternalAdapter-->>Client: CallMethodRes

Suggested reviewers

andrewtruong

Poem

I'm a rabbit with a hop in my code,
Skipping through tests on a bumpy road.
New methods and classes, a garden so bright,
Tests and servers dancing in the moonlight.
With carrots of logic and hops of delight,
I nibble on changes with all of my might!
🥕🐇 Happy coding and a joyful night!

Tip

⚡🧪 Multi-step agentic review comment chat (experimental)

We're introducing multi-step agentic chat in review comments. This experimental feature enhances review discussions with the CodeRabbit agentic chat by enabling advanced interactions, including the ability to create pull requests directly from comments.
- To enable this feature, set early_access to true under in the settings.

✨ Finishing Touches

📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 9

🔭 Outside diff range comments (1)

weave/builtin_objects/models/CompletionModel.py (1)

14-28: ⚠️ Potential issue

Add error handling for JSON parsing and method documentation

The predict method assumes the response will be valid JSON and has no error handling. Additionally, it lacks a docstring explaining parameters and return value.

Consider adding error handling and documentation:

@weave.op()
def predict(self, **kwargs: Any) -> str:
+    """Generate a completion based on the template messages.
+    
+    Args:
+        **kwargs: Values to format into the message templates
+        
+    Returns:
+        Parsed JSON response from the model
+        
+    Raises:
+        json.JSONDecodeError: If the response cannot be parsed as JSON
+    """
    messages: list[dict] = [
        {**m, "content": m["content"].format(**kwargs)}
        for m in self.messages_template
    ]

    res = litellm.completion(
        model=self.model,
        messages=messages,
        response_format=self.response_format,
    )

-    return json.loads(res.choices[0].message.content)
+    try:
+        return json.loads(res.choices[0].message.content)
+    except json.JSONDecodeError as e:
+        raise json.JSONDecodeError(
+            f"Failed to parse response as JSON: {res.choices[0].message.content}", 
+            e.doc, e.pos
+        )

🧹 Nitpick comments (22)

weave/builtin_objects/scorers/LLMJudgeScorer.py (2)
12-16: Class definition looks good, but could benefit from documentation.

The LLMJudgeScorer class properly inherits from weave.Scorer and declares appropriate attributes. Consider adding docstrings to describe the purpose of the class and the expected values for each attribute.
 class LLMJudgeScorer(weave.Scorer):
+    """A scorer that uses LLM to judge the quality of model outputs.
+    
+    This scorer sends the inputs and outputs to an LLM model with a system prompt
+    and returns the LLM's judgment as a structured response.
+    """
     model: str
+    """The LLM model identifier to use for scoring."""
     system_prompt: str
+    """System prompt that instructs the LLM how to evaluate outputs."""
     response_format: dict
+    """Structure defining the expected format of the LLM response."""
31-36: Consider adding timeout and retry logic for API calls.

API calls to external LLM services can sometimes fail due to timeouts or rate limiting. Adding timeout parameters and retry logic would make the code more robust in production environments.
         res = litellm.completion(
             model=self.model,
             messages=messages,
             response_format=self.response_format,
+            timeout=30,  # Set a reasonable timeout
         )
weave/builtin_objects/models/CompletionModel.py (1)
9-12: Add docstring to clarify class purpose and attributes

The class lacks a docstring explaining its purpose and the meaning of its attributes. Adding documentation would help users understand how to use this model correctly.

Consider adding a class docstring like:
class LiteLLMCompletionModel(weave.Model):
+    """A lightweight completion model wrapper for litellm.
+    
+    This model formats messages according to a template and sends them to the
+    specified LLM via litellm, returning the parsed JSON response.
+    
+    Attributes:
+        model: The model identifier to use with litellm
+        messages_template: List of message objects with content templates
+        response_format: Optional response format specification
+    """
    model: str
    messages_template: list[dict[str, str]]
    response_format: Optional[dict] = None
🧰 Tools

🪛 GitHub Actions: test

[error] 9-9: AttributeError: partially initialized module 'weave' has no attribute 'Model' (most likely due to a circular import)
weave/trace/weave_client.py (1)
658-676: Update docstring to reflect new return type

The docstring for apply_scorer still mentions the old return type and structure, but the implementation now returns a ClientApplyScorerResult.

Update the docstring to reflect the new return type:
Returns:
-    The result of the scorer application in the form of an `ClientApplyScorerResult`.
+    The result of the scorer application in the form of a `ClientApplyScorerResult`.

-```python
-class ApplyScorerSuccess:
-    result: Any
-    score_call: Call
-```
+```python
+@dataclasses.dataclass
+class ClientApplyScorerResult:
+    result: Any
+    score_call: Call
+    feedback_id: Future[str] | None = None
+```
weave/trace_server/trace_server_interface.py (2)
903-914: LGTM: Well-structured request/response classes for method calls

The CallMethodReq and CallMethodRes classes follow the established patterns in the codebase. Consider adding docstrings for better documentation.

Add docstrings to explain the purpose of these classes:
class CallMethodReq(BaseModel):
+    """Request to call a method on an object via the trace server.
+    
+    This enables remotely executing methods on objects stored in the Weave system.
+    """
    project_id: str
    object_ref: str
    method_name: str
    args: dict[str, Any]
    wb_user_id: Optional[str] = Field(None, description=WB_USER_ID_DESCRIPTION)


class CallMethodRes(BaseModel):
+    """Response from a method call containing the execution result and call ID."""
    call_id: str
    output: Any
916-926: LGTM: Well-structured request/response classes for score calls

The ScoreCallReq and ScoreCallRes classes follow the established patterns in the codebase. Consider adding docstrings for better documentation.

Add docstrings to explain the purpose of these classes:
class ScoreCallReq(BaseModel):
+    """Request to score a call using a specified scorer.
+    
+    This enables on-demand evaluation of previous calls using scoring models.
+    """
    project_id: str
    call_ref: str
    scorer_ref: str
    wb_user_id: Optional[str] = Field(None, description=WB_USER_ID_DESCRIPTION)


class ScoreCallRes(BaseModel):
+    """Response from a score call containing the feedback ID and the score call details."""
    feedback_id: str
    score_call: CallSchema
tests/trace/test_on_demand_evals.py (9)
1-4: Correct minor spelling issue in docstring.

"inteded" should be spelled "intended" to maintain professionalism and clarity in documentation.
-This test suite is inteded to test the end...
+This test suite is intended to test the end...
13-66: Consider additional negative or edge-case checks in do_test_dataset_create.

While the test verifies a successful dataset creation, testing edge cases (e.g., empty rows, invalid row structures, special characters) can help ensure robustness.

68-70: Implement the empty test for test_llm_judge_scorer_create.

This test is currently empty. Either implement assertions validating LLM judge scorer creation or remove the placeholder to avoid confusion.

72-75: Provide validation logic in test_evaluation_create.

This test calls do_test_dataset_create but ends without actual evaluation verification. Add relevant assertions or remove this placeholder to maintain test clarity.

77-79: Fill in or remove the placeholder test test_model_create.

Without any assertions or logic, this test currently serves no purpose.

81-83: Complete or remove test_evaluation_run_remotely.

The test stands empty. Either add logic for remote evaluation or remove it to avoid confusion.

85-87: Implement logic in test_evaluation_run_locally.

Currently, it's only a placeholder. Ensure that it either tests local evaluation or remove it altogether.

89-91: Add testing steps to test_list_evaluations.

There is no logic yet. Provide checks that confirm evaluation listing works as expected or remove this empty stub.

93-95: Finalize or remove test_list_evaluation_runs.

Implement real assertions to verify evaluation runs or remove the placeholder test.
tests/trace/builtin_objects/test_builtin_scorer.py (3)

1-7: Add high-level docstring describing test coverage.

While the inline comments list scenarios, consider placing an overarching docstring at the file top to explain the suite’s purpose more formally.

36-57: Test flow is consistent for test_scorer_publishing_alignment.

Key checks (digest equality and type) look good. Consider adding negative tests (e.g., mismatch digests).

68-85: Consider robust error handling in assert_expected_outcome.

If scorer_res is missing expected fields, KeyError could occur. You might add a fallback or more explicit checks for a clearer test failure message.

tests/trace/builtin_objects/test_builtin_model.py (2)

59-83: Consider adding error-handling or negative-path tests.

This test covers a happy path flow using the remote server call. Adding scenarios with invalid inputs or handling server exceptions could further strengthen reliability.

85-108: DRY opportunity for remote object creation logic.

The creation pattern repeats throughout the tests. Extracting a helper function for object creation could reduce duplication and improve maintainability.
weave/trace_server/server_side_object_saver.py (2)
50-69: Preserve traceback details for easier debugging.

Currently, the code sends only str(e) to the result queue. Including the entire traceback would help diagnose issues in the child process.
-except Exception as e:
-    result_queue.put(("error", str(e)))
+import traceback
+except Exception:
+    tb = traceback.format_exc()
+    result_queue.put(("error", tb))
112-155: Consider using a context manager to ensure consistent patching.

autopatch.autopatch() and autopatch.reset_autopatch() are invoked here, but a finally block or context manager could guarantee the patch is always reset even if an exception occurs.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 34defc9 and 2fa60e9.

📒 Files selected for processing (16)

pyproject.toml (1 hunks)
tests/trace/builtin_objects/test_builtin_model.py (1 hunks)
tests/trace/builtin_objects/test_builtin_scorer.py (1 hunks)
tests/trace/test_on_demand_evals.py (1 hunks)
weave/builtin_objects/builtin_registry.py (1 hunks)
weave/builtin_objects/models/CompletionModel.py (1 hunks)
weave/builtin_objects/scorers/LLMJudgeScorer.py (1 hunks)
weave/flow/scorer.py (1 hunks)
weave/trace/serialization/serialize.py (2 hunks)
weave/trace/weave_client.py (5 hunks)
weave/trace_server/clickhouse_trace_server_batched.py (3 hunks)
weave/trace_server/external_to_internal_trace_server_adapter.py (1 hunks)
weave/trace_server/server_side_object_saver.py (1 hunks)
weave/trace_server/sqlite_trace_server.py (1 hunks)
weave/trace_server/trace_server_interface.py (2 hunks)
weave/trace_server_bindings/remote_http_trace_server.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

`**/*.py`: Focus on pythonic code patterns. Check for proper...

**/*.py: Focus on pythonic code patterns.
Check for proper exception handling.
Verify type hints usage where applicable.
Look for potential performance improvements.
Don't comment on formatting if black/isort is configured.
Check for proper dependency injection patterns.
Verify proper async handling if applicable.

weave/trace/serialization/serialize.py
weave/builtin_objects/builtin_registry.py
weave/trace_server_bindings/remote_http_trace_server.py
weave/flow/scorer.py
weave/builtin_objects/models/CompletionModel.py
weave/builtin_objects/scorers/LLMJudgeScorer.py
weave/trace_server/external_to_internal_trace_server_adapter.py
weave/trace_server/sqlite_trace_server.py
tests/trace/test_on_demand_evals.py
weave/trace/weave_client.py
tests/trace/builtin_objects/test_builtin_model.py
weave/trace_server/trace_server_interface.py
tests/trace/builtin_objects/test_builtin_scorer.py
weave/trace_server/server_side_object_saver.py
weave/trace_server/clickhouse_trace_server_batched.py

🪛 GitHub Actions: test

weave/builtin_objects/models/CompletionModel.py

[error] 9-9: AttributeError: partially initialized module 'weave' has no attribute 'Model' (most likely due to a circular import)

⏰ Context from checks skipped due to timeout of 90000ms (519)

GitHub Check: notify-wandb-core
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, langchain)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: test-query-service-matrix-check
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, langchain)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: test-query-service-matrix-check
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, langchain)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, langchain)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)

🔇 Additional comments (33)

weave/trace/serialization/serialize.py (1)

10-10: LGTM: Import for builtin registry.

The import for get_builtin from the new registry is appropriately added.

weave/flow/scorer.py (1)

314-317: LGTM: Support for scorers with 'inputs' parameter.

This enhancement adds support for scorers that accept an 'inputs' parameter by passing the complete example dictionary as input. The implementation is clean and maintainable.

The change enables the new LLMJudgeScorer to function correctly by passing all example data in a single parameter, maintaining backward compatibility with existing scorers.

weave/builtin_objects/builtin_registry.py (3)

1-6: Good implementation of the registry pattern.

The registry implementation uses appropriate typing and imports the necessary classes. The dictionary definition is clean and well-typed.

8-16: LGTM: Robust validation in registration function.

The register_builtin function correctly validates that registered classes are subclasses of weave.Object and prevents duplicate registrations.

22-24: LGTM: Initial builtin registrations.

The initial registrations for LiteLLMCompletionModel and LLMJudgeScorer are correctly implemented.

weave/trace_server_bindings/remote_http_trace_server.py (2)

549-552: Consistent implementation of the call_method API endpoint.

The implementation follows the established pattern in the codebase for HTTP API endpoints, using the _generic_request method with proper typing and request/response objects.

554-557: Consistent implementation of the score_call API endpoint.

The implementation follows the same pattern as other API methods in this class, utilizing the _generic_request method with appropriate types and endpoint path.

weave/trace_server/sqlite_trace_server.py (2)

1253-1254: Appropriate NotImplementedError for local SQLite implementation.

The stub implementation correctly raises NotImplementedError with a descriptive message, clearly indicating that this functionality is not available in the local SQLite implementation.

1256-1257: Appropriate NotImplementedError for local SQLite implementation.

The stub implementation correctly raises NotImplementedError with a descriptive message, clearly indicating that this functionality is not available in the local SQLite implementation.

weave/trace_server/external_to_internal_trace_server_adapter.py (2)

386-392: Implementation follows adapter pattern with proper ID conversion.

The call_method implementation correctly:

Converts the project_id from external to internal representation

Validates and converts the user ID

Uses the _ref_apply method to handle reference transformations

This implementation aligns with the pattern used throughout the adapter class.

394-400: Implementation follows adapter pattern with proper ID conversion.

The score_call implementation correctly follows the same pattern as other methods in this adapter class, with proper ID conversions and validation.

weave/builtin_objects/scorers/LLMJudgeScorer.py (1)

8-10: Address TODO comments before finalizing.

The TODO comment suggests there are unresolved questions about implementation details. Consider resolving these questions and removing the TODO before finalizing the PR.

weave/trace/weave_client.py (2)

498-503: LGTM: Well-structured dataclass for scorer results

The new ClientApplyScorerResult dataclass provides a clear structure for scorer application results. This is a good addition that improves code clarity.

710-720: LGTM: Proper handling of feedback ID

The implementation properly handles the feedback ID, whether the client exists or not. The code follows good patterns for handling async operations.

weave/trace_server/trace_server_interface.py (1)

980-982: LGTM: Protocol methods added consistently

The new methods for the TraceServerInterface protocol are added appropriately under the "Execute API" section, maintaining a clean interface organization.

tests/trace/builtin_objects/test_builtin_scorer.py (9)

9-11: Imports look appropriate.

No concerns: the relevant modules are imported correctly.

14-29: No changes needed; scorer_args is clear.

This dictionary’s purpose and layout appear neatly structured for test usage.

31-34: Score input and expected score are straightforward.

No further improvements needed; these constants serve the test scenario well.

59-66: make_simple_call is concise and functional.

No changes needed; this helper adequately simulates a simple call scenario.

87-99: do_remote_score is well-defined.

The function is minimal and suitable for this testing approach. No actionable issues.

124-129: test_scorer_local_create_local_use fulfills its intention.

No improvements necessary. The test scenario is straightforward and well-implemented.

131-137: test_scorer_local_create_remote_use effectively covers local creation with remote usage.

Test seems sufficient to ensure coverage of cross-boundary usage.

139-146: test_scorer_remote_create_local_use checks correct retrieval from server.

No issues: the test is straightforward and aligns with its stated purpose.

148-153: test_scorer_remote_create_remote_use thoroughly tests end-to-end remote usage.

Implementation looks correct; no further concerns.

tests/trace/builtin_objects/test_builtin_model.py (4)

7-23: Confirm validity of model name and consider negative test scenarios.

Currently, "gpt-4o" is used as the model name, which might be non-standard. Also, the test suite doesn't check negative or edge-case scenarios (e.g., invalid inputs or incomplete model_args). Consider adding those tests to improve robustness.

30-50: Looks good for publication alignment test.

Ensures the local and server-saved objects match digests correctly; the logic appears solid.

53-57: Local create and local use test logic looks fine.

Straightforward verification of the model’s output with no apparent issues.

110-150: Ensure stable test environment for LLM predictions.

Because the tests depend on a consistent model output, consider mocking or stubbing the LLM calls if the environment can vary. This helps avoid flakiness in CI/CD pipelines.

weave/trace_server/server_side_object_saver.py (5)

70-75: Clarify handling of a missing user_id.

This method signature allows user_id to be None, but the remote server logic (in UserInjectingExternalTraceServer) raises an error if user_id is missing. Confirm whether run_save_object should similarly enforce a non-null user_id.

180-229: Revisit the TODO for project alignment validation.

The comment at line 208 indicates a missing project-alignment check. Implementing or removing stubs helps maintain clarity and avoids shipping half-finished features.

230-315: Caution with asyncio.run in spawned processes.

While this is likely safe in a separate process, confirm there are no unexpected runtime conflicts when nesting event loops.

319-338: Simple project and user ID converters look good.

Implementation is straightforward with no apparent issues or extraneous complexity.

340-401: UserInjectingExternalTraceServer properly enforces user context.

The approach to enforce user_id in each method invocation is consistent and appears robust. No additional concerns.

coderabbitai · 2025-03-15T02:50:14Z

pyproject.toml

+
+  # Bultins
+  "litellm>=1.49.1",


💡 Verification agent

🧩 Analysis chain

Consider version constraint conflicts with litellm package.

The newly added test dependency litellm>=1.49.1 should be evaluated for potential conflicts with existing constraints:

Line 72: litellm<=1.49.1 for dspy

Line 92: litellm>=1.36.1 for the litellm optional dependency

Line 98: litellm>=1.58 for scorers

These varying version constraints might lead to dependency resolution issues.

🏁 Script executed:

#!/bin/bash # Check for potential version conflicts with litellm echo "Checking for litellm version constraints in pyproject.toml:" grep -n "litellm" pyproject.toml

Length of output: 345

Attention: Resolve litellm Version Conflicts

The new test dependency ("litellm>=1.49.1", line 142) is inconsistent with other version constraints:

dspy (line 72) and notdiamond (line 108) restrict litellm to <=1.49.1, effectively forcing version 1.49.1.

Scorers (line 98) require litellm>=1.58, which makes a version of 1.49.1 unacceptable.

Optional dependency (line 92) demands litellm>=1.36.1, which adds another layer to the mix.

These conflicts create a scenario where no single litellm version can satisfy all constraints.

Action Required: Re-assess and align the constraints (or refactor dependency usage) to ensure that a compatible version can be chosen across all modules.

coderabbitai · 2025-03-15T02:50:14Z

weave/trace/serialization/serialize.py

+        elif (
+            isinstance(val_type, str)
+            and obj.get("_class_name") == val_type
+            and (baseObject := get_builtin(val_type))
+        ):
+            valid_keys = baseObject.model_fields.keys()
+            return baseObject.model_validate(
+                {k: v for k, v in obj.items() if k in valid_keys}
+            )


⚠️ Potential issue

Add error handling for missing registry entries.

The new code adds support for serializing objects from the builtin registry, which is good. However, there's no error handling if get_builtin(val_type) fails to find an entry in the registry.

Consider adding try/except to handle potential KeyError:

elif ( isinstance(val_type, str) and obj.get("_class_name") == val_type - and (baseObject := get_builtin(val_type)) + and val_type in _BUILTIN_REGISTRY ): - valid_keys = baseObject.model_fields.keys() - return baseObject.model_validate( - {k: v for k, v in obj.items() if k in valid_keys} - ) + try: + baseObject = get_builtin(val_type) + valid_keys = baseObject.model_fields.keys() + return baseObject.model_validate( + {k: v for k, v in obj.items() if k in valid_keys} + ) + except KeyError: + # Fall through to the next condition if the builtin isn't found + pass

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

elif (

isinstance(val_type, str)

and obj.get("_class_name") == val_type

and (baseObject := get_builtin(val_type))

):

valid_keys = baseObject.model_fields.keys()

return baseObject.model_validate(

{k: v for k, v in obj.items() if k in valid_keys}

)

elif (

isinstance(val_type, str)

and obj.get("_class_name") == val_type

and val_type in _BUILTIN_REGISTRY

):

try:

baseObject = get_builtin(val_type)

valid_keys = baseObject.model_fields.keys()

return baseObject.model_validate(

{k: v for k, v in obj.items() if k in valid_keys}

)

except KeyError:

# Fall through to the next condition if the builtin isn't found

pass

coderabbitai · 2025-03-15T02:50:15Z

weave/builtin_objects/builtin_registry.py

+def get_builtin(name: str) -> type[weave.Object]:
+    return _BUILTIN_REGISTRY[name]
+


🛠️ Refactor suggestion

Add error handling for missing registry entries.

The get_builtin function doesn't handle the case where a name isn't in the registry, which could lead to uncaught KeyError exceptions.

def get_builtin(name: str) -> type[weave.Object]: - return _BUILTIN_REGISTRY[name] + try: + return _BUILTIN_REGISTRY[name] + except KeyError: + raise KeyError(f"No builtin object found with name '{name}'")

Alternatively, you could use the dictionary's get method with a more informative error:

def get_builtin(name: str) -> type[weave.Object]: - return _BUILTIN_REGISTRY[name] + cls = _BUILTIN_REGISTRY.get(name) + if cls is None: + raise KeyError(f"No builtin object found with name '{name}'") + return cls

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def get_builtin(name: str) -> type[weave.Object]:

return _BUILTIN_REGISTRY[name]

def get_builtin(name: str) -> type[weave.Object]:

try:

return _BUILTIN_REGISTRY[name]

except KeyError:

raise KeyError(f"No builtin object found with name '{name}'")

coderabbitai · 2025-03-15T02:50:15Z

weave/builtin_objects/scorers/LLMJudgeScorer.py

+    @weave.op()
+    def score(self, inputs: dict, output: Any) -> str:
+        user_prompt = json.dumps(
+            {
+                "inputs": inputs,
+                "output": output,
+            }
+        )
+
+        messages = [
+            {"role": "system", "content": self.system_prompt},
+            {"role": "user", "content": user_prompt},
+        ]
+
+        res = litellm.completion(
+            model=self.model,
+            messages=messages,
+            response_format=self.response_format,
+        )
+
+        return json.loads(res.choices[0].message.content)


⚠️ Potential issue

Add error handling to the score method.

The current implementation lacks error handling for API calls and JSON parsing, which could lead to unhandled exceptions in production. Consider adding try/except blocks to handle potential errors gracefully.

Apply this diff to improve error handling:

@weave.op() def score(self, inputs: dict, output: Any) -> str: - user_prompt = json.dumps( - { - "inputs": inputs, - "output": output, - } - ) + try: + user_prompt = json.dumps( + { + "inputs": inputs, + "output": output, + } + ) - messages = [ - {"role": "system", "content": self.system_prompt}, - {"role": "user", "content": user_prompt}, - ] + messages = [ + {"role": "system", "content": self.system_prompt}, + {"role": "user", "content": user_prompt}, + ] - res = litellm.completion( - model=self.model, - messages=messages, - response_format=self.response_format, - ) + res = litellm.completion( + model=self.model, + messages=messages, + response_format=self.response_format, + ) - return json.loads(res.choices[0].message.content) + return json.loads(res.choices[0].message.content) + except Exception as e: + # Log the error + print(f"Error in LLMJudgeScorer: {str(e)}") + # Return a structured error response + return {"error": str(e), "status": "failed"}

coderabbitai · 2025-03-15T02:50:15Z

weave/builtin_objects/models/CompletionModel.py

+import json
+from typing import Any, Optional
+
+import litellm
+
+import weave
+


⚠️ Potential issue

Fix circular import issue with weave.Model

There's a pipeline failure indicating a circular import: AttributeError: partially initialized module 'weave' has no attribute 'Model'. This needs to be fixed to prevent runtime errors.

Consider using a relative import or restructuring the imports to avoid this circular dependency. For example:

- import weave + from weave.flow.object import Model

Or delay the import using:

import weave from typing import TYPE_CHECKING if TYPE_CHECKING: from weave.flow.object import Model

weave/trace_server/clickhouse_trace_server_batched.py

coderabbitai · 2025-03-15T02:50:15Z

weave/trace_server/clickhouse_trace_server_batched.py

+    def call_method(self, req: tsi.CallMethodReq) -> tsi.CallMethodRes:
+        from weave.trace_server.server_side_object_saver import RunAsUser
+
+        if req.wb_user_id is None:
+            raise ValueError("User ID is required")
+
+        runner = RunAsUser(ch_server_dump=self.model_dump())
+        # TODO: handle errors here
+        res = runner.run_call_method(
+            req.object_ref, req.project_id, req.wb_user_id, req.method_name, req.args
+        )
+        return tsi.CallMethodRes.model_validate(res)
+
+    def score_call(self, req: tsi.ScoreCallReq) -> tsi.ScoreCallRes:
+        from weave.trace_server.server_side_object_saver import RunAsUser
+
+        runner = RunAsUser(ch_server_dump=self.model_dump())
+        res = runner.run_score_call(req)
+
+        return tsi.ScoreCallRes(
+            feedback_id=res["feedback_id"],
+            score_call=self.call_read(
+                tsi.CallReadReq(project_id=req.project_id, id=res["scorer_call_id"])
+            ).call,
+        )
+


🛠️ Refactor suggestion

Handle method invocation errors for call_method and score_call.

The # TODO: handle errors here note highlights a missing error-handling strategy. Wrap runner.run_call_method and runner.run_score_call in a try-except block, returning appropriate error responses or raising well-defined exceptions instead of relying on unhandled runtime errors.

coderabbitai · 2025-03-15T02:50:15Z

weave/trace_server/clickhouse_trace_server_batched.py

+    def model_dump(self) -> dict[str, Any]:
+        return {
+            "host": self._host,
+            "port": self._port,
+            "user": self._user,
+            "password": self._password,
+            "database": self._database,
+            "use_async_insert": self._use_async_insert,
+        }
+


⚠️ Potential issue

Avoid returning the raw database password from model_dump.

Exposing credentials (even just in memory or logs) can pose a security risk. Consider returning masked credentials or removing the password field altogether if not strictly necessary.

- "password": self._password, + "password": "*****" # or omit entirely

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def model_dump(self) -> dict[str, Any]:

return {

"host": self._host,

"port": self._port,

"user": self._user,

"password": self._password,

"database": self._database,

"use_async_insert": self._use_async_insert,

}

def model_dump(self) -> dict[str, Any]:

return {

"host": self._host,

"port": self._port,

"user": self._user,

"password": "*****", # or omit entirely

"database": self._database,

"use_async_insert": self._use_async_insert,

}

coderabbitai · 2025-03-15T02:50:15Z

tests/trace/builtin_objects/test_builtin_scorer.py

+def make_remote_scorer(client: WeaveClient):
+    obj_create_res = client.server.obj_create(
+        tsi.ObjCreateReq.model_validate(
+            {
+                "obj": {
+                    "project_id": client._project_id(),
+                    "object_id": "CorrectnessJudge",
+                    "val": scorer_args,
+                    "set_leaf_object_class": "LLMJudgeScorer",
+                }
+            }
+        )
+    )
+    client._flush()
+    obj_ref = weave.ObjectRef(
+        entity=client._project_id().split("/")[0],
+        project=client._project_id().split("/")[1],
+        name="CorrectnessJudge",
+        _digest=obj_create_res.digest,
+    )
+    return obj_ref
+


🛠️ Refactor suggestion

Ensure make_remote_scorer handles creation errors.

If the server-side creation fails (e.g., permission or invalid data), the test might crash. Consider verifying the server response to better diagnose issues.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (10)

tests/trace/builtin_objects/test_builtin_evaluation.py (3)

10-16: Consider using dynamically generated references instead of hardcoded ones.

The test uses hardcoded weave URIs which could make the test brittle if the referenced resources change. Consider creating the necessary resources as part of the test setup or using more generic reference patterns.

-    eval_args = {
-        "dataset": f"weave:///{client.entity}/{client.project}/object/Dataset:CObSXNGcvzDYN4LvAjXPux46YNMF2CJ5SZKTBbbEJy0",
-        "scorers": [
-            f"weave:///{client.entity}/{client.project}/object/LLMJudgeScorer:KlrVBCHEcKqPzl6zYCXuODBnEa8MTxYx9JOvPpS9gI0"
-        ],
-        "trials": 1,
-    }
+    # Create and publish dataset and scorer first, then use their URIs
+    dataset = weave.Dataset(rows=[{"input": "hi", "output": "hello"}])
+    dataset_ref = weave.publish(dataset)
+    
+    scorer = LLMJudgeScorer(
+        model="gpt-4o",
+        system_prompt="You are a judge that scores the correctness of a response.",
+        response_format={
+            "type": "json_schema",
+            "json_schema": {
+                "name": "Correctness",
+                "schema": {
+                    "type": "object",
+                    "properties": {
+                        "is_correct": {"type": "boolean"},
+                    },
+                },
+            },
+        },
+    )
+    scorer_ref = weave.publish(scorer)
+    
+    eval_args = {
+        "dataset": dataset_ref.uri(),
+        "scorers": [scorer_ref.uri()],
+        "trials": 1,
+    }

55-57: Consider using conditional assertion for nested refs feature.

The comment indicates that this assertion will fail until nested refs are supported. Consider using a conditional assertion or skip mechanism that can be easily enabled once the feature is available.

-    # This is going to fail until we support nested refs
-    # Not required for MVP
-    assert obj_create_res.digest == publish_ref.digest
+    # TODO: Enable this assertion when nested refs are supported
+    import os
+    if os.environ.get("ENABLE_NESTED_REFS_TEST", "").lower() in ("1", "true"):
+        assert obj_create_res.digest == publish_ref.digest, "Nested refs support is now available, but digests don't match"

62-71: Consider implementing remaining test cases.

The commented-out section indicates there are additional test cases to be implemented. Consider adding these as separate test functions (possibly with pytest.mark.skip) to make it clear what's still pending implementation.

Would you like help implementing these additional test functions that would be skipped until the required functionality is available?

weave/trace_server/clickhouse_trace_server_batched.py (1)

1734-1737: Add implementation timeline for evaluate_stream.

The method is currently stubbed with a NotImplementedError. Consider adding a comment with the expected timeline for implementation or add it to your issue tracking system.
     async def evaluate_stream(
         self, req: tsi.EvaluateReq
     ) -> AsyncIterator[tsi.EvaluateStepRes]:
-        raise NotImplementedError("Evaluate stream is not yet implemented")
+        # TODO(issue-123): Implement evaluate_stream by [DATE]
+        raise NotImplementedError("Evaluate stream will be implemented in a future PR")

tests/trace/builtin_objects/test_builtin_model.py (2)

85-108: Add error handling for object creation in remote tests.

Similar to the scorer tests, consider adding error handling for the server-side object creation to make tests more robust.

     obj_create_res = client.server.obj_create(
         tsi.ObjCreateReq.model_validate(
             {
                 "obj": {
                     "project_id": client._project_id(),
                     "object_id": "LiteLLMCompletionModel",
                     "val": model_args,
                     "builtin_object_class": "LiteLLMCompletionModel",
                 }
             }
         )
     )
+    # Verify that the object was created successfully
+    if not obj_create_res or not obj_create_res.digest:
+        raise ValueError("Failed to create remote model object")

111-149: Consider extracting common object creation code.

The remote object creation code is duplicated in multiple test functions. Consider extracting it to a helper function to reduce duplication.

def make_remote_model(client: WeaveClient):
    obj_create_res = client.server.obj_create(
        tsi.ObjCreateReq.model_validate(
            {
                "obj": {
                    "project_id": client._project_id(),
                    "object_id": "LiteLLMCompletionModel",
                    "val": model_args,
                    "builtin_object_class": "LiteLLMCompletionModel",
                }
            }
        )
    )
    if not obj_create_res or not obj_create_res.digest:
        raise ValueError("Failed to create remote model object")
    
    return ObjectRef(
        entity=client._project_id().split("/")[0],
        project=client._project_id().split("/")[1],
        name="LiteLLMCompletionModel",
        _digest=obj_create_res.digest,
    )

weave/trace_server/trace_server_interface.py (1)

935-949: Consider adding more details to empty response classes

The evaluation response classes (EvaluateStartRes, EvaluatePredictAndScoreRes, EvaluateSummaryRes) are currently empty. Consider adding documentation comments or example fields that might be needed in the future.
 class EvaluateStartRes(BaseModel):
-    pass
+    """Indicates that an evaluation has started."""
+    # Reserved for future use
+    pass

 class EvaluatePredictAndScoreRes(BaseModel):
-    pass
+    """Contains prediction and scoring information for an evaluation step."""
+    # Reserved for future use
+    pass

 class EvaluateSummaryRes(BaseModel):
-    pass
+    """Contains the summary of a completed evaluation."""
+    # Reserved for future use
+    pass

weave/trace_server/server_side_object_saver.py (3)

157-230: Potential memory leak in process handling

While the process is joined after getting the result, there's no timeout mechanism if the process hangs. Consider adding a timeout parameter and handling for process termination.
 def run_call_method(
     self,
     obj_ref: str,
     project_id: str,
     user_id: str,
     method_name: str,
     args: dict[str, Any],
+    timeout: float = 300.0,
 ) -> str:
     result_queue: multiprocessing.Queue[tuple[str, Any]] = multiprocessing.Queue()

     process = multiprocessing.Process(
         target=self._call_method,
         args=(obj_ref, project_id, user_id, method_name, args, result_queue),
     )

     process.start()
-    status, result = result_queue.get()
-    process.join()
+    try:
+        status, result = result_queue.get(timeout=timeout)
+        process.join(timeout=5.0)  # Give it 5 seconds to join
+        if process.is_alive():
+            process.terminate()
+            process.join()
+    except Exception as e:
+        process.terminate()
+        process.join()
+        raise RunCallMethodException(f"Process execution timed out or failed: {str(e)}")

     if status == "error":
         raise RunCallMethodException(f"Process execution failed: {result}")

     return result
231-318: Comprehensive score call implementation

The run_score_call and _score_call methods include:

Proper type validations

Error handling

Resource cleanup

Proper async handling with asyncio.run()

However, similarly to the run_call_method, a timeout mechanism would be beneficial.

341-410: Consistent user ID injection

The UserInjectingExternalTraceServer class properly validates and injects user IDs across all operations, maintaining consistency. The implementation pattern is repeated across all methods, which ensures reliability but creates some code duplication.

Consider using a decorator or helper method to reduce duplication in the user ID validation and injection pattern:
+    def _inject_user_id(self, req, field_name='wb_user_id'):
+        if self._user_id is None:
+            raise ValueError("User ID is required")
+        setattr(req, field_name, self._user_id)
+        return req

     def call_start(self, req: tsi.CallStartReq) -> tsi.CallStartRes:
-        if self._user_id is None:
-            raise ValueError("User ID is required")
-        req.start.wb_user_id = self._user_id
+        self._inject_user_id(req.start)
         return super().call_start(req)

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2fa60e9 and ce707ac.

📒 Files selected for processing (16)

tests/conftest.py (2 hunks)
tests/trace/builtin_objects/test_builtin_evaluation.py (1 hunks)
tests/trace/builtin_objects/test_builtin_model.py (1 hunks)
tests/trace/builtin_objects/test_builtin_scorer.py (1 hunks)
weave-js/src/components/FancyPage/useProjectSidebar.ts (1 hunks)
weave-js/src/components/PagePanelComponents/Home/Browse3.tsx (3 hunks)
weave-js/src/components/PagePanelComponents/Home/Browse3/context.tsx (4 hunks)
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/EvalStudio.tsx (1 hunks)
weave/builtin_objects/builtin_registry.py (1 hunks)
weave/trace/serialization/serialize.py (2 hunks)
weave/trace_server/clickhouse_trace_server_batched.py (4 hunks)
weave/trace_server/external_to_internal_trace_server_adapter.py (2 hunks)
weave/trace_server/server_side_object_saver.py (1 hunks)
weave/trace_server/sqlite_trace_server.py (2 hunks)
weave/trace_server/trace_server_interface.py (3 hunks)
weave/trace_server_bindings/remote_http_trace_server.py (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

weave/builtin_objects/builtin_registry.py
weave/trace/serialization/serialize.py

🧰 Additional context used

📓 Path-based instructions (2)

`**/*.{js,jsx,ts,tsx}`: Focus on architectural and logical i...

**/*.{js,jsx,ts,tsx}: Focus on architectural and logical issues rather than style (assuming ESLint is in place).
Flag potential memory leaks and performance bottlenecks.
Check for proper error handling and async/await usage.
Avoid strict enforcement of try/catch blocks - accept Promise chains, early returns, and other clear error handling patterns. These are acceptable as long as they maintain clarity and predictability.
Ensure proper type usage in TypeScript files.
Look for security vulnerabilities in data handling.
Don't comment on formatting if prettier is configured.
Verify proper React hooks usage and component lifecycle.
Check for proper state management patterns.

weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/EvalStudio.tsx
weave-js/src/components/FancyPage/useProjectSidebar.ts
weave-js/src/components/PagePanelComponents/Home/Browse3.tsx
weave-js/src/components/PagePanelComponents/Home/Browse3/context.tsx

`**/*.py`: Focus on pythonic code patterns. Check for proper...

**/*.py: Focus on pythonic code patterns.
Check for proper exception handling.
Verify type hints usage where applicable.
Look for potential performance improvements.
Don't comment on formatting if black/isort is configured.
Check for proper dependency injection patterns.
Verify proper async handling if applicable.

tests/trace/builtin_objects/test_builtin_evaluation.py
tests/conftest.py
weave/trace_server_bindings/remote_http_trace_server.py
tests/trace/builtin_objects/test_builtin_scorer.py
weave/trace_server/external_to_internal_trace_server_adapter.py
weave/trace_server/clickhouse_trace_server_batched.py
weave/trace_server/sqlite_trace_server.py
tests/trace/builtin_objects/test_builtin_model.py
weave/trace_server/trace_server_interface.py
weave/trace_server/server_side_object_saver.py

⏰ Context from checks skipped due to timeout of 90000ms (671)

GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, scorers)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, notdiamond)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, dspy)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, anthropic)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, notdiamond)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, dspy)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, notdiamond)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, dspy)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, notdiamond)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, dspy)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: test-query-service-matrix-check
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)

🔇 Additional comments (44)

weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/EvalStudio.tsx (1)

1-10: New component looks good but is currently a placeholder

The component follows good React and TypeScript patterns with proper typing of props. However, it's currently just a placeholder that will need to be implemented with actual functionality in a future PR.

Is this intentionally a minimal implementation to be expanded upon in future PRs?

weave-js/src/components/FancyPage/useProjectSidebar.ts (1)

177-183: Sidebar item for EvalStudio added correctly

The new EvalStudio sidebar button follows the established pattern of other sidebar items. It's correctly positioned in the navigation hierarchy before the Prompts item and uses appropriate visibility controls.

weave-js/src/components/PagePanelComponents/Home/Browse3.tsx (3)

69-69: Import for EvalStudioPage added properly

The import statement for the new EvalStudioPage component is added correctly.

446-448: Route for EvalStudio added correctly

The new route for the EvalStudio page follows the existing pattern and correctly uses the project root path.

965-968: EvalStudioPageBinding component implementation looks good

The binding component correctly extracts and passes the entity and project parameters to the EvalStudioPage component, following the pattern established by other similar binding components in the file.

weave-js/src/components/PagePanelComponents/Home/Browse3/context.tsx (4)

223-225: Method stub added to browse2Context

The placeholder method for evalStudioUri in browse2Context follows the established pattern. Note that it just throws an error which is expected for the browse2Context implementation.

499-501: Implementation of evalStudioUri in browse3ContextGen

The implementation correctly constructs the URI using the projectRoot function, consistent with other URI generation methods in the codebase.

611-612: Type definition updated for RouteType

The RouteType interface is correctly updated to include the new evalStudioUri method, maintaining type safety throughout the application.

745-747: Peeking router implementation for evalStudioUri

The implementation in the peeking router correctly follows the pattern established for other routes, ensuring consistent behavior with the peek functionality.
tests/conftest.py (3)

86-86: Default weave-server changed from sqlite to clickhouse.

The default database server for testing is now set to "clickhouse" instead of "sqlite". This change aligns with the PR objective of supporting on-demand evaluations, which may rely on clickhouse functionality.

397-401: Added evaluate_stream method to TestOnlyUserInjectingExternalTraceServer.

The implementation follows the established pattern in the class where the user ID is injected into the request before calling the parent class implementation. This is consistent with other methods in this class.

395-395:
❓ Verification inconclusive

Informative comment indicating potential implementation gaps.

This comment correctly identifies that implementations for call_method and score_call might be missing in the TestOnlyUserInjectingExternalTraceServer class. Consider implementing these methods to ensure complete test coverage of the new functionality.

🏁 Script executed:
#!/bin/bash
# Check if similar methods exist in the parent ExternalTraceServer class
grep -A 5 "def call_method" weave/trace_server/external_to_internal_trace_server_adapter.py
grep -A 5 "def score_call" weave/trace_server/external_to_internal_trace_server_adapter.py
Length of output: 891
Action Required: Clarify the Test Class Implementation for call_method and score_call

Although the parent class in weave/trace_server/external_to_internal_trace_server_adapter.py implements both call_method and score_call, the test-only class (TestOnlyUserInjectingExternalTraceServer in tests/conftest.py) does not override or explicitly implement these methods. This might be intentional if the tests are meant to use the inherited logic; however, if specialized behavior (or dedicated instrumentation for testing) is required, please consider either:

Overriding these methods in the test class with custom stubs or wrappers that call the parent implementations, or

Adding documentation to clarify that the test class is designed to rely on the parent’s implementations.

Please verify which approach is intended to ensure complete and clear test coverage.
weave/trace_server/sqlite_trace_server.py (3)

1253-1254: Appropriate stub for call_method in SqliteTraceServer.

The method raises a NotImplementedError with a clear message indicating that this functionality is not supported in the SQLite implementation. This is a correct approach for maintaining interface compatibility while clearly indicating lack of support.

1256-1257: Appropriate stub for score_call in SqliteTraceServer.

Similar to call_method, this method correctly raises NotImplementedError with a descriptive message indicating that the functionality is not implemented for the local SQLite trace server.

1259-1262: Proper async signature for evaluate_stream stub.

The method is correctly defined as async and raises a NotImplementedError with a descriptive message. The return type signature using AsyncIterator is appropriate for the streaming nature of this method.

tests/trace/builtin_objects/test_builtin_evaluation.py (1)

18-38: LGTM: Well-structured Evaluation object creation.

The evaluation is properly constructed with a dataset and a scorer, following the expected pattern for the Weave Evaluation API.
weave/trace_server_bindings/remote_http_trace_server.py (3)

549-552: Correct implementation of call_method for RemoteHTTPTraceServer.

The method properly delegates to _generic_request with the appropriate endpoint and request/response types.

554-557: Correct implementation of score_call for RemoteHTTPTraceServer.

Similar to call_method, this method properly uses _generic_request with the appropriate endpoint and types.

559-562:
✅ Verification successful

NotImplementedError for evaluate_stream is appropriate for now.

The error message clearly indicates that the functionality is not yet implemented. Since this is an async method for streaming results, it likely requires a different implementation approach than the other methods.

🏁 Script executed:
#!/bin/bash
# Check if a streaming method implementation pattern exists in this codebase
grep -r "async def" --include="*.py" weave/trace_server_bindings/
grep -r "AsyncIterator" --include="*.py" weave/trace_server_bindings/
Length of output: 433
Async Streaming Stub Review: No Further Action Required

The NotImplementedError for evaluate_stream is appropriate given that no other streaming method implementations exist in our codebase. The grep results confirm that aside from this stub in remote_http_trace_server.py, no alternative implementations are present, so indicating that this functionality is "not yet implemented" is correct.
weave/trace_server/external_to_internal_trace_server_adapter.py (4)

3-3: Added import is appropriate for the new async method.

The import of AsyncIterator from collections.abc properly supports the return type annotation for the new evaluate_stream async method.

386-392: LGTM: New method follows existing class patterns.

The implementation correctly converts project and user IDs from external to internal representations before calling the internal server. It also properly validates the user ID, ensuring it's not None.

394-400: LGTM: Consistent implementation with other methods.

The score_call method follows the same pattern as other methods in this class, handling ID conversion and validation before delegating to the internal server.

402-410: Async method implementation correctly handles direct server call.

Unlike the other methods, this async method directly awaits the internal server call rather than using _ref_apply, which is appropriate for async streaming responses. The validation and ID conversion logic is consistent with other methods.
tests/trace/builtin_objects/test_builtin_scorer.py (4)

101-121: Ensure make_remote_scorer handles creation errors.

If the server-side creation fails (e.g., permission or invalid data), the test might crash. Consider verifying the server response to better diagnose issues.
 def make_remote_scorer(client: WeaveClient):
     obj_create_res = client.server.obj_create(
         tsi.ObjCreateReq.model_validate(
             {
                 "obj": {
                     "project_id": client._project_id(),
                     "object_id": "CorrectnessJudge",
                     "val": scorer_args,
                     "builtin_object_class": "LLMJudgeScorer",
                 }
             }
         )
     )
+    # Verify that the object was created successfully
+    if not obj_create_res or not obj_create_res.digest:
+        raise ValueError("Failed to create remote scorer object")
     client._flush()
     obj_ref = weave.ObjectRef(
         entity=client._project_id().split("/")[0],
         project=client._project_id().split("/")[1],
         name="CorrectnessJudge",
         _digest=obj_create_res.digest,
     )
     return obj_ref
7-29: LGTM: Comprehensive test configuration.

Test configuration properly defines the scorer arguments with a clear JSON schema for the expected response format.

87-98: LGTM: Well-structured helper for remote scoring.

The do_remote_score function is cleanly implemented to handle remote scoring, appropriately converting to model-validated request objects.

124-152: LGTM: Comprehensive test coverage for all scenarios.

The test cases provide excellent coverage for all combinations of local/remote creation and usage, ensuring the scorer works consistently across environments.
weave/trace_server/clickhouse_trace_server_batched.py (4)

225-234: Avoid returning the raw database password from model_dump.

Exposing credentials (even just in memory or logs) can pose a security risk. Consider returning masked credentials or removing the password field altogether if not strictly necessary.
-            "password": self._password,
+            "password": "*****"  # or omit entirely
589-602: Add fallback handling and error checks for unrecognized builtin object classes.

Currently, if get_builtin(req.obj.builtin_object_class) returns None or raises an exception, the code does not handle it, potentially causing a runtime error.
         if req.obj.builtin_object_class is not None:
             from weave.trace_server.server_side_object_saver import RunAsUser

             object_class_type = get_builtin(req.obj.builtin_object_class)
+            if object_class_type is None:
+                raise ValueError(f"Unknown builtin object class: {req.obj.builtin_object_class}")

             # PROBLEM: If req.obj.val contains refs, this is going to break due to
             # pydantic validation. No clear solution.
             new_obj = object_class_type.model_validate(req.obj.val, strict=False)
1708-1719: Handle method invocation errors for call_method.

The # TODO: handle errors here note highlights a missing error-handling strategy. Wrap runner.run_call_method in a try-except block, returning appropriate error responses instead of relying on unhandled runtime errors.
     def call_method(self, req: tsi.CallMethodReq) -> tsi.CallMethodRes:
         from weave.trace_server.server_side_object_saver import RunAsUser

         if req.wb_user_id is None:
             raise ValueError("User ID is required")

         runner = RunAsUser(ch_server_dump=self.model_dump())
-        # TODO: handle errors here
-        res = runner.run_call_method(
-            req.object_ref, req.project_id, req.wb_user_id, req.method_name, req.args
-        )
-        return tsi.CallMethodRes.model_validate(res)
+        try:
+            res = runner.run_call_method(
+                req.object_ref, req.project_id, req.wb_user_id, req.method_name, req.args
+            )
+            return tsi.CallMethodRes.model_validate(res)
+        except Exception as e:
+            # Log the error and return an appropriate error response
+            logger.error(f"Error running method {req.method_name}: {str(e)}")
+            raise ValueError(f"Error running method {req.method_name}: {str(e)}")
1721-1733: Handle method invocation errors for score_call.

The same error handling issue exists for the score_call method as with call_method. Implement proper error handling for a more robust implementation.
     def score_call(self, req: tsi.ScoreCallReq) -> tsi.ScoreCallRes:
         from weave.trace_server.server_side_object_saver import RunAsUser

         runner = RunAsUser(ch_server_dump=self.model_dump())
-        res = runner.run_score_call(req)
-
-        return tsi.ScoreCallRes(
-            feedback_id=res["feedback_id"],
-            score_call=self.call_read(
-                tsi.CallReadReq(project_id=req.project_id, id=res["scorer_call_id"])
-            ).call,
-        )
+        try:
+            res = runner.run_score_call(req)
+            return tsi.ScoreCallRes(
+                feedback_id=res["feedback_id"],
+                score_call=self.call_read(
+                    tsi.CallReadReq(project_id=req.project_id, id=res["scorer_call_id"])
+                ).call,
+            )
+        except Exception as e:
+            logger.error(f"Error during score_call: {str(e)}")
+            raise ValueError(f"Error during score_call: {str(e)}")
tests/trace/builtin_objects/test_builtin_model.py (2)

7-23: LGTM: Well-structured model configuration.

The model arguments are clearly defined with an appropriate JSON schema for structured output.

30-51: LGTM: Thorough test for model publishing.

The test properly verifies that publishing the model and creating it directly on the server result in the same object.

weave/trace_server/trace_server_interface.py (6)

903-909: Good implementation of callable method request structure

The CallMethodReq model has a clear structure with well-defined fields and proper type hints. The use of Field with description for wb_user_id is a good practice.

911-914: Good implementation of method call response structure

The CallMethodRes model provides the essential information needed after a method call - the call ID for tracing and the actual output.

916-921: Clear scoring request implementation

The ScoreCallReq model is well-structured with appropriate fields for identifying the project, call to score, and scorer to use. The wb_user_id field uses the standard description pattern seen elsewhere in the file.

923-926: Structured score call response

The response correctly includes both the feedback ID for reference and the complete call schema which provides all details about the scoring operation.

928-933: Well-defined evaluation request structure

The EvaluateReq model follows the same pattern as other request models, maintaining consistency in the codebase.

1004-1009: New methods align with interface pattern

The new methods added to the TraceServerInterface protocol maintain consistency with the existing pattern. The async method for evaluation streaming is a good choice for handling potentially long-running operations.

weave/trace_server/server_side_object_saver.py (5)

22-25: Well-structured result type

Using TypedDict for ScoreCallResult is appropriate for defining a static structure that will be serialized/deserialized.

27-37: Good separation of exception types

Creating specific exception classes for each operation type improves error handling and debugging.

39-70: Excellent implementation of process isolation

The RunAsUser class provides a robust way to execute functions in isolated memory spaces using multiprocessing. The docstrings are comprehensive and the static method _process_runner is well-implemented.

71-156: Well-structured object saving functionality

The run_save_object and _save_object methods have:

Clear separation of concerns

Good error handling

Proper cleanup of resources

Detailed docstrings

320-339: Simple and effective ID conversion

The IdConverter class follows the interface defined in the parent class and has simple implementations for the required methods.

coderabbitai · 2025-03-15T17:35:02Z

weave/trace_server/server_side_object_saver.py

+            # TODO: validate project alignment?
+            int_ref = parse_internal_uri(obj_ref)
+            assert isinstance(int_ref, InternalObjectRef)
+            ref = ObjectRef(
+                entity="_SERVER_",
+                project=int_ref.project_id,
+                name=int_ref.name,
+                _digest=int_ref.version,
+            )
+            obj = client.get(ref)
+            method = getattr(obj, method_name)
+            # TODO: Self might be wrong
+            res, call = method.call(self=obj, **args)
+            autopatch.reset_autopatch()


💡 Verification agent

🧩 Analysis chain

TODOs should be addressed before production release

The code contains TODOs at lines 209 and 220 for validating project alignment and fixing potential self parameter issues. These should be addressed before releasing to production.

Are these TODOs still relevant? Consider either addressing them or creating specific issues to track them.

🏁 Script executed:

#!/bin/bash # Find all TODOs in the codebase related to this functionality grep -n "TODO" --include="*.py" --include="*.md" ./weave/trace_server/server_side_object_saver.py

Length of output: 193

Attention: Pending TODO Items in server_side_object_saver.py

The verification confirms that the two TODOs remain in the file:

Line 209: Validate project alignment?

Line 220: Potential issues with the self parameter.

Please either address these items or create corresponding issues to track them before the production release.

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (23)

weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/EvalStudio.tsx (2)
16-57: Consider extracting inline styles to improve maintainability.

The EvalStudioContent component uses many inline styles that would be better managed with a CSS-in-JS solution or separate stylesheet.

Consider using styled-components or extracting these styles to a separate constant:
+ const styles = {
+   container: { display: 'flex', height: '100%' },
+   sidebar: {
+     width: '300px',
+     borderRight: '1px solid #ccc',
+     overflow: 'auto',
+   },
+   content: { flex: 1, overflow: 'auto' },
+   emptyState: { padding: '1rem' }
+ };

// In the JSX:
- <div style={{display: 'flex', height: '100%'}}>
+ <div style={styles.container}>
59-67: Setting a fixed viewport height can cause layout issues.

Using height: '100vh' can lead to layout problems on mobile browsers where the viewport height changes as the address bar appears/disappears.

Consider using a more flexible approach:
- <div style={{height: '100vh', display: 'flex', flexDirection: 'column'}}>
+ <div style={{height: '100%', minHeight: '100vh', display: 'flex', flexDirection: 'column'}}>
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/EvaluationResults.tsx (3)
44-66: Error handling could be improved for more specific error messages.

The error message is generic and doesn't provide specific information about what went wrong during evaluation.

Consider providing more specific error handling:
try {
  const result = await runEvaluation(
    selectedEvaluation.id,
    selectedModel.id
  );
  setResults(prev => [...prev, result]);
  setSelectedModel(null);
} catch (error) {
  console.error('Failed to run evaluation:', error);
-  setError('Failed to run evaluation');
+  setError(`Failed to run evaluation: ${error instanceof Error ? error.message : 'Unknown error'}`);
} finally {
  setIsRunning(false);
}
80-110: Consider adding confirmation before running evaluations.

Running an evaluation might be resource-intensive or have costs associated with it, but there's no confirmation step.

Add a confirmation step before running the evaluation:
<button
  onClick={handleRunEvaluation}
+ onClick={() => {
+   if (window.confirm('Are you sure you want to run this evaluation?')) {
+     handleRunEvaluation();
+   }
+ }}
  disabled={isRunning || !selectedModel}>
  {isRunning ? 'Running...' : 'Run Evaluation'}
</button>
111-189: Consider extracting the result item to a separate component.

The result item rendering is complex and would benefit from being extracted to a separate component for better maintainability.

Extract the result item to a separate component:
type ResultItemProps = {
  result: EvaluationResult;
  onSelect: (result: EvaluationResult) => void;
};

const ResultItem: React.FC<ResultItemProps> = ({ result, onSelect }) => {
  return (
    <div
      key={result.id}
      style={{
        padding: '1rem',
        border: '1px solid #ccc',
        borderRadius: '4px',
        cursor: result.status === 'completed' ? 'pointer' : 'default',
        backgroundColor: 'white',
        transition: 'background-color 0.2s',
      }}
      onMouseEnter={e => {
        if (result.status === 'completed') {
          e.currentTarget.style.backgroundColor = '#f5f5f5';
        }
      }}
      onMouseLeave={e => {
        if (result.status === 'completed') {
          e.currentTarget.style.backgroundColor = 'white';
        }
      }}
      onClick={() => {
        if (result.status === 'completed') {
          onSelect(result);
        }
      }}>
      {/* Item content */}
    </div>
  );
};

// Then in the main component:
{results.map(result => (
  <ResultItem 
    key={result.id} 
    result={result} 
    onSelect={setSelectedResult} 
  />
))}
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/EvaluationsList.tsx (2)
12-25: Error handling should include user-facing error feedback.

While console.error is good for debugging, the component doesn't show any error message to the user when evaluations can't be loaded.
  useEffect(() => {
    const loadEvaluations = async () => {
      try {
        const data = await fetchEvaluations();
        setEvaluations(data);
      } catch (error) {
        console.error('Failed to fetch evaluations:', error);
+       setError('Failed to load evaluations. Please try again later.');
      } finally {
        setLoading(false);
      }
    };

    loadEvaluations();
  }, []);
You'll also need to add an error state:
const [error, setError] = useState<string | null>(null);
And render it when present:
{error && <div style={{ color: 'red', marginBottom: '1rem' }}>{error}</div>}
31-74: Consider extracting inline styles to a dedicated CSS/SCSS file.

Inline styles make the component harder to maintain as it grows. Consider moving these styles to a dedicated CSS module or styled-components.
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/CreateEvaluation.tsx (2)
45-67: Add loading state during form submission.

The component doesn't show a loading indicator during evaluation creation, which might confuse users if the operation takes time. Consider adding a submitting state and disabling the form during submission.
export const CreateEvaluation: React.FC = () => {
  const [datasets, setDatasets] = useState<Dataset[]>([]);
  const [scorers, setScorers] = useState<Scorer[]>([]);
  const [loading, setLoading] = useState(true);
+ const [submitting, setSubmitting] = useState(false);
  const [error, setError] = useState<string | null>(null);

  // ...existing code...

  const handleSubmit = async (e: React.FormEvent) => {
    e.preventDefault();
    if (
      !selectedDataset ||
      selectedScorers.length === 0 ||
      !evaluationName.trim()
    ) {
      setError('Please fill in all required fields');
      return;
    }

+   setSubmitting(true);
    try {
      await createEvaluation(
        evaluationName,
        selectedDataset.id,
        selectedScorers.map(s => s.id)
      );
      setIsCreatingNewEval(false);
    } catch (error) {
      console.error('Failed to create evaluation:', error);
      setError('Failed to create evaluation');
+   } finally {
+     setSubmitting(false);
    }
  };
And then disable the submit button when submitting:
- <button type="submit">Create Evaluation</button>
+ <button type="submit" disabled={submitting}>
+   {submitting ? 'Creating...' : 'Create Evaluation'}
+ </button>
142-150: Consider extracting the scorer selection logic to a separate function.

The inline onClick handler is somewhat complex. Extracting it to a named function would improve readability.
+ const toggleScorerSelection = (scorer: Scorer) => {
+   if (selectedScorers.some(s => s.id === scorer.id)) {
+     setSelectedScorers(selectedScorers.filter(s => s.id !== scorer.id));
+   } else {
+     setSelectedScorers([...selectedScorers, scorer]);
+   }
+ };

// Then in the JSX:
- onClick={() => {
-   if (selectedScorers.some(s => s.id === scorer.id)) {
-     setSelectedScorers(
-       selectedScorers.filter(s => s.id !== scorer.id)
-     );
-   } else {
-     setSelectedScorers([...selectedScorers, scorer]);
-   }
- }}>
+ onClick={() => toggleScorerSelection(scorer)}>
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/DetailedResults.tsx (2)
105-109: Use useMemo for finding samples to avoid unnecessary calculations during render.

Finding a sample for each prediction inside the render function can be inefficient, especially for large datasets. Consider using useMemo to optimize this operation.
+ const sampleMap = React.useMemo(() => {
+   if (!selectedEvaluation?.dataset?.samples) return {};
+   return selectedEvaluation.dataset.samples.reduce((acc, sample) => {
+     acc[sample.id] = sample;
+     return acc;
+   }, {} as Record<string, typeof selectedEvaluation.dataset.samples[0]>);
+ }, [selectedEvaluation?.dataset?.samples]);

// Then in the render function:
- const sample = selectedEvaluation.dataset.samples.find(
-   s => s.id === prediction.sampleId
- );
+ const sample = sampleMap[prediction.sampleId];
129-129: Add type guard for score values.

The current code assumes scores are numbers when they exist, but doesn't validate this assumption. Adding a type check would make the code more robust.
- {prediction.scores[scorer.id]?.toFixed(4) || 'N/A'}
+ {typeof prediction.scores[scorer.id] === 'number' 
+   ? prediction.scores[scorer.id].toFixed(4) 
+   : 'N/A'}
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/api.ts (6)
138-145: Add validation for non-existent evaluation IDs

The fetchEvaluationResults function filters results but doesn't handle the case where the evaluation ID doesn't exist. If no results match the filter, it returns an empty array without any error or notification.
export const fetchEvaluationResults = async (
  evaluationId: string
): Promise<EvaluationResult[]> => {
  await new Promise(resolve => setTimeout(resolve, 500));
+ const evaluation = MOCK_EVALUATIONS.find(e => e.id === evaluationId);
+ 
+ if (!evaluation) {
+   throw new Error(`Evaluation with ID ${evaluationId} not found`);
+ }
+
  return MOCK_EVALUATION_RESULTS.filter(
    r => r.evaluationDefinition.id === evaluationId
  );
};
156-158: Improve error message specificity

The current error message doesn't clearly indicate which specific input was invalid. More detailed error messages would help with debugging and provide better feedback.
  if (!dataset || scorers.length === 0) {
-   throw new Error('Invalid dataset or scorers');
+   if (!dataset) {
+     throw new Error(`Dataset with ID "${datasetId}" not found`);
+   } else {
+     throw new Error(`No valid scorers found for IDs: ${scorerIds.join(', ')}`);
+   }
  }
178-180: Improve error message specificity

Similar to the previous comment, this error message could be more specific about which input (evaluation or model) was invalid.
  if (!evaluation || !model) {
-   throw new Error('Invalid evaluation or model');
+   if (!evaluation) {
+     throw new Error(`Evaluation with ID "${evaluationId}" not found`);
+   } else {
+     throw new Error(`Model with ID "${modelId}" not found`);
+   }
  }
182-193: Consider adding error case simulation

The current implementation always returns a successful result. For more realistic testing, consider adding logic to occasionally simulate failures or partial failures.
+ // Simulate occasional failures (e.g., 10% chance)
+ if (Math.random() < 0.1) {
+   return {
+     id: `result-${Date.now()}`,
+     evaluationDefinition: evaluation,
+     model,
+     metrics: {},
+     status: 'failed',
+     createdAt: new Date().toISOString(),
+   };
+ }

  return {
    id: `result-${Date.now()}`,
    evaluationDefinition: evaluation,
    model,
    metrics: {
      accuracy: Math.random() * 0.2 + 0.8, // Random accuracy between 0.8 and 1.0
      f1_score: Math.random() * 0.2 + 0.8,
    },
    status: 'completed',
    createdAt: new Date().toISOString(),
  };
10-32: Consider adding type annotations to mock data

Adding explicit type annotations to the mock data constants would improve code readability and ensure type safety.
- const MOCK_DATASETS: Dataset[] = [
+ const MOCK_DATASETS: readonly Dataset[] = [
  {
    id: 'dataset-1',
    name: 'MNIST Test Set',
    createdAt: '2024-03-20T10:00:00Z',
    samples: [
      {id: 'sample-1', input: 'Image of digit 7'},
      {id: 'sample-2', input: 'Image of digit 3'},
      {id: 'sample-3', input: 'Image of digit 5'},
    ],
  },
  // ...
];
Applying similar changes to other mock data constants would provide additional type safety.

117-136: Consider adding error handling for network failures

The mock API functions simulate network delay but don't simulate network failures. Adding error simulation would make the mock more realistic and help test error handling in components.

For example, you could add a utility function:
const simulateNetworkRequest = async <T>(data: T, failureRate = 0.05): Promise<T> => {
  await new Promise(resolve => setTimeout(resolve, 500 + Math.random() * 500));
  
  // Simulate occasional network failures
  if (Math.random() < failureRate) {
    throw new Error('Network request failed');
  }
  
  return data;
};
Then use it in your API functions:
export const fetchEvaluations = async (): Promise<EvaluationDefinition[]> => {
-  await new Promise(resolve => setTimeout(resolve, 500)); // Simulate network delay
-  return MOCK_EVALUATIONS;
+  return simulateNetworkRequest(MOCK_EVALUATIONS);
};
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/types.ts (6)
1-4: Consider adding more specific input type

The input field is currently typed as string with a comment suggesting it could be more specific. Depending on your actual use case, consider creating a more specific type that matches your domain model.
export interface DatasetSample {
  id: string;
-  input: string; // This could be more specific based on your needs
+  input: {
+    text?: string;
+    imageUrl?: string;
+    // Add other potential input types
+  };
}
35-39: Add typing for model prediction based on expected output format

Similar to the input field, modelPrediction is currently typed as a string but could be more specific based on your domain model.
export interface PredictionResult {
  sampleId: string;
-  modelPrediction: string; // This could be more specific based on your needs
+  modelPrediction: {
+    label?: string;
+    confidence?: number;
+    rawOutput?: unknown;
+  };
  scores: Record<string, number>; // Keyed by scorer ID
}
46-53: Consider adding error information to EvaluationResult

When an evaluation fails (status: 'failed'), there's no way to understand why it failed. Consider adding an error field to provide more context.
export interface EvaluationResult {
  id: string;
  evaluationDefinition: EvaluationDefinition;
  model: Model;
  metrics: Record<string, number>; // Placeholder for summary metrics
  status: 'running' | 'completed' | 'failed';
  createdAt: string;
+  error?: {
+    message: string;
+    code?: string;
+    details?: unknown;
+  };
}
55-65: Consider adding loading states to context

The context state tracks various selection states but doesn't track loading states for async operations. This could lead to a poor user experience if operations take time.
export interface EvalStudioContextState {
  selectedEvaluation: EvaluationDefinition | null;
  selectedDataset: Dataset | null;
  selectedScorers: Scorer[];
  evaluationName: string;
  isCreatingNewEval: boolean;
  isCreatingNewDataset: boolean;
  isCreatingNewScorer: boolean;
  selectedResult: EvaluationResult | null;
+  isLoading: {
+    evaluations?: boolean;
+    datasets?: boolean;
+    scorers?: boolean;
+    results?: boolean;
+    creation?: boolean;
+  };
+  error: {
+    message?: string;
+    source?: string;
+  } | null;
}
67-76: Add methods for error handling in context

To complement the suggested error state in the previous comment, add methods to handle error state management.
export interface EvalStudioContextValue extends EvalStudioContextState {
  setSelectedEvaluation: (evaluation: EvaluationDefinition | null) => void;
  setSelectedDataset: (dataset: Dataset | null) => void;
  setSelectedScorers: (scorers: Scorer[]) => void;
  setEvaluationName: (name: string) => void;
  setIsCreatingNewEval: (isCreating: boolean) => void;
  setIsCreatingNewDataset: (isCreating: boolean) => void;
  setIsCreatingNewScorer: (isCreating: boolean) => void;
  setSelectedResult: (result: EvaluationResult | null) => void;
+  setLoading: (key: keyof EvalStudioContextState['isLoading'], isLoading: boolean) => void;
+  setError: (error: EvalStudioContextState['error']) => void;
+  clearError: () => void;
}
50-50: Add typing for metrics instead of using generic Record

The metrics field uses a generic Record<string, number> which doesn't provide any information about the expected metrics. Consider creating a more specific type.
- metrics: Record<string, number>; // Placeholder for summary metrics
+ metrics: {
+   accuracy?: number;
+   f1_score?: number;
+   precision?: number;
+   recall?: number;
+   [key: string]: number | undefined; // Allow for custom metrics
+ };

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ce707ac and eed3f00.

📒 Files selected for processing (12)

tests/conftest.py (2 hunks)
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/EvalStudio.tsx (1 hunks)
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/api.ts (1 hunks)
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/CreateDataset.tsx (1 hunks)
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/CreateEvaluation.tsx (1 hunks)
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/CreateScorer.tsx (1 hunks)
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/DetailedResults.tsx (1 hunks)
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/EvaluationResults.tsx (1 hunks)
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/EvaluationsList.tsx (1 hunks)
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/context.tsx (1 hunks)
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/types.ts (1 hunks)
weave/builtin_objects/models/CompletionModel.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

weave/builtin_objects/models/CompletionModel.py
tests/conftest.py

🧰 Additional context used

📓 Path-based instructions (1)

`**/*.{js,jsx,ts,tsx}`: Focus on architectural and logical i...

**/*.{js,jsx,ts,tsx}: Focus on architectural and logical issues rather than style (assuming ESLint is in place).
Flag potential memory leaks and performance bottlenecks.
Check for proper error handling and async/await usage.
Avoid strict enforcement of try/catch blocks - accept Promise chains, early returns, and other clear error handling patterns. These are acceptable as long as they maintain clarity and predictability.
Ensure proper type usage in TypeScript files.
Look for security vulnerabilities in data handling.
Don't comment on formatting if prettier is configured.
Verify proper React hooks usage and component lifecycle.
Check for proper state management patterns.

weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/CreateDataset.tsx
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/EvaluationResults.tsx
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/CreateEvaluation.tsx
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/context.tsx
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/EvaluationsList.tsx
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/EvalStudio.tsx
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/DetailedResults.tsx
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/api.ts
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/CreateScorer.tsx
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/types.ts

⏰ Context from checks skipped due to timeout of 90000ms (42)

GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 10, huggingface)

🔇 Additional comments (3)

weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/CreateScorer.tsx (1)

1-15:
❓ Verification inconclusive

Placeholder component structure looks good, but lacks implementation.

The component correctly uses the useEvalStudio hook to access context and provides a basic UI structure with a close button. However, this is just a placeholder with no actual scorer creation functionality implemented yet.

Since this is part of a new feature, verify that there's a plan for implementing the actual scorer creation UI:

🌐 Web query:
What are common design patterns for form creation in React applications?
💡 Result:

Creating forms in React applications involves several design patterns that enhance maintainability, scalability, and user experience. Below are the most common patterns and their use cases:

1. 3-Layer Architecture (Apollo/Logic/View)

A modular approach for complex forms, separating concerns into distinct components:

Apollo/Network Layer: Handles API interactions (data fetching/submission) and transforms data between API and form schemas[1][13].

Logic Layer: Manages form state, validation (e.g., Zod or Yup), and submission logic via libraries like react-hook-form[2][13].

View Layer: Stateless UI rendering with optional local state for visual interactions (e.g., collapsible sections)[1][13].

Example Flow:
// Apollo Layer (CreateUserApollo.tsx)
const fetchData = () => api.getUser().then(transformForForm);
const handleSubmit = (data) => api.postUser(transformForAPI(data));

// Logic Layer (CreateUserLogic.tsx)
const form = useForm({ defaultValues, resolver: zodResolver(schema) });
const onSubmit = (data) => handleSubmit(data).then(resetForm);

// View Layer (CreateUserView.tsx)
<Form {...form}>
  <FormField name="email" render={({ field }) => <Input {...field} />} />
</Form>
2. Controlled vs. Uncontrolled Components
Controlled: Form state managed via React (ideal for validation/real-time updates):
const [value, setValue] = useState('');
<input value={value} onChange={(e) => setValue(e.target.value)} />[5][6][13]
Uncontrolled: DOM manages state (better performance for large forms):
const handleSubmit = (e) => {
  e.preventDefault();
  const data = new FormData(e.target);
};[5][13]
3. Form Library Integration

Libraries like React Hook Form + Zod streamline validation and state management:
const formSchema = z.object({ username: z.string().min(2) });
const form = useForm({ resolver: zodResolver(formSchema) });

<FormField
  control={form.control}
  name="username"
  render={({ field }) => <Input {...field} />}
/>[3][14]
4. Compound Components

Share state across form elements using context (e.g., custom Form components):
<Form>
  <Form.Field name="email">
    <Form.Label>Email</Form.Label>
    <Form.Input />
  </Form.Field>
</Form>
This pattern is used in libraries like Shadcn UI and Radix[2][7][9].

5. Container-Presentational Pattern

Separates business logic (Container) from UI (Presentational Component):
// Container
const UserContainer = () => {
  const [user, setUser] = useState(null);
  return <UserProfile user={user} />;
};

// Presentational Component
const UserProfile = ({ user }) => <div>{user?.name}</div>;[4][10]
6. Provider-Consumer Pattern

Uses React Context for global state (e.g., themes/auth):
const FormContext = createContext();
const FormProvider = ({ children }) => {
  const [formState, setFormState] = useState();
  return (
    <FormContext.Provider value={{ formState, setFormState }}>
      {children}
    </FormContext.Provider>
  );
};[4][7][8]
7. Higher-Order Components (HOCs)

Reuse logic across forms (e.g., auth checks):
const withFormLogging = (Component) => (props) => {
  useEffect(() => console.log('Form mounted'), []);
  return <Component {...props} />;
};[4][7][10]
Comparison Table

Pattern Use Case Complexity Key Benefit

3-Layer Architecture Large-scale forms High Clear separation of concerns

Controlled Components Real-time validation Medium Granular state control

React Hook Form Rapid development Low Built-in validation/optimization

Compound Components Reusable form systems Medium Shared state management

These patterns address different needs, from simple forms (uncontrolled components) to enterprise applications (3-layer architecture). For modern projects, combining React Hook Form with Zod validation and Compound Components offers a balanced approach for maintainability and scalability[2][3][13].

Citations:

1: https://spencerpauly.com/tech/best-form-design-pattern-for-react/

2: https://ui.shadcn.com/docs/forms/react-hook-form

3: https://www.sayonetech.com/blog/react-design-patterns/

4: https://dev.to/ajones_codes/a-better-guide-to-forms-in-react-47f0

5: https://www.telerik.com/blogs/react-basics-react-forms-examples

6: https://www.syncfusion.com/blogs/post/react-design-patterns

7: https://refine.dev/blog/react-design-patterns/

8: https://www.w3schools.com/react/react_forms.asp

9: https://www.uxpin.com/studio/blog/react-design-patterns/

10: https://react.dev/reference/react-dom/components/common

11: https://demo.mobiscroll.com/react/forms

12: https://legacy.reactjs.org/docs/forms.html

13: https://dev.to/spencerpauly/the-1-best-design-pattern-for-managing-forms-in-react-4215

14: https://react.dev/reference/react-dom/components/form

15: https://www.youtube.com/watch?v=Lbj3vmp8spI

16: https://github.com/brillout/awesome-react-components
Placeholder Scorer Creation – Plan for Full Implementation Needed

The CreateScorer component is a good structural start and correctly uses the useEvalStudio context to manage UI state. However, as a placeholder it lacks the actual functionality for creating a scorer.

For the upcoming implementation, please ensure there’s a clear roadmap addressing:

Form State & Validation: Consider using a well-established pattern like React Hook Form combined with a validation library (e.g., Zod) for streamlined state management and validation.

Component Architecture: Evaluate the benefits of a Compound Components or 3-Layer Architecture pattern to separate concerns (data fetching, business logic, and presentation) effectively.

UI/UX Consistency: Align the form creation strategy with existing standards within the application to maintain consistency and scalability.

Please confirm that these design directions are documented and planned as part of this new feature’s roadmap.
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/CreateDataset.tsx (1)

1-15: Placeholder component structure looks good, but lacks implementation.

The component correctly uses the useEvalStudio hook to access context and provides a basic UI structure with a close button. However, this is just a placeholder with no actual dataset creation functionality implemented yet.

Consider adding a more descriptive comment or TODO to indicate the expected implementation timeline:
-      <p>Dataset creation UI will be implemented here.</p>
+      <p>Dataset creation UI will be implemented here. </p>
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/context.tsx (1)

1-58: Solid implementation of a React context provider for the Eval Studio.

The context implementation follows good React patterns with proper TypeScript typing, immutable state updates using the spread operator, and a custom hook with appropriate error handling.

weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/EvalStudio.tsx

...omponents/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/EvaluationResults.tsx

coderabbitai · 2025-03-15T18:44:11Z

.../components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/DetailedResults.tsx

+                  {selectedEvaluation.scorers.map(scorer => (
+                    <td
+                      key={scorer.id}
+                      style={{
+                        padding: '0.5rem',
+                        borderBottom: '1px solid #eee',
+                        color:
+                          prediction.scores[scorer.id] >= 0.5 ? 'green' : 'red',
+                      }}>
+                      {prediction.scores[scorer.id]?.toFixed(4) || 'N/A'}
+                    </td>
+                  ))}


🛠️ Refactor suggestion

Parameterize the score threshold for color-coding.

The component hard-codes a threshold of 0.5 for determining if a score is "good" (green) or "bad" (red). This might not be appropriate for all scorers which could have different scales or meanings for their scores.

Consider one of these approaches:

Get threshold values from each scorer's configuration

Allow for customizable thresholds based on scorer type

Use a more gradual color scale

Example implementation:

+ // Function to determine color based on score and scorer + const getScoreColor = (score: number, scorerId: string) => { + const scorer = selectedEvaluation.scorers.find(s => s.id === scorerId); + // Get threshold from scorer config or use default + const threshold = scorer?.config?.threshold || 0.5; + return score >= threshold ? 'green' : 'red'; + }; // Then in the JSX: - color: prediction.scores[scorer.id] >= 0.5 ? 'green' : 'red', + color: getScoreColor(prediction.scores[scorer.id] || 0, scorer.id),

coderabbitai

Actionable comments posted: 3

♻️ Duplicate comments (2)

weave/trace_server/clickhouse_trace_server_batched.py (2)
1712-1724: ⚠️ Potential issue

Handle method invocation errors for call_method and score_call.

The # TODO: handle errors here note highlights a missing error-handling strategy. Wrap runner.run_call_method and runner.run_score_call in a try-except block, returning appropriate error responses or raising well-defined exceptions instead of relying on unhandled runtime errors.

225-234: ⚠️ Potential issue

Avoid returning the raw database password from model_dump

Exposing credentials (even just in memory or logs) can pose a security risk. Consider returning masked credentials or removing the password field altogether if not strictly necessary.
-            "password": self._password,
+            "password": "*****"  # or omit entirely

🧹 Nitpick comments (10)

PR.md (1)

3-3: Improve clarity in documentation text.

To make your writing more confident, consider removing the phrase “sort of”:
- * [ ] Resolve the Scorer and Model input args (these were sort of halphazardly handled and not consistent.)
+ * [ ] Resolve the Scorer and Model input args (these were haphazardly handled and not consistent.)
🧰 Tools

🪛 LanguageTool

[style] ~3-~3: To make your writing sound more confident, consider removing the phrase ‘sort of’.
Context: ...Scorer and Model input args (these were sort of halphazardly handled and not consistent.)

(SORT_OF_NP)

weave/builtin_objects/models/CompletionModel.py (1)

11-11: Follow up on TODO.

There's a TODO to add prior messages input. Please let me know if you'd like help implementing this or if I should open a new issue to track it.

weave/trace_server/server_side_object_saver.py (8)

101-101: Unnecessary comment

The comment "# Pass result_queue here" is redundant as it's already clear from the code that the result_queue is being passed as an argument.

-            ),  # Pass result_queue here
+            ),

327-329: Inefficient fixed sleep time in async loop

The fixed sleep time of 1.0 second can cause inefficiency, potentially making the operation unnecessarily slow or unresponsive. Consider using a more efficient approach like asyncio.Queue or implementing a backoff mechanism.

-            while True:
-                await asyncio.sleep(1.0)
-                status, result = result_queue.get()
+            # Create a queue adapter for multiprocessing.Queue
+            class QueueAdapter:
+                def __init__(self, queue):
+                    self.queue = queue
+                    self._loop = asyncio.get_event_loop()
+                
+                async def get(self):
+                    return await self._loop.run_in_executor(None, self.queue.get)
+            
+            queue_adapter = QueueAdapter(result_queue)
+            while True:
+                status, result = await queue_adapter.get()

402-402: Remove debugging print statement

The print(eval_results) statement appears to be for debugging purposes and should be removed or replaced with proper logging.

-            print(eval_results)  # add eval_results to summary
+            # Include eval_results in summary response
+            result_queue.put(
+                ("success", tsi.EvaluateSummaryRes(results=eval_results).model_dump())
+            )
-            result_queue.put(("success", tsi.EvaluateSummaryRes().model_dump()))

48-48: Add validation for ch_server_dump

The constructor accepts a dictionary without validating its contents, which could lead to runtime errors if required keys are missing.

     def __init__(self, ch_server_dump: dict[str, Any]):
+        required_keys = ["host", "port", "database", "user", "password"]  # Adjust based on actual requirements
+        missing_keys = [key for key in required_keys if key not in ch_server_dump]
+        if missing_keys:
+            raise ValueError(f"Missing required keys in ch_server_dump: {', '.join(missing_keys)}")
         self.ch_server_dump = ch_server_dump

71-77: Update docstring to include all parameters

The docstring is missing documentation for the object_name parameter.

     def run_save_object(
         self,
         new_obj: Any,
         project_id: str,
         object_name: str | None,
         user_id: str | None,
     ) -> str:
         """Run the save_object operation in a separate process.
 
         Args:
             new_obj: The object to save
             project_id: The project identifier
+            object_name: The name to assign to the saved object
             user_id: The user identifier
 
         Returns:
             str: The digest of the saved object
 
         Raises:
             Exception: If the save operation fails in the child process
         """

313-347: Add resource cleanup for processes

The run_evaluate_stream method creates a process but doesn't handle cases where exceptions occur outside the process_results generator. This could lead to zombie processes.

Consider implementing a try-finally block to ensure the process is terminated properly even if an exception occurs:

     async def run_evaluate_stream(
         self,
         req: tsi.EvaluateReq,
     ) -> AsyncIterator[tsi.EvaluateStepRes]:
         result_queue: multiprocessing.Queue[tuple[str, dict]] = multiprocessing.Queue()
 
         process = multiprocessing.Process(
             target=self._evaluate_stream,
             args=(req, result_queue),
         )
 
         process.start()
+        try:
-        async def process_results() -> AsyncIterator[tsi.EvaluateStepRes]:
-            while True:
-                await asyncio.sleep(1.0)
-                status, result = result_queue.get()
-                if status == "error":
-                    raise ValueError(f"Process execution failed: {result}")
-                if status == "success":
-                    if result["step_type"] == "start":
-                        yield tsi.EvaluateStartRes.model_validate(result)
-                    elif result["step_type"] == "predict_and_score":
-                        yield tsi.EvaluatePredictAndScoreRes.model_validate(result)
-                    elif result["step_type"] == "summary":
-                        yield tsi.EvaluateSummaryRes.model_validate(result)
-                        return
-                    else:
-                        raise ValueError(f"Unexpected result: {result}")
-                else:
-                    raise ValueError(f"Unexpected result: {result}")
-
-        async for res in process_results():
-            yield res
-        process.join()
-        return
+            async def process_results() -> AsyncIterator[tsi.EvaluateStepRes]:
+                while True:
+                    await asyncio.sleep(1.0)
+                    status, result = result_queue.get()
+                    if status == "error":
+                        raise ValueError(f"Process execution failed: {result}")
+                    if status == "success":
+                        if result["step_type"] == "start":
+                            yield tsi.EvaluateStartRes.model_validate(result)
+                        elif result["step_type"] == "predict_and_score":
+                            yield tsi.EvaluatePredictAndScoreRes.model_validate(result)
+                        elif result["step_type"] == "summary":
+                            yield tsi.EvaluateSummaryRes.model_validate(result)
+                            return
+                        else:
+                            raise ValueError(f"Unexpected result: {result}")
+                    else:
+                        raise ValueError(f"Unexpected result: {result}")
+
+            async for res in process_results():
+                yield res
+        finally:
+            # Ensure process is properly terminated
+            if process.is_alive():
+                process.terminate()
+            process.join()

80-90: Update exception type in docstring

The docstring indicates that a generic Exception is raised, but the method specifically raises RunSaveObjectException. The docstring should be updated to match the actual implementation.

         Raises:
-            Exception: If the save operation fails in the child process
+            RunSaveObjectException: If the save operation fails in the child process

248-251: Improve validation of result type

The current implementation first checks if the result is a dictionary using isinstance, but then raises an exception for any other type. This could be more explicit and provide better error messages.

-        if isinstance(result, dict):
+        if isinstance(result, dict) and "feedback_id" in result and "scorer_call_id" in result:
             return result
         else:
-            raise RunScoreCallException(f"Unexpected result: {result}")
+            raise RunScoreCallException(f"Expected ScoreCallResult dict with feedback_id and scorer_call_id, got: {result}")

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eed3f00 and 7b593dc.

📒 Files selected for processing (15)

PR.md (1 hunks)
tests/conftest.py (2 hunks)
tests/trace/test_on_demand_evals_2.py (1 hunks)
weave/builtin_objects/builtin_registry.py (1 hunks)
weave/builtin_objects/models/CompletionModel.py (1 hunks)
weave/builtin_objects/scorers/LLMJudgeScorer.py (1 hunks)
weave/flow/casting.py (2 hunks)
weave/flow/eval.py (3 hunks)
weave/flow/model.py (1 hunks)
weave/trace/weave_client.py (6 hunks)
weave/trace_server/clickhouse_trace_server_batched.py (4 hunks)
weave/trace_server/external_to_internal_trace_server_adapter.py (3 hunks)
weave/trace_server/server_side_object_saver.py (1 hunks)
weave/trace_server/trace_server_interface.py (3 hunks)
weave/trace_server_bindings/caching_middleware_trace_server.py (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (3)

weave/builtin_objects/builtin_registry.py
weave/builtin_objects/scorers/LLMJudgeScorer.py
tests/conftest.py

🧰 Additional context used

📓 Path-based instructions (2)

`**/*.py`: Focus on pythonic code patterns. Check for proper...

**/*.py: Focus on pythonic code patterns.
Check for proper exception handling.
Verify type hints usage where applicable.
Look for potential performance improvements.
Don't comment on formatting if black/isort is configured.
Check for proper dependency injection patterns.
Verify proper async handling if applicable.

weave/flow/casting.py
weave/flow/model.py
weave/trace/weave_client.py
weave/builtin_objects/models/CompletionModel.py
tests/trace/test_on_demand_evals_2.py
weave/flow/eval.py
weave/trace_server_bindings/caching_middleware_trace_server.py
weave/trace_server/external_to_internal_trace_server_adapter.py
weave/trace_server/server_side_object_saver.py
weave/trace_server/clickhouse_trace_server_batched.py
weave/trace_server/trace_server_interface.py

`**/*.{md,mdx}`: Focus on technical accuracy. Check for brok...

**/*.{md,mdx}: Focus on technical accuracy.
Check for broken links.
Verify code examples are up-to-date.
Look for clarity and completeness.
Don't focus on grammar/spelling unless significant.

PR.md

🪛 LanguageTool

PR.md

[style] ~3-~3: To make your writing sound more confident, consider removing the phrase ‘sort of’.
Context: ...Scorer and Model input args (these were sort of halphazardly handled and not consistent.)

(SORT_OF_NP)

⏰ Context from checks skipped due to timeout of 90000ms (333)

GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, llamaindex)

🔇 Additional comments (29)

weave/builtin_objects/models/CompletionModel.py (1)

16-34: Fix the return type mismatch.

The method signature declares -> str, but it returns a JSON-parsed object (dict) if response_format is "json_object" or "json_schema". This can cause type confusion or errors for downstream callers.

[suggest_essential_refactor, suggest_good_to_have_refactor]

Consider updating the return type or consistently returning a string. For example:
- def predict(self, **kwargs: Any) -> str:
+ def predict(self, **kwargs: Any) -> Any:
Additionally, consider wrapping the JSON parse call in try-except to handle malformed or unexpected JSON responses.
weave/flow/casting.py (1)

43-44:
❓ Verification inconclusive

Guard against potential infinite recursion.

When obj.get() also returns an ObjectRef, the recursion could continue indefinitely. Make sure there are no cyclic references leading to unbounded recursion in cast_to_scorer.

🏁 Script executed:
#!/bin/bash
# Description: Search the codebase for `ObjectRef.get()` usage to detect potential cyclical references.
rg -A 5 "ObjectRef.get" .
Length of output: 25

I'm not seeing output from the initial search, so let's dig deeper into the definition and implementation of ObjectRef and its get() method to verify whether there's any risk of infinite recursion. Please run the following additional shell script to search for the declaration of the ObjectRef class and its get() method within the codebase:

This script will help determine if get() has any logic that could lead to cyclic behavior (e.g., by returning another ObjectRef). Once we have these details, we can better decide if additional safeguards are needed in weave/flow/casting.py.

🏁 Script executed:
#!/bin/bash
echo "Searching for the definition of ObjectRef and its context..."
rg -A 10 "class ObjectRef" .

echo "---------------------------------------"
echo "Searching for the implementation of the get() method..."
rg -A 10 "def get(" .
Length of output: 2819
Review Update: Verify Cyclic Reference Handling in cast_to_scorer

The search confirms that ObjectRef is defined in weave/trace/refs.py, but we couldn’t conclusively locate an explicit implementation of the get() method that might indicate whether it can return another ObjectRef. Given this uncertainty, please manually verify that:

ObjectRef.get() (or its inherited implementations) does not return another ObjectRef in a way that would cause an infinite recursive call in cast_to_scorer.

cast_to_scorer includes safeguards (such as cycle detection) to prevent unbounded recursion if cyclic references arise.
weave/flow/model.py (1)

129-134: Improved argument binding for prediction function

The change replaces manual argument filtering with Python's native signature.bind() method, which is a more robust way to match arguments to function parameters. This approach will correctly handle default values, keyword-only arguments, and other parameter specifications.

The improvement eliminates potential edge cases where argument filtering might miss complex parameter specifications and makes the code more maintainable.

weave/trace_server_bindings/caching_middleware_trace_server.py (3)

4-4: Import addition for new async method

Added AsyncIterator to imports to support the new async streaming method.

436-441: New methods for call method and score call operations

These two new methods extend the middleware server to forward call and score requests to the underlying trace server.

The implementation correctly follows the existing pattern in this middleware class of forwarding non-cacheable operations.

442-445: Added async evaluation stream method

This new async method allows for streaming evaluation results asynchronously.

The implementation correctly awaits the async method from the next trace server, ensuring proper async handling.

tests/trace/test_on_demand_evals_2.py (6)

1-4: Well-documented test purpose

The docstring clearly explains the purpose of this test suite, which is to validate the end-to-end lifecycle of on-demand evaluations through the API.

15-16: Proper async test setup

The test is correctly marked with pytest.mark.asyncio to indicate it's an asynchronous test function, which is necessary for testing async functionality.

17-49: Test setup and evaluation configuration

This section creates test data, a scorer, and an evaluation object with appropriate configuration for the test.

The test dataset is simple but effective for validation purposes, and the LLMJudgeScorer is configured with appropriate parameters including response format schema.

50-59: Model configuration

The LiteLLMCompletionModel is properly configured with a system prompt and message template for the test.

61-84: Local execution testing

This section tests the local execution path with callback tracking and result validation.

The callbacks are properly implemented to track execution progress, and the assertions correctly validate both the callback invocations and the structure of evaluation results.

86-112: Remote execution testing

This section tests the remote execution path through a server using the streaming API.

The test properly sets up the stream, processes events asynchronously, and verifies the events follow the expected sequence and structure. The final assertion ensures all expected events have been processed.

weave/trace/weave_client.py (3)

498-503: Added ClientApplyScorerResult dataclass

This new dataclass provides a more structured way to return results from applying a scorer, including the result, score call, and feedback ID.

This improves the API by providing a richer return type with all related information in a single object.

660-660: Updated return type for apply_scorer method

The method signature has been updated to return the new ClientApplyScorerResult type.

This change correctly updates the type annotation to match the actual return value.

693-722: Enhanced return handling in apply_scorer method

The method implementation has been updated to construct and return a ClientApplyScorerResult object with the appropriate values.

The implementation correctly handles both cases where a WeaveClient is available (returning a feedback_id) and where it's not (returning None for the feedback_id). This ensures consistent behavior across different contexts.

weave/trace_server/external_to_internal_trace_server_adapter.py (4)

96-119: Well-implemented async streaming adapter method

This new async method properly adapts the existing pattern to handle asynchronous iterators, maintaining the consistency of the adapter's conversion responsibilities while adding support for streaming.

411-418: LGTM - Method call adapter follows established pattern

This implementation correctly handles project_id conversion, user validation, and reference conversion, maintaining consistency with other adapter methods.

419-426: LGTM - Score call adapter maintains consistency

The implementation correctly follows the established pattern for trace server adapter methods, properly handling ID conversions and validation.

427-437: LGTM - Evaluate stream adapter leverages async pattern

Good implementation that properly uses the new _stream_ref_apply_async helper method to handle the streaming evaluation results.

weave/flow/eval.py (4)

56-57: Clear and well-typed callback interfaces

These type definitions provide good type hints for the callback functions, making the API easier to use and understand.

199-203: LGTM - Callback support enhances evaluation flexibility

The added callback parameter allows for progress tracking during evaluation, which is valuable for long-running evaluations.

213-219: Good error handling in callback execution

The implementation properly catches and logs exceptions from callbacks, ensuring that callback failures don't disrupt the main evaluation process.

252-264: Well-designed lifecycle callback addition

The addition of on_start and on_row_complete callbacks with proper error handling provides a clean way to track evaluation progress and integrate with external systems.

weave/trace_server/clickhouse_trace_server_batched.py (1)

1738-1746: LGTM - Async evaluation stream method correctly delegates to runner

The evaluate_stream method properly creates a runner instance and delegates to it, maintaining the pattern established in the other methods.

weave/trace_server/trace_server_interface.py (4)

903-914: Well-defined method invocation request/response models

The CallMethodReq and CallMethodRes models provide a clear interface for method invocation with appropriate fields and validation.

916-926: LGTM - Score call models follow established pattern

The ScoreCallReq and ScoreCallRes models follow the established pattern for request/response models in the trace server interface.

928-953: Good design for evaluation response hierarchy

The evaluation response models are well-designed with a clear hierarchy:

Base class with step_type discriminator field

Specific subclasses for different step types

Union type for comprehensive typing

This makes the API easy to understand and use while maintaining type safety.

1008-1013: LGTM - Interface methods complete on-demand evaluation support

The three new interface methods complete the implementation pattern established in the other files, providing a consistent interface across the system.

weave/trace_server/server_side_object_saver.py (1)

209-209: TODOs should be addressed before production release

The code contains TODOs at lines 209 and 220 for validating project alignment and fixing potential self parameter issues, which should be addressed or have corresponding issues created to track them.

Also applies to: 220-220

coderabbitai · 2025-03-15T21:05:49Z

weave/trace_server/server_side_object_saver.py

+            assert isinstance(int_ref, InternalObjectRef)
+            ref = ObjectRef(


🛠️ Refactor suggestion

Replace assert with proper error handling

Using assert is not recommended for validation in production code as assertions can be disabled with the -O flag when running Python. Replace with proper error handling.

- assert isinstance(int_ref, InternalObjectRef) + if not isinstance(int_ref, InternalObjectRef): + raise TypeError("Invalid object reference")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

assert isinstance(int_ref, InternalObjectRef)

ref = ObjectRef(

if not isinstance(int_ref, InternalObjectRef):

raise TypeError("Invalid object reference")

ref = ObjectRef(

coderabbitai · 2025-03-15T21:05:50Z

weave/trace_server/server_side_object_saver.py

+        assert project_id.startswith("_SERVER_/")
+        return project_id[len("_SERVER_/") :]


🛠️ Refactor suggestion

Replace assert with proper error handling for project_id

Using assert for validation is not recommended in production code. Additionally, string manipulation for project_id could be made more robust.

- assert project_id.startswith("_SERVER_/") - return project_id[len("_SERVER_/") :] + prefix = "_SERVER_/" + if not project_id.startswith(prefix): + raise ValueError(f"Project ID must start with '{prefix}'") + return project_id[len(prefix):]

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

assert project_id.startswith("_SERVER_/")

return project_id[len("_SERVER_/") :]

prefix = "_SERVER_/"

if not project_id.startswith(prefix):

raise ValueError(f"Project ID must start with '{prefix}'")

return project_id[len(prefix):]

weave/trace_server/server_side_object_saver.py

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (14)

PR.md (2)

3-3: Improve Clarity in TODO Item
Consider removing the phrase “sort of” from the description. A more assertive phrasing such as “Resolve the Scorer and Model input args (which were handled inconsistently)” would enhance clarity and professionalism.

🧰 Tools

🪛 LanguageTool

[style] ~3-~3: To make your writing sound more confident, consider removing the phrase ‘sort of’.
Context: ...Scorer and Model input args (these were sort of halphazardly handled and not consistent.) * [ ] Stop...

(SORT_OF_NP)

4-4: Action Item Consistency Check
Ensure that the instruction to “Stop tracking on_row_complete and on_start for ops” is aligned with the new callback types introduced in weave/flow/eval.py. Clarifying how this change integrates with the overall evaluation process might help future maintainers.

weave/trace_server/server_side_object_saver.py (3)

27-37: Consider adding docstrings for custom exceptions.

While these custom exception classes are straightforward, adding brief docstrings (e.g., describing when they are raised and what conditions trigger them) fosters clarity and maintainability.

39-419: Evaluate alternative concurrency approaches for frequent operations.

Each operation spawns a dedicated multiprocessing.Process, which can be expensive if these methods are called repeatedly. Using a pool-based approach or a shared process for certain operations might reduce overhead.

407-407: Replace or remove print statements in production code.

print(eval_results) may be a leftover debug statement. Consider using a logger or removing it altogether for cleaner production output.

weave/trace_server/trace_server_interface.py (2)

903-909: Good definition for CallMethodReq.

Fields are clearly defined and typed. Consider adding docstrings to clarify method usage.

928-933: EvaluateReq schema is clear.

Optionally, add a docstring to clarify how evaluation_ref and model_ref should be used.
tests/trace/test_on_demand_evals_2.py (5)
1-4: Fix typo in docstring.

There is a small typo in the docstring: "inteded" should be "intended".
-This test suite is inteded to test the end to end lifecycle of on demand evaluations.
+This test suite is intended to test the end to end lifecycle of on demand evaluations.
26-42: Consider parameterizing the model name.

The model name "gpt-4o-mini" is hardcoded here and in other places throughout the test. This could cause issues if the model becomes unavailable or changes.

Consider parameterizing the model name, either with a constant at the top of the file or through a fixture:
+MODEL_NAME = "gpt-4o-mini"  # Define at the top of the file

 scorer = LLMJudgeScorer(
-    model="gpt-4o-mini",
+    model=MODEL_NAME,
     system_prompt="Judge the correctness of the output.",
     response_format={
         # ...
     },
 )
69-71: Add type hint for call_id parameter.

The call_id parameter in the on_row_complete function is missing a type hint, while eval_row has one.
-def on_row_complete(call_id, eval_row: dict):
+def on_row_complete(call_id: str, eval_row: dict):
     nonlocal on_row_complete_called
     on_row_complete_called.append((call_id, eval_row))
77-84: Make assertion more robust against structural changes.

The current assertion is brittle because it assumes a specific structure for the evaluation results. If the structure changes, the test will break even if the functionality is correct.

Consider testing key parts of the structure separately:
 true_count = eval_results["LLMJudgeScorer"]["passed"]["true_count"]
 latency = eval_results["model_latency"]["mean"]
-assert eval_results == {
-    "LLMJudgeScorer": {
-        "passed": {"true_count": true_count, "true_fraction": true_count / 3}
-    },
-    "model_latency": {"mean": latency},
-}
+# Check individual components separately
+assert "LLMJudgeScorer" in eval_results
+assert "passed" in eval_results["LLMJudgeScorer"]
+assert "true_count" in eval_results["LLMJudgeScorer"]["passed"]
+assert "true_fraction" in eval_results["LLMJudgeScorer"]["passed"]
+assert eval_results["LLMJudgeScorer"]["passed"]["true_fraction"] == true_count / 3
+assert "model_latency" in eval_results
+assert "mean" in eval_results["model_latency"]
97-111: Add timeout and error handling for async stream processing.

The async stream processing doesn't have a timeout or error handling for unexpected stream termination, which could lead to the test hanging indefinitely.

Consider adding timeout and error handling:
 index = 0
+import asyncio
+try:
+    async with asyncio.timeout(30):  # Set an appropriate timeout
+        async for event in stream:
+            if event.item.step_type == "start":
+                assert index == 0
+                index += 1
+            elif event.item.step_type == "predict_and_score":
+                assert index > 0 and index <= 3
+                index += 1
+            elif event.item.step_type == "summary":
+                assert index == 4
+                index += 1
+            else:
+                raise ValueError(f"Unknown event type: {event.item.step_type}")
+except asyncio.TimeoutError:
+    pytest.fail("Stream processing timed out")
+except Exception as e:
+    pytest.fail(f"Unexpected error during stream processing: {e}")
tests/trace/eval_demo.ipynb (2)
50-94: Extract common code into reusable functions.

The dataset creation process involves multiple steps that are repeated for other objects like scorers and evaluations. This pattern makes the code verbose and harder to maintain.

Consider extracting common code into reusable functions:
def create_object(server_client, project_id, object_id, object_type, object_bases, **properties):
    """Create an object and return its reference URI."""
    obj = server_client.obj_create({
        "obj": {
            "project_id": project_id,
            "object_id": object_id,
            "val": {
                "_type": object_type,
                "_class_name": object_type,
                "_bases": object_bases,
                **properties
            }
        }
    })
    
    obj_digest = obj.digest
    return ObjectRef(
        entity=entity,
        project=project,
        name=object_id,
        _digest=obj_digest
    ).uri()

# Usage example
dataset_ref = create_object(
    server_client, 
    project_id, 
    dataset_name, 
    "Dataset", 
    ["Object", "BaseModel"], 
    rows=table_ref
)
264-275: Add error handling for evaluation streaming.

The evaluation streaming code lacks error handling, which could make debugging difficult if issues occur.
-results = server_client.evaluate_stream(
-    {
-        "project_id": project_id,
-        "evaluation_ref": evaluation_ref,
-        "model_ref": model_ref,
-    }
-)
-
-async for result in await results:
-    print(result)
+try:
+    results = server_client.evaluate_stream(
+        {
+            "project_id": project_id,
+            "evaluation_ref": evaluation_ref,
+            "model_ref": model_ref,
+        }
+    )
+    
+    async for result in await results:
+        print(result)
+except Exception as e:
+    print(f"Error during evaluation streaming: {e}")
+    # Consider additional error handling or retry logic

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7b593dc and 8f1c172.

📒 Files selected for processing (6)

PR.md (1 hunks)
tests/trace/eval_demo.ipynb (1 hunks)
tests/trace/test_on_demand_evals_2.py (1 hunks)
weave/trace_server/server_side_object_saver.py (1 hunks)
weave/trace_server/trace_server_interface.py (3 hunks)
weave/trace_server_bindings/remote_http_trace_server.py (3 hunks)

🧰 Additional context used

📓 Path-based instructions (2)

`**/*.py`: Focus on pythonic code patterns. Check for proper...

**/*.py: Focus on pythonic code patterns.
Check for proper exception handling.
Verify type hints usage where applicable.
Look for potential performance improvements.
Don't comment on formatting if black/isort is configured.
Check for proper dependency injection patterns.
Verify proper async handling if applicable.

tests/trace/test_on_demand_evals_2.py
weave/trace_server_bindings/remote_http_trace_server.py
weave/trace_server/trace_server_interface.py
weave/trace_server/server_side_object_saver.py

`**/*.{md,mdx}`: Focus on technical accuracy. Check for brok...

**/*.{md,mdx}: Focus on technical accuracy.
Check for broken links.
Verify code examples are up-to-date.
Look for clarity and completeness.
Don't focus on grammar/spelling unless significant.

PR.md

🪛 LanguageTool

PR.md

[style] ~3-~3: To make your writing sound more confident, consider removing the phrase ‘sort of’.
Context: ...Scorer and Model input args (these were sort of halphazardly handled and not consistent.) * [ ] Stop...

(SORT_OF_NP)

⏰ Context from checks skipped due to timeout of 90000ms (890)

GitHub Check: notify-wandb-core
GitHub Check: Legacy (Query Service) Python unit tests (1)
GitHub Check: Legacy (Query Service) Python unit tests (0)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, litellm)
GitHub Check: Trace nox tests (3, 13, langchain)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, cerebras)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, anthropic)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: Trace nox tests (3, 9, langchain)
GitHub Check: Legacy (Query Service) Python unit tests (1)
GitHub Check: Legacy (Query Service) Python unit tests (0)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, litellm)
GitHub Check: Trace nox tests (3, 13, langchain)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, cerebras)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, anthropic)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: Trace nox tests (3, 9, langchain)
GitHub Check: Legacy (Query Service) Python unit tests (1)
GitHub Check: Legacy (Query Service) Python unit tests (0)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, litellm)
GitHub Check: Trace nox tests (3, 13, langchain)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, cerebras)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, anthropic)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: Trace nox tests (3, 9, langchain)
GitHub Check: Legacy (Query Service) Python unit tests (1)
GitHub Check: Legacy (Query Service) Python unit tests (0)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, litellm)
GitHub Check: Trace nox tests (3, 13, langchain)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, cerebras)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, anthropic)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: Trace nox tests (3, 9, langchain)
GitHub Check: Legacy (Query Service) Python unit tests (1)
GitHub Check: Legacy (Query Service) Python unit tests (0)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, litellm)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, cerebras)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, anthropic)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: Trace nox tests (3, 9, langchain)
GitHub Check: Legacy (Query Service) Python unit tests (1)
GitHub Check: Legacy (Query Service) Python unit tests (0)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, cerebras)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, anthropic)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: Trace nox tests (3, 9, langchain)
GitHub Check: Legacy (Query Service) Python unit tests (1)
GitHub Check: Legacy (Query Service) Python unit tests (0)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, cerebras)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, anthropic)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: Trace nox tests (3, 9, langchain)
GitHub Check: Legacy (Query Service) Python unit tests (1)
GitHub Check: Legacy (Query Service) Python unit tests (0)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, cerebras)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, anthropic)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: Trace nox tests (3, 9, langchain)
GitHub Check: Legacy (Query Service) Python unit tests (1)
GitHub Check: Legacy (Query Service) Python unit tests (0)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, cerebras)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, anthropic)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: Trace nox tests (3, 9, langchain)
GitHub Check: Legacy (Query Service) Python unit tests (1)
GitHub Check: Legacy (Query Service) Python unit tests (0)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, cerebras)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, anthropic)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: Trace nox tests (3, 9, langchain)

🔇 Additional comments (19)

PR.md (1)

1-1: General Note on TODO Section
This section outlines pending tasks. Please ensure that each TODO item is later linked with a specific issue or additional details to facilitate tracking.

weave/trace_server/server_side_object_saver.py (5)

209-210: Unresolved TODO remains.
This line contains a TODO comment (“validate project alignment?”) from a previous review that still appears unresolved.

211-212: Replace assert with proper error handling.
Using assert for runtime validation in production code can be bypassed if Python is launched with -O. Use a conditional check and raise a TypeError or ValueError.

220-221: Unresolved TODO remains.
A TODO comment here refers to a potential “self” parameter issue. This was mentioned in a past review and appears unresolved.

389-398: Callback doesn't utilize the available arguments fully.
The callback ignores call_id and eval_row when generating EvaluatePredictAndScoreRes. Including these could provide additional context and traceability in the result.

423-424: Replace assert with proper error handling for project_id.

Production code should not rely on assert for critical checks. Use explicit conditional checks and raise a suitable exception.

weave/trace_server_bindings/remote_http_trace_server.py (4)

4-4: No issues with the new import.

563-567: Implementation looks consistent with existing code patterns.

568-572: Implementation of this method is straightforward and consistent with _generic_request.

573-580: Verify asynchronous behavior of the helper method.

evaluate_stream relies on _generic_stream_request_async, which currently uses a synchronous request under the hood. If the helper remains synchronous, this function might not truly be async and could block the event loop. Please ensure the helper is refactored for genuine async I/O.

weave/trace_server/trace_server_interface.py (9)

2-2: Import usage looks appropriate.

911-914: Return schema looks fine.

916-921: ScoreCallReq looks well-structured.

923-926: No issues with ScoreCallRes.

935-937: EvaluateStartRes is straightforward.

939-941: EvaluatePredictAndScoreRes usage is consistent.

943-945: EvaluateSummaryRes structure is correct.

947-951: Great use of a discriminated union for EvaluateStepRes.

1005-1010: New interface methods aptly match the request/response classes.

coderabbitai · 2025-03-16T00:06:13Z

weave/trace_server_bindings/remote_http_trace_server.py

+    async def _generic_stream_request_async(
+        self,
+        url: str,
+        req: BaseModel,
+        req_model: type[BaseModel],
+        res_model: type[BaseModel],
+    ) -> AsyncIterator[BaseModel]:
+        if isinstance(req, dict):
+            req = req_model.model_validate(req)
+        r = self._generic_request_executor(url, req, stream=True)
+        for line in r.iter_lines():
+            if line:
+                yield res_model.model_validate_json(line)
+


🛠️ Refactor suggestion

Consider using a fully asynchronous request approach.

Inside this async def method, the code calls synchronous requests.post and iterates over r.iter_lines(), which blocks the event loop and undermines asynchronous concurrency. Replacing requests with an async-capable HTTP client (e.g., aiohttp or httpx) would ensure non-blocking I/O and truly asynchronous behavior.

Below is an example diff using httpx:

- r = self._generic_request_executor(url, req, stream=True) - for line in r.iter_lines(): - if line: - yield res_model.model_validate_json(line) + import httpx + async with httpx.AsyncClient() as client: + async with client.stream( + "POST", + self.trace_server_url + url, + content=req.model_dump_json(by_alias=True).encode("utf-8"), + ) as r: + async for line in r.aiter_lines(): + if line: + yield res_model.model_validate_json(line)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

async def _generic_stream_request_async(

self,

url: str,

req: BaseModel,

req_model: type[BaseModel],

res_model: type[BaseModel],

) -> AsyncIterator[BaseModel]:

if isinstance(req, dict):

req = req_model.model_validate(req)

r = self._generic_request_executor(url, req, stream=True)

for line in r.iter_lines():

if line:

yield res_model.model_validate_json(line)

async def _generic_stream_request_async(

self,

url: str,

req: BaseModel,

req_model: type[BaseModel],

res_model: type[BaseModel],

) -> AsyncIterator[BaseModel]:

if isinstance(req, dict):

req = req_model.model_validate(req)

import httpx

async with httpx.AsyncClient() as client:

async with client.stream(

"POST",

self.trace_server_url + url,

content=req.model_dump_json(by_alias=True).encode("utf-8"),

) as r:

async for line in r.aiter_lines():

if line:

yield res_model.model_validate_json(line)

coderabbitai · 2025-03-16T00:06:13Z

tests/trace/eval_demo.ipynb

+    "os.environ[\"WF_TRACE_SERVER_URL\"] = \"http://127.0.01:6345\"\n",
+    "correctness_model_2 = weave.ref(\n",
+    "    \"weave:///timssweeney/on_demand_eval_demo_5/object/correctness_model_2:XlAizts4EuFNA8DQ8OU1xXs2rYR7jnNnZ2FWHrt5YFM\"\n",
+    ").get()\n",
+    "correctness_model_2.predict(input=\"United States\")"


⚠️ Potential issue

Fix URL typo and avoid hardcoded references.

The URL has a typo and the model reference is hardcoded, which could break if the model is recreated.

-os.environ["WF_TRACE_SERVER_URL"] = "http://127.0.01:6345" -correctness_model_2 = weave.ref( - "weave:///timssweeney/on_demand_eval_demo_5/object/correctness_model_2:XlAizts4EuFNA8DQ8OU1xXs2rYR7jnNnZ2FWHrt5YFM" -).get() +os.environ["WF_TRACE_SERVER_URL"] = "http://127.0.0.1:6345" +# Use the model reference created earlier in the notebook +correctness_model_2 = weave.ref(model_ref).get() correctness_model_2.predict(input="United States")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

"os.environ[\"WF_TRACE_SERVER_URL\"] = \"http://127.0.01:6345\"\n",

"correctness_model_2 = weave.ref(\n",

" \"weave:///timssweeney/on_demand_eval_demo_5/object/correctness_model_2:XlAizts4EuFNA8DQ8OU1xXs2rYR7jnNnZ2FWHrt5YFM\"\n",

").get()\n",

"correctness_model_2.predict(input=\"United States\")"

os.environ["WF_TRACE_SERVER_URL"] = "http://127.0.0.1:6345"

# Use the model reference created earlier in the notebook

correctness_model_2 = weave.ref(model_ref).get()

correctness_model_2.predict(input="United States")

coderabbitai · 2025-03-16T00:06:13Z

tests/trace/eval_demo.ipynb

+    "server_client = RemoteHTTPTraceServer(\"http://127.0.01:6345\")\n",
+    "server_client.set_auth((\"\", wandb_context.api_key))"


⚠️ Potential issue

Fix typo in URL and improve security practices.

There's a typo in the server URL and the authentication method uses an empty string with an API key directly in the notebook.

-server_client = RemoteHTTPTraceServer("http://127.0.01:6345") -server_client.set_auth(("", wandb_context.api_key)) +server_client = RemoteHTTPTraceServer("http://127.0.0.1:6345") +# Use environment variables for credentials or a more secure authentication method +import os +api_key = os.environ.get("WANDB_API_KEY", wandb_context.api_key) +server_client.set_auth(("", api_key))

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

"server_client = RemoteHTTPTraceServer(\"http://127.0.01:6345\")\n",

"server_client.set_auth((\"\", wandb_context.api_key))"

"server_client = RemoteHTTPTraceServer(\"http://127.0.0.1:6345\")\n",

"# Use environment variables for credentials or a more secure authentication method\n",

"import os\n",

"api_key = os.environ.get(\"WANDB_API_KEY\", wandb_context.api_key)\n",

"server_client.set_auth((\"\", api_key))"

coderabbitai

Actionable comments posted: 8

🧹 Nitpick comments (14)

weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/EvalStudioPage.tsx (2)
47-47: Remove unused import.

ModelReport is imported but never used. Consider removing this to keep the codebase clean.
- import {ModelReport} from './ModelReport';
🧰 Tools

🪛 GitHub Check: WeaveJS Lint and Compile

[warning] 47-47:
'ModelReport' is defined but never used

358-366: Unify fetch and error handling logic.

Multiple fetch operations exhibit a repeated pattern without user-facing error feedback (only console errors). Consider extracting a common utility/hook to handle loading states, caching, and error handling with user-visible alerts or fallback UI.

Also applies to: 484-490, 540-549, 604-607
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/forms/NewModelForm.tsx (2)

16-17: Offer assistance with model creation.

There is a TODO comment about implementing actual model creation. If you would like, I can assist in hooking up a real API call or server-side logic to complete this functionality.

18-19: Use a robust ID generation method.

Relying on Math.random().toString(36) may lead to collisions. For production-grade applications, consider using a library like uuid or nanoid for generating truly unique IDs.

weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/EvalStudio.tsx (1)

41-47: Consider extracting inline styles to an external stylesheet or styled-components.
Inline style blocks can clutter the component body and make maintenance or reuse challenging.

weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/forms/NewDatasetForm.tsx (1)

13-23: Address the “file” state usage and file upload logic.
You assign a file to state but never use it, and there’s a TODO indicating future file upload implementation. Consider removing the unused state if it’s not needed or incorporating file parsing to build samples from CSV/JSON.

Would you like help adding this logic or opening a tracking issue?

weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/EvaluationResults.tsx (1)

150-154: Implement the “Run Evaluation” action.
The TODO comment signals missing logic for triggering evaluations with selected data.

Do you want me to propose an approach to implement this action or open a new issue for future reference?
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/forms/NewEvaluationForm.tsx (2)
15-33: Replace mock data with API integration.

The scorers are currently hardcoded as mock data. This should be replaced with actual API calls to fetch available scorers.
- // Mock scorers for now - would come from API
- const availableScorers: Scorer[] = [
-   {id: 'scorer1', name: 'Accuracy', description: 'Basic accuracy scorer'},
-   {
-     id: 'scorer2',
-     name: 'F1 Score',
-     description: 'F1 score for balanced evaluation',
-   },
-   {
-     id: 'scorer3',
-     name: 'BLEU',
-     description: 'BLEU score for text generation',
-   },
-   {
-     id: 'scorer4',
-     name: 'ROUGE',
-     description: 'ROUGE score for summarization',
-   },
- ];
+ const [availableScorers, setAvailableScorers] = useState<Scorer[]>([]);
+ 
+ useEffect(() => {
+   const fetchScorers = async () => {
+     try {
+       const scorers = await fetchAvailableScorers(); // API function to be implemented
+       setAvailableScorers(scorers);
+     } catch (error) {
+       console.error('Failed to fetch scorers:', error);
+       // Handle error state
+     }
+   };
+   
+   fetchScorers();
+ }, []);
53-137: Consider extracting inline styles to improve maintainability.

The component uses numerous inline styles which can make the code harder to maintain. Consider extracting these into styled components or CSS modules.

You could use emotion, styled-components, or a similar CSS-in-JS library to extract these styles:
// Example using styled-components
import styled from 'styled-components';

const FormContainer = styled.div`
  padding: 1rem;
`;

const FormSection = styled.div`
  margin-bottom: 1rem;
`;

const FormLabel = styled.label`
  display: block;
  margin-bottom: 0.5rem;
`;

// etc...

// Then in your component:
return (
  <FormContainer>
    <h2>Create New Evaluation</h2>
    <form onSubmit={handleSubmit}>
      <FormSection>
        <FormLabel>Evaluation Name</FormLabel>
        {/* ... */}
      </FormSection>
      {/* ... */}
    </form>
  </FormContainer>
);
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/README.md (1)

9-16: Fix the "Run Details" repetition in navigation tabs list.

There appears to be a duplicated phrase in the navigation tabs listing. According to the static analyzer, "Run Details" appears twice.

Verify the tab names and ensure they are all distinct. It looks like this might be a copy-paste error.

🧰 Tools

🪛 LanguageTool

[grammar] ~13-~13: This phrase is duplicated. You should probably use “Details Run” only once.
Context: ...: - Data Preview - Evaluation Details - Run Details - Run Report 2. List Views (Three collap...

(PHRASE_REPETITION)
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/ModelReport.tsx (1)
151-156: Improve handling of non-numeric metrics.

The current implementation simply converts non-numeric metrics to strings without special formatting. Consider adding more specific handling for different data types.
{Object.entries(result.metrics).map(([key, value]) => (
  <td key={key} style={{padding: '0.75rem'}}>
-    {typeof value === 'number'
-      ? value.toFixed(3)
-      : value.toString()}
+    {(() => {
+      if (typeof value === 'number') {
+        return value.toFixed(3);
+      } else if (typeof value === 'boolean') {
+        return value ? 'Yes' : 'No';
+      } else if (value === null || value === undefined) {
+        return 'N/A';
+      } else if (typeof value === 'object') {
+        return JSON.stringify(value);
+      } else {
+        return String(value);
+      }
+    })()}
  </td>
))}
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/DetailedResults.tsx (1)
16-35: Add retry mechanism for failed data fetches.

The component logs errors to the console but doesn't provide a mechanism for retrying the fetch operation.
export const DetailedResults: React.FC = () => {
  const [detailedResults, setDetailedResults] =
    useState<DetailedEvaluationResult | null>(null);
  const [loading, setLoading] = useState(true);
  const [error, setError] = useState<string | null>(null);

  const {selectedResult, selectedEvaluation, setSelectedResult} =
    useEvalStudio();

+  const loadData = async () => {
+    if (!selectedResult) {
+      return;
+    }
+
+    setLoading(true);
+    setError(null);
+    try {
+      const data = await fetchDetailedResults(selectedResult.callId);
+      setDetailedResults(data);
+    } catch (error) {
+      console.error('Failed to fetch detailed results:', error);
+      setError('Failed to load detailed results');
+    } finally {
+      setLoading(false);
+    }
+  };

  useEffect(() => {
-    const loadData = async () => {
-      if (!selectedResult) {
-        return;
-      }
-
-      setLoading(true);
-      try {
-        const data = await fetchDetailedResults(selectedResult.callId);
-        setDetailedResults(data);
-      } catch (error) {
-        console.error('Failed to fetch detailed results:', error);
-        setError('Failed to load detailed results');
-      } finally {
-        setLoading(false);
-      }
-    };
-
    loadData();
  }, [selectedResult]);

  // In the error display section
  if (error) {
    return (
      <div>
        <div style={{color: 'red', marginBottom: '1rem'}}>{error}</div>
-       <button onClick={() => setSelectedResult(null)}>Back to Results</button>
+       <div style={{ marginBottom: '1rem' }}>
+         <button onClick={loadData} style={{ marginRight: '1rem' }}>Retry</button>
+         <button onClick={() => setSelectedResult(null)}>Back to Results</button>
+       </div>
      </div>
    );
  }
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/mockData.ts (2)
12-23: Ensure model filtering or confirm intended behavior.

The fetchModels function accepts an evaluationId but simply returns mockModels without filtering by this ID. If the intention is to map models to a specific evaluation, consider filtering by evaluationId within mockModels. Otherwise, clarify why all models are returned.
-export const fetchModels = async (evaluationId: string): Promise<Model[]> => {
-  await new Promise(resolve => setTimeout(resolve, 500));
-  return mockModels;
+export const fetchModels = async (evaluationId: string): Promise<Model[]> => {
+  await new Promise(resolve => setTimeout(resolve, 500));
+  return mockModels.filter(model => model.evaluationId === evaluationId);
}
109-140: Consider error handling for client.objsQuery.

Currently, there is no handling for potential network or server errors. Although this is mock code, it could be beneficial to wrap the server call in a try/catch to gracefully handle errors, ensuring consistent state in case of query failures.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8f1c172 and 794a33b.

📒 Files selected for processing (14)

weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/EvalStudio.tsx (1 hunks)
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/README.md (1 hunks)
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/api.ts (1 hunks)
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/DetailedResults.tsx (1 hunks)
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/EvalStudioPage.tsx (1 hunks)
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/EvaluationResults.tsx (1 hunks)
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/EvaluationsList.tsx (1 hunks)
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/ModelReport.tsx (1 hunks)
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/forms/NewDatasetForm.tsx (1 hunks)
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/forms/NewEvaluationForm.tsx (1 hunks)
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/forms/NewModelForm.tsx (1 hunks)
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/mockData.ts (1 hunks)
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/types.ts (1 hunks)
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/wfReactInterface/tsDataModelHooks.ts (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/EvaluationsList.tsx

🧰 Additional context used

📓 Path-based instructions (2)

`**/*.{js,jsx,ts,tsx}`: Focus on architectural and logical i...

**/*.{js,jsx,ts,tsx}: Focus on architectural and logical issues rather than style (assuming ESLint is in place).
Flag potential memory leaks and performance bottlenecks.
Check for proper error handling and async/await usage.
Avoid strict enforcement of try/catch blocks - accept Promise chains, early returns, and other clear error handling patterns. These are acceptable as long as they maintain clarity and predictability.
Ensure proper type usage in TypeScript files.
Look for security vulnerabilities in data handling.
Don't comment on formatting if prettier is configured.
Verify proper React hooks usage and component lifecycle.
Check for proper state management patterns.

weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/forms/NewEvaluationForm.tsx
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/EvalStudioPage.tsx
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/EvaluationResults.tsx
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/EvalStudio.tsx
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/forms/NewModelForm.tsx
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/forms/NewDatasetForm.tsx
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/ModelReport.tsx
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/DetailedResults.tsx
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/mockData.ts
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/wfReactInterface/tsDataModelHooks.ts
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/types.ts
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/api.ts

`**/*.{md,mdx}`: Focus on technical accuracy. Check for brok...

**/*.{md,mdx}: Focus on technical accuracy.
Check for broken links.
Verify code examples are up-to-date.
Look for clarity and completeness.
Don't focus on grammar/spelling unless significant.

weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/README.md

🪛 GitHub Check: WeaveJS Lint and Compile

weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/EvalStudioPage.tsx

[warning] 47-47:
'ModelReport' is defined but never used

weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/forms/NewDatasetForm.tsx

[warning] 11-11:
'file' is assigned a value but never used

🪛 LanguageTool

weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/README.md

[grammar] ~13-~13: This phrase is duplicated. You should probably use “Details Run” only once.
Context: ...: - Data Preview - Evaluation Details - Run Details - Run Report 2. List Views (Three collap...

(PHRASE_REPETITION)

⏰ Context from checks skipped due to timeout of 90000ms (879)

GitHub Check: notify-wandb-core
GitHub Check: Legacy (Query Service) Python unit tests (1)
GitHub Check: Legacy (Query Service) Python unit tests (0)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, litellm)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, cerebras)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, anthropic)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: Trace nox tests (3, 9, langchain)
GitHub Check: Legacy (Query Service) Python unit tests (1)
GitHub Check: Legacy (Query Service) Python unit tests (0)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, litellm)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, cerebras)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, anthropic)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: Trace nox tests (3, 9, langchain)
GitHub Check: Legacy (Query Service) Python unit tests (1)
GitHub Check: Legacy (Query Service) Python unit tests (0)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, cerebras)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, anthropic)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: Trace nox tests (3, 9, langchain)
GitHub Check: Legacy (Query Service) Python unit tests (1)
GitHub Check: Legacy (Query Service) Python unit tests (0)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, cerebras)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, anthropic)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: Trace nox tests (3, 9, langchain)
GitHub Check: Legacy (Query Service) Python unit tests (1)
GitHub Check: Legacy (Query Service) Python unit tests (0)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, cerebras)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, anthropic)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: Trace nox tests (3, 9, langchain)
GitHub Check: Legacy (Query Service) Python unit tests (1)
GitHub Check: Legacy (Query Service) Python unit tests (0)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, cerebras)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, anthropic)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: Trace nox tests (3, 9, langchain)
GitHub Check: Legacy (Query Service) Python unit tests (1)
GitHub Check: Legacy (Query Service) Python unit tests (0)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, cerebras)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, anthropic)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: Trace nox tests (3, 9, langchain)
GitHub Check: Legacy (Query Service) Python unit tests (1)
GitHub Check: Legacy (Query Service) Python unit tests (0)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, cerebras)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, anthropic)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: Trace nox tests (3, 9, langchain)
GitHub Check: Legacy (Query Service) Python unit tests (1)
GitHub Check: Legacy (Query Service) Python unit tests (0)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, cerebras)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, anthropic)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: Trace nox tests (3, 9, langchain)
GitHub Check: Legacy (Query Service) Python unit tests (1)
GitHub Check: Legacy (Query Service) Python unit tests (0)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, cerebras)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, anthropic)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: Trace nox tests (3, 9, langchain)

🔇 Additional comments (5)

weave-js/src/components/PagePanelComponents/Home/Browse3/pages/wfReactInterface/tsDataModelHooks.ts (1)

1819-1819: Great enhancement for broader usage.

Exporting the StatusCodeType allows it to be reused in other modules and improves maintainability.

weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/EvaluationResults.tsx (1)

28-61: Retain awareness of potential memory leaks in async data fetching.
If the component unmounts before loadResults() completes, setState calls may cause a memory leak. This was flagged in a previous review for a similar pattern.

weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/README.md (1)

1-149: Well-structured documentation with comprehensive details.

The README is well-organized, providing clear information about the component structure, key features, data flow, dependencies, and future improvements. The inclusion of TypeScript interfaces and usage examples is particularly helpful.

🧰 Tools

🪛 LanguageTool

[grammar] ~13-~13: This phrase is duplicated. You should probably use “Details Run” only once.
Context: ...: - Data Preview - Evaluation Details - Run Details - Run Report 2. List Views (Three collap...

(PHRASE_REPETITION)
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/DetailedResults.tsx (1)

116-126: Parameterize the score threshold for color-coding.

The component hard-codes a threshold of 0.5 for determining if a score is "good" (green) or "bad" (red). This might not be appropriate for all scorers which could have different scales or meanings for their scores.

Consider one of these approaches:

Get threshold values from each scorer's configuration

Allow for customizable thresholds based on scorer type

Use a more gradual color scale

Example implementation:
+ // Function to determine color based on score and scorer
+ const getScoreColor = (score: number, scorerId: string) => {
+   const scorer = selectedEvaluation.scorers.find(s => s.id === scorerId);
+   // Get threshold from scorer config or use default
+   const threshold = scorer?.config?.threshold || 0.5;
+   return score >= threshold ? 'green' : 'red';
+ };

// Then in the JSX:
- color:
-   prediction.scores[scorerRef] >= 0.5 ? 'green' : 'red',
+ color: getScoreColor(prediction.scores[scorerRef] || 0, scorerRef),
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/types.ts (1)

59-68: Confirm Date usage for serialization.

Using Date objects within interfaces is convenient internally but can complicate JSON serialization/deserialization. Ensure call sites handle date conversions properly, especially when persisting or transferring data across the network.

Could you verify downstream usage of these interfaces to ensure consistent handling of these Date fields?

coderabbitai · 2025-03-16T08:06:16Z

...nts/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/forms/NewEvaluationForm.tsx

+    const newEvaluation: Evaluation = {
+      entity: 'wandb',
+      project: 'eval-studio',
+      objectId: Math.random().toString(36).substr(2, 9),
+      objectDigest: Math.random().toString(36).substr(2, 9),
+      evaluationRef: Math.random().toString(36).substr(2, 9),
+      displayName: name,
+      createdAt: new Date(),
+      datasetRef: 'dataset1', // Would come from context
+      scorerRefs: selectedScorers,
+    };


🛠️ Refactor suggestion

Avoid using Math.random() for IDs in production code.

Using Math.random() for generating IDs is not recommended for production as it doesn't guarantee uniqueness. Consider using a proper UUID library instead.

+ import { v4 as uuidv4 } from 'uuid'; // Then in the handleSubmit function: const newEvaluation: Evaluation = { entity: 'wandb', project: 'eval-studio', - objectId: Math.random().toString(36).substr(2, 9), - objectDigest: Math.random().toString(36).substr(2, 9), - evaluationRef: Math.random().toString(36).substr(2, 9), + objectId: uuidv4(), + objectDigest: uuidv4(), + evaluationRef: uuidv4(), displayName: name, createdAt: new Date(), datasetRef: 'dataset1', // Would come from context scorerRefs: selectedScorers, };

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const newEvaluation: Evaluation = {

entity: 'wandb',

project: 'eval-studio',

objectId: Math.random().toString(36).substr(2, 9),

objectDigest: Math.random().toString(36).substr(2, 9),

evaluationRef: Math.random().toString(36).substr(2, 9),

displayName: name,

createdAt: new Date(),

datasetRef: 'dataset1', // Would come from context

scorerRefs: selectedScorers,

};

import { v4 as uuidv4 } from 'uuid';

// ... other imports and context

const newEvaluation: Evaluation = {

entity: 'wandb',

project: 'eval-studio',

objectId: uuidv4(),

objectDigest: uuidv4(),

evaluationRef: uuidv4(),

displayName: name,

createdAt: new Date(),

datasetRef: 'dataset1', // Would come from context

scorerRefs: selectedScorers,

};

coderabbitai · 2025-03-16T08:06:16Z

...nts/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/forms/NewEvaluationForm.tsx

+      evaluationRef: Math.random().toString(36).substr(2, 9),
+      displayName: name,
+      createdAt: new Date(),
+      datasetRef: 'dataset1', // Would come from context


🛠️ Refactor suggestion

Retrieve datasetRef from context instead of hardcoding.

The comment indicates that datasetRef should come from context, but it's currently hardcoded as 'dataset1'.

+ import { useEvalStudio } from '../../context'; // Inside the component: + const { selectedDatasetVersion } = useEvalStudio(); // Then in the handleSubmit function: const newEvaluation: Evaluation = { entity: 'wandb', project: 'eval-studio', objectId: Math.random().toString(36).substr(2, 9), objectDigest: Math.random().toString(36).substr(2, 9), evaluationRef: Math.random().toString(36).substr(2, 9), displayName: name, createdAt: new Date(), - datasetRef: 'dataset1', // Would come from context + datasetRef: selectedDatasetVersion?.id || '', scorerRefs: selectedScorers, };

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

datasetRef: 'dataset1', // Would come from context

// At the top of the file, add the import:

import { useEvalStudio } from '../../context';

...

const NewEvaluationForm = (props) => {

// Retrieve selectedDatasetVersion from context

const { selectedDatasetVersion } = useEvalStudio();

// ... other component logic

const handleSubmit = () => {

const newEvaluation: Evaluation = {

entity: 'wandb',

project: 'eval-studio',

objectId: Math.random().toString(36).substr(2, 9),

objectDigest: Math.random().toString(36).substr(2, 9),

evaluationRef: Math.random().toString(36).substr(2, 9),

displayName: name,

createdAt: new Date(),

// Retrieve datasetRef from context instead of hardcoding

datasetRef: selectedDatasetVersion?.id || '',

scorerRefs: selectedScorers,

};

// ... further handling logic

};

// ... rest of the component

return (

// ... JSX for the component

);

};

export default NewEvaluationForm;

coderabbitai · 2025-03-16T08:06:16Z

.../src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/ModelReport.tsx

+export const ModelReport: React.FC<ModelReportProps> = ({results}) => {
+  // Calculate summary statistics
+  const summaryStats = results.reduce((acc, result) => {
+    Object.entries(result.metrics).forEach(([key, value]) => {
+      if (typeof value === 'number') {
+        if (!acc[key]) {
+          acc[key] = {
+            min: value,
+            max: value,
+            sum: value,
+            count: 1,
+          };
+        } else {
+          acc[key].min = Math.min(acc[key].min, value);
+          acc[key].max = Math.max(acc[key].max, value);
+          acc[key].sum += value;
+          acc[key].count += 1;
+        }
+      }
+    });
+    return acc;
+  }, {} as Record<string, {min: number; max: number; sum: number; count: number}>);


🛠️ Refactor suggestion

Add handling for empty results array.

The component doesn't handle the case where the results array might be empty, which could lead to rendering errors. Add a check at the beginning of the component to handle this case.

export const ModelReport: React.FC<ModelReportProps> = ({results}) => { + if (!results || results.length === 0) { + return <div>No results available to display</div>; + } // Calculate summary statistics const summaryStats = results.reduce((acc, result) => { // ...

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

export const ModelReport: React.FC<ModelReportProps> = ({results}) => {

// Calculate summary statistics

const summaryStats = results.reduce((acc, result) => {

Object.entries(result.metrics).forEach(([key, value]) => {

if (typeof value === 'number') {

if (!acc[key]) {

acc[key] = {

min: value,

max: value,

sum: value,

count: 1,

};

} else {

acc[key].min = Math.min(acc[key].min, value);

acc[key].max = Math.max(acc[key].max, value);

acc[key].sum += value;

acc[key].count += 1;

}

}

});

return acc;

}, {} as Record<string, {min: number; max: number; sum: number; count: number}>);

export const ModelReport: React.FC<ModelReportProps> = ({results}) => {

if (!results || results.length === 0) {

return <div>No results available to display</div>;

}

// Calculate summary statistics

const summaryStats = results.reduce((acc, result) => {

Object.entries(result.metrics).forEach(([key, value]) => {

if (typeof value === 'number') {

if (!acc[key]) {

acc[key] = {

min: value,

max: value,

sum: value,

count: 1,

};

} else {

acc[key].min = Math.min(acc[key].min, value);

acc[key].max = Math.max(acc[key].max, value);

acc[key].sum += value;

acc[key].count += 1;

}

}

});

return acc;

}, {} as Record<string, {min: number; max: number; sum: number; count: number}>);

coderabbitai · 2025-03-16T08:06:16Z

.../src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/ModelReport.tsx

+                {Object.keys(results[0]?.metrics || {}).map(metric => (
+                  <th
+                    key={metric}
+                    style={{
+                      padding: '0.75rem',
+                      textAlign: 'left',
+                      borderBottom: '1px solid #eee',
+                    }}>
+                    {metric.charAt(0).toUpperCase() + metric.slice(1)}
+                  </th>
+                ))}


⚠️ Potential issue

Guard against undefined first result.

The table header rendering depends on the first result having metrics, but doesn't handle cases where results[0] might be undefined.

- {Object.keys(results[0]?.metrics || {}).map(metric => ( + {results[0] && Object.keys(results[0].metrics || {}).map(metric => ( <th key={metric} style={{ padding: '0.75rem', textAlign: 'left', borderBottom: '1px solid #eee', }}> {metric.charAt(0).toUpperCase() + metric.slice(1)} </th> ))}

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

{Object.keys(results[0]?.metrics || {}).map(metric => (

<th

key={metric}

style={{

padding: '0.75rem',

textAlign: 'left',

borderBottom: '1px solid #eee',

}}>

{metric.charAt(0).toUpperCase() + metric.slice(1)}

</th>

))}

{results[0] && Object.keys(results[0].metrics || {}).map(metric => (

<th

key={metric}

style={{

padding: '0.75rem',

textAlign: 'left',

borderBottom: '1px solid #eee',

}}>

{metric.charAt(0).toUpperCase() + metric.slice(1)}

</th>

))}

coderabbitai · 2025-03-16T08:06:16Z

.../components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/DetailedResults.tsx

+                <tr key={prediction.sampleId}>
+                  <td
+                    style={{padding: '0.5rem', borderBottom: '1px solid #eee'}}>
+                    {'Unknown input'}


🛠️ Refactor suggestion

Replace hardcoded 'Unknown input' with actual input data.

The component displays 'Unknown input' rather than showing the actual input data from the sample.

<td style={{padding: '0.5rem', borderBottom: '1px solid #eee'}}> - {'Unknown input'} + {prediction.input || 'Unknown input'} </td>

This assumes that the DetailedEvaluationResult type includes an input field in each prediction. If not, the type definition should be updated to include this information.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

{'Unknown input'}

<td style={{padding: '0.5rem', borderBottom: '1px solid #eee'}}>

{prediction.input || 'Unknown input'}

</td>

coderabbitai · 2025-03-16T08:06:16Z

weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/api.ts

+export const runEvaluation = async (
+  evaluationId: string,
+  modelId: string
+): Promise<EvaluationResult> => {
+  await new Promise(resolve => setTimeout(resolve, 2000));
+  const evaluation = MOCK_EVALUATIONS.find(e => e.id === evaluationId);
+  const model = MOCK_MODELS.find(m => m.id === modelId);
+
+  if (!evaluation || !model) {
+    throw new Error('Invalid evaluation or model');
+  }
+
+  return {
+    id: `result-${Date.now()}`,
+    evaluationDefinition: evaluation,
+    model,
+    metrics: {
+      accuracy: Math.random() * 0.2 + 0.8, // Random accuracy between 0.8 and 1.0
+      f1_score: Math.random() * 0.2 + 0.8,
+    },
+    status: 'completed',
+    createdAt: new Date(),
+  };
+};


⚠️ Potential issue

Resolve mismatched evaluation lookups in runEvaluation.

Similarly, the code references MOCK_EVALUATIONS.find(e => e.id === evaluationId), but EvaluationDefinition defines objectId instead of id. These lookups are likely to fail. Align naming or references appropriately.

coderabbitai · 2025-03-16T08:06:17Z

weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/api.ts

+  const res = await client.objsQuery({
+    project_id: `${entity}/${project}`,
+    filter: {
+      base_object_classes: ['EvaluationResult'],
+      // evaluation_ref: evaluationRef,
+      // model_id: modelId,
+    },
+  });
+  return res.objs.map(obj => ({
+    id: obj.object_id,
+    status: obj.val.status,
+    created: new Date(obj.created_at),
+    accuracy: obj.val.accuracy,
+    f1Score: obj.val.f1_score,
+    latency: obj.val.latency,
+  }));
+};


🛠️ Refactor suggestion

Improve filtering logic in fetchModelResults.

The filter in fetchModelResults is commented out. If we truly need to narrow results by evaluationRef and modelId, consider implementing the filter as intended, or remove unused fields to avoid confusion.

filter: { base_object_classes: ['EvaluationResult'], - // evaluation_ref: evaluationRef, - // model_id: modelId, }, ...

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const res = await client.objsQuery({

project_id: `${entity}/${project}`,

filter: {

base_object_classes: ['EvaluationResult'],

// evaluation_ref: evaluationRef,

// model_id: modelId,

},

});

return res.objs.map(obj => ({

id: obj.object_id,

status: obj.val.status,

created: new Date(obj.created_at),

accuracy: obj.val.accuracy,

f1Score: obj.val.f1_score,

latency: obj.val.latency,

}));

};

const res = await client.objsQuery({

project_id: `${entity}/${project}`,

filter: {

base_object_classes: ['EvaluationResult'],

},

});

return res.objs.map(obj => ({

id: obj.object_id,

status: obj.val.status,

created: new Date(obj.created_at),

accuracy: obj.val.accuracy,

f1Score: obj.val.f1_score,

latency: obj.val.latency,

}));

};

coderabbitai · 2025-03-16T08:06:17Z

weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/api.ts

+export const createEvaluation = async (
+  name: string,
+  datasetId: string,
+  scorerIds: string[]
+): Promise<EvaluationDefinition> => {
+  await new Promise(resolve => setTimeout(resolve, 1000));
+  const dataset = MOCK_DATASETS.find(d => d.id === datasetId);
+  const scorers = MOCK_SCORERS.filter(s => scorerIds.includes(s.id));
+
+  if (!dataset || scorers.length === 0) {
+    throw new Error('Invalid dataset or scorers');
+  }


⚠️ Potential issue

Address mismatched fields in createEvaluation.

This code references dataset.id, but the Dataset interface does not have an id property (it uses name instead). This mismatch causes lookups to fail. We also assume that the dataset must exist and at least one scorer must match, but we’re searching empty arrays:
• MOCK_DATASETS.find(d => d.id === datasetId) will always yield undefined.
• MOCK_SCORERS is empty, so it never contains scorers to match.

To fix, align the interface or the mock usage:

-const dataset = MOCK_DATASETS.find(d => d.id === datasetId); +const dataset = MOCK_DATASETS.find(d => d.name === datasetId); -const scorers = MOCK_SCORERS.filter(s => scorerIds.includes(s.id)); +const scorers = MOCK_SCORERS.filter(s => scorerIds.includes(s.id || ''));

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

export const createEvaluation = async (

name: string,

datasetId: string,

scorerIds: string[]

): Promise<EvaluationDefinition> => {

await new Promise(resolve => setTimeout(resolve, 1000));

const dataset = MOCK_DATASETS.find(d => d.id === datasetId);

const scorers = MOCK_SCORERS.filter(s => scorerIds.includes(s.id));

if (!dataset || scorers.length === 0) {

throw new Error('Invalid dataset or scorers');

}

export const createEvaluation = async (

name: string,

datasetId: string,

scorerIds: string[]

): Promise<EvaluationDefinition> => {

await new Promise(resolve => setTimeout(resolve, 1000));

- const dataset = MOCK_DATASETS.find(d => d.id === datasetId);

+ const dataset = MOCK_DATASETS.find(d => d.name === datasetId);

- const scorers = MOCK_SCORERS.filter(s => scorerIds.includes(s.id));

+ const scorers = MOCK_SCORERS.filter(s => scorerIds.includes(s.id || ''));

if (!dataset || scorers.length === 0) {

throw new Error('Invalid dataset or scorers');

}

// remaining code...

}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (6)

weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/EvalStudioPage.tsx (4)
25-25: Remove unused import for CompareEvaluationsPageContent.

This import is never utilized in the component. Keeping unused imports can lead to confusion and minor performance overhead due to additional bundle size.

Consider removing it with a diff like:
- import {CompareEvaluationsPageContent} from '../../CompareEvaluationsPage/CompareEvaluationsPage';
🧰 Tools

🪛 GitHub Check: WeaveJS Lint and Compile

[warning] 25-25:
'CompareEvaluationsPageContent' is defined but never used

47-47: Remove unused import for ModelReport.

This import also remains unused, similar to CompareEvaluationsPageContent. Cleaning it up improves overall clarity.
- import {ModelReport} from './ModelReport';
🧰 Tools

🪛 GitHub Check: WeaveJS Lint and Compile

[warning] 47-47:
'ModelReport' is defined but never used

935-1019: Remove unnecessary dependencies from useMemo.

The hook currently includes selectedMetrics and setSelectedMetrics in its dependency array, but neither variable is referenced in the memoized output. This can cause extra re-renders. Consider removing them from the array.
- const availableTabs = useMemo(() => {
+ const availableTabs = useMemo(() => {
    const tabs: TabConfig[] = [
      ...
    ];
    return tabs;
- }, [selectedDataset, selectedEvaluation, selectedRun, objectVersion.result, loadingRuns, entity, project, selectedMetrics, setSelectedMetrics]);
+ }, [selectedDataset, selectedEvaluation, selectedRun, objectVersion.result, loadingRuns, entity, project]);
🧰 Tools

🪛 GitHub Check: WeaveJS Lint and Compile

[warning] 1019-1019:
React Hook useMemo has unnecessary dependencies: 'selectedMetrics' and 'setSelectedMetrics'. Either exclude them or remove the dependency array

221-234: Consider using styled components or CSS classes for hover effects.

Inline styles modified via onMouseEnter and onMouseLeave can become cumbersome over time. Adopting CSS or styled components for hover states would help maintain separation of concerns.
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/RunReport.tsx (2)
1-1: Sort imports to match lint rules.

The lint warning suggests sorting imports for better consistency. You may run the autofix or manually reorder them according to your project's style conventions.

🧰 Tools

🪛 GitHub Check: WeaveJS Lint and Compile

[warning] 1-1:
Run autofix to sort these imports!

186-186: Remove unused variables outputKeys and flattenedOutput.

They are assigned but never used, introducing unnecessary clutter. Removing them clarifies the code.
- const outputKeys = getUniqueKeys(allOutputs);
...
- const flattenedOutput = flattenObject(predictOutput);
Also applies to: 200-200

🧰 Tools

🪛 GitHub Check: WeaveJS Lint and Compile

[warning] 186-186:
'outputKeys' is assigned a value but never used

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 794a33b and 89f168d.

📒 Files selected for processing (2)

weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/EvalStudioPage.tsx (1 hunks)
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/RunReport.tsx (1 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

`**/*.{js,jsx,ts,tsx}`: Focus on architectural and logical i...

**/*.{js,jsx,ts,tsx}: Focus on architectural and logical issues rather than style (assuming ESLint is in place).
Flag potential memory leaks and performance bottlenecks.
Check for proper error handling and async/await usage.
Avoid strict enforcement of try/catch blocks - accept Promise chains, early returns, and other clear error handling patterns. These are acceptable as long as they maintain clarity and predictability.
Ensure proper type usage in TypeScript files.
Look for security vulnerabilities in data handling.
Don't comment on formatting if prettier is configured.
Verify proper React hooks usage and component lifecycle.
Check for proper state management patterns.

weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/EvalStudioPage.tsx
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/RunReport.tsx

🪛 GitHub Check: WeaveJS Lint and Compile

weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/EvalStudioPage.tsx

[warning] 25-25:
'CompareEvaluationsPageContent' is defined but never used

[warning] 47-47:
'ModelReport' is defined but never used

[warning] 1019-1019:
React Hook useMemo has unnecessary dependencies: 'selectedMetrics' and 'setSelectedMetrics'. Either exclude them or remove the dependency array

weave-js/src/components/PagePanelComponents/Home/Browse3/pages/EvalStudio/components/RunReport.tsx

[warning] 1-1:
Run autofix to sort these imports!

[warning] 186-186:
'outputKeys' is assigned a value but never used

[warning] 200-200:
'flattenedOutput' is assigned a value but never used

⏰ Context from checks skipped due to timeout of 90000ms (933)

GitHub Check: Legacy (Query Service) Python unit tests (1)
GitHub Check: Legacy (Query Service) Python unit tests (0)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, scorers)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, notdiamond)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, litellm)
GitHub Check: Trace nox tests (3, 13, langchain)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, google_ai_studio)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, dspy)
GitHub Check: Trace nox tests (3, 13, cohere)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, bedrock)
GitHub Check: Trace nox tests (3, 13, anthropic)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, cerebras)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, anthropic)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: Trace nox tests (3, 9, langchain)
GitHub Check: Trace nox tests (3, 9, instructor)
GitHub Check: Legacy (Query Service) Python unit tests (1)
GitHub Check: Legacy (Query Service) Python unit tests (0)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, scorers)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, notdiamond)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, litellm)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, google_ai_studio)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, dspy)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, bedrock)
GitHub Check: Trace nox tests (3, 13, anthropic)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, cerebras)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, anthropic)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: Trace nox tests (3, 9, langchain)
GitHub Check: Trace nox tests (3, 9, instructor)
GitHub Check: Legacy (Query Service) Python unit tests (1)
GitHub Check: Legacy (Query Service) Python unit tests (0)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, scorers)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, notdiamond)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, litellm)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, google_ai_studio)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, bedrock)
GitHub Check: Trace nox tests (3, 13, anthropic)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, cerebras)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, anthropic)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: Trace nox tests (3, 9, langchain)
GitHub Check: Trace nox tests (3, 9, instructor)
GitHub Check: Legacy (Query Service) Python unit tests (1)
GitHub Check: Legacy (Query Service) Python unit tests (0)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, notdiamond)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, litellm)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, google_ai_studio)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, bedrock)
GitHub Check: Trace nox tests (3, 13, anthropic)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, cerebras)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, anthropic)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: Trace nox tests (3, 9, langchain)
GitHub Check: Trace nox tests (3, 9, instructor)
GitHub Check: Legacy (Query Service) Python unit tests (1)
GitHub Check: Legacy (Query Service) Python unit tests (0)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, notdiamond)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, litellm)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, google_ai_studio)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, bedrock)
GitHub Check: Trace nox tests (3, 13, anthropic)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, cerebras)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, anthropic)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: Trace nox tests (3, 9, langchain)
GitHub Check: Legacy (Query Service) Python unit tests (1)
GitHub Check: Legacy (Query Service) Python unit tests (0)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, notdiamond)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, litellm)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, google_ai_studio)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, anthropic)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, cerebras)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, anthropic)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: Trace nox tests (3, 9, langchain)
GitHub Check: Legacy (Query Service) Python unit tests (1)
GitHub Check: Legacy (Query Service) Python unit tests (0)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, notdiamond)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, litellm)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, google_ai_studio)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, anthropic)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, cerebras)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, anthropic)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: Trace nox tests (3, 9, langchain)
GitHub Check: Legacy (Query Service) Python unit tests (1)
GitHub Check: Legacy (Query Service) Python unit tests (0)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, notdiamond)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, litellm)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, google_ai_studio)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, anthropic)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, cerebras)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, anthropic)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: Trace nox tests (3, 9, langchain)
GitHub Check: notify-wandb-core
GitHub Check: Legacy (Query Service) Python unit tests (1)
GitHub Check: Legacy (Query Service) Python unit tests (0)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, litellm)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, google_ai_studio)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, cerebras)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, anthropic)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: Trace nox tests (3, 9, langchain)
GitHub Check: Legacy (Query Service) Python unit tests (1)
GitHub Check: Legacy (Query Service) Python unit tests (0)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, litellm)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, google_ai_studio)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, cerebras)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, anthropic)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: Trace nox tests (3, 9, langchain)

coderabbitai

Actionable comments posted: 3

♻️ Duplicate comments (1)

weave/trace_server/clickhouse_trace_server_batched.py (1)
225-234: ⚠️ Potential issue

Avoid returning the raw database password from model_dump.

Exposing credentials (even just in memory or logs) can pose a security risk. Consider masking the password or removing the field entirely if not strictly necessary.
-            "password": self._password,
+            "password": "*****"  # or omit entirely

🧹 Nitpick comments (1)

weave/trace_server/clickhouse_trace_server_batched.py (1)
599-615: Consider moving imports to the module level.

Import statements inside functions can make the code less readable and may impact performance since they're executed each time the function is called. Unless needed to prevent circular imports, consider moving the import to the top of the file.
+from weave.trace_server.server_side_object_saver import RunAsUser
+
 def obj_create(self, req: tsi.ObjCreateReq) -> tsi.ObjCreateRes:
     from weave.builtin_objects.builtin_registry import get_builtin

     if req.obj.builtin_object_class is not None:
-        from weave.trace_server.server_side_object_saver import RunAsUser

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 89f168d and 566b9bb.

📒 Files selected for processing (3)

pyproject.toml (1 hunks)
weave/trace/weave_client.py (6 hunks)
weave/trace_server/clickhouse_trace_server_batched.py (4 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

pyproject.toml

🧰 Additional context used

📓 Path-based instructions (1)

`**/*.py`: Focus on pythonic code patterns. Check for proper...

**/*.py: Focus on pythonic code patterns.
Check for proper exception handling.
Verify type hints usage where applicable.
Look for potential performance improvements.
Don't comment on formatting if black/isort is configured.
Check for proper dependency injection patterns.
Verify proper async handling if applicable.

weave/trace/weave_client.py
weave/trace_server/clickhouse_trace_server_batched.py

⏰ Context from checks skipped due to timeout of 90000ms (890)

GitHub Check: notify-wandb-core
GitHub Check: Legacy (Query Service) Python unit tests (1)
GitHub Check: Legacy (Query Service) Python unit tests (0)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, litellm)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, cerebras)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, anthropic)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: Trace nox tests (3, 9, langchain)
GitHub Check: Legacy (Query Service) Python unit tests (1)
GitHub Check: Legacy (Query Service) Python unit tests (0)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, litellm)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, cerebras)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, anthropic)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: Trace nox tests (3, 9, langchain)
GitHub Check: Legacy (Query Service) Python unit tests (1)
GitHub Check: Legacy (Query Service) Python unit tests (0)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, litellm)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, cerebras)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, anthropic)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: Trace nox tests (3, 9, langchain)
GitHub Check: Legacy (Query Service) Python unit tests (1)
GitHub Check: Legacy (Query Service) Python unit tests (0)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, litellm)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, cerebras)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, anthropic)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: Trace nox tests (3, 9, langchain)
GitHub Check: Legacy (Query Service) Python unit tests (1)
GitHub Check: Legacy (Query Service) Python unit tests (0)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, litellm)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, cerebras)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, anthropic)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: Trace nox tests (3, 9, langchain)
GitHub Check: Legacy (Query Service) Python unit tests (1)
GitHub Check: Legacy (Query Service) Python unit tests (0)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, litellm)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, cerebras)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, anthropic)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: Trace nox tests (3, 9, langchain)
GitHub Check: Legacy (Query Service) Python unit tests (1)
GitHub Check: Legacy (Query Service) Python unit tests (0)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, litellm)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, cerebras)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, anthropic)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: Trace nox tests (3, 9, langchain)
GitHub Check: Legacy (Query Service) Python unit tests (1)
GitHub Check: Legacy (Query Service) Python unit tests (0)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, litellm)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, cerebras)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, anthropic)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: Trace nox tests (3, 9, langchain)
GitHub Check: Legacy (Query Service) Python unit tests (1)
GitHub Check: Legacy (Query Service) Python unit tests (0)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, litellm)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, cerebras)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cohere)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, anthropic)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: Trace nox tests (3, 9, langchain)
GitHub Check: Legacy (Query Service) Python unit tests (1)
GitHub Check: Legacy (Query Service) Python unit tests (0)
GitHub Check: WeaveJS Lint and Compile
GitHub Check: Trace nox tests (3, 13, pandas-test)
GitHub Check: Trace nox tests (3, 13, vertexai)
GitHub Check: Trace nox tests (3, 13, openai)
GitHub Check: Trace nox tests (3, 13, mistral1)
GitHub Check: Trace nox tests (3, 13, mistral0)
GitHub Check: Trace nox tests (3, 13, llamaindex)
GitHub Check: Trace nox tests (3, 13, litellm)
GitHub Check: Trace nox tests (3, 13, instructor)
GitHub Check: Trace nox tests (3, 13, huggingface)
GitHub Check: Trace nox tests (3, 13, groq)
GitHub Check: Trace nox tests (3, 13, cerebras)
GitHub Check: Trace nox tests (3, 13, trace_server)
GitHub Check: Trace nox tests (3, 13, trace)
GitHub Check: Trace nox tests (3, 12, pandas-test)
GitHub Check: Trace nox tests (3, 12, scorers)
GitHub Check: Trace nox tests (3, 12, vertexai)
GitHub Check: Trace nox tests (3, 12, openai)
GitHub Check: Trace nox tests (3, 12, notdiamond)
GitHub Check: Trace nox tests (3, 12, mistral1)
GitHub Check: Trace nox tests (3, 12, mistral0)
GitHub Check: Trace nox tests (3, 12, llamaindex)
GitHub Check: Trace nox tests (3, 12, litellm)
GitHub Check: Trace nox tests (3, 12, langchain)
GitHub Check: Trace nox tests (3, 12, instructor)
GitHub Check: Trace nox tests (3, 12, google_ai_studio)
GitHub Check: Trace nox tests (3, 12, huggingface)
GitHub Check: Trace nox tests (3, 12, groq)
GitHub Check: Trace nox tests (3, 12, dspy)
GitHub Check: Trace nox tests (3, 12, cohere)
GitHub Check: Trace nox tests (3, 12, cerebras)
GitHub Check: Trace nox tests (3, 12, bedrock)
GitHub Check: Trace nox tests (3, 12, anthropic)
GitHub Check: Trace nox tests (3, 12, trace_server)
GitHub Check: Trace nox tests (3, 12, trace)
GitHub Check: Trace nox tests (3, 11, pandas-test)
GitHub Check: Trace nox tests (3, 11, scorers)
GitHub Check: Trace nox tests (3, 11, vertexai)
GitHub Check: Trace nox tests (3, 11, openai)
GitHub Check: Trace nox tests (3, 11, notdiamond)
GitHub Check: Trace nox tests (3, 11, mistral1)
GitHub Check: Trace nox tests (3, 11, mistral0)
GitHub Check: Trace nox tests (3, 11, llamaindex)
GitHub Check: Trace nox tests (3, 11, litellm)
GitHub Check: Trace nox tests (3, 11, langchain)
GitHub Check: Trace nox tests (3, 11, instructor)
GitHub Check: Trace nox tests (3, 11, google_ai_studio)
GitHub Check: Trace nox tests (3, 11, huggingface)
GitHub Check: Trace nox tests (3, 11, groq)
GitHub Check: Trace nox tests (3, 11, dspy)
GitHub Check: Trace nox tests (3, 11, cohere)
GitHub Check: Trace nox tests (3, 11, cerebras)
GitHub Check: Trace nox tests (3, 11, bedrock)
GitHub Check: Trace nox tests (3, 11, anthropic)
GitHub Check: Trace nox tests (3, 11, trace_server)
GitHub Check: Trace nox tests (3, 11, trace)
GitHub Check: Trace nox tests (3, 10, pandas-test)
GitHub Check: Trace nox tests (3, 10, scorers)
GitHub Check: Trace nox tests (3, 10, vertexai)
GitHub Check: Trace nox tests (3, 10, openai)
GitHub Check: Trace nox tests (3, 10, notdiamond)
GitHub Check: Trace nox tests (3, 10, mistral1)
GitHub Check: Trace nox tests (3, 10, mistral0)
GitHub Check: Trace nox tests (3, 10, llamaindex)
GitHub Check: Trace nox tests (3, 10, litellm)
GitHub Check: Trace nox tests (3, 10, langchain)
GitHub Check: Trace nox tests (3, 10, instructor)
GitHub Check: Trace nox tests (3, 10, google_ai_studio)
GitHub Check: Trace nox tests (3, 10, huggingface)
GitHub Check: Trace nox tests (3, 10, groq)
GitHub Check: Trace nox tests (3, 10, dspy)
GitHub Check: Trace nox tests (3, 10, cerebras)
GitHub Check: Trace nox tests (3, 10, bedrock)
GitHub Check: Trace nox tests (3, 10, anthropic)
GitHub Check: Trace nox tests (3, 10, trace_server)
GitHub Check: Trace nox tests (3, 10, trace)
GitHub Check: Trace nox tests (3, 9, pandas-test)
GitHub Check: Trace nox tests (3, 9, scorers)
GitHub Check: Trace nox tests (3, 9, vertexai)
GitHub Check: Trace nox tests (3, 9, openai)
GitHub Check: Trace nox tests (3, 9, notdiamond)
GitHub Check: Trace nox tests (3, 9, mistral1)
GitHub Check: Trace nox tests (3, 9, mistral0)
GitHub Check: Trace nox tests (3, 9, llamaindex)
GitHub Check: Trace nox tests (3, 9, litellm)
GitHub Check: Trace nox tests (3, 9, langchain)

🔇 Additional comments (6)

weave/trace/weave_client.py (5)

127-127: Import addition looks good.

Adding the Scorer import for type hinting is appropriate and necessary for the changes made in this file.

499-504: Well-structured new dataclass for scorer results.

The new ClientApplyScorerResult dataclass effectively encapsulates the result of applying a scorer to a call, with appropriate typing for the result, score_call, and optional feedback_id.

661-661: Return type annotation update is consistent with implementation changes.

The method signature has been correctly updated to return the newly created ClientApplyScorerResult type.

694-696: Input extraction logic improvement.

The approach for extracting model inputs has been refined to explicitly exclude 'self' and 'kwargs' keys.

714-723: Return structure enhancement with proper error handling.

The implementation now correctly returns a ClientApplyScorerResult instance in both paths (with and without a feedback ID), ensuring consistent return types and proper error handling.

The method handles the case where no WeaveClient is available by returning a result with a None feedback_id, which is a good defensive programming practice.
weave/trace_server/clickhouse_trace_server_batched.py (1)

1722-1734: Handle method invocation errors for call_method.

The # TODO: handle errors here note highlights a missing error-handling strategy. Wrap runner.run_call_method in a try-except block, returning appropriate error responses or raising well-defined exceptions instead of relying on unhandled runtime errors.
     runner = RunAsUser(ch_server_dump=self.model_dump())
-    # TODO: handle errors here
-    res = runner.run_call_method(
-        req.object_ref, req.project_id, req.wb_user_id, req.method_name, req.args
-    )
-    return tsi.CallMethodRes.model_validate(res)
+    try:
+        res = runner.run_call_method(
+            req.object_ref, req.project_id, req.wb_user_id, req.method_name, req.args
+        )
+        return tsi.CallMethodRes.model_validate(res)
+    except Exception as e:
+        # Handle specific exceptions with appropriate responses
+        logger.error(f"Error calling method {req.method_name}: {str(e)}")
+        raise

coderabbitai · 2025-03-17T20:19:09Z

weave/trace_server/clickhouse_trace_server_batched.py

+    async def evaluate_stream(
+        self, req: tsi.EvaluateReq
+    ) -> AsyncIterator[tsi.EvaluateStepRes]:
+        from weave.trace_server.server_side_object_saver import RunAsUser
+
+        runner = RunAsUser(ch_server_dump=self.model_dump())
+
+        return runner.run_evaluate_stream(req)
+


⚠️ Potential issue

Add error handling and user validation to evaluate_stream.

This async method lacks both user ID validation and error handling, similar to the issues identified in score_call. Proper error handling is especially important in async methods to prevent unhandled exceptions from breaking the async flow.

async def evaluate_stream( self, req: tsi.EvaluateReq ) -> AsyncIterator[tsi.EvaluateStepRes]: from weave.trace_server.server_side_object_saver import RunAsUser + if req.wb_user_id is None: + raise ValueError("User ID is required") + runner = RunAsUser(ch_server_dump=self.model_dump()) - return runner.run_evaluate_stream(req) + try: + return await runner.run_evaluate_stream(req) + except Exception as e: + logger.error(f"Error during evaluate_stream: {str(e)}") + raise

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

async def evaluate_stream(

self, req: tsi.EvaluateReq

) -> AsyncIterator[tsi.EvaluateStepRes]:

from weave.trace_server.server_side_object_saver import RunAsUser

runner = RunAsUser(ch_server_dump=self.model_dump())

return runner.run_evaluate_stream(req)

async def evaluate_stream(

self, req: tsi.EvaluateReq

) -> AsyncIterator[tsi.EvaluateStepRes]:

from weave.trace_server.server_side_object_saver import RunAsUser

if req.wb_user_id is None:

raise ValueError("User ID is required")

runner = RunAsUser(ch_server_dump=self.model_dump())

try:

return await runner.run_evaluate_stream(req)

except Exception as e:

logger.error(f"Error during evaluate_stream: {str(e)}")

raise

coderabbitai · 2025-03-17T20:19:09Z

weave/trace_server/clickhouse_trace_server_batched.py

+        from weave.builtin_objects.builtin_registry import get_builtin
+
+        if req.obj.builtin_object_class is not None:
+            from weave.trace_server.server_side_object_saver import RunAsUser
+
+            object_class_type = get_builtin(req.obj.builtin_object_class)
+            if object_class_type is None:
+                raise ValueError(
+                    f"Unknown builtin object class: {req.obj.builtin_object_class}"
+                )
+
+            # PROBLEM: If req.obj.val contains refs, this is going to break due to
+            # pydantic validation. No clear solution.
+            new_obj = object_class_type.model_validate(req.obj.val, strict=False)
+            runner = RunAsUser(ch_server_dump=self.model_dump())
+            digest = runner.run_save_object(
+                new_obj, req.obj.project_id, req.obj.object_id, None
+            )
+            return tsi.ObjCreateRes(digest=digest)
+


⚠️ Potential issue

Handle the known issue with references in req.obj.val.

The code comment indicates an unresolved issue with Pydantic validation when req.obj.val contains refs. This could lead to runtime failures in production.

- # PROBLEM: If req.obj.val contains refs, this is going to break due to - # pydantic validation. No clear solution. + # Handle the case where req.obj.val contains refs new_obj = object_class_type.model_validate(req.obj.val, strict=False)

Consider implementing a pre-processing step to handle references before validation, or add explicit error handling to gracefully manage this scenario.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

from weave.builtin_objects.builtin_registry import get_builtin

if req.obj.builtin_object_class is not None:

from weave.trace_server.server_side_object_saver import RunAsUser

object_class_type = get_builtin(req.obj.builtin_object_class)

if object_class_type is None:

raise ValueError(

f"Unknown builtin object class: {req.obj.builtin_object_class}"

)

# PROBLEM: If req.obj.val contains refs, this is going to break due to

# pydantic validation. No clear solution.

new_obj = object_class_type.model_validate(req.obj.val, strict=False)

runner = RunAsUser(ch_server_dump=self.model_dump())

digest = runner.run_save_object(

new_obj, req.obj.project_id, req.obj.object_id, None

)

return tsi.ObjCreateRes(digest=digest)

from weave.builtin_objects.builtin_registry import get_builtin

if req.obj.builtin_object_class is not None:

from weave.trace_server.server_side_object_saver import RunAsUser

object_class_type = get_builtin(req.obj.builtin_object_class)

if object_class_type is None:

raise ValueError(

f"Unknown builtin object class: {req.obj.builtin_object_class}"

)

# Handle the case where req.obj.val contains refs

new_obj = object_class_type.model_validate(req.obj.val, strict=False)

runner = RunAsUser(ch_server_dump=self.model_dump())

digest = runner.run_save_object(

new_obj, req.obj.project_id, req.obj.object_id, None

)

return tsi.ObjCreateRes(digest=digest)

coderabbitai · 2025-03-17T20:19:09Z

weave/trace_server/clickhouse_trace_server_batched.py

+    def score_call(self, req: tsi.ScoreCallReq) -> tsi.ScoreCallRes:
+        from weave.trace_server.server_side_object_saver import RunAsUser
+
+        runner = RunAsUser(ch_server_dump=self.model_dump())
+        res = runner.run_score_call(req)
+
+        return tsi.ScoreCallRes(
+            feedback_id=res["feedback_id"],
+            score_call=self.call_read(
+                tsi.CallReadReq(project_id=req.project_id, id=res["scorer_call_id"])
+            ).call,
+        )
+


⚠️ Potential issue

Add error handling and user ID validation to score_call.

Unlike call_method, this method doesn't validate req.wb_user_id and lacks error handling for the call to runner.run_score_call. This inconsistency could lead to cryptic errors or security issues.

def score_call(self, req: tsi.ScoreCallReq) -> tsi.ScoreCallRes: from weave.trace_server.server_side_object_saver import RunAsUser + if req.wb_user_id is None: + raise ValueError("User ID is required") + runner = RunAsUser(ch_server_dump=self.model_dump()) - res = runner.run_score_call(req) - - return tsi.ScoreCallRes( - feedback_id=res["feedback_id"], - score_call=self.call_read( - tsi.CallReadReq(project_id=req.project_id, id=res["scorer_call_id"]) - ).call, - ) + try: + res = runner.run_score_call(req) + return tsi.ScoreCallRes( + feedback_id=res["feedback_id"], + score_call=self.call_read( + tsi.CallReadReq(project_id=req.project_id, id=res["scorer_call_id"]) + ).call, + ) + except Exception as e: + logger.error(f"Error during score_call: {str(e)}") + raise

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def score_call(self, req: tsi.ScoreCallReq) -> tsi.ScoreCallRes:

from weave.trace_server.server_side_object_saver import RunAsUser

runner = RunAsUser(ch_server_dump=self.model_dump())

res = runner.run_score_call(req)

return tsi.ScoreCallRes(

feedback_id=res["feedback_id"],

score_call=self.call_read(

tsi.CallReadReq(project_id=req.project_id, id=res["scorer_call_id"])

).call,

)

def score_call(self, req: tsi.ScoreCallReq) -> tsi.ScoreCallRes:

from weave.trace_server.server_side_object_saver import RunAsUser

if req.wb_user_id is None:

raise ValueError("User ID is required")

runner = RunAsUser(ch_server_dump=self.model_dump())

try:

res = runner.run_score_call(req)

return tsi.ScoreCallRes(

feedback_id=res["feedback_id"],

score_call=self.call_read(

tsi.CallReadReq(project_id=req.project_id, id=res["scorer_call_id"])

).call,

)

except Exception as e:

logger.error(f"Error during score_call: {str(e)}")

raise

tssweeney added 2 commits March 14, 2025 19:10

Initialization - brining over prior art

3983d76

Initialization - brining over prior art

2fa60e9

coderabbitai bot reviewed Mar 15, 2025

View reviewed changes

getting the basics going

ce707ac

coderabbitai bot reviewed Mar 15, 2025

View reviewed changes

tssweeney added 4 commits March 15, 2025 10:56

Some initial cursor scaffolding

c6a95c8

Lint

c7981cf

Lint

57906a3

Lint

eed3f00

coderabbitai bot reviewed Mar 15, 2025

View reviewed changes

tssweeney added 3 commits March 15, 2025 11:59

ok, initial model definition working

e0fe18b

Probably a lost of natyness

ce9aa1d

Ok getting the basics complete

7b593dc

coderabbitai bot reviewed Mar 15, 2025

View reviewed changes

tssweeney added 3 commits March 15, 2025 14:18

Little cleanup

a1e01a0

checkpoint

4f3d9c3

checkpoint

8f1c172

coderabbitai bot reviewed Mar 16, 2025

View reviewed changes

tssweeney added 6 commits March 15, 2025 17:47

starting the UI

03f48db

Checkpoint before bigger refactor

0895cef

Complete UI revamp

d905ea2

Ok, rudamentary navigation complete

6c7a80a

a bunch more stuff

a645b63

Checkpoint

794a33b

coderabbitai bot reviewed Mar 16, 2025

View reviewed changes

Checkpoint

89f168d

coderabbitai bot reviewed Mar 16, 2025

View reviewed changes

Merge branch 'master' into tim/on_demand_evals

566b9bb

coderabbitai bot reviewed Mar 17, 2025

View reviewed changes

		def get_builtin(name: str) -> type[weave.Object]:
		return _BUILTIN_REGISTRY[name]

Pattern	Use Case	Complexity	Key Benefit
3-Layer Architecture	Large-scale forms	High	Clear separation of concerns
Controlled Components	Real-time validation	Medium	Granular state control
React Hook Form	Rapid development	Low	Built-in validation/optimization
Compound Components	Reusable form systems	Medium	Shared state management

		assert isinstance(int_ref, InternalObjectRef)
		ref = ObjectRef(

		assert project_id.startswith("_SERVER_/")
		return project_id[len("_SERVER_/") :]

-        assert project_id.startswith("_SERVER_/")
-        return project_id[len("_SERVER_/") :]
+        prefix = "_SERVER_/"
+        if not project_id.startswith(prefix):
+            raise ValueError(f"Project ID must start with '{prefix}'")
+        return project_id[len(prefix):]

		"server_client = RemoteHTTPTraceServer(\"http://127.0.01:6345\")\n",
		"server_client.set_auth((\"\", wandb_context.api_key))"

-    "server_client = RemoteHTTPTraceServer(\"http://127.0.01:6345\")\n",
-    "server_client.set_auth((\"\", wandb_context.api_key))"
+"server_client = RemoteHTTPTraceServer(\"http://127.0.0.1:6345\")\n",
+"# Use environment variables for credentials or a more secure authentication method\n",
+"import os\n",
+"api_key = os.environ.get(\"WANDB_API_KEY\", wandb_context.api_key)\n",
+"server_client.set_auth((\"\", api_key))"

-      datasetRef: 'dataset1', // Would come from context
+// At the top of the file, add the import:
+import { useEvalStudio } from '../../context';
+...
+const NewEvaluationForm = (props) => {
+  // Retrieve selectedDatasetVersion from context
+  const { selectedDatasetVersion } = useEvalStudio();
+  // ... other component logic
+  const handleSubmit = () => {
+    const newEvaluation: Evaluation = {
+      entity: 'wandb',
+      project: 'eval-studio',
+      objectId: Math.random().toString(36).substr(2, 9),
+      objectDigest: Math.random().toString(36).substr(2, 9),
+      evaluationRef: Math.random().toString(36).substr(2, 9),
+      displayName: name,
+      createdAt: new Date(),
+      // Retrieve datasetRef from context instead of hardcoding
+      datasetRef: selectedDatasetVersion?.id || '',
+      scorerRefs: selectedScorers,
+    };
+    // ... further handling logic
+  };
+  // ... rest of the component
+  return (
+    // ... JSX for the component
+  );
+};
+export default NewEvaluationForm;

-                    {'Unknown input'}
+<td style={{padding: '0.5rem', borderBottom: '1px solid #eee'}}>
+  {prediction.input || 'Unknown input'}
+</td>

feat(weave): On Demand Evals #3877

Are you sure you want to change the base?

feat(weave): On Demand Evals #3877

Conversation

tssweeney commented Mar 15, 2025 • edited by coderabbitai bot Loading

Summary by CodeRabbit

circle-job-mirror bot commented Mar 15, 2025 • edited Loading

coderabbitai bot commented Mar 15, 2025 • edited Loading

Walkthrough

Changes

Sequence Diagram(s)

Suggested reviewers

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Mar 15, 2025

Choose a reason for hiding this comment

coderabbitai bot Mar 15, 2025

Choose a reason for hiding this comment

coderabbitai bot Mar 15, 2025

Choose a reason for hiding this comment

coderabbitai bot Mar 15, 2025

Choose a reason for hiding this comment

coderabbitai bot Mar 15, 2025

Choose a reason for hiding this comment

coderabbitai bot Mar 15, 2025

Choose a reason for hiding this comment

coderabbitai bot Mar 15, 2025

Choose a reason for hiding this comment

coderabbitai bot Mar 15, 2025

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Mar 15, 2025

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

1. 3-Layer Architecture (Apollo/Logic/View)

2. Controlled vs. Uncontrolled Components

3. Form Library Integration

4. Compound Components

5. Container-Presentational Pattern

6. Provider-Consumer Pattern

7. Higher-Order Components (HOCs)

Comparison Table

coderabbitai bot Mar 15, 2025

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Mar 15, 2025

Choose a reason for hiding this comment

coderabbitai bot Mar 15, 2025

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Mar 16, 2025

Choose a reason for hiding this comment

coderabbitai bot Mar 16, 2025

Choose a reason for hiding this comment

coderabbitai bot Mar 16, 2025

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Mar 16, 2025

Choose a reason for hiding this comment

coderabbitai bot Mar 16, 2025

Choose a reason for hiding this comment

coderabbitai bot Mar 16, 2025

Choose a reason for hiding this comment

coderabbitai bot Mar 16, 2025

Choose a reason for hiding this comment

coderabbitai bot Mar 16, 2025

Choose a reason for hiding this comment

coderabbitai bot Mar 16, 2025

Choose a reason for hiding this comment

coderabbitai bot Mar 16, 2025

tssweeney commented Mar 15, 2025 •

edited by coderabbitai bot

Loading

circle-job-mirror bot commented Mar 15, 2025 •

edited

Loading

coderabbitai bot commented Mar 15, 2025 •

edited

Loading