Skip to content

Conversation

shaohuzhang1
Copy link
Contributor

fix: Calling the model using non stream cannot obtain the token

@shaohuzhang1 shaohuzhang1 merged commit bd589e5 into v2 Aug 6, 2025
3 of 4 checks passed
@shaohuzhang1 shaohuzhang1 deleted the pr@v2@fix_model_call branch August 6, 2025 03:00
Copy link

f2c-ci-robot bot commented Aug 6, 2025

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@@ -133,7 +134,7 @@ def _convert_chunk_to_generation_chunk(
)

usage_metadata: Optional[UsageMetadata] = (
_create_usage_metadata(token_usage) if token_usage else None
_create_usage_metadata(token_usage) if token_usage and token_usage.get("prompt_tokens") else None
)
if len(choices) == 0:
# logprobs is implicitly None
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are some suggestions to improve the readability and error handling of the provided code, along with addressing any logical errors:

  1. Remove Redundant Error Handling: The exception block in both get_num_tokens_from_messages and get_num_tokens seems unnecessary given that _create_usage_metadata function would handle exceptions.

    def get_sum_of_tokens_in_chat_history(messages):
        try:
            tokenizer = TokenizerManage.get_tokenizer()
            return sum([len(tokenizer.encode(s)) for s in messages])
        except Exception as e:
            raise RuntimeError(f"Failed to count tokens: {e}")
    
    def get_number_of_tokens(input_text):
        try:
            tokenizer = TokenizerManage.get_tokenizer()
            return len(tokenizer.encode(input_text))
        except Exception as e:
            raise RuntimeError(f"Failed to count tokens: {e}")
    
    def set_usage_metadata(input_message: Union[str, List], output_message=None) -> UsageMetadata:
        try:
            if isinstance(input_message, list):
                input_usage = []
                for message in input_message:
                    msg_obj = {"prompt": message}
                    if output_message and hasattr(output_message.message.choices[0].logprobs, 'tokens'):
                        result = {
                            "prompt_tokens": len(message),
                            "completion_tokens": sum((prob['token']) for prob in output_message.message.choices[0].logprobs.tokens)
                        }
                        print(result)
                      }                    
                      
                      # If no log probs available for completion tokens
                      else:
                          if not isinstance(input_message, List):  
                              input_tokens = [input_message]
                              input_usage.append({"content": input_message})
                              
          except ValueError as ve:
              print("Incorrect usage format:", ve)

2. **Optimize String Concatenation**: In Python versions before 3.6, string concatenation using the `+` operator can be inefficient because strings are immutable and each new concatenated operation creates a new string object. Use an f-string or the `join()` method instead to avoid this inefficiency.

3. **Error Handling in Method Calls**: Ensure that you have adequate error handling when calling methods like `TokenizerManage.get_tokenizer()`.

4. **Return Type Annotations**: Consider adding type annotations to the parameters and variables where possible to improve clarity and maintainability.

5. **Code Readability**: Refactor complex expressions into smaller functions for better organization.

```python
def get_summary_statistics():
   stats = {}
   
   def calculate_average(numbers):
       if not numbers:
           return 0
       return sum(nums) / len(nums)

   all_scores = [...your data here...]

   average_score = calculate_average(all_scores)
   max_score = max(scores)
   min_score = min(score)


calculate_stats()  # Call your refactored summary_statistics function

This approach helps organize the code more effectively and makes it easier to understand the logic behind different parts.

Copy link

f2c-ci-robot bot commented Aug 6, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant