Tags: jannikmaierhoefer/ragas
Tags
fix: pin langchain_core to <3 (explodinggradients#1329) The new langchain v0.3 will break the current usage of metrics. the plan for actions is as follows 1. for ragas<0.2 we will pin langchain_core to <0.3 2. for ragas>0.2 we will directly depend on pydantic>=2 fixes: explodinggradients#1328
fix: make score nested if loop_is_running (explodinggradients#1276)
feat: set and get prompts for metrics (explodinggradients#1259) ```python from ragas.experimental.metrics._faithfulness import FaithfulnessExperimental, LongFormAnswerPrompt faithfulness = FaithfulnessExperimental() faithfulness.get_prompts() #{'long_form_answer_prompt': <ragas.experimental.metrics._faithfulness.LongFormAnswerPrompt at 0x7fd7baa8efb0>, #'nli_statement_prompt': <ragas.experimental.metrics._faithfulness.NLIStatementPrompt at 0x7fd7baa8f010>} long_form_prompt = LongFormAnswerPrompt() long_form_prompt.instruction = "my new instruction" prompts = {"long_form_answer_prompt":long_form_prompt} faithfulness.set_prompts(**prompts) ``` --------- Co-authored-by: Jithin James <[email protected]>
feat (experimental) added new prompt and metric into `ragas.experimen… …tal` (explodinggradients#1240) you can use it like ```py from ragas.experimental.metrics import FaithfulnessExperimental from ragas.metrics import faithfulness from ragas import evaluate f = FaithfulnessExperimental(llm=LangchainLLMWrapper(gpt4o)) faithfulness.llm = LangchainLLMWrapper(gpt4o) # row = amnesty_qa["eval"][0] # await f.ascore(row) # await faithfulness.ascore(row) r = evaluate( amnesty_qa["eval"].select(range(10)), metrics=[f, faithfulness], raise_exceptions=True, callbacks=[] ) ```
[FIX] - Fix for summarization edge case (explodinggradients#1201) This PR adds a fix for the issue mentioned in explodinggradients#1108 However I have a points to discuss @shahules786 : - I had added `conciseness_score` to penalize long summaries, but I also do not want to promote very very short and skimpy summaries, need to find a middle ground. - Is `averaging` a good way to combine `QA_score` and `conciseness_score`? - Ranking based metrics to measure quality of summarization (as mentioned by shahul in the above issue) Given the conclusions we reach based on these discussion points, I will push more commits, let's keep this PR open till we resolve these points. --------- Co-authored-by: Shahules786 <[email protected]>
Allow Metric.score to work within an existing asyncio loop (exploding… …gradients#1161) I got errors when running metrics within an existing asyncio loop, the `asyncio.run` part makes the code fail in those cases. Submitting a bugfix. --------- Co-authored-by: jjmachan <[email protected]>
fix: temperature parameter in generate_text not ignored. (explodinggr… …adients#887) Addresses explodinggradients#886. Temperature parameter is not anymore overwritten by calling to `self.get_temperature(n=n)`. Now it will only call that method if no parameter was given. --------- Co-authored-by: Ivan Herreros <[email protected]>
PreviousNext