Use default prompt with Vicuna

gosachin1 · May 29, 2023 · 724dea0 · 724dea0
1 parent 65d38e0
commit 724dea0
Show file tree

Hide file tree

Showing 3 changed files with 5 additions and 4 deletions.
diff --git a/api/docs/gpt3/gpt3-eval.csv b/api/docs/gpt3/gpt3-eval.csv
@@ -3,4 +3,4 @@
 "What is in-context learning?","In-context learning is an approach to meta-learning, which means the model develops a broad set of skills and pattern recognition abilities at training time, and then uses those abilities at inference time to rapidly adapt to or recognize the desired task when given examples. This involves absorbing many skills and tasks within the parameters of the model.",
 "On what NLP tasks does GPT3 report state-of-the-art performance using zero or few shot learning relative to fine-tuned benchmarks?","GPT3 achieves 71.2% on TriviaQA in the few-shot setting, which is state of the art relative to fine-tuned models operating in the same closed-book setting.",
 "What are the pros and cons of fine-tuning, zero-shot learning, and few-shot learning?","Fine-tuning involves updating the weights of a pre-trained model by training on a supervised dataset specific to the desired task. It benefits from strong performance on many benchmarks, but requires a new large dataset for every task. Few shot learning gives the model a few demonstrations of the task at inference time as conditioning, but no weight updates are done. It benefits from a major reduction in the need for task-specific data. But results from this method have so far been much worse than state-of-the-art fine-tuned models. In zero-shot learning, the model is only given a natural language instruction describing the task without any examples. It is the most convent and potentially robust approach, but the most challenges (especially for tasks that are difficult to describe).",
-"How is the batch size increased for the largest GPT3 175B parameter model?","The batch size is increased linearly from a small value (32k tokens) to the full value of 3.2M token over the first 2 billion tokens of training.",
+"How is the batch size increased over the course of training?","The batch size is increased linearly from a small value (32k tokens) to the full value over the first 4-12 billion tokens of training, depending on the model size.",
diff --git a/api/evaluator_app.py b/api/evaluator_app.py
@@ -150,7 +150,8 @@ def make_chain(llm, retriever, retriever_type, model):
 
     # Select prompt 
     if model == "vicuna-13b":
-        chain_type_kwargs = {"prompt": QA_CHAIN_PROMPT_LLAMA}
+        # chain_type_kwargs = {"prompt": QA_CHAIN_PROMPT_LLAMA}
+        chain_type_kwargs = {"prompt": QA_CHAIN_PROMPT}
     else: 
         chain_type_kwargs = {"prompt": QA_CHAIN_PROMPT}
 

diff --git a/api/text_utils.py b/api/text_utils.py
@@ -146,7 +146,7 @@ def remove_citations(text: str) -> str:
 
 template = """
 ### Human
-You are question+answering assistant tasked with answering questions based on the provided context. 
+You are question-answering assistant tasked with answering questions based on the provided context. 
 
 Here is the question: \
 {question}
@@ -155,6 +155,6 @@ def remove_citations(text: str) -> str:
 {context}
 
 ### Assistant
-Answer: Think step by step """
+Answer: Think step by step. """
 QA_CHAIN_PROMPT_LLAMA = PromptTemplate(input_variables=["context", "question"],template=template,)