simrit1 / xgen Public

forked from salesforce/xgen

Notifications You must be signed in to change notification settings
Fork 0
Star 0

Salesforce open-source LLMs with 8k sequence length.

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
CODEOWNERS		CODEOWNERS
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
requirements.txt		requirements.txt
sample.py		sample.py

Repository files navigation

xGen

Official research release for the family of xGen models (7B) by Salesforce AI Research:

Title: Long Sequence Modeling with XGen: A 7B LLM Trained on 8K Input Sequence Length

Authors: todo.

Usage

Model cards are published on the HuggingFace Hub:

xGen-7B

The models can be used as auto-regressive samplers as follows:

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("checkpoints/xgen-7B")
model = AutoModelForCausalLM.from_pretrained("checkpoints/xgen-7B", torch_dtype=torch.bfloat16, revision="sharded")
inputs = tokenizer("The world is", return_tensors="pt")
sample = model.generate(**inputs, max_length=128)
print(tokenizer.decode(sample[0]))

Citation

@misc{xGen,
  title={Long Sequence Modeling with XGen: A 7B LLM Trained on 8K Input Sequence Length},
  author={Salesforce AI Research},
  howpublished={Salesforce AI Research Blog},
  year={2023},
  url={https://blog.salesforceairesearch.com/xgen/}
}