Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PreRelease: v0.1.0 #2

Merged
merged 41 commits into from
Jul 2, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
f841592
Feature(MInference): build the basic MInference framework
iofu728 Jun 5, 2024
37cd987
Feature(MInference): support InfiniteBench
iofu728 Jun 5, 2024
f5e3595
Feature(MInference): add Needle in A Haystack script
iofu728 Jun 5, 2024
847776d
Feature(MInference): add ppl script
iofu728 Jun 5, 2024
5347d8c
Feature(MInference): add RULER
iofu728 Jun 5, 2024
adc194f
Feature(MInference): add benchmark e2e
iofu728 Jun 5, 2024
e48038e
Feature(MInference): support vLLM and add examples
iofu728 Jun 6, 2024
e28a73c
Feature(MInference): Use CUDAExtention to build indexing kernel inste…
iofu728 Jun 6, 2024
e3569ed
Feature(MInference): add benchmark experiments
iofu728 Jun 7, 2024
ce133f3
Feature(MInference): add onepage
iofu728 Jun 8, 2024
db8998e
experiments documents
liyucheng09 Jun 13, 2024
c354d98
Feature(MInference): add search scripts
iofu728 Jun 13, 2024
7022371
Feature(MInference): add streaming example
iofu728 Jun 14, 2024
a7ac35d
Feature(MInference): update logo and demo
iofu728 Jun 14, 2024
ed7a439
experiment commands test passed - warning disabled
liyucheng09 Jun 15, 2024
6053545
Feature(MInference): add FAQ
iofu728 Jun 16, 2024
834c6ec
Feature(MInference): fix the bibtex
iofu728 Jun 16, 2024
538e476
Feature(MInference): add license
iofu728 Jun 16, 2024
c754284
Feature(MInference): support GLM-4 and Qwen2
iofu728 Jun 23, 2024
70f05f9
Feature(MInference): fix the kv cache cpu bias issue
iofu728 Jun 24, 2024
a206fb2
Feature(MInference): support kv cache cpu device
iofu728 Jun 24, 2024
abc5aba
Fix(MInference): fix config get issue
iofu728 Jun 25, 2024
010b708
Feature(MInference): update experiments details
iofu728 Jun 26, 2024
a4cf850
add GLM-4 to RULER
liyucheng09 Jun 26, 2024
3cf46d4
Feature(MInference): update logo and T5 sparsity
iofu728 Jun 26, 2024
ce1c243
Feature(MInference): update the logo
iofu728 Jun 26, 2024
8341e4f
Feature(MInference): update the logo
iofu728 Jun 26, 2024
264ea01
Feature(MInference): update the title
iofu728 Jun 26, 2024
6b401a8
bug fixed - GLM-4
liyucheng09 Jun 27, 2024
be92526
bug fix - ruler with StreamingLLM
liyucheng09 Jun 28, 2024
3743bbe
patch GLM-4 with InfLLM
liyucheng09 Jun 29, 2024
3581688
Feature(MInference): fix the KV retrieval and math find evaluation
iofu728 Jun 29, 2024
9c4b960
Feature(MInference): update FAQ
iofu728 Jun 29, 2024
db22985
Feature(MInference): update logo
iofu728 Jun 30, 2024
0aff9c3
Feature(MInference): update FAQ
iofu728 Jun 30, 2024
fbfd9fb
Feature(MInference): update the pip release script, logo, and copyright
iofu728 Jul 1, 2024
882dcc6
Feature(MInference): update the example
iofu728 Jul 1, 2024
2c48613
Feature(MInference): add supported models
iofu728 Jul 2, 2024
ceba21b
fix bugs in vllm patch
liyucheng09 Jul 2, 2024
a9583e0
Feature(MInference): prepare for release
iofu728 Jul 2, 2024
1c414a4
Feature(MInference): remove unittest
iofu728 Jul 2, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Feature(MInference): add license
Co-authored-by: Yucheng Li <[email protected]>
Co-authored-by: Chengruidong Zhang <[email protected]>
  • Loading branch information
3 people committed Jun 16, 2024
commit 538e47611b78843b2bf29469aba28d3d5bc717a0
3 changes: 3 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
# Copyright (c) 2024 Microsoft
# Licensed under The MIT License [see LICENSE for details]

.PHONY: install style test

PYTHON := python
Expand Down
3 changes: 3 additions & 0 deletions examples/run_hf.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
# Copyright (c) 2024 Microsoft
# Licensed under The MIT License [see LICENSE for details]

from transformers import AutoModelForCausalLM, AutoTokenizer

from minference import MInference
Expand Down
3 changes: 3 additions & 0 deletions examples/run_hf_streaming.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
# Copyright (c) 2024 Microsoft
# Licensed under The MIT License [see LICENSE for details]

import warnings

warnings.filterwarnings("ignore")
Expand Down
3 changes: 3 additions & 0 deletions examples/run_hf_streaming.sh
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
# Copyright (c) 2024 Microsoft
# Licensed under The MIT License [see LICENSE for details]

# Load data
wget https://gutenberg.org/cache/epub/2600/pg2600.txt -O ./data/pg2600.txt

Expand Down
3 changes: 3 additions & 0 deletions examples/run_vllm.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
# Copyright (c) 2024 Microsoft
# Licensed under The MIT License [see LICENSE for details]

from vllm import LLM, SamplingParams

from minference import MInference
Expand Down
3 changes: 3 additions & 0 deletions experiments/benchmarks/benchmark_e2e.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
# Copyright (c) 2024 Microsoft
# Licensed under The MIT License [see LICENSE for details]

import argparse
import sys
import time
Expand Down
3 changes: 3 additions & 0 deletions experiments/benchmarks/run_e2e.sh
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
# Copyright (c) 2024 Microsoft
# Licensed under The MIT License [see LICENSE for details]

# Load data
wget https://raw.githubusercontent.com/FranxYao/chain-of-thought-hub/main/gsm8k/lib_prompt/prompt_hardest.txt

Expand Down
3 changes: 3 additions & 0 deletions experiments/infinite_bench/args.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
# Copyright (c) 2024 Microsoft
# Licensed under The MIT License [see LICENSE for details]

from argparse import ArgumentParser, Namespace

from eval_utils import DATA_NAME_TO_MAX_NEW_TOKENS
Expand Down
3 changes: 3 additions & 0 deletions experiments/infinite_bench/compute_scores.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
# Copyright (c) 2024 Microsoft
# Licensed under The MIT License [see LICENSE for details]

from __future__ import annotations

import json
Expand Down
3 changes: 3 additions & 0 deletions experiments/infinite_bench/eval_utils.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
# Copyright (c) 2024 Microsoft
# Licensed under The MIT License [see LICENSE for details]

from __future__ import annotations

import json
Expand Down
3 changes: 3 additions & 0 deletions experiments/infinite_bench/run_infinitebench.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
# Copyright (c) 2024 Microsoft
# Licensed under The MIT License [see LICENSE for details]

from __future__ import annotations

import json
Expand Down
3 changes: 3 additions & 0 deletions experiments/infinite_bench/run_infinitebench.sh
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
# Copyright (c) 2024 Microsoft
# Licensed under The MIT License [see LICENSE for details]

TASKS=("kv_retrieval" "longbook_choice_eng" "math_find" "longbook_qa_chn" "longbook_qa_eng" "longdialogue_qa_eng" "code_debug" "longbook_sum_eng" "number_string" "passkey")

export TOKENIZERS_PARALLELISM=false
Expand Down
3 changes: 3 additions & 0 deletions experiments/needle_in_a_haystack/needle_summary.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
# Copyright (c) 2024 Microsoft
# Licensed under The MIT License [see LICENSE for details]

import argparse
import json
import os
Expand Down
3 changes: 3 additions & 0 deletions experiments/needle_in_a_haystack/needle_test.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
# Copyright (c) 2024 Microsoft
# Licensed under The MIT License [see LICENSE for details]

import argparse
import os
from dataclasses import dataclass
Expand Down
3 changes: 3 additions & 0 deletions experiments/needle_in_a_haystack/needle_tools.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
# Copyright (c) 2024 Microsoft
# Licensed under The MIT License [see LICENSE for details]

import json
import math
import os
Expand Down
3 changes: 3 additions & 0 deletions experiments/needle_in_a_haystack/needle_viz.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
# Copyright (c) 2024 Microsoft
# Licensed under The MIT License [see LICENSE for details]

import argparse
import glob
import json
Expand Down
3 changes: 3 additions & 0 deletions experiments/needle_in_a_haystack/run_needle.sh
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
# Copyright (c) 2024 Microsoft
# Licensed under The MIT License [see LICENSE for details]

export TOKENIZERS_PARALLELISM=false

# Load Haystack
Expand Down
3 changes: 3 additions & 0 deletions experiments/ppl/run_ppl.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
# Copyright (c) 2024 Microsoft
# Licensed under The MIT License [see LICENSE for details]

import argparse
import gc
import json
Expand Down
3 changes: 3 additions & 0 deletions experiments/ppl/run_ppl.sh
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
# Copyright (c) 2024 Microsoft
# Licensed under The MIT License [see LICENSE for details]

export TOKENIZERS_PARALLELISM=false

mkdir -p results/long-ppl/
Expand Down
15 changes: 2 additions & 13 deletions experiments/ruler/config_models.sh
Original file line number Diff line number Diff line change
@@ -1,16 +1,5 @@
# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# Copyright (c) 2024 Microsoft
# Licensed under The MIT License [see LICENSE for details]

TEMPERATURE="0.0" # greedy
TOP_P="1.0"
Expand Down
15 changes: 2 additions & 13 deletions experiments/ruler/config_tasks.sh
Original file line number Diff line number Diff line change
@@ -1,16 +1,5 @@
# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# Copyright (c) 2024 Microsoft
# Licensed under The MIT License [see LICENSE for details]

NUM_SAMPLES=80
REMOVE_NEWLINE_TAB=false
Expand Down
15 changes: 2 additions & 13 deletions experiments/ruler/data/prepare.py
Original file line number Diff line number Diff line change
@@ -1,16 +1,5 @@
# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# Copyright (c) 2024 Microsoft
# Licensed under The MIT License [see LICENSE for details]

"""
Prepare jsonl with field `input` and `outputs`.
Expand Down
15 changes: 2 additions & 13 deletions experiments/ruler/data/synthetic/common_words_extraction.py
Original file line number Diff line number Diff line change
@@ -1,16 +1,5 @@
# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License
# Copyright (c) 2024 Microsoft
# Licensed under The MIT License [see LICENSE for details]

"""
Create a dataset jsonl file for common words extraction.
Expand Down
15 changes: 2 additions & 13 deletions experiments/ruler/data/synthetic/constants.py
Original file line number Diff line number Diff line change
@@ -1,16 +1,5 @@
# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License
# Copyright (c) 2024 Microsoft
# Licensed under The MIT License [see LICENSE for details]

"""
Add a new task (required arguments):
Expand Down
15 changes: 2 additions & 13 deletions experiments/ruler/data/synthetic/freq_words_extraction.py
Original file line number Diff line number Diff line change
@@ -1,16 +1,5 @@
# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License
# Copyright (c) 2024 Microsoft
# Licensed under The MIT License [see LICENSE for details]

"""
Create a dataset jsonl file for frequent words extraction.
Expand Down
15 changes: 2 additions & 13 deletions experiments/ruler/data/synthetic/json/download_paulgraham_essay.py
Original file line number Diff line number Diff line change
@@ -1,16 +1,5 @@
# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License
# Copyright (c) 2024 Microsoft
# Licensed under The MIT License [see LICENSE for details]

import glob
import json
Expand Down
15 changes: 2 additions & 13 deletions experiments/ruler/data/synthetic/json/download_qa_dataset.sh
Original file line number Diff line number Diff line change
@@ -1,16 +1,5 @@
# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# Copyright (c) 2024 Microsoft
# Licensed under The MIT License [see LICENSE for details]

wget https://rajpurkar.github.io/SQuAD-explorer/dataset/dev-v2.0.json -O squad.json
wget http://curtis.ml.cmu.edu/datasets/hotpot/hotpot_dev_distractor_v1.json -O hotpotqa.json
15 changes: 2 additions & 13 deletions experiments/ruler/data/synthetic/niah.py
Original file line number Diff line number Diff line change
@@ -1,16 +1,5 @@
# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License
# Copyright (c) 2024 Microsoft
# Licensed under The MIT License [see LICENSE for details]

"""
Create a dataset jsonl file for needle in a haystack.
Expand Down
15 changes: 2 additions & 13 deletions experiments/ruler/data/synthetic/qa.py
Original file line number Diff line number Diff line change
@@ -1,16 +1,5 @@
# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License
# Copyright (c) 2024 Microsoft
# Licensed under The MIT License [see LICENSE for details]

"""
Create a dataset jsonl file for QA task.
Expand Down
15 changes: 2 additions & 13 deletions experiments/ruler/data/synthetic/variable_tracking.py
Original file line number Diff line number Diff line change
@@ -1,16 +1,5 @@
# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License
# Copyright (c) 2024 Microsoft
# Licensed under The MIT License [see LICENSE for details]

"""
Create a dataset jsonl file for variable tracking.
Expand Down
15 changes: 2 additions & 13 deletions experiments/ruler/data/template.py
Original file line number Diff line number Diff line change
@@ -1,16 +1,5 @@
# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# Copyright (c) 2024 Microsoft
# Licensed under The MIT License [see LICENSE for details]

Templates = {
"base": "{task_template}",
Expand Down
15 changes: 2 additions & 13 deletions experiments/ruler/data/tokenizer.py
Original file line number Diff line number Diff line change
@@ -1,16 +1,5 @@
# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# Copyright (c) 2024 Microsoft
# Licensed under The MIT License [see LICENSE for details]


import os
Expand Down
15 changes: 2 additions & 13 deletions experiments/ruler/eval/evaluate.py
Original file line number Diff line number Diff line change
@@ -1,16 +1,5 @@
# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# Copyright (c) 2024 Microsoft
# Licensed under The MIT License [see LICENSE for details]

"""
Get summary.csv with score and null predictions amount.
Expand Down
Loading