Skip to content

Fm/stg 474 more tests #83

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 61 commits into from
Jun 9, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
61 commits
Select commit Hold shift + click to select a range
802b387
update readme
filip-michalsky Jun 2, 2025
76c5a1c
update env example
filip-michalsky Jun 2, 2025
f0d15cf
update
filip-michalsky Jun 2, 2025
9e82aba
update to pip
filip-michalsky Jun 3, 2025
89164ee
update examples and README
filip-michalsky Jun 3, 2025
33ef876
do not require bb key for local runs
filip-michalsky Jun 3, 2025
37d5510
update example
filip-michalsky Jun 3, 2025
8ca7751
format
filip-michalsky Jun 3, 2025
93a9011
formatting
filip-michalsky Jun 3, 2025
81032b2
Merge branch 'main' into fm/stg-468-update-readme-and-example
filip-michalsky Jun 3, 2025
0ffca19
merge commit
filip-michalsky Jun 3, 2025
a8fe211
format;
filip-michalsky Jun 3, 2025
9ee8096
remove saving the tree
filip-michalsky Jun 4, 2025
957dbeb
one shot test structure
filip-michalsky Jun 4, 2025
503876a
fixing tests
filip-michalsky Jun 4, 2025
fa68005
fixing more tests
filip-michalsky Jun 5, 2025
ba4ffcd
fix more tests
filip-michalsky Jun 5, 2025
b37bba1
update tests
filip-michalsky Jun 5, 2025
20605bb
fixing tests
filip-michalsky Jun 5, 2025
324277e
all tests pass
filip-michalsky Jun 5, 2025
ba317a4
accept deps from miguel
filip-michalsky Jun 5, 2025
3392d57
remove warnings
filip-michalsky Jun 5, 2025
cc9fd0f
fix formatting
filip-michalsky Jun 5, 2025
e3ebdac
fix: update deprecated GitHub Actions upload/download-artifact from v…
filip-michalsky Jun 6, 2025
1e8c7c8
Merge branch 'main' into fm/stg-474-more-tests
filip-michalsky Jun 6, 2025
f729c5c
make google cua optional import, fix stagehand import
filip-michalsky Jun 6, 2025
7421e4a
update tests
filip-michalsky Jun 6, 2025
7f4b7e4
format
filip-michalsky Jun 6, 2025
3d1b604
update cua to CI
filip-michalsky Jun 6, 2025
6637c77
fix linter
filip-michalsky Jun 6, 2025
69117de
remove min coverage threshold
filip-michalsky Jun 6, 2025
baae62b
run tests in CI attempt
filip-michalsky Jun 6, 2025
76c4356
remove tests from ruff
filip-michalsky Jun 6, 2025
ae1ac0b
more debug to pass ci
filip-michalsky Jun 6, 2025
8ed42e9
more ci fixes
filip-michalsky Jun 6, 2025
4475f7e
cleaning up some unit tests
miguelg719 Jun 6, 2025
ae11459
remove stuff from publish yaml
filip-michalsky Jun 6, 2025
907d542
revert
filip-michalsky Jun 6, 2025
227badf
fix test
filip-michalsky Jun 7, 2025
9a4cdd5
revert types back from schema
filip-michalsky Jun 7, 2025
0471b55
add note todo
filip-michalsky Jun 7, 2025
6a7c449
select tests based on PR labels
filip-michalsky Jun 7, 2025
af9e033
first pass on integration tests
filip-michalsky Jun 7, 2025
503d64e
updates to integration tests
filip-michalsky Jun 7, 2025
470938a
fix unit tests
filip-michalsky Jun 7, 2025
b3ea794
revert pr template, extract handler, remove test readme
filip-michalsky Jun 8, 2025
7e0cf6f
reverting more files except test folder
filip-michalsky Jun 8, 2025
8463309
revert more files
filip-michalsky Jun 8, 2025
4c93b0b
revert more files
filip-michalsky Jun 8, 2025
afdb3c1
revert examples
filip-michalsky Jun 8, 2025
574c12c
revert example
filip-michalsky Jun 8, 2025
1b9159d
trim unit tests
filip-michalsky Jun 8, 2025
e069d0f
trim tests
filip-michalsky Jun 8, 2025
3636016
fix unit test
filip-michalsky Jun 8, 2025
86005ca
fix smoke test warnings
filip-michalsky Jun 8, 2025
5dca7e8
update tests
filip-michalsky Jun 8, 2025
0e21f24
update test CI workflow
filip-michalsky Jun 8, 2025
a0cebee
updates
filip-michalsky Jun 8, 2025
291afe9
update core integration test
filip-michalsky Jun 9, 2025
50a7cfe
change local integration test
filip-michalsky Jun 9, 2025
d5b37cb
ci passing
filip-michalsky Jun 9, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .env.example → .env.example
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
MODEL_API_KEY = "anthropic-or-openai-api-key"
MODEL_API_KEY = "your-favorite-llm-api-key"
BROWSERBASE_API_KEY = "browserbase-api-key"
BROWSERBASE_PROJECT_ID = "browserbase-project-id"
STAGEHAND_API_URL = "api_url"
STAGEHAND_ENV= "LOCAL or BROWSERBASE"
292 changes: 292 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,292 @@
name: Test Suite

on:
push:
branches: [ main, develop ]
pull_request:
branches: [ main, develop ]
types: [opened, synchronize, reopened, labeled, unlabeled]

jobs:
test-unit:
name: Unit Tests
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.9", "3.10", "3.11", "3.12"]

steps:
- uses: actions/checkout@v4

- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}

- name: Cache pip dependencies
uses: actions/cache@v3
with:
path: ~/.cache/pip
key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements*.txt', '**/pyproject.toml') }}
restore-keys: |
${{ runner.os }}-pip-

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -e ".[dev]"
# Install jsonschema for schema validation tests
pip install jsonschema
# Install temporary Google GenAI wheel
pip install temp/google_genai-1.14.0-py3-none-any.whl

- name: Run unit tests
run: |
pytest tests/unit/ -v --junit-xml=junit-unit-${{ matrix.python-version }}.xml

- name: Upload unit test results
uses: actions/upload-artifact@v4
if: always()
with:
name: unit-test-results-${{ matrix.python-version }}
path: junit-unit-${{ matrix.python-version }}.xml

test-integration-local:
name: Local Integration Tests
runs-on: ubuntu-latest
needs: test-unit

steps:
- uses: actions/checkout@v4

- name: Set up Python 3.11
uses: actions/setup-python@v4
with:
python-version: "3.11"

- name: Install system dependencies
run: |
sudo apt-get update
sudo apt-get install -y xvfb

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -e ".[dev]"
pip install jsonschema
# Install temporary Google GenAI wheel
pip install temp/google_genai-1.14.0-py3-none-any.whl
# Install Playwright browsers for integration tests
playwright install chromium
playwright install-deps chromium

- name: Run local integration tests
run: |
# Run integration tests marked as 'integration' and 'local'
xvfb-run -a pytest tests/integration/ -v \
--cov=stagehand \
--cov-report=xml \
--junit-xml=junit-integration-local.xml \
-m "integration and local" \
--tb=short \
--maxfail=5
env:
MODEL_API_KEY: ${{ secrets.MODEL_API_KEY || 'mock-model-key' }}
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY || 'mock-openai-key' }}
DISPLAY: ":99"

- name: Upload integration test results
uses: actions/upload-artifact@v4
if: always()
with:
name: integration-test-results-local
path: junit-integration-local.xml

- name: Upload coverage data
uses: actions/upload-artifact@v4
if: always()
with:
name: coverage-data-integration-local
path: |
.coverage
coverage.xml

test-integration-api:
name: API Integration Tests
runs-on: ubuntu-latest
needs: test-unit

steps:
- uses: actions/checkout@v4

- name: Set up Python 3.11
uses: actions/setup-python@v4
with:
python-version: "3.11"

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -e ".[dev]"
pip install jsonschema
# Install temporary Google GenAI wheel
pip install temp/google_genai-1.14.0-py3-none-any.whl

- name: Run API integration tests
run: |
pytest tests/integration/ -v \
--cov=stagehand \
--cov-report=xml \
--junit-xml=junit-integration-api.xml \
-m "integration and api" \
--tb=short
env:
BROWSERBASE_API_KEY: ${{ secrets.BROWSERBASE_API_KEY }}
BROWSERBASE_PROJECT_ID: ${{ secrets.BROWSERBASE_PROJECT_ID }}
MODEL_API_KEY: ${{ secrets.MODEL_API_KEY }}
STAGEHAND_API_URL: ${{ secrets.STAGEHAND_API_URL }}

- name: Upload API integration test results
uses: actions/upload-artifact@v4
if: always()
with:
name: api-integration-test-results
path: junit-integration-api.xml

smoke-tests:
name: Smoke Tests
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v4

- name: Set up Python 3.11
uses: actions/setup-python@v4
with:
python-version: "3.11"

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -e ".[dev]"
pip install jsonschema
pip install temp/google_genai-1.14.0-py3-none-any.whl

- name: Run smoke tests
run: |
pytest tests/ -v \
--junit-xml=junit-smoke.xml \
-m "smoke" \
--tb=line \
--maxfail=5

- name: Upload smoke test results
uses: actions/upload-artifact@v4
if: always()
with:
name: smoke-test-results
path: junit-smoke.xml

test-e2e:
name: End-to-End Tests
runs-on: ubuntu-latest
needs: test-unit
if: |
contains(github.event.pull_request.labels.*.name, 'test-e2e') ||
contains(github.event.pull_request.labels.*.name, 'e2e')

steps:
- uses: actions/checkout@v4

- name: Set up Python 3.11
uses: actions/setup-python@v4
with:
python-version: "3.11"

- name: Install system dependencies
run: |
sudo apt-get update
sudo apt-get install -y xvfb

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -e ".[dev]"
pip install jsonschema
# Install temporary Google GenAI wheel
pip install temp/google_genai-1.14.0-py3-none-any.whl
playwright install chromium
playwright install-deps chromium

- name: Run E2E tests
run: |
xvfb-run -a pytest tests/ -v \
--cov=stagehand \
--cov-report=xml \
--junit-xml=junit-e2e.xml \
-m "e2e" \
--tb=short
env:
BROWSERBASE_API_KEY: ${{ secrets.BROWSERBASE_API_KEY || 'mock-api-key' }}
BROWSERBASE_PROJECT_ID: ${{ secrets.BROWSERBASE_PROJECT_ID || 'mock-project-id' }}
MODEL_API_KEY: ${{ secrets.MODEL_API_KEY || 'mock-model-key' }}
STAGEHAND_API_URL: ${{ secrets.STAGEHAND_API_URL || 'http://localhost:3000' }}
DISPLAY: ":99"

- name: Upload E2E test results
uses: actions/upload-artifact@v4
if: always()
with:
name: e2e-test-results
path: junit-e2e.xml

test-all:
name: Complete Test Suite
runs-on: ubuntu-latest
needs: test-unit
if: |
contains(github.event.pull_request.labels.*.name, 'test-all') ||
contains(github.event.pull_request.labels.*.name, 'full-test')

steps:
- uses: actions/checkout@v4

- name: Set up Python 3.11
uses: actions/setup-python@v4
with:
python-version: "3.11"

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -e ".[dev]"
pip install jsonschema
# Install temporary Google GenAI wheel
pip install temp/google_genai-1.14.0-py3-none-any.whl
playwright install chromium

- name: Run complete test suite
run: |
pytest tests/ -v \
--cov=stagehand \
--cov-report=xml \
--cov-report=html \
--junit-xml=junit-all.xml \
--maxfail=10 \
--tb=short
env:
BROWSERBASE_API_KEY: ${{ secrets.BROWSERBASE_API_KEY }}
BROWSERBASE_PROJECT_ID: ${{ secrets.BROWSERBASE_PROJECT_ID }}
MODEL_API_KEY: ${{ secrets.MODEL_API_KEY }}
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
STAGEHAND_API_URL: ${{ secrets.STAGEHAND_API_URL }}

- name: Upload complete test results
uses: actions/upload-artifact@v4
if: always()
with:
name: complete-test-results
path: |
junit-all.xml
htmlcov/
Loading