Installation¶

Core Library¶

The core library handles parsing, validation, writing, and export with no LLM dependencies:

pip install pyGAEB

This installs the parser, writer, validator, and JSON/CSV export — everything you need to work with GAEB files programmatically.

With LLM Classification¶

To use the LLM-powered classification and structured extraction features, install the llm extra:

pip install pyGAEB[llm]

This adds LiteLLM (100+ LLM providers) and Instructor (structured LLM output).

Development Setup¶

For contributing or running the test suite:

git clone https://github.com/frameiq/pygaeb.git
cd pygaeb
python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev,llm]"

Run the tests:

pytest -v

Building Documentation Locally¶

pip install -e ".[docs]"
mkdocs serve

Then open http://localhost:8000.

Requirements¶

Python 3.9+
Core dependencies: lxml, pydantic v2, beautifulsoup4, ftfy, charset-normalizer
LLM extras: litellm, instructor

LLM Provider Setup¶

pyGAEB uses LiteLLM under the hood, so any provider that LiteLLM supports works out of the box. Set the appropriate API key as an environment variable:

# Anthropic
export ANTHROPIC_API_KEY=sk-ant-...

# OpenAI
export OPENAI_API_KEY=sk-...

# Local (Ollama) — no key needed
ollama pull llama3

Then pass the model name when creating a classifier:

from pygaeb import LLMClassifier

classifier = LLMClassifier(model="anthropic/claude-sonnet-4-6")
# or: LLMClassifier(model="gpt-4o")
# or: LLMClassifier(model="ollama/llama3")  # local, free, private

See the LiteLLM docs for the full provider list.