Caching¶
LLM classification and structured extraction results are cached to avoid redundant API calls. pyGAEB uses a pluggable cache architecture with three options.
In-Memory Cache (Default)¶
The default cache lives in memory and lasts for the current Python session:
from pygaeb import LLMClassifier, StructuredExtractor
classifier = LLMClassifier(model="gpt-4o")
extractor = StructuredExtractor(model="gpt-4o")
No configuration needed. When the process exits, the cache is lost.
The in-memory cache uses LRU (Least Recently Used) eviction with a default capacity of 10,000 entries. When the cache is full, the oldest unused entry is evicted to make room. You can adjust the capacity:
from pygaeb import InMemoryCache
cache = InMemoryCache(maxsize=50_000) # larger for big projects
classifier = LLMClassifier(model="gpt-4o", cache=cache)
Best for: scripts, notebooks, one-off processing.
SQLite Cache (Persistent)¶
For caching across runs (e.g., during development or repeated processing):
from pygaeb import LLMClassifier, StructuredExtractor, SQLiteCache
cache = SQLiteCache("~/.pygaeb/cache")
classifier = LLMClassifier(model="gpt-4o", cache=cache)
extractor = StructuredExtractor(model="gpt-4o", cache=cache)
The SQLite database is created automatically in the specified directory. It uses WAL mode for safe concurrent reads.
Context Manager¶
SQLiteCache supports with blocks for automatic cleanup:
from pygaeb import SQLiteCache, LLMClassifier
with SQLiteCache("/tmp/project-cache") as cache:
classifier = LLMClassifier(model="gpt-4o", cache=cache)
await classifier.enrich(doc)
# Connection closed automatically
Shared Cache¶
Use a single cache backend for both classifier and extractor:
shared = SQLiteCache("/tmp/project-cache")
classifier = LLMClassifier(model="gpt-4o", cache=shared)
extractor = StructuredExtractor(model="gpt-4o", cache=shared)
Classification and extraction keys are namespaced internally, so they never collide.
Custom Cache Backend¶
Implement the CacheBackend protocol to use Redis, DynamoDB, or any other store:
from pygaeb import CacheBackend
class RedisCache:
"""Example custom cache backend."""
def __init__(self, redis_client):
self._r = redis_client
def get(self, key: str) -> str | None:
val = self._r.get(f"pygaeb:{key}")
return val.decode() if val else None
def put(self, key: str, value: str) -> None:
self._r.set(f"pygaeb:{key}", value)
def delete(self, key: str) -> None:
self._r.delete(f"pygaeb:{key}")
def clear(self) -> None:
for key in self._r.keys("pygaeb:*"):
self._r.delete(key)
def keys(self) -> list[str]:
return [k.decode().removeprefix("pygaeb:") for k in self._r.keys("pygaeb:*")]
def close(self) -> None:
pass # Redis client manages its own lifecycle
Then use it:
import redis
from pygaeb import LLMClassifier
r = redis.Redis(host="localhost")
classifier = LLMClassifier(model="gpt-4o", cache=RedisCache(r))
Cache Statistics¶
Check cache hit rates:
stats = classifier.cache.stats()
for entry in stats:
print(f"Prompt v{entry['prompt_version']}: {entry['count']} entries")
Clearing the Cache¶
# Clear all entries (except manual overrides)
classifier.cache.clear()
# Clear extraction cache for a specific schema
extractor.cache.clear(schema_name="DoorSpec")
How Cache Keys Work¶
Cache keys are deterministic hashes of the input data:
- Classification: hash of
(short_text, long_text, unit, hierarchy_path, prompt_version) - Extraction: hash of
(item_content_hash, schema_json_hash)
This means identical items always hit the cache, even across different documents.