AI Providers¶

MethodAtlas supports ten AI providers. This page covers each one in detail: the provider value string, required CLI flags, authentication method, recommended model, and a complete example command.

Each provider differs in where requests are sent, what authentication is required, and what data-sovereignty guarantees it offers. For a high-level comparison table and a quick decision guide, see the AI Enrichment overview.

Which AI product uses which provider?¶

Many well-known AI assistants are end-user products — not developer APIs. When users encounter them as colleagues' tools (ChatGPT, Claude, Copilot, Grok), they may not know which underlying API to configure in MethodAtlas. This table maps the most common AI products to the correct provider value.

AI assistant / product	Underlying platform	MethodAtlas provider	Free tier
ChatGPT (OpenAI)	OpenAI API	`openai`	No
Claude (Anthropic)	Anthropic API	`anthropic`	No
Grok (xAI)	xAI API (OpenAI-compatible)	`xai`	Limited
GitHub Copilot	GitHub proprietary — IDE tool	not applicable	—
Microsoft Copilot / M365 Copilot	Microsoft Graph — enterprise product	not applicable	—
Gemini (Google)	Google AI / Vertex AI	not yet supported	Yes
GitHub Models	OpenAI-compatible inference by GitHub	`github_models`	Yes (GitHub account)
Meta Llama (any platform)	Varies — use a hosting service	`groq`, `openrouter`, `ollama`	Via partner
Mistral Le Chat	Mistral AI API	`mistral`	Limited
Local models (LM Studio, Ollama, etc.)	Local HTTP	`ollama`	—

GitHub Copilot and Microsoft Copilot / M365 Copilot are end-user productivity assistants embedded in IDEs, Office, and Teams. They do not expose a public inference API that external tools can call. There is no MethodAtlas provider for them because they cannot be used programmatically for custom classification tasks.

Google Gemini uses a proprietary API format that differs from the OpenAI-compatible interface used by all other providers on this list. A native Gemini client is not yet implemented in MethodAtlas. As a workaround, Gemini models are available through OpenRouter (use openrouter with a Gemini model identifier).

Meta Llama models have no official hosted API. They are available through third-party services: Groq hosts several Llama variants on free-tier LPU hardware (groq), OpenRouter aggregates Llama access across many providers (openrouter), and Ollama runs them locally (ollama).

Ollama — local inference¶

Provider value: ollama

What it is: Ollama is an open-source runtime that runs large language models entirely on your local machine. No data leaves the host, no API key is required, and no account is needed.

Data residency: Requests are sent to http://localhost:11434 and never leave the machine. This is the only provider where source code is never transmitted over a network.

Regulatory perspective: Fully compliant with any policy that prohibits transmitting source code outside the organization. Suitable for air-gapped environments.

When to use: Development workstations, CI runners with local GPUs, environments with strict data-egress policies.

Authentication: None required.

Recommended model: qwen2.5-coder:7b — suitable for code classification and runs on consumer hardware with 8 GB VRAM, or on CPU.

Setup:

Download and install Ollama
Pull a model: ollama pull qwen2.5-coder:7b
Run MethodAtlas:

./methodatlas -ai -ai-provider ollama -ai-model qwen2.5-coder:7b /path/to/tests

Or via YAML:

ai:
  enabled: true
  provider: ollama
  model: qwen2.5-coder:7b

Azure OpenAI — corporate cloud inference¶

Provider value: azure_openai

What it is: Azure OpenAI Service is a managed offering from Microsoft that hosts OpenAI models (GPT-4o, GPT-4, etc.) inside a customer's own Azure subscription. It is distinct from the public OpenAI API: requests go to infrastructure your organization controls, not to OpenAI's shared platform.

Data residency: Requests are sent to a resource endpoint within your Azure tenant (e.g. https://contoso.openai.azure.com). Data does not leave your Azure subscription boundary. Microsoft processes it under the terms of your enterprise agreement. The EU Data Boundary commitment applies when the resource is provisioned in an EU region.

Regulatory perspective: Recommended for organizations subject to GDPR, HIPAA, ISO 27001, or internal policies that prohibit sending source code to third-party cloud services. If your organization already has an Azure subscription, this is usually the most straightforward path to compliant cloud AI.

When to use: Corporate environments, regulated industries, any situation where source code must not leave the organization's cloud tenant.

Difference from Microsoft 365 Copilot

Microsoft 365 Copilot (the AI assistant in Teams, Word, etc.) is a separate product and is not accessible via this integration. MethodAtlas uses the Azure OpenAI Service API, which requires an Azure subscription and a dedicated resource provisioned by your IT or cloud team — independent of any M365 licence.

Authentication: A resource-scoped API key generated in the Azure portal. Supply it via the environment variable named in -ai-api-key-env.

The -ai-model value is the deployment name, not the model family name. When you deploy a model in Azure, you assign it a name (e.g. gpt-4o-prod). That deployment name — not gpt-4o or any other OpenAI model identifier — is what MethodAtlas passes to the API. You must also supply -ai-base-url pointing to your Azure resource endpoint.

How to obtain credentials (coordinate with your IT or cloud team):

Provision an Azure OpenAI resource in the Azure portal under your subscription. Choose a region close to your users (EU regions satisfy EU Data Boundary requirements).
Deploy a model within that resource. The deployment gets a name you choose (e.g. gpt-4o-prod). This deployment name is what you supply as -ai-model.
Copy the API key from Azure Portal → your resource → Keys and Endpoint. Two keys are available (Key 1 / Key 2); either works and both can be rotated independently.
Copy the endpoint URL from the same page (e.g. https://contoso.openai.azure.com).

Configuration:

export AZURE_OPENAI_KEY=<your-key>
./methodatlas -ai \
  -ai-provider azure_openai \
  -ai-base-url https://contoso.openai.azure.com \
  -ai-model gpt-4o-prod \
  -ai-api-key-env AZURE_OPENAI_KEY \
  /path/to/tests

Or via YAML:

ai:
  enabled: true
  provider: azure_openai
  baseUrl: https://contoso.openai.azure.com
  model: gpt-4o-prod          # deployment name, not model family name
  apiKeyEnv: AZURE_OPENAI_KEY
  apiVersion: 2024-02-01      # optional; defaults to 2024-02-01
  timeoutSec: 120
  maxRetries: 2

The apiVersion YAML field (or -ai-api-version CLI flag) selects the Azure OpenAI REST API version. The default (2024-02-01) targets the generally-available Chat Completions API. Use 2024-08-01-preview to access preview features such as structured outputs.

OpenAI¶

Provider value: openai

What it is: The OpenAI API provides direct access to GPT-4o, GPT-4, and other models hosted on OpenAI's infrastructure in the United States.

Data residency: Requests are sent to https://api.openai.com and processed on OpenAI's infrastructure. Data leaves the organization's control and is governed by OpenAI's API data usage policy. By default, OpenAI does not use API data to train models, but confirm the current policy with your legal team before use in regulated contexts.

Regulatory perspective: Not suitable for environments where source code must not be transmitted to third-party services. Acceptable for teams with explicit approval for external cloud AI usage.

Authentication: An OpenAI platform API key.

How to obtain:

Create an account at platform.openai.com
Go to API keys → Create new secret key
Add a payment method or enable a usage limit

Recommended model: gpt-4o-mini — strong classification accuracy at low cost. Use gpt-4o for the highest accuracy on complex test bodies.

Configuration:

export OPENAI_API_KEY=sk-...
./methodatlas -ai -ai-provider openai \
  -ai-api-key-env OPENAI_API_KEY \
  -ai-model gpt-4o-mini \
  /path/to/tests

Or via YAML:

ai:
  enabled: true
  provider: openai
  model: gpt-4o-mini
  apiKeyEnv: OPENAI_API_KEY
  timeoutSec: 60
  maxRetries: 2

Anthropic¶

Provider value: anthropic

What it is: The Anthropic API provides access to Claude models hosted on Anthropic's infrastructure in the United States.

Data residency: Requests are sent to https://api.anthropic.com and processed on Anthropic's infrastructure. Data leaves the organization's control.

Regulatory perspective: Same as OpenAI — not suitable where source code must stay within the organization. Acceptable for teams with explicit approval for external cloud AI.

Authentication: An Anthropic API key.

How to obtain:

Create an account at console.anthropic.com
Go to API keys → Create key

Recommended model: claude-3-haiku-20240307 for fast, cost-effective classification. Use claude-sonnet-4-5 for the highest accuracy.

Configuration:

export ANTHROPIC_API_KEY=sk-ant-...
./methodatlas -ai -ai-provider anthropic \
  -ai-api-key-env ANTHROPIC_API_KEY \
  -ai-model claude-3-haiku-20240307 \
  /path/to/tests

Or via YAML:

ai:
  enabled: true
  provider: anthropic
  model: claude-3-haiku-20240307
  apiKeyEnv: ANTHROPIC_API_KEY
  timeoutSec: 60
  maxRetries: 2

OpenRouter¶

Provider value: openrouter

What it is: OpenRouter is an API aggregation service that routes requests to multiple underlying AI providers (OpenAI, Anthropic, Google, Meta, Mistral, and others) through a single endpoint using an OpenAI-compatible interface.

Data residency: Requests pass through OpenRouter's infrastructure and are forwarded to the underlying model provider. Data leaves the organization's control twice: once to OpenRouter, and again to the downstream provider.

Regulatory perspective: Not suitable for environments where source code must not leave the organization. Acceptable for development use where access to many models through one key is convenient.

Authentication: An OpenRouter API key.

How to obtain:

Create an account at openrouter.ai
Go to Keys → Create key

Recommended model: stepfun/step-3.5-flash:free for zero-cost experimentation. Use openai/gpt-4o-mini or anthropic/claude-3-haiku for production quality.

Configuration:

export OPENROUTER_API_KEY=sk-or-...
./methodatlas -ai -ai-provider openrouter \
  -ai-api-key-env OPENROUTER_API_KEY \
  -ai-model stepfun/step-3.5-flash:free \
  /path/to/tests

Or via YAML:

ai:
  enabled: true
  provider: openrouter
  model: stepfun/step-3.5-flash:free
  apiKeyEnv: OPENROUTER_API_KEY
  timeoutSec: 120
  maxRetries: 2

Groq¶

Provider value: groq

What it is: Groq is a cloud inference service built on custom LPU (Language Processing Unit) hardware that delivers very low latency responses. It exposes an OpenAI-compatible REST API and offers a free tier suitable for CI pipelines and experimentation.

Data residency: Requests are sent to https://api.groq.com and processed on Groq's infrastructure. Data leaves the organization's control.

Regulatory perspective: Same considerations as OpenAI and Anthropic — not suitable where source code must remain within the organization. Acceptable for teams with explicit approval for external cloud AI usage. The free tier makes it particularly convenient for open-source projects and public CI pipelines.

Authentication: A Groq API key from console.groq.com.

How to obtain:

Create an account at console.groq.com
Go to API Keys → Create API key
The free tier provides generous rate limits for development and CI use

Recommended model: llama-3.3-70b-versatile. See console.groq.com/docs/models for the current list.

Configuration:

export GROQ_API_KEY=gsk_...
./methodatlas -ai -ai-provider groq \
  -ai-api-key-env GROQ_API_KEY \
  -ai-model llama-3.3-70b-versatile \
  /path/to/tests

Or via YAML:

ai:
  enabled: true
  provider: groq
  model: llama-3.3-70b-versatile
  apiKeyEnv: GROQ_API_KEY
  timeoutSec: 60
  maxRetries: 1

xAI — Grok models¶

Provider value: xai

What it is: xAI is the company behind the Grok family of language models. It exposes an OpenAI-compatible REST API at https://api.x.ai/v1. API keys are available at console.x.ai.

Data residency: Requests are sent to xAI's infrastructure. Data leaves the organization's control.

Regulatory perspective: Not suitable for environments that prohibit transmitting source code to external services.

Authentication: An xAI API key from console.x.ai.

Recommended model: grok-3-mini. See x.ai/api for the current list.

Configuration:

export XAI_API_KEY=xai-...
./methodatlas -ai -ai-provider xai \
  -ai-api-key-env XAI_API_KEY \
  -ai-model grok-3-mini \
  /path/to/tests

Or via YAML:

ai:
  enabled: true
  provider: xai
  model: grok-3-mini
  apiKeyEnv: XAI_API_KEY
  timeoutSec: 60
  maxRetries: 2

GitHub Models — free inference with a GitHub account¶

Provider value: github_models

What it is: GitHub Models is a free inference service that makes a broad selection of models available to any GitHub account holder. It uses an OpenAI-compatible endpoint and authenticates with a standard GitHub personal access token or the GITHUB_TOKEN available in every GitHub Actions workflow — no separate API key or billing setup is required.

Data residency: Requests are processed on Microsoft Azure infrastructure on behalf of GitHub. Data leaves the organization's control.

Regulatory perspective: Suitable for open-source projects and public CI pipelines. Not suitable for environments that prohibit transmitting source code to cloud services.

Authentication: The GITHUB_TOKEN Actions secret (injected automatically in every GitHub Actions run), or a GitHub personal access token. Set the environment variable name with -ai-api-key-env GITHUB_TOKEN. No separate API key signup is required.

Recommended model: gpt-4o-mini. See github.com/marketplace/models for the full list, which includes GPT-4o, Meta Llama 3.x, Mistral Large, and Phi-4.

Configuration:

./methodatlas -ai -ai-provider github_models \
  -ai-api-key-env GITHUB_TOKEN \
  -ai-model gpt-4o-mini \
  /path/to/tests

In a GitHub Actions workflow, GITHUB_TOKEN is injected automatically:

- name: Run MethodAtlas
  run: |
    ./methodatlas -ai \
      -ai-provider github_models \
      -ai-api-key-env GITHUB_TOKEN \
      -ai-model gpt-4o-mini \
      ${{ github.workspace }}/src/test
  env:
    GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

Or via YAML:

ai:
  enabled: true
  provider: github_models
  model: gpt-4o-mini
  apiKeyEnv: GITHUB_TOKEN
  timeoutSec: 60
  maxRetries: 2

Mistral AI¶

Provider value: mistral

What it is: Mistral AI is a European AI company offering a family of open-weight models with an OpenAI-compatible API at https://api.mistral.ai/v1. A free tier is available at console.mistral.ai.

Data residency: Requests are processed within the European Union on Mistral's infrastructure. This makes Mistral one of the few cloud providers suitable for organizations with EU-only data residency requirements that cannot host models locally.

Regulatory perspective: The EU data residency may satisfy GDPR-based restrictions on data transfers outside the EU, but confirm with your legal team. Mistral's enterprise agreements may offer additional guarantees.

Authentication: A Mistral API key from console.mistral.ai.

Recommended model: mistral-small-latest for a balance of cost and accuracy. Use codestral-latest (code-specialized) for the best classification quality. See docs.mistral.ai/getting-started/models for the current list.

Configuration:

export MISTRAL_API_KEY=...
./methodatlas -ai -ai-provider mistral \
  -ai-api-key-env MISTRAL_API_KEY \
  -ai-model mistral-small-latest \
  /path/to/tests

Or via YAML:

ai:
  enabled: true
  provider: mistral
  model: mistral-small-latest
  apiKeyEnv: MISTRAL_API_KEY
  timeoutSec: 60
  maxRetries: 2

Auto mode¶

Provider value: auto

auto first probes the local Ollama endpoint (http://localhost:11434). If Ollama is reachable, it is used and no data leaves the machine. If Ollama is not available and an API key has been configured, an OpenAI-compatible provider is used instead.

Auto mode is convenient for developer workstations where Ollama is typically running, with a cloud provider as fallback for CI environments that lack a local GPU.

Data residency in auto mode

The provider selected at runtime determines where data goes. Do not use auto in configurations where data residency must be guaranteed — use an explicit provider value instead.

Configuration file¶

All AI options can be stored in a YAML configuration file so that teams share the same settings without repeating flags on every invocation:

ai:
  enabled: true
  provider: openrouter
  model: stepfun/step-3.5-flash:free
  apiKeyEnv: OPENROUTER_API_KEY
  taxonomyMode: optimized
  timeoutSec: 120
  maxRetries: 2
  confidence: true

Load it with -config:

./methodatlas -config ./methodatlas.yaml /path/to/tests

Command-line flags always override values from the file:

# Use the team config but switch to local Ollama for offline work
./methodatlas -config ./methodatlas.yaml -ai-provider ollama -ai-model qwen2.5-coder:7b /path/to/tests

See CLI reference for the complete YAML field reference.