How big is the model and how accurate is it?

The model has 1.5 billion total parameters with roughly 50 million active parameters per pass, supports a 128,000-token context window, and scored a 96% F1 on the PII-Masking-300k benchmark (94.04% precision, 98.04% recall). On a revised dataset, OpenAI reports a 97.43% F1 score.

What types of personal data does Privacy Filter detect?

It sorts sensitive data into eight categories: names, addresses, emails, phone numbers, URLs, dates, account numbers such as credit cards and bank accounts, and secrets like passwords and API keys. OpenAI warns it can still miss ambiguous references and recommends human review for legal, medical, and financial workloads.

OpenAI Ships Open-Weight Privacy Filter to Redact PII On Device

Q: What is OpenAI Privacy Filter?

Privacy Filter is an open-weight token-classification model from OpenAI, released under Apache 2.0, built to detect and redact personally identifiable information in unstructured text. It is designed to run locally on a laptop or in a browser rather than calling a remote API.

OpenAI has released Privacy Filter, an open-weight model built to detect and redact personally identifiable information in text without sending that data to a server. The release, published on April 22 with wider coverage landing April 23, 2026, is one of the first major open-weight models OpenAI has shipped since its gpt-oss family, and it targets a concrete enterprise pain point rather than raw chat performance.

A small model aimed at a specific job

Privacy Filter is a bidirectional token classifier derived from the gpt-oss family. It has 1.5 billion total parameters with roughly 50 million active parameters per forward pass, supports a 128,000-token context window, and is distributed under a permissive Apache 2.0 license via GitHub and Hugging Face. Rather than generating text, the model labels each token in a document as belonging to one of eight sensitive categories: names, addresses, emails, phone numbers, URLs, dates, account numbers such as credit cards and bank accounts, and secrets including passwords and API keys.

On the PII-Masking-300k benchmark, OpenAI reports a 96% F1 score, with 94.04% precision and 98.04% recall. On a revised version of the same dataset, the numbers climb to 97.43% F1, 96.79% precision, and 98.08% recall. OpenAI positions the model as having "frontier personal data detection capability" at a size that can run locally in a browser or on a laptop.

Why the on-device angle matters

Most de-identification pipelines today either rely on regex-based tools that miss contextual references or on server-side LLM calls that require sending raw sensitive data to a third-party API. OpenAI argues the Privacy Filter sidesteps both problems. The company said the model "can remain on device, with less risk of exposure, rather than needing to be sent to a server for de-identification."

For regulated industries — healthcare, finance, legal — this is the difference between shipping AI features and tripping over compliance reviews. A 1.5B-parameter classifier is small enough to embed inside data pipelines, browser extensions, and local developer tools, letting teams scrub prompts and logs before they ever leave a machine.

Caveats OpenAI is flagging up front

OpenAI is unusually candid about limitations. The model card notes that Privacy Filter "can make mistakes," citing under-detection of uncommon personal names, regional naming conventions, and domain-specific identifiers, as well as over-redaction of public entities, organizations, or common nouns when local context is ambiguous. For high-sensitivity settings such as medical, legal, and financial workflows, OpenAI advises teams to retain human review paths and to fine-tune the model when local policy differs from its default decision boundaries.

Implications

The release signals a pragmatic shift in how OpenAI is using its open-weight lane. Instead of trying to outflank Meta's Llama or Alibaba's Qwen on general-purpose chat, OpenAI is carving out utility models — small, task-specific, locally deployable — that plug directly into enterprise data workflows. Expect similar single-purpose open-weight drops to follow as OpenAI courts enterprise buyers who want AI infrastructure they can run behind their own firewalls.

OpenAI Ships Open-Weight Privacy Filter to Redact PII On Device

A small model aimed at a specific job

Why the on-device angle matters

Caveats OpenAI is flagging up front

Implications

More in Models

Moonshot Kimi K2.6 lands open-source, scales to 300 sub-agents and 4,000 coordinated steps

OpenAI's 'Spud' Caught Live in API Testing, Polymarket Jumps to 81% for April 23 Launch

OpenAI Launches GPT-Rosalind, Its First Domain-Specific Model Built for Life Sciences