πŸ§ͺ Science-Backed Text Anonymisation

Anonymize Text.
Retain Data.

Research-grade PII detection for Mac, Windows, and Mobile. Scale from individual research papers to enterprise-wide cloud APIs.

Available for texts in English, Dutch, French, Spanish, German, Italian & many more

Read the paper
Input Text

"Jane Doe lives at 123 Baker St and works at Apple."

Textwash Pro
Anonymised Output

"PERSON_1 lives at ADDRESS_1 and works at ORG_1."

Developed for Smart Privacy

Zero Code Required Intuitive GUI designed for researchers and non-technical staff.
Air-Gapped Ready Local processing ensures data never leaves your infrastructure.
ML-Powered NER Probabilistic entity detection outperforms static dictionary lists.
Intruder Tested Empirically validated against human re-identification attempts.

Core Principles

Scientific Foundation

Based on the peer-reviewed Textwash project (GPL-3.0). Auditable, transparent, and built by academic researchers.

Contextual Privacy

Uses category probabilities to anonymise phrases based on linguistic context, not just simple keywords.

Local-First Architecture

Designed for sensitive institutional data. No internet connection required for the desktop application.

ISO 9001 Certified Development Company

🧩 Product family

Choose the setup that best fits your workflow, from a user-friendly desktop app to cloud APIs and the original open-source script.

All variants are built around the same research-grade anonymisation approach and evaluation framework.

Desktop & mobile app

Textwash Pro

Mac Β· Windows Β· iOS Β· Android

A user-friendly application that runs entirely on your devices. Import unstructured text data and export anonymised versions without sending anything to external servers.

Supports English, Dutch, French, Spanish, German, Italian, and many more; designed to be easy to use for non-technical users.

Offline by default Β· GUI-based
API & integrations

Textwash Pro API

Cloud-based processing Β· Zapier-ready

Cloud API for integrating Textwash anonymisation into your own systems and workflows. Ideal for automated pipelines, web apps, and low-code tools such as Zapier.

Process text from forms, CRMs, or ticket systems before storage or analysis.

REST API Β· Integrations
Cloud workspace

Textwash Pro Cloud

Browser-based batch processing

Use Textwash in a managed cloud environment. Upload datasets, configure entity types, and run anonymisation jobs directly from your browser.

Ideal for teams who need shared project dashboards and result logs.

Team-ready Β· Job monitoring
Open-source foundations

Textwash Free

Original script Β· No GUI

The original open-source Textwash project that Textwash Pro builds on. A script-based anonymisation tool without a graphical interface, intended for technical users who want direct access to the underlying code.

Includes the full anonymisation pipeline and evaluation materials under GPL-3.0.

Source code & paper The open-source original

🏒 Typical use cases

Textwash Pro is built to support real-world anonymisation workflows in research, industry, and the public sector.

If your use case involves unstructured text and personal data, Textwash Pro is likely relevant. Not sure? Reach out at textwash-pro@jocapps.eu

GDPR-compliant data anonymisation

Anonymise free-text fields that contain personal data before storing or sharing them:

  • Customer support logs and email archives
  • Contact forms and CRM notes
  • Internal reports with narrative descriptions

Open Science & data sharing

Prepare research datasets for sharing while protecting participants’ identities:

  • Survey open-ended responses
  • Interview and focus group transcripts
  • Field notes and qualitative research data

Legal, Health, & Social services

Remove direct and indirect identifiers from sensitive case descriptions:

  • Clinical notes and case vignettes
  • Legal case summaries and memos
  • Social work documentation and protocols

User research & UX feedback

Anonymise qualitative feedback before sharing within teams or with external partners:

  • User interviews and usability tests
  • App store reviews and support tickets
  • Internal product discovery notes

Logs & monitoring data

Remove PII from semi-structured logs before central storage or analysis:

  • Application and server logs containing user details
  • Chat logs from support systems
  • Exported audit trails and monitoring outputs

Proxy & preprocessing for LLM workflows

Route prompts and free-text inputs through anonymisation before they reach external or internal LLM systems:

  • PII-safe prompt proxy for shared AI assistants
  • Preprocess support tickets before summarisation
  • Mask identifiers prior to retrieval, ranking, and generation

πŸ›οΈ Custom institutional workflows

We design end-to-end anonymisation workflows that align with institutional governance, legal obligations, and research quality standards.

πŸ“‹ Governance alignment

Policy mapping, retention definitions, and approval checkpoints for every data flow stage

πŸ›‘οΈ Controlled processing

Role-based access setup, secure review loops, and privacy controls for internal and external teams

πŸ“ˆ Audit readiness

Documented procedures, quality evidence, and repeatable validation protocols for compliance reviews

For institutional rollouts, integration planning, or compliance questions, contact textwash-pro@jocapps.eu

🀝 Optional Services

Textwash Pro works as a standalone product. If helpful, we additionally offer optional implementation and consultancy support for research teams, companies, and public-sector organisations.

Advisory and implementation support

The optional service package combines operational design, integration planning, and quality assurance for sensitive text workflows.

  • Workflow assessment for privacy, compliance, and data utility
  • PII preprocessing and guardrail design before model usage
  • Output postprocessing checks to reduce leakage risk
  • Cross-checking between anonymisation quality and business requirements
  • Human-in-the-loop review strategies for high-impact datasets
  • Integration recommendations across on-prem and cloud stacks

πŸ”Ž  Phase 1: Discovery

  • Data landscape review
  • Risk and exposure mapping
  • Target workflow definition

πŸ§ͺ  Phase 2: Pilot

  • Dataset onboarding
  • Entity type calibration
  • Human quality review setup

βš™οΈ  Phase 3: Integration

  • System and API integration
  • Operational runbook creation
  • Monitoring dashboard rollout

βœ…  Phase 4: Governance

  • Audit & evidence reviews
  • Policy and training package
  • Continuous improvements

Optional services are available for SMEs, enterprise teams, universities, healthcare, and public sector

Built for serious data protection work

Textwash Pro was designed to meet high standards for text anonymisation. The following principles guide its development.

1. Complete and transparent evaluation

The underlying anonymisation approach has been evaluated empirically. This includes tests of what the tool can and cannot do, as well as a motivated intruder test where humans attempt to re-identify persons in anonymised documents.

2. Data never leave your system

The Textwash Pro application does not require you to upload text data or use any remote API. You can disconnect from the internet and continue anonymising documents. This minimises leakage and reduces risks for sensitive data.

3. Transparent foundations

Textwash Pro is based on the open, research-driven Textwash project. The foundations can be inspected, tested, and extended by the community.

4. Learning-based anonymisation

Personal information is complex and context-dependent. Textwash therefore does not rely on simple dictionary lookups. Instead, it uses a machine learning model that assigns category probabilities to phrases and anonymises them accordingly.

Considering other anonymisation tools?

Even if you do not use Textwash Pro, we strongly encourage you to ask any tool provider for:

  1. An empirical evaluation that clearly shows what their tool can and cannot do (you can point them to the Textwash evaluation approach and dataset)
  2. A clear justification for why data must be sent to online services or APIs In many cases, strong anonymisation does not require central data collection

If this level of transparency is not available, treat risk claims with caution

You can always reach us at textwash-pro@jocapps.eu if you have questions about evaluation details.

❓ FAQ

Common questions about deployment models, support levels, and governance requirements

❓ Is Textwash Pro usable without optional services?

Yes. The product is fully usable on its own, and services are optional

❓ Do you provide SLA options?

Yes. We can define service levels, support windows, response targets, and escalation paths for qualifying organisations

❓ Is Textwash Pro suitable for public sector programmes?

Yes. We support public sector, research, healthcare, and regulated environments with governance-aligned implementation plans

❓ Can on-premise and cloud setups be combined?

Yes. Hybrid architectures can combine local processing with API or cloud components, depending on policy and risk constraints

❓ How do you support audits and compliance reviews?

We provide documentation inputs, quality checkpoints, and implementation evidence to help internal governance and external audits

❓ Who should contact you for enterprise or institutional rollout?

Programme managers, data protection teams, and technical leads can contact us at textwash-pro@jocapps.eu to discuss fit and rollout options

πŸš€ Quick Start Guide

Textwash Pro offers a graphical user interface (GUI) for anonymising text files with no command line required:

  • Open the Textwash Pro app on your Mac, Windows, iOS, or Android device
  • Import data by selecting individual files or folders in the GUI
  • Set the language (supports English, Dutch, French, Spanish, German, Italian, and many more)
  • Choose the output folder where anonymised files should be saved
  • Start the anonymisation run; anonymised files are written to the chosen directory

Textwash Pro is designed to be user-friendly and works well for both small and large text collections. It can take advantage of powerful hardware where available, but does not require any technical setup.

Need a walkthrough?

If you would like a short demo or have specific questions about your use case, we are happy to help.

Examples & sample data

Also the original open-source Textwash project includes detailed person descriptions and their anonymised counterparts. These examples illustrate how the underlying anonymisation behaves.

  • Original, detail-rich descriptions in the examples directory
  • Corresponding anonymised versions in examples_anonymised

You can use these example files to understand how different entity types are treated, and as a starting point for your own evaluation.

Browse Textwash Free on GitHub

🏷️ Fine-grained control over entity types

Textwash can anonymise a rich set of entity types and can be restricted to a subset as needed.

This allows you to align anonymisation with legal and methodological requirements while preserving as much non-identifying information as possible.

PRONOUNS PHONE NUMBER EMAIL ADDRESS NUMERICS MONTHS DATE PERSON LOCATION OCCUPATION TITLE AGE CULTURAL IDENTITY TIME ADDRESS ORGANISATION OTHER IDENTIFIABLE ATTRIBUTE

By selecting only the entity types you need, you can tailor anonymisation to your context while keeping as much useful, non-identifying information as possible.

Researcher Info

Textwash Pro is a commercial product built on a research-driven and openly documented foundation. Whether you choose Textwash Pro or another provider, we recommend requesting evidence that supports real-world privacy claims.

  • Empirical evaluation showing what a tool can and cannot anonymise, ideally against shared benchmark datasets
  • A clear explanation of data flow, including why remote APIs are needed and which safeguards apply
  • Governance artefacts such as validation reports, audit evidence, and documented limits of the method

If these materials are not available, decision makers should treat deployment claims with caution and request formal clarification

For research collaborations, interoperability discussions, or evaluation questions, contact textwash-pro@jocapps.eu

πŸ‘₯ Who developed Textwash Pro?

Textwash Pro is developed and distributed by Dr. Bennett Kleinberg & jocapps® GmbH and is based on Textwash (github.com/ben-aaron188/textwash) under the GNU General Public License v3.0. The original Textwash project was developed by Dr. Maximilian Mozes and Dr. Bennett Kleinberg.

Textwash Pro extends this foundation with a multi-platform GUI, deployment options, and additional tooling while preserving the open, research-driven ethos of the original project.