Security & Compliance

Voice Transcription for Therapy Notes: How CBT Assistant Pro Does It Privately

7 min read·Updated May 24, 2026
Encrypted · Audit-logged · Zero retention

Voice dictation can save therapists 15-30 minutes per session — but only if the privacy model is right. This guide explains exactly how CBT Assistant Pro handles voice transcription: where the audio goes, how long it stays there (spoiler: not long), and what protections are in place to ensure no clinical recording is ever retained, shared, or used to train an AI model.

Why voice transcription matters for therapists

Writing session notes is the single largest source of administrative burden in clinical practice. Studies estimate that therapists spend 20-30% of their working hours on documentation. Voice dictation can cut this by half or more — provided the privacy model is genuinely safe for PHI.

The benefits compound:

  • More accurate notes (you capture detail while the session is fresh)
  • Less burnout from after-hours documentation
  • More time for clinical reasoning and case formulation
  • Better client engagement during sessions (less time spent typing)

The catch: most consumer voice transcription tools (including the dictation features built into common operating systems) are not safe for PHI. They retain audio, transmit it to cloud services without BAAs, and may use recordings for model improvement.

How CBT Assistant Pro handles voice transcription

The technical flow:

  1. Capture in the browser: Audio is captured locally in your browser using the standard MediaRecorder API. Nothing is uploaded yet.
  2. Transmit encrypted: When you stop recording, the audio is sent over TLS 1.3 to our transcription service.
  3. Transcribe and discard: The transcription service converts audio to text and immediately returns the text. The audio is deleted from the processing buffer.
  4. Store text only: The transcribed text is encrypted with AES-256-GCM and stored in your client record. The audio is never written to persistent storage.
  5. Confirm deletion: Within minutes, all temporary copies of the audio are securely overwritten and removed from every storage layer.

At no point does the audio touch long-term storage. At no point is it backed up. At no point is it used for training.

What our agreements with transcription providers require

CBT Assistant Pro uses enterprise-grade transcription services with the following contractual requirements:

  • Business Associate Agreement (BAA): Signed with every provider that processes clinical audio.
  • Zero retention: Audio is deleted immediately after transcription completes; no caching, no archiving.
  • No model training: Clinical audio cannot be used to train, fine-tune, or improve any AI model.
  • Sub-processor restrictions: Providers may not pass audio to their own sub-processors except for the immediate transcription task.
  • Geographic restrictions: Audio is processed in U.S. and EU data centers only, with EU clients processed exclusively in EU regions.
  • Audit rights: We retain the right to audit provider security practices annually.

These terms are non-negotiable for any provider we use.

What about the transcribed text? How is it protected?

Once transcribed, the text becomes part of your client record and inherits the full security model:

  • Encrypted at rest with AES-256-GCM
  • Encrypted in transit with TLS 1.3
  • Access-controlled (only you, and clinic colleagues you explicitly grant access)
  • Audit-logged (every read and write is recorded)
  • Subject to your right of access, edit, and deletion at any time

If you later edit or delete the note, the change propagates through production within 24 hours and through backups within 30 days.

Best practices for safe dictation

Even with strong technical protections, clinicians should follow these practices:

  • Get explicit client consent to record any portion of a session. Recording without consent violates clinical ethics regardless of platform.
  • Prefer dictating summaries to recording the full session. Summarize key points after the client leaves rather than recording the entire conversation.
  • Use pseudonyms when dictating. Say "the client" or use the client code rather than the real name.
  • Check the transcribed text for accuracy before saving. Voice transcription is good but not perfect, especially with clinical terminology.
  • Avoid public spaces. Do not dictate session notes in a coffee shop or open office.

Following these practices combined with the platform's technical safeguards gives you a defensible voice transcription workflow.

Frequently asked questions

Is the audio of my sessions ever stored?

No. Audio is transcribed and immediately deleted. The transcribed text is stored encrypted; the audio is never written to persistent storage and never backed up.

Can I record a full session and have it transcribed?

Yes, technically — but we recommend dictating summaries rather than recording full sessions. Recording requires explicit client consent and creates a substantially larger audio file with no privacy benefit.

What language does voice transcription support?

Voice transcription supports the major languages in our platform: English, Spanish, German, French, Polish, Portuguese, Russian, and Indonesian. Accuracy is highest in English; other languages are improving rapidly.

Does the transcription provider see my client data?

The transcription provider sees the audio waveform for the seconds it takes to transcribe — they do not see your client record, your formulations, or any contextual data. Per our BAA, they cannot retain or train on the audio.

Can I disable voice transcription entirely?

Yes. Voice transcription is an optional feature. You can disable it for your account in Settings, and clients never have access to it.

Ready to speed up your CBT documentation?

CBT Assistant Pro helps therapists build formulations 3× faster with AI-assisted documentation. HIPAA compliant. Free trial, no credit card.

Start Free Trial

Related guides